About Me

I work on enabling robots to navigate and act in unstructured environments using foundation models.

Currently, I’m a student at UIUC UIUC pursuing a MS in Computer Science. I’m advised by Girish Chowdhary at Distributed Autonomous Systems Lab. My research is focused on Generalized Navigation with Vision Language Action models and embodiment aware grounding for Vision Large Language Models. My research is supported by NASA FireSense, AIFARMS, I-FARM, COALESCE, and NSF ACCESS.

Previously, I worked at Earthsense, on hardware design, autonomy, and optimization for a 750kg payload UGV. My contributions included developing the lower-level control systems and implementing 4-wheel independent torque vectoring for the TerraMax Robot’s dual Ackermann steering. I also developed steering controllers that were deployed across the fleet of 200+ smaller TerraSentia and TerraPreta robots.

I enjoy running and cycling, currently preparing for a 5k in 25 minutes.

robot_video_terramax

Education

University of Illinois Urbana-Champaign UIUC

MS in Computer Science

Relevant coursework: Humanoid Robotics, Autonomous Vehicle System Engineering, 3D Vision, Networked IoT, Software Engineering, Artificial Intelligence.

College of Engineering, Pune COEP

B.Tech in Mechanical Engineering, Minor in Computer Engineering

Relevant coursework: Machine Design, Robotics and Automation, Mechanics of Materials, Engineering Mechanics, Linear Algebra, Vector Calculus.

Publications

CATNAV: Cached Vision-Language Traversability for Efficient Zero-Shot Robot Navigation

Aditya Potnis, Francisco Affonso, Shreya Gummadi, Naveen Kumar Uppalapati, Girish Chowdhary

Under Review.

Paper: Link | Website: Link

Zero-shot embodiment-aware traversability framework using multimodal LLMs for costmap generation; visuosemantic caching reduces online VLM queries by 85.7%, achieving 10% higher goal-reaching rate and 33% fewer behavioral constraint violations versus state-of-the-art VLA baselines on a quadruped robot.

CTS-MoE: Implicit Policy Adaptation Enables Perceptive Locomotion on Discontinuous Terrain

Francisco Affonso, Matheus P. Angarola, Ana Luiza Mineiro, Aditya Potnis, Marcelo Becker, Girish Chowdhary

Under Review.

Concurrent teacher student architecture for mixture of experts, enabling implicit policy adaptation for perceptive locomotion on discontinuous terrain.

Visual-Language-Guided Task Planning for Horticultural Robots

Jose Cuaran, Kendall Koe, Aditya Potnis, Naveen Kumar Uppalapati, Girish Chowdhary

Computers and Electronics in Agriculture.

Paper: Link | Website: Link

Modular VLM-guided framework for precision agriculture interleaving natural language queries with action primitives for autonomous crop monitoring, benchmarked long-horizon planning using MLLMs, finding human-comparable performance on short-horizon tasks but degradation in long-horizon scenarios with noisy semantic maps.

more coming soon …

Competitive Robotics

fll logo wro logo

I am a FIRST and WRO alumni and I have participated in Robotics competitions since 2013. I have represented India Internationally in Robotics competitions thrice and also mentored a team which won Runner up Best Project Research in FLL Europe Opens in Estonia in 2018. I continue to mentor teams at Robominds for FLL, FTC and Vex Robotics.

Achievements

Projects

Visual Navigation Policy with V-JEPA2 and Diffusion Policy

Code: Link

Trained a visual navigation policy using a frozen V-JEPA2 backbone for spatiotemporal feature extraction from egocentric video, with a Diffusion Policy action head for action prediction.

Adapt: Diffusion-Predicted Pedestrian Avoidance with MPPI Control on the Polaris GEM e4 (CS 588)

Report: Link

Autonomous driving stack on the Polaris GEM e4 that pairs a diffusion-based pedestrian predictor with an MPPI planner for pedestrian-aware avoidance, plus a text-promptable LiDAR-camera module for open-vocabulary goals.

Low-Rank Adaptation for Video Generation with semantic relative pose prompts (CS 598 3D Vision, HACKER Project)

Code: Link

Fine-tuned Wan 2.2 (1.3B) with LoRA to generate robot-POV navigation videos from text and motion plans for scalable outdoor data synthesis. Evaluated motion fidelity and failure modes, proposing hierarchical motion-primitive curriculum training to improve alignment.

GhibliDream – Studio-Ghibli inspired Stylization of Stable Diffusion (CS 444)

Report: Link

generated sample images

Fine-tuned StableDiffusion-2.0 with DreamBooth on curated Ghibli images using LLM-assisted auto-captioning, achieving 0.90+ CLIP-I cosine similarity on foreground characters while retaining background quality.

Salto Simulator for development

Code: Link

salto-animation-1

Gazebo plugin simulating Salto-1P jumping motion as a point object, supporting parabolic trajectory jumping and simulated odometry/pose estimation to accelerate autonomy development.

Design and Analysis of Tendon actuated Robotic arm using Bowden cables and Mechanical Multiplexing (Thesis @ COEP)

Report: Link

multiplexer-arm-2 multiplexer-arm-1

Designed a modular 4-DOF Bowden cable-driven tendon-actuated robot arm with a mechanical multiplexer enabling full control with only 2 stepper motors, prototyped with FDM printing and including FK/IK solvers. Awarded Best Working Project (Mechanical) at COEP; provisional patent filed (IN202321006687).

ROS1 ROS2 bridge for faster debugging and code refactoring

  • Created a Docker image generation repo to accelerate image creation and facilitate easier refactoring. Link

VLM assisted Octomap generation for navigation

Implemented open-vocabulary segmentation with CLIPSeg and stereo depth-based point cloud generation via Open3D to build octomaps for 3D robot navigation, tested on the NVIDIA r2b_2023 dataset.

Semantic-aware segmentation and navigation using CLIPSeg (CS440)

Slides: Link Report: Link

smartnav image output

Combined CLIPSeg with Depth-Anything V2 for obstacle-aware gridmap generation (94% accuracy); modified A* planner with goal-object validation achieves 81% success rate for line-of-sight pathfinding.


Made using minimal-mistakes