About Me
I work on enabling robots to navigate and act in unstructured environments using foundation models.
Currently, I’m a student at UIUC pursuing a MS in Computer Science. I’m advised by Girish Chowdhary at Distributed Autonomous Systems Lab. My research is focused on Generalized Navigation with Vision Language Action models and embodiment aware grounding for Vision Large Language Models. My research is supported by NASA FireSense, AIFARMS, I-FARM, COALESCE, and NSF ACCESS.
Previously, I worked at Earthsense, on hardware design, autonomy, and optimization for a 750kg payload UGV. My contributions included developing the lower-level control systems and implementing 4-wheel independent torque vectoring for the TerraMax Robot’s dual Ackermann steering. I also developed steering controllers that were deployed across the fleet of 200+ smaller TerraSentia and TerraPreta robots.
I enjoy running and cycling, currently preparing for a 5k in 25 minutes.
Education
University of Illinois Urbana-Champaign 
MS in Computer Science
Relevant coursework: Humanoid Robotics, Autonomous Vehicle System Engineering, 3D Vision, Networked IoT, Software Engineering, Artificial Intelligence.
College of Engineering, Pune 
B.Tech in Mechanical Engineering, Minor in Computer Engineering
Relevant coursework: Machine Design, Robotics and Automation, Mechanics of Materials, Engineering Mechanics, Linear Algebra, Vector Calculus.
Publications
CATNAV: Cached Vision-Language Traversability for Efficient Zero-Shot Robot Navigation
Aditya Potnis, Francisco Affonso, Shreya Gummadi, Naveen Kumar Uppalapati, Girish Chowdhary
Under Review.
Zero-shot embodiment-aware traversability framework using multimodal LLMs for costmap generation; visuosemantic caching reduces online VLM queries by 85.7%, achieving 10% higher goal-reaching rate and 33% fewer behavioral constraint violations versus state-of-the-art VLA baselines on a quadruped robot.
CTS-MoE: Implicit Policy Adaptation Enables Perceptive Locomotion on Discontinuous Terrain
Francisco Affonso, Matheus P. Angarola, Ana Luiza Mineiro, Aditya Potnis, Marcelo Becker, Girish Chowdhary
Under Review.
Concurrent teacher student architecture for mixture of experts, enabling implicit policy adaptation for perceptive locomotion on discontinuous terrain.
Visual-Language-Guided Task Planning for Horticultural Robots
Jose Cuaran, Kendall Koe, Aditya Potnis, Naveen Kumar Uppalapati, Girish Chowdhary
Computers and Electronics in Agriculture.
Modular VLM-guided framework for precision agriculture interleaving natural language queries with action primitives for autonomous crop monitoring, benchmarked long-horizon planning using MLLMs, finding human-comparable performance on short-horizon tasks but degradation in long-horizon scenarios with noisy semantic maps.
more coming soon …
Competitive Robotics

I am a FIRST and WRO alumni and I have participated in Robotics competitions since 2013. I have represented India Internationally in Robotics competitions thrice and also mentored a team which won Runner up Best Project Research in FLL Europe Opens in Estonia in 2018. I continue to mentor teams at Robominds for FLL, FTC and Vex Robotics.
Achievements
- Team India Coach, 2018 FIRST Lego League European Championships at Tallinn, Estonia. Team stood 2nd place for Best Project Research and 4th in Overall Robot Game.
- Team India Representative in World Robot Olympiad Internationals 2017(Costa Rica),
- Team India Representative 2016(India) in WRO 2016 Internationals, New Delhi, Gold Medal in 2016 Nationals.
- Team India Representative, Top 10 in First Lego League Open European Championships, Spain(2016) (Fully sponsored by Tata Motors).
Projects
Visual Navigation Policy with V-JEPA2 and Diffusion Policy
Code: Link
Trained a visual navigation policy using a frozen V-JEPA2 backbone for spatiotemporal feature extraction from egocentric video, with a Diffusion Policy action head for action prediction.
Adapt: Diffusion-Predicted Pedestrian Avoidance with MPPI Control on the Polaris GEM e4 (CS 588)
Report: Link
Autonomous driving stack on the Polaris GEM e4 that pairs a diffusion-based pedestrian predictor with an MPPI planner for pedestrian-aware avoidance, plus a text-promptable LiDAR-camera module for open-vocabulary goals.
Low-Rank Adaptation for Video Generation with semantic relative pose prompts (CS 598 3D Vision, HACKER Project)
Code: Link
Fine-tuned Wan 2.2 (1.3B) with LoRA to generate robot-POV navigation videos from text and motion plans for scalable outdoor data synthesis. Evaluated motion fidelity and failure modes, proposing hierarchical motion-primitive curriculum training to improve alignment.
GhibliDream – Studio-Ghibli inspired Stylization of Stable Diffusion (CS 444)
Report: Link
Fine-tuned StableDiffusion-2.0 with DreamBooth on curated Ghibli images using LLM-assisted auto-captioning, achieving 0.90+ CLIP-I cosine similarity on foreground characters while retaining background quality.
Salto Simulator for development
Code: Link
Gazebo plugin simulating Salto-1P jumping motion as a point object, supporting parabolic trajectory jumping and simulated odometry/pose estimation to accelerate autonomy development.
Design and Analysis of Tendon actuated Robotic arm using Bowden cables and Mechanical Multiplexing (Thesis @ COEP)
Report: Link
Designed a modular 4-DOF Bowden cable-driven tendon-actuated robot arm with a mechanical multiplexer enabling full control with only 2 stepper motors, prototyped with FDM printing and including FK/IK solvers. Awarded Best Working Project (Mechanical) at COEP; provisional patent filed (IN202321006687).
ROS1 ROS2 bridge for faster debugging and code refactoring
- Created a Docker image generation repo to accelerate image creation and facilitate easier refactoring. Link
VLM assisted Octomap generation for navigation
Implemented open-vocabulary segmentation with CLIPSeg and stereo depth-based point cloud generation via Open3D to build octomaps for 3D robot navigation, tested on the NVIDIA r2b_2023 dataset.
Semantic-aware segmentation and navigation using CLIPSeg (CS440)
Combined CLIPSeg with Depth-Anything V2 for obstacle-aware gridmap generation (94% accuracy); modified A* planner with goal-object validation achieves 81% success rate for line-of-sight pathfinding.
