Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Volkan Isler

Uncertainty Guided Exploratory Trajectory Optimization for Sampling-Based Model Predictive Control

Apr 13, 2026

O. Goktug Poyrazoglu, Yukang Cao, Rahul Moorthy, Volkan Isler

Abstract:Trajectory optimization depends heavily on initialization. In particular, sampling-based approaches are highly sensitive to initial solutions, and limited exploration frequently leads them to converge to local minima in complex environments. We present Uncertainty Guided Exploratory Trajectory Optimization (UGE-TO), a trajectory optimization algorithm that generates well-separated samples to achieve a better coverage of the configuration space. UGE-TO represents trajectories as probability distributions induced by uncertainty ellipsoids. Unlike sampling-based approaches that explore only in the action space, this representation captures the effects of both system dynamics and action selection. By incorporating the impact of dynamics, in addition to the action space, into our distributions, our method enhances trajectory diversity by enforcing distributional separation via the Hellinger distance between them. It enables a systematic exploration of the configuration space and improves robustness against local minima. Further, we present UGE-MPC, which integrates UGE-TO into sampling-based model predictive controller methods. Experiments demonstrate that UGE-MPC achieves higher exploration and faster convergence in trajectory optimization compared to baselines under the same sampling budget, achieving 72.1% faster convergence in obstacle-free environments and 66% faster convergence with a 6.7% higher success rate in the cluttered environment compared to the best-performing baseline. Additionally, we validate the approach through a range of simulation scenarios and real-world experiments. Our results indicate that UGE-MPC has higher success rates and faster convergence, especially in environments that demand significant deviations from nominal trajectories to avoid failures. The project and code are available at https://ogpoyrazoglu.github.io/cuniform_sampling/.

* This paper has been accepted for presentation at the IEEE International Conference on Robotics and Automation (ICRA) 2026

Via

Access Paper or Ask Questions

SV-GS: Sparse View 4D Reconstruction with Skeleton-Driven Gaussian Splatting

Jan 01, 2026

Jun-Jee Chao, Volkan Isler

Abstract:Reconstructing a dynamic target moving over a large area is challenging. Standard approaches for dynamic object reconstruction require dense coverage in both the viewing space and the temporal dimension, typically relying on multi-view videos captured at each time step. However, such setups are only possible in constrained environments. In real-world scenarios, observations are often sparse over time and captured sparsely from diverse viewpoints (e.g., from security cameras), making dynamic reconstruction highly ill-posed. We present SV-GS, a framework that simultaneously estimates a deformation model and the object's motion over time under sparse observations. To initialize SV-GS, we leverage a rough skeleton graph and an initial static reconstruction as inputs to guide motion estimation. (Later, we show that this input requirement can be relaxed.) Our method optimizes a skeleton-driven deformation field composed of a coarse skeleton joint pose estimator and a module for fine-grained deformations. By making only the joint pose estimator time-dependent, our model enables smooth motion interpolation while preserving learned geometric details. Experiments on synthetic datasets show that our method outperforms existing approaches under sparse observations by up to 34% in PSNR, and achieves comparable performance to dense monocular video methods on real-world datasets despite using significantly fewer frames. Moreover, we demonstrate that the input initial static reconstruction can be replaced by a diffusion-based generative prior, making our method more practical for real-world scenarios.

Via

Access Paper or Ask Questions

Fast Navigation Through Occluded Spaces via Language-Conditioned Map Prediction

Dec 24, 2025

Rahul Moorthy Mahesh, Oguzhan Goktug Poyrazoglu, Yukang Cao, Volkan Isler

Abstract:In cluttered environments, motion planners often face a trade-off between safety and speed due to uncertainty caused by occlusions and limited sensor range. In this work, we investigate whether co-pilot instructions can help robots plan more decisively while remaining safe. We introduce PaceForecaster, as an approach that incorporates such co-pilot instructions into local planners. PaceForecaster takes the robot's local sensor footprint (Level-1) and the provided co-pilot instructions as input and predicts (i) a forecasted map with all regions visible from Level-1 (Level-2) and (ii) an instruction-conditioned subgoal within Level-2. The subgoal provides the planner with explicit guidance to exploit the forecasted environment in a goal-directed manner. We integrate PaceForecaster with a Log-MPPI controller and demonstrate that using language-conditioned forecasts and goals improves navigation performance by 36% over a local-map-only baseline while in polygonal environments.

Via

Access Paper or Ask Questions

Optimal Robotic Velcro Peeling with Force Feedback

Jun 06, 2025

Jiacheng Yuan, Changhyun Choi, Volkan Isler

Abstract:We study the problem of peeling a Velcro strap from a surface using a robotic manipulator. The surface geometry is arbitrary and unknown. The robot has access to only the force feedback and its end-effector position. This problem is challenging due to the partial observability of the environment and the incompleteness of the sensor feedback. To solve it, we first model the system with simple analytic state and action models based on quasi-static dynamics assumptions. We then study the fully-observable case where the state of both the Velcro and the robot are given. For this case, we obtain the optimal solution in closed-form which minimizes the total energy cost. Next, for the partially-observable case, we design a state estimator which estimates the underlying state using only force and position feedback. Then, we present a heuristics-based controller that balances exploratory and exploitative behaviors in order to peel the velcro efficiently. Finally, we evaluate our proposed method in environments with complex geometric uncertainties and sensor noises, achieving 100% success rate with less than 80% increase in energy cost compared to the optimal solution when the environment is fully-observable, outperforming the baselines by a large margin.

Via

Access Paper or Ask Questions

VisDiff: SDF-Guided Polygon Generation for Visibility Reconstruction and Recognition

Oct 07, 2024

Rahul Moorthy, Volkan Isler

Abstract:The capability to learn latent representations plays a key role in the effectiveness of recent machine learning methods. An active frontier in representation learning is understanding representations for combinatorial structures which may not admit well-behaved local neighborhoods or distance functions. For example, for polygons, slightly perturbing vertex locations might lead to significant changes in their combinatorial structure and may even lead to invalid polygons. In this paper, we investigate representations to capture the underlying combinatorial structures of polygons. Specifically, we study the open problem of Visibility Reconstruction: Given a visibility graph G, construct a polygon P whose visibility graph is G. We introduce VisDiff, a novel diffusion-based approach to reconstruct a polygon from its given visibility graph G. Our method first estimates the signed distance function (SDF) of P from G. Afterwards, it extracts ordered vertex locations that have the pairwise visibility relationship given by the edges of G. Our main insight is that going through the SDF significantly improves learning for reconstruction. In order to train VisDiff, we make two main contributions: (1) We design novel loss components for computing the visibility in a differentiable manner and (2) create a carefully curated dataset. We use this dataset to benchmark our method and achieve 21% improvement in F1-Score over standard methods. We also demonstrate effective generalization to out-of-distribution polygon types and show that learning a generative model allows us to sample the set of polygons with a given visibility graph. Finally, we extend our method to the related combinatorial problem of reconstruction from a triangulation. We achieve 95% classification accuracy of triangulation edges and a 4% improvement in Chamfer distance compared to current architectures.

Via

Access Paper or Ask Questions

Neural L1 Adaptive Control of Vehicle Lateral Dynamics

May 25, 2024

Pratik Mukherjee, Burak M. Gonultas, O. Goktug Poyrazoglu, Volkan Isler

Abstract:We address the problem of stable and robust control of vehicles with lateral error dynamics for the application of lane keeping. Lane departure is the primary reason for half of the fatalities in road accidents, making the development of stable, adaptive and robust controllers a necessity. Traditional linear feedback controllers achieve satisfactory tracking performance, however, they exhibit unstable behavior when uncertainties are induced into the system. Any disturbance or uncertainty introduced to the steering-angle input can be catastrophic for the vehicle. Therefore, controllers must be developed to actively handle such uncertainties. In this work, we introduce a Neural L1 Adaptive controller (Neural-L1) which learns the uncertainties in the lateral error dynamics of a front-steered Ackermann vehicle and guarantees stability and robustness. Our contributions are threefold: i) We extend the theoretical results for guaranteed stability and robustness of conventional L1 Adaptive controllers to Neural-L1; ii) We implement a Neural-L1 for the lane keeping application which learns uncertainties in the dynamics accurately; iii)We evaluate the performance of Neural-L1 on a physics-based simulator, PyBullet, and conduct extensive real-world experiments with the F1TENTH platform to demonstrate superior reference trajectory tracking performance of Neural-L1 compared to other state-of-the-art controllers, in the presence of uncertainties. Our project page, including supplementary material and videos, can be found at https://mukhe027.github.io/Neural-Adaptive-Control/

Via

Access Paper or Ask Questions

Learning to Play Pursuit-Evasion with Dynamic and Sensor Constraints

May 08, 2024

Burak M. Gonultas, Volkan Isler

Abstract:We present a multi-agent reinforcement learning approach to solve a pursuit-evasion game between two players with car-like dynamics and sensing limitations. We develop a curriculum for an existing multi-agent deterministic policy gradient algorithm to simultaneously obtain strategies for both players, and deploy the learned strategies on real robots moving as fast as 2 m/s in indoor environments. Through experiments we show that the learned strategies improve over existing baselines by up to 30% in terms of capture rate for the pursuer. The learned evader model has up to 5% better escape rate over the baselines even against our competitive pursuer model. We also present experiment results which show how the pursuit-evasion game and its results evolve as the player dynamics and sensor constraints are varied. Finally, we deploy learned policies on physical robots for a game between the F1TENTH and JetRacer platforms and show that the learned strategies can be executed on real-robots. Our code and supplementary material including videos from experiments are available at https: //gonultasbu.github.io/pursuit-evasion/.

Via

Access Paper or Ask Questions

Map-Aware Human Pose Prediction for Robot Follow-Ahead

Mar 20, 2024

Qingyuan Jiang, Burak Susam, Jun-Jee Chao, Volkan Isler

Abstract:In the robot follow-ahead task, a mobile robot is tasked to maintain its relative position in front of a moving human actor while keeping the actor in sight. To accomplish this task, it is important that the robot understand the full 3D pose of the human (since the head orientation can be different than the torso) and predict future human poses so as to plan accordingly. This prediction task is especially tricky in a complex environment with junctions and multiple corridors. In this work, we address the problem of forecasting the full 3D trajectory of a human in such environments. Our main insight is to show that one can first predict the 2D trajectory and then estimate the full 3D trajectory by conditioning the estimator on the predicted 2D trajectory. With this approach, we achieve results comparable or better than the state-of-the-art methods three times faster. As part of our contribution, we present a new dataset where, in contrast to existing datasets, the human motion is in a much larger area than a single room. We also present a complete robot system that integrates our human pose forecasting network on the mobile robot to enable real-time robot follow-ahead and present results from real-world experiments in multiple buildings on campus. Our project page, including supplementary material and videos, can be found at: https://qingyuan-jiang.github.io/iros2024_poseForecasting/

Via

Access Paper or Ask Questions

FineControlNet: Fine-level Text Control for Image Generation with Spatially Aligned Text Control Injection

Dec 14, 2023

Hongsuk Choi, Isaac Kasahara, Selim Engin, Moritz Graule, Nikhil Chavan-Dafle, Volkan Isler

Abstract:Recently introduced ControlNet has the ability to steer the text-driven image generation process with geometric input such as human 2D pose, or edge features. While ControlNet provides control over the geometric form of the instances in the generated image, it lacks the capability to dictate the visual appearance of each instance. We present FineControlNet to provide fine control over each instance's appearance while maintaining the precise pose control capability. Specifically, we develop and demonstrate FineControlNet with geometric control via human pose images and appearance control via instance-level text prompts. The spatial alignment of instance-specific text prompts and 2D poses in latent space enables the fine control capabilities of FineControlNet. We evaluate the performance of FineControlNet with rigorous comparison against state-of-the-art pose-conditioned text-to-image diffusion models. FineControlNet achieves superior performance in generating images that follow the user-provided instance-specific text prompts and poses compared with existing methods. Project webpage: https://samsunglabs.github.io/FineControlNet-project-page

* Hongsuk Choi and Isaac Kasahara have eqaul contributions. 19 pages, 15 figures, 3 tables

Via

Access Paper or Ask Questions

VioLA: Aligning Videos to 2D LiDAR Scans

Nov 08, 2023

Jun-Jee Chao, Selim Engin, Nikhil Chavan-Dafle, Bhoram Lee, Volkan Isler

Figure 1 for VioLA: Aligning Videos to 2D LiDAR Scans

Figure 2 for VioLA: Aligning Videos to 2D LiDAR Scans

Figure 3 for VioLA: Aligning Videos to 2D LiDAR Scans

Figure 4 for VioLA: Aligning Videos to 2D LiDAR Scans

Abstract:We study the problem of aligning a video that captures a local portion of an environment to the 2D LiDAR scan of the entire environment. We introduce a method (VioLA) that starts with building a semantic map of the local scene from the image sequence, then extracts points at a fixed height for registering to the LiDAR map. Due to reconstruction errors or partial coverage of the camera scan, the reconstructed semantic map may not contain sufficient information for registration. To address this problem, VioLA makes use of a pre-trained text-to-image inpainting model paired with a depth completion model for filling in the missing scene content in a geometrically consistent fashion to support pose registration. We evaluate VioLA on two real-world RGB-D benchmarks, as well as a self-captured dataset of a large office scene. Notably, our proposed scene completion module improves the pose registration performance by up to 20%.

* 8 pages

Via

Access Paper or Ask Questions