Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Pratik Chaudhari

Active Perception using Neural Radiance Fields

Oct 15, 2023

Siming He, Christopher D. Hsu, Dexter Ong, Yifei Simon Shao, Pratik Chaudhari

Abstract:We study active perception from first principles to argue that an autonomous agent performing active perception should maximize the mutual information that past observations posses about future ones. Doing so requires (a) a representation of the scene that summarizes past observations and the ability to update this representation to incorporate new observations (state estimation and mapping), (b) the ability to synthesize new observations of the scene (a generative model), and (c) the ability to select control trajectories that maximize predictive information (planning). This motivates a neural radiance field (NeRF)-like representation which captures photometric, geometric and semantic properties of the scene grounded. This representation is well-suited to synthesizing new observations from different viewpoints. And thereby, a sampling-based planner can be used to calculate the predictive information from synthetic observations along dynamically-feasible trajectories. We use active perception for exploring cluttered indoor environments and employ a notion of semantic uncertainty to check for the successful completion of an exploration task. We demonstrate these ideas via simulation in realistic 3D indoor environments.

Via

Access Paper or Ask Questions

TreeScope: An Agricultural Robotics Dataset for LiDAR-Based Mapping of Trees in Forests and Orchards

Oct 03, 2023

Derek Cheng, Fernando Cladera Ojeda, Ankit Prabhu, Xu Liu, Alan Zhu, Patrick Corey Green, Reza Ehsani, Pratik Chaudhari, Vijay Kumar

Figure 1 for TreeScope: An Agricultural Robotics Dataset for LiDAR-Based Mapping of Trees in Forests and Orchards

Figure 2 for TreeScope: An Agricultural Robotics Dataset for LiDAR-Based Mapping of Trees in Forests and Orchards

Figure 3 for TreeScope: An Agricultural Robotics Dataset for LiDAR-Based Mapping of Trees in Forests and Orchards

Figure 4 for TreeScope: An Agricultural Robotics Dataset for LiDAR-Based Mapping of Trees in Forests and Orchards

Abstract:Data collection for forestry, timber, and agriculture currently relies on manual techniques which are labor-intensive and time-consuming. We seek to demonstrate that robotics offers improvements over these techniques and accelerate agricultural research, beginning with semantic segmentation and diameter estimation of trees in forests and orchards. We present TreeScope v1.0, the first robotics dataset for precision agriculture and forestry addressing the counting and mapping of trees in forestry and orchards. TreeScope provides LiDAR data from agricultural environments collected with robotics platforms, such as UAV and mobile robot platforms carried by vehicles and human operators. In the first release of this dataset, we provide ground-truth data with over 1,800 manually annotated semantic labels for tree stems and field-measured tree diameters. We share benchmark scripts for these tasks that researchers may use to evaluate the accuracy of their algorithms. Finally, we run our open-source diameter estimation and off-the-shelf semantic segmentation algorithms and share our baseline results.

* Submitted to 2024 IEEE International Conference on Robotics and Automation (ICRA 2024) for review

Via

Access Paper or Ask Questions

Design and Evaluation of Motion Planners for Quadrotors

Sep 24, 2023

Yifei Simon Shao, Yuwei Wu, Laura Jarin-Lipschitz, Pratik Chaudhari, Vijay Kumar

Figure 1 for Design and Evaluation of Motion Planners for Quadrotors

Figure 2 for Design and Evaluation of Motion Planners for Quadrotors

Figure 3 for Design and Evaluation of Motion Planners for Quadrotors

Figure 4 for Design and Evaluation of Motion Planners for Quadrotors

Abstract:The field of quadrotor motion planning has experienced significant advancements over the last decade. Most successful approaches rely on two stages: a front-end that determines the best path by incorporating geometric (and in some cases kinematic or input) constraints, that effectively specify the homotopy class of the trajectory; and a back-end that optimizes the path with a suitable objective function, constrained by the robot's dynamics as well as state/input constraints. However, there is no systematic approach or design guidelines to design both the front and the back ends for a wide range of environments, and no literature evaluates the performance of the trajectory planning algorithm with varying degrees of environment complexity. In this paper, we propose a modular approach to designing the software planning stack and offer a parameterized set of environments to systematically evaluate the performance of two-stage planners. Our parametrized environments enable us to access different front and back-end planners as a function of environmental clutter and complexity. We use simulation and experimental results to demonstrate the performance of selected planning algorithms across a range of environments. Finally, we open source the planning/evaluation stack and parameterized environments to facilitate more in-depth studies of quadrotor motion planning, available at https://github.com/KumarRobotics/kr_mp_design

Via

Access Paper or Ask Questions

Adapting Machine Learning Diagnostic Models to New Populations Using a Small Amount of Data: Results from Clinical Neuroscience

Aug 06, 2023

Rongguang Wang, Guray Erus, Pratik Chaudhari, Christos Davatzikos

Figure 1 for Adapting Machine Learning Diagnostic Models to New Populations Using a Small Amount of Data: Results from Clinical Neuroscience

Figure 2 for Adapting Machine Learning Diagnostic Models to New Populations Using a Small Amount of Data: Results from Clinical Neuroscience

Figure 3 for Adapting Machine Learning Diagnostic Models to New Populations Using a Small Amount of Data: Results from Clinical Neuroscience

Figure 4 for Adapting Machine Learning Diagnostic Models to New Populations Using a Small Amount of Data: Results from Clinical Neuroscience

Abstract:Machine learning (ML) has shown great promise for revolutionizing a number of areas, including healthcare. However, it is also facing a reproducibility crisis, especially in medicine. ML models that are carefully constructed from and evaluated on a training set might not generalize well on data from different patient populations or acquisition instrument settings and protocols. We tackle this problem in the context of neuroimaging of Alzheimer's disease (AD), schizophrenia (SZ) and brain aging. We develop a weighted empirical risk minimization approach that optimally combines data from a source group, e.g., subjects are stratified by attributes such as sex, age group, race and clinical cohort to make predictions on a target group, e.g., other sex, age group, etc. using a small fraction (10%) of data from the target group. We apply this method to multi-source data of 15,363 individuals from 20 neuroimaging studies to build ML models for diagnosis of AD and SZ, and estimation of brain age. We found that this approach achieves substantially better accuracy than existing domain adaptation techniques: it obtains area under curve greater than 0.95 for AD classification, area under curve greater than 0.7 for SZ classification and mean absolute error less than 5 years for brain age prediction on all target groups, achieving robustness to variations of scanners, protocols, and demographic or clinical characteristics. In some cases, it is even better than training on all data from the target group, because it leverages the diversity and size of a larger training set. We also demonstrate the utility of our models for prognostic tasks such as predicting disease progression in individuals with mild cognitive impairment. Critically, our brain age prediction models lead to new clinical insights regarding correlations with neurophysiological tests.

Via

Access Paper or Ask Questions

Budgeting Counterfactual for Offline RL

Jul 12, 2023

Yao Liu, Pratik Chaudhari, Rasool Fakoor

Figure 1 for Budgeting Counterfactual for Offline RL

Figure 2 for Budgeting Counterfactual for Offline RL

Figure 3 for Budgeting Counterfactual for Offline RL

Figure 4 for Budgeting Counterfactual for Offline RL

Abstract:The main challenge of offline reinforcement learning, where data is limited, arises from a sequence of counterfactual reasoning dilemmas within the realm of potential actions: What if we were to choose a different course of action? These circumstances frequently give rise to extrapolation errors, which tend to accumulate exponentially with the problem horizon. Hence, it becomes crucial to acknowledge that not all decision steps are equally important to the final outcome, and to budget the number of counterfactual decisions a policy make in order to control the extrapolation. Contrary to existing approaches that use regularization on either the policy or value function, we propose an approach to explicitly bound the amount of out-of-distribution actions during training. Specifically, our method utilizes dynamic programming to decide where to extrapolate and where not to, with an upper bound on the decisions different from behavior policy. It balances between the potential for improvement from taking out-of-distribution actions and the risk of making errors due to extrapolation. Theoretically, we justify our method by the constrained optimality of the fixed point solution to our $Q$ updating rules. Empirically, we show that the overall performance of our method is better than the state-of-the-art offline RL methods on tasks in the widely-used D4RL benchmarks.

Via

Access Paper or Ask Questions

Taming AI Bots: Controllability of Neural States in Large Language Models

May 29, 2023

Stefano Soatto, Paulo Tabuada, Pratik Chaudhari, Tian Yu Liu

Abstract:We tackle the question of whether an agent can, by suitable choice of prompts, control an AI bot to any state. To that end, we first introduce a formal definition of ``meaning'' that is amenable to analysis. Then, we characterize ``meaningful data'' on which large language models (LLMs) are ostensibly trained, and ``well-trained LLMs'' through conditions that are largely met by today's LLMs. While a well-trained LLM constructs an embedding space of meanings that is Euclidean, meanings themselves do not form a vector (linear) subspace, but rather a quotient space within. We then characterize the subset of meanings that can be reached by the state of the LLMs for some input prompt, and show that a well-trained bot can reach any meaning albeit with small probability. We then introduce a stronger notion of controllability as {\em almost certain reachability}, and show that, when restricted to the space of meanings, an AI bot is controllable. We do so after introducing a functional characterization of attentive AI bots, and finally derive necessary and sufficient conditions for controllability. The fact that AI bots are controllable means that an adversary could steer them towards any state. However, the sampling process can be designed to counteract adverse actions and avoid reaching undesirable regions of state space before their boundary is crossed.

* TLDR: AI Bots are stochastic dynamical systems whose mental state can be controlled by both the user and the designer. The space of meanings, defined as equivalence classes of sentences, is learned during fine-tuning with human supervision, and safeguarding can be designed into the bot by establishing controls both at its input and output

Via

Access Paper or Ask Questions

Learning Capacity: A Measure of the Effective Dimensionality of a Model

May 27, 2023

Daiwei Chen, Weikai Chang, Pratik Chaudhari

Figure 1 for Learning Capacity: A Measure of the Effective Dimensionality of a Model

Figure 2 for Learning Capacity: A Measure of the Effective Dimensionality of a Model

Figure 3 for Learning Capacity: A Measure of the Effective Dimensionality of a Model

Figure 4 for Learning Capacity: A Measure of the Effective Dimensionality of a Model

Abstract:We exploit a formal correspondence between thermodynamics and inference, where the number of samples can be thought of as the inverse temperature, to define a "learning capacity'' which is a measure of the effective dimensionality of a model. We show that the learning capacity is a tiny fraction of the number of parameters for many deep networks trained on typical datasets, depends upon the number of samples used for training, and is numerically consistent with notions of capacity obtained from the PAC-Bayesian framework. The test error as a function of the learning capacity does not exhibit double descent. We show that the learning capacity of a model saturates at very small and very large sample sizes; this provides guidelines, as to whether one should procure more data or whether one should search for new architectures, to improve performance. We show how the learning capacity can be used to understand the effective dimensionality, even for non-parametric models such as random forests and $k$-nearest neighbor classifiers.

Via

Access Paper or Ask Questions

The Training Process of Many Deep Networks Explores the Same Low-Dimensional Manifold

May 02, 2023

Jialin Mao, Itay Griniasty, Han Kheng Teoh, Rahul Ramesh, Rubing Yang, Mark K. Transtrum, James P. Sethna, Pratik Chaudhari

Abstract:We develop information-geometric techniques to analyze the trajectories of the predictions of deep networks during training. By examining the underlying high-dimensional probabilistic models, we reveal that the training process explores an effectively low-dimensional manifold. Networks with a wide range of architectures, sizes, trained using different optimization methods, regularization techniques, data augmentation techniques, and weight initializations lie on the same manifold in the prediction space. We study the details of this manifold to find that networks with different architectures follow distinguishable trajectories but other factors have a minimal influence; larger networks train along a similar manifold as that of smaller networks, just faster; and networks initialized at very different parts of the prediction space converge to the solution along a similar manifold.

Via

Access Paper or Ask Questions

Fast Diffusion Probabilistic Model Sampling through the lens of Backward Error Analysis

Apr 22, 2023

Yansong Gao, Zhihong Pan, Xin Zhou, Le Kang, Pratik Chaudhari

Figure 1 for Fast Diffusion Probabilistic Model Sampling through the lens of Backward Error Analysis

Figure 2 for Fast Diffusion Probabilistic Model Sampling through the lens of Backward Error Analysis

Figure 3 for Fast Diffusion Probabilistic Model Sampling through the lens of Backward Error Analysis

Figure 4 for Fast Diffusion Probabilistic Model Sampling through the lens of Backward Error Analysis

Abstract:Denoising diffusion probabilistic models (DDPMs) are a class of powerful generative models. The past few years have witnessed the great success of DDPMs in generating high-fidelity samples. A significant limitation of the DDPMs is the slow sampling procedure. DDPMs generally need hundreds or thousands of sequential function evaluations (steps) of neural networks to generate a sample. This paper aims to develop a fast sampling method for DDPMs requiring much fewer steps while retaining high sample quality. The inference process of DDPMs approximates solving the corresponding diffusion ordinary differential equations (diffusion ODEs) in the continuous limit. This work analyzes how the backward error affects the diffusion ODEs and the sample quality in DDPMs. We propose fast sampling through the \textbf{Restricting Backward Error schedule (RBE schedule)} based on dynamically moderating the long-time backward error. Our method accelerates DDPMs without any further training. Our experiments show that sampling with an RBE schedule generates high-quality samples within only 8 to 20 function evaluations on various benchmark datasets. We achieved 12.01 FID in 8 function evaluations on the ImageNet $128\times128$, and a $20\times$ speedup compared with previous baseline samplers.

* arXiv admin note: text overlap with arXiv:2101.12176 by other authors

Via

Access Paper or Ask Questions

Mesh Strikes Back: Fast and Efficient Human Reconstruction from RGB videos

Mar 15, 2023

Rohit Jena, Pratik Chaudhari, James Gee, Ganesh Iyer, Siddharth Choudhary, Brandon M. Smith

Abstract:Human reconstruction and synthesis from monocular RGB videos is a challenging problem due to clothing, occlusion, texture discontinuities and sharpness, and framespecific pose changes. Many methods employ deferred rendering, NeRFs and implicit methods to represent clothed humans, on the premise that mesh-based representations cannot capture complex clothing and textures from RGB, silhouettes, and keypoints alone. We provide a counter viewpoint to this fundamental premise by optimizing a SMPL+D mesh and an efficient, multi-resolution texture representation using only RGB images, binary silhouettes and sparse 2D keypoints. Experimental results demonstrate that our approach is more capable of capturing geometric details compared to visual hull, mesh-based methods. We show competitive novel view synthesis and improvements in novel pose synthesis compared to NeRF-based methods, which introduce noticeable, unwanted artifacts. By restricting the solution space to the SMPL+D model combined with differentiable rendering, we obtain dramatic speedups in compute, training times (up to 24x) and inference times (up to 192x). Our method therefore can be used as is or as a fast initialization to NeRF-based methods.

Via

Access Paper or Ask Questions