Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yu Xiang

Training-Conditional Coverage Bounds under Covariate Shift

May 26, 2024

Mehrdad Pournaderi, Yu Xiang

Abstract:Training-conditional coverage guarantees in conformal prediction concern the concentration of the error distribution, conditional on the training data, below some nominal level. The conformal prediction methodology has recently been generalized to the covariate shift setting, namely, the covariate distribution changes between the training and test data. In this paper, we study the training-conditional coverage properties of a range of conformal prediction methods under covariate shift via a weighted version of the Dvoretzky-Kiefer-Wolfowitz (DKW) inequality tailored for distribution change. The result for the split conformal method is almost assumption-free, while the results for the full conformal and jackknife+ methods rely on strong assumptions including the uniform stability of the training algorithm.

* arXiv admin note: text overlap with arXiv:2404.13731

Via

Access Paper or Ask Questions

Causal Inference from Slowly Varying Nonstationary Processes

May 11, 2024

Kang Du, Yu Xiang

Figure 1 for Causal Inference from Slowly Varying Nonstationary Processes

Figure 2 for Causal Inference from Slowly Varying Nonstationary Processes

Figure 3 for Causal Inference from Slowly Varying Nonstationary Processes

Figure 4 for Causal Inference from Slowly Varying Nonstationary Processes

Abstract:Causal inference from observational data following the restricted structural causal models (SCM) framework hinges largely on the asymmetry between cause and effect from the data generating mechanisms, such as non-Gaussianity or non-linearity. This methodology can be adapted to stationary time series, yet inferring causal relationships from nonstationary time series remains a challenging task. In this work, we propose a new class of restricted SCM, via a time-varying filter and stationary noise, and exploit the asymmetry from nonstationarity for causal identification in both bivariate and network settings. We propose efficient procedures by leveraging powerful estimates of the bivariate evolutionary spectra for slowly varying processes. Various synthetic and real datasets that involve high-order and non-smooth filters are evaluated to demonstrate the effectiveness of our proposed methodology.

* Accepted to the IEEE Transactions on Signal and Information Processing over Networks. arXiv admin note: substantial text overlap with arXiv:2012.13025

Via

Access Paper or Ask Questions

Mining Invariance from Nonlinear Multi-Environment Data: Binary Classification

Apr 23, 2024

Austin Goddard, Kang Du, Yu Xiang

Figure 1 for Mining Invariance from Nonlinear Multi-Environment Data: Binary Classification

Figure 2 for Mining Invariance from Nonlinear Multi-Environment Data: Binary Classification

Figure 3 for Mining Invariance from Nonlinear Multi-Environment Data: Binary Classification

Figure 4 for Mining Invariance from Nonlinear Multi-Environment Data: Binary Classification

Abstract:Making predictions in an unseen environment given data from multiple training environments is a challenging task. We approach this problem from an invariance perspective, focusing on binary classification to shed light on general nonlinear data generation mechanisms. We identify a unique form of invariance that exists solely in a binary setting that allows us to train models invariant over environments. We provide sufficient conditions for such invariance and show it is robust even when environmental conditions vary greatly. Our formulation admits a causal interpretation, allowing us to compare it with various frameworks. Finally, we propose a heuristic prediction method and conduct experiments using real and synthetic datasets.

* Accepted to the 2024 International Symposium on Information Theory (ISIT)

Via

Access Paper or Ask Questions

Training-Conditional Coverage Bounds for Uniformly Stable Learning Algorithms

Apr 21, 2024

Mehrdad Pournaderi, Yu Xiang

Abstract:The training-conditional coverage performance of the conformal prediction is known to be empirically sound. Recently, there have been efforts to support this observation with theoretical guarantees. The training-conditional coverage bounds for jackknife+ and full-conformal prediction regions have been established via the notion of $(m,n)$-stability by Liang and Barber~[2023]. Although this notion is weaker than uniform stability, it is not clear how to evaluate it for practical models. In this paper, we study the training-conditional coverage bounds of full-conformal, jackknife+, and CV+ prediction regions from a uniform stability perspective which is known to hold for empirical risk minimization over reproducing kernel Hilbert spaces with convex regularization. We derive coverage bounds for finite-dimensional models by a concentration argument for the (estimated) predictor function, and compare the bounds with existing ones under ridge regression.

* Accepted to the ISIT 2024 workshop on Information-Theoretic Methods for Trustworthy Machine Learning (IT-TML)

Via

Access Paper or Ask Questions

Deep Dependency Networks and Advanced Inference Schemes for Multi-Label Classification

Apr 17, 2024

Shivvrat Arya, Yu Xiang, Vibhav Gogate

Abstract:We present a unified framework called deep dependency networks (DDNs) that combines dependency networks and deep learning architectures for multi-label classification, with a particular emphasis on image and video data. The primary advantage of dependency networks is their ease of training, in contrast to other probabilistic graphical models like Markov networks. In particular, when combined with deep learning architectures, they provide an intuitive, easy-to-use loss function for multi-label classification. A drawback of DDNs compared to Markov networks is their lack of advanced inference schemes, necessitating the use of Gibbs sampling. To address this challenge, we propose novel inference schemes based on local search and integer linear programming for computing the most likely assignment to the labels given observations. We evaluate our novel methods on three video datasets (Charades, TACoS, Wetlab) and three image datasets (MS-COCO, PASCAL VOC, NUS-WIDE), comparing their performance with (a) basic neural architectures and (b) neural architectures combined with Markov networks equipped with advanced inference and learning techniques. Our results demonstrate the superiority of our new DDN methods over the two competing approaches.

* Will appear in AISTATS 2024. arXiv admin note: substantial text overlap with arXiv:2302.00633

Via

Access Paper or Ask Questions

Causal Discovery from Poisson Branching Structural Causal Model Using High-Order Cumulant with Path Analysis

Mar 25, 2024

Jie Qiao, Yu Xiang, Zhengming Chen, Ruichu Cai, Zhifeng Hao

Abstract:Count data naturally arise in many fields, such as finance, neuroscience, and epidemiology, and discovering causal structure among count data is a crucial task in various scientific and industrial scenarios. One of the most common characteristics of count data is the inherent branching structure described by a binomial thinning operator and an independent Poisson distribution that captures both branching and noise. For instance, in a population count scenario, mortality and immigration contribute to the count, where survival follows a Bernoulli distribution, and immigration follows a Poisson distribution. However, causal discovery from such data is challenging due to the non-identifiability issue: a single causal pair is Markov equivalent, i.e., $X\rightarrow Y$ and $Y\rightarrow X$ are distributed equivalent. Fortunately, in this work, we found that the causal order from $X$ to its child $Y$ is identifiable if $X$ is a root vertex and has at least two directed paths to $Y$, or the ancestor of $X$ with the most directed path to $X$ has a directed path to $Y$ without passing $X$. Specifically, we propose a Poisson Branching Structure Causal Model (PB-SCM) and perform a path analysis on PB-SCM using high-order cumulants. Theoretical results establish the connection between the path and cumulant and demonstrate that the path information can be obtained from the cumulant. With the path information, causal order is identifiable under some graphical conditions. A practical algorithm for learning causal structure under PB-SCM is proposed and the experiments demonstrate and verify the effectiveness of the proposed method.

* Accepted by AAAI-2024

Via

Access Paper or Ask Questions

MultiGripperGrasp: A Dataset for Robotic Grasping from Parallel Jaw Grippers to Dexterous Hands

Mar 14, 2024

Luis Felipe Casas Murrilo, Ninad Khargonkar, Balakrishnan Prabhakaran, Yu Xiang

Figure 1 for MultiGripperGrasp: A Dataset for Robotic Grasping from Parallel Jaw Grippers to Dexterous Hands

Figure 2 for MultiGripperGrasp: A Dataset for Robotic Grasping from Parallel Jaw Grippers to Dexterous Hands

Figure 3 for MultiGripperGrasp: A Dataset for Robotic Grasping from Parallel Jaw Grippers to Dexterous Hands

Figure 4 for MultiGripperGrasp: A Dataset for Robotic Grasping from Parallel Jaw Grippers to Dexterous Hands

Abstract:We introduce a large-scale dataset named MultiGripperGrasp for robotic grasping. Our dataset contains 30.4M grasps from 11 grippers for 345 objects. These grippers range from two-finger grippers to five-finger grippers, including a human hand. All grasps in the dataset are verified in Isaac Sim to classify them as successful and unsuccessful grasps. Additionally, the object fall-off time for each grasp is recorded as a grasp quality measurement. Furthermore, the grippers in our dataset are aligned according to the orientation and position of their palms, allowing us to transfer grasps from one gripper to another. The grasp transfer significantly increases the number of successful grasps for each gripper in the dataset. Our dataset is useful to study generalized grasp planning and grasp transfer across different grippers.

Via

Access Paper or Ask Questions

Grasping Trajectory Optimization with Point Clouds

Mar 08, 2024

Yu Xiang, Sai Haneesh Allu, Rohith Peddi, Tyler Summers, Vibhav Gogate

Abstract:We introduce a new trajectory optimization method for robotic grasping based on a point-cloud representation of robots and task spaces. In our method, robots are represented by 3D points on their link surfaces. The task space of a robot is represented by a point cloud that can be obtained from depth sensors. Using the point-cloud representation, goal reaching in grasping can be formulated as point matching, while collision avoidance can be efficiently achieved by querying the signed distance values of the robot points in the signed distance field of the scene points. Consequently, a constrained non-linear optimization problem is formulated to solve the joint motion and grasp planning problem. The advantage of our method is that the point-cloud representation is general to be used with any robot in any environment. We demonstrate the effectiveness of our method by conducting experiments on a tabletop scene and a shelf scene for grasping with a Fetch mobile manipulator and a Franka Panda arm.

Via

Access Paper or Ask Questions

RISeg: Robot Interactive Object Segmentation via Body Frame-Invariant Features

Mar 04, 2024

Howard H. Qian, Yangxiao Lu, Kejia Ren, Gaotian Wang, Ninad Khargonkar, Yu Xiang, Kaiyu Hang

Figure 1 for RISeg: Robot Interactive Object Segmentation via Body Frame-Invariant Features

Figure 2 for RISeg: Robot Interactive Object Segmentation via Body Frame-Invariant Features

Figure 3 for RISeg: Robot Interactive Object Segmentation via Body Frame-Invariant Features

Figure 4 for RISeg: Robot Interactive Object Segmentation via Body Frame-Invariant Features

Abstract:In order to successfully perform manipulation tasks in new environments, such as grasping, robots must be proficient in segmenting unseen objects from the background and/or other objects. Previous works perform unseen object instance segmentation (UOIS) by training deep neural networks on large-scale data to learn RGB/RGB-D feature embeddings, where cluttered environments often result in inaccurate segmentations. We build upon these methods and introduce a novel approach to correct inaccurate segmentation, such as under-segmentation, of static image-based UOIS masks by using robot interaction and a designed body frame-invariant feature. We demonstrate that the relative linear and rotational velocities of frames randomly attached to rigid bodies due to robot interactions can be used to identify objects and accumulate corrected object-level segmentation masks. By introducing motion to regions of segmentation uncertainty, we are able to drastically improve segmentation accuracy in an uncertainty-driven manner with minimal, non-disruptive interactions (ca. 2-3 per scene). We demonstrate the effectiveness of our proposed interactive perception pipeline in accurately segmenting cluttered scenes by achieving an average object segmentation accuracy rate of 80.7%, an increase of 28.2% when compared with other state-of-the-art UOIS methods.

* 7 pages, 5 figures, ICRA 2024

Via

Access Paper or Ask Questions

Low-Rank Approximation of Structural Redundancy for Self-Supervised Learning

Feb 10, 2024

Kang Du, Yu Xiang

Figure 1 for Low-Rank Approximation of Structural Redundancy for Self-Supervised Learning

Figure 2 for Low-Rank Approximation of Structural Redundancy for Self-Supervised Learning

Figure 3 for Low-Rank Approximation of Structural Redundancy for Self-Supervised Learning

Figure 4 for Low-Rank Approximation of Structural Redundancy for Self-Supervised Learning

Abstract:We study the data-generating mechanism for reconstructive SSL to shed light on its effectiveness. With an infinite amount of labeled samples, we provide a sufficient and necessary condition for perfect linear approximation. The condition reveals a full-rank component that preserves the label classes of Y, along with a redundant component. Motivated by the condition, we propose to approximate the redundant component by a low-rank factorization and measure the approximation quality by introducing a new quantity $\epsilon_s$, parameterized by the rank of factorization s. We incorporate $\epsilon_s$ into the excess risk analysis under both linear regression and ridge regression settings, where the latter regularization approach is to handle scenarios when the dimension of the learned features is much larger than the number of labeled samples n for downstream tasks. We design three stylized experiments to compare SSL with supervised learning under different settings to support our theoretical findings.

* Accepted to the 3rd Conference on Causal Learning and Reasoning (CLeaR)

Via

Access Paper or Ask Questions