Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Fan Yang

refer to the report for detailed contributions

Calibrate to Discriminate: Improve In-Context Learning with Label-Free Comparative Inference

Oct 03, 2024

Wei Cheng, Tianlu Wang, Yanmin Ji, Fan Yang, Keren Tan, Yiyu Zheng

Figure 1 for Calibrate to Discriminate: Improve In-Context Learning with Label-Free Comparative Inference

Figure 2 for Calibrate to Discriminate: Improve In-Context Learning with Label-Free Comparative Inference

Figure 3 for Calibrate to Discriminate: Improve In-Context Learning with Label-Free Comparative Inference

Figure 4 for Calibrate to Discriminate: Improve In-Context Learning with Label-Free Comparative Inference

Abstract:While in-context learning with large language models (LLMs) has shown impressive performance, we have discovered a unique miscalibration behavior where both correct and incorrect predictions are assigned the same level of confidence. We refer to this phenomenon as indiscriminate miscalibration. We found that traditional calibration metrics, such as Expected Calibrated Errors (ECEs), are unable to capture this behavior effectively. To address this issue, we propose new metrics to measure the severity of indiscriminate miscalibration. Additionally, we develop a novel in-context comparative inference method to alleviate miscalibrations and improve classification performance. Through extensive experiments on five datasets, we demonstrate that our proposed method can achieve more accurate and calibrated predictions compared to regular zero-shot and few-shot prompting.

* 19 pages

Via

Access Paper or Ask Questions

Diffusion-Informed Probabilistic Contact Search for Multi-Finger Manipulation

Oct 01, 2024

Abhinav Kumar, Thomas Power, Fan Yang, Sergio Aguilera Marinovic, Soshi Iba, Rana Soltani Zarrin, Dmitry Berenson

Figure 1 for Diffusion-Informed Probabilistic Contact Search for Multi-Finger Manipulation

Figure 2 for Diffusion-Informed Probabilistic Contact Search for Multi-Finger Manipulation

Figure 3 for Diffusion-Informed Probabilistic Contact Search for Multi-Finger Manipulation

Figure 4 for Diffusion-Informed Probabilistic Contact Search for Multi-Finger Manipulation

Abstract:Planning contact-rich interactions for multi-finger manipulation is challenging due to the high-dimensionality and hybrid nature of dynamics. Recent advances in data-driven methods have shown promise, but are sensitive to the quality of training data. Combining learning with classical methods like trajectory optimization and search adds additional structure to the problem and domain knowledge in the form of constraints, which can lead to outperforming the data on which models are trained. We present Diffusion-Informed Probabilistic Contact Search (DIPS), which uses an A* search to plan a sequence of contact modes informed by a diffusion model. We train the diffusion model on a dataset of demonstrations consisting of contact modes and trajectories generated by a trajectory optimizer given those modes. In addition, we use a particle filter-inspired method to reason about variability in diffusion sampling arising from model error, estimating likelihoods of trajectories using a learned discriminator. We show that our method outperforms ablations that do not reason about variability and can plan contact sequences that outperform those found in training data across multiple tasks. We evaluate on simulated tabletop card sliding and screwdriver turning tasks, as well as the screwdriver task in hardware to show that our combined learning and planning approach transfers to the real world.

Via

Access Paper or Ask Questions

Steering Prediction via a Multi-Sensor System for Autonomous Racing

Sep 28, 2024

Zhuyun Zhou, Zongwei Wu, Florian Bolli, Rémi Boutteau, Fan Yang, Radu Timofte, Dominique Ginhac, Tobi Delbruck

Figure 1 for Steering Prediction via a Multi-Sensor System for Autonomous Racing

Figure 2 for Steering Prediction via a Multi-Sensor System for Autonomous Racing

Figure 3 for Steering Prediction via a Multi-Sensor System for Autonomous Racing

Figure 4 for Steering Prediction via a Multi-Sensor System for Autonomous Racing

Abstract:Autonomous racing has rapidly gained research attention. Traditionally, racing cars rely on 2D LiDAR as their primary visual system. In this work, we explore the integration of an event camera with the existing system to provide enhanced temporal information. Our goal is to fuse the 2D LiDAR data with event data in an end-to-end learning framework for steering prediction, which is crucial for autonomous racing. To the best of our knowledge, this is the first study addressing this challenging research topic. We start by creating a multisensor dataset specifically for steering prediction. Using this dataset, we establish a benchmark by evaluating various SOTA fusion methods. Our observations reveal that existing methods often incur substantial computational costs. To address this, we apply low-rank techniques to propose a novel, efficient, and effective fusion design. We introduce a new fusion learning policy to guide the fusion process, enhancing robustness against misalignment. Our fusion architecture provides better steering prediction than LiDAR alone, significantly reducing the RMSE from 7.72 to 1.28. Compared to the second-best fusion method, our work represents only 11% of the learnable parameters while achieving better accuracy. The source code, dataset, and benchmark will be released to promote future research.

Via

Access Paper or Ask Questions

Fourier neural operators for spatiotemporal dynamics in two-dimensional turbulence

Sep 25, 2024

Mohammad Atif, Pulkit Dubey, Pratik P. Aghor, Vanessa Lopez-Marrero, Tao Zhang, Abdullah Sharfuddin, Kwangmin Yu, Fan Yang, Foluso Ladeinde, Yangang Liu(+2 more)

Abstract:High-fidelity direct numerical simulation of turbulent flows for most real-world applications remains an outstanding computational challenge. Several machine learning approaches have recently been proposed to alleviate the computational cost even though they become unstable or unphysical for long time predictions. We identify that the Fourier neural operator (FNO) based models combined with a partial differential equation (PDE) solver can accelerate fluid dynamic simulations and thus address computational expense of large-scale turbulence simulations. We treat the FNO model on the same footing as a PDE solver and answer important questions about the volume and temporal resolution of data required to build pre-trained models for turbulence. We also discuss the pitfalls of purely data-driven approaches that need to be avoided by the machine learning models to become viable and competitive tools for long time simulations of turbulence.

Via

Access Paper or Ask Questions

RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval

Sep 16, 2024

Di Liu, Meng Chen, Baotong Lu, Huiqiang Jiang, Zhenhua Han, Qianxi Zhang, Qi Chen, Chengruidong Zhang, Bailu Ding, Kai Zhang(+4 more)

Figure 1 for RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval

Figure 2 for RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval

Figure 3 for RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval

Figure 4 for RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval

Abstract:Transformer-based large Language Models (LLMs) become increasingly important in various domains. However, the quadratic time complexity of attention operation poses a significant challenge for scaling to longer contexts due to the extremely high inference latency and GPU memory consumption for caching key-value (KV) vectors. This paper proposes RetrievalAttention, a training-free approach to accelerate attention computation. To leverage the dynamic sparse property of attention, RetrievalAttention builds approximate nearest neighbor search (ANNS) indexes upon KV vectors in CPU memory and retrieves the most relevant ones via vector search during generation. Due to the out-of-distribution (OOD) between query vectors and key vectors, off-the-shelf ANNS indexes still need to scan O(N) (usually 30% of all keys) data for accurate retrieval, which fails to exploit the high sparsity. RetrievalAttention first identifies the OOD challenge of ANNS-based attention, and addresses it via an attention-aware vector search algorithm that can adapt to queries and only access 1--3% of data, thus achieving a sub-linear time complexity. RetrievalAttention greatly reduces the inference cost of long-context LLM with much lower GPU memory requirements while maintaining the model accuracy. Especially, RetrievalAttention only needs 16GB GPU memory for serving 128K tokens in LLMs with 8B parameters, which is capable of generating one token in 0.188 seconds on a single NVIDIA RTX4090 (24GB).

* 16 pages

Via

Access Paper or Ask Questions

A Scalable Matrix Visualization for Understanding Tree Ensemble Classifiers

Sep 05, 2024

Zhen Li, Weikai Yang, Jun Yuan, Jing Wu, Changjian Chen, Yao Ming, Fan Yang, Hui Zhang, Shixia Liu

Figure 1 for A Scalable Matrix Visualization for Understanding Tree Ensemble Classifiers

Figure 2 for A Scalable Matrix Visualization for Understanding Tree Ensemble Classifiers

Figure 3 for A Scalable Matrix Visualization for Understanding Tree Ensemble Classifiers

Figure 4 for A Scalable Matrix Visualization for Understanding Tree Ensemble Classifiers

Abstract:The high performance of tree ensemble classifiers benefits from a large set of rules, which, in turn, makes the models hard to understand. To improve interpretability, existing methods extract a subset of rules for approximation using model reduction techniques. However, by focusing on the reduced rule set, these methods often lose fidelity and ignore anomalous rules that, despite their infrequency, play crucial roles in real-world applications. This paper introduces a scalable visual analysis method to explain tree ensemble classifiers that contain tens of thousands of rules. The key idea is to address the issue of losing fidelity by adaptively organizing the rules as a hierarchy rather than reducing them. To ensure the inclusion of anomalous rules, we develop an anomaly-biased model reduction method to prioritize these rules at each hierarchical level. Synergized with this hierarchical organization of rules, we develop a matrix-based hierarchical visualization to support exploration at different levels of detail. Our quantitative experiments and case studies demonstrate how our method fosters a deeper understanding of both common and anomalous rules, thereby enhancing interpretability without sacrificing comprehensiveness.

* 15 pages, 10 figures

Via

Access Paper or Ask Questions

Multi-frequency Neural Born Iterative Method for Solving 2-D Inverse Scattering Problems

Sep 02, 2024

Daoqi Liu, Tao Shan, Maokun Li, Fan Yang, Shenheng Xu

Figure 1 for Multi-frequency Neural Born Iterative Method for Solving 2-D Inverse Scattering Problems

Figure 2 for Multi-frequency Neural Born Iterative Method for Solving 2-D Inverse Scattering Problems

Figure 3 for Multi-frequency Neural Born Iterative Method for Solving 2-D Inverse Scattering Problems

Figure 4 for Multi-frequency Neural Born Iterative Method for Solving 2-D Inverse Scattering Problems

Abstract:In this work, we propose a deep learning-based imaging method for addressing the multi-frequency electromagnetic (EM) inverse scattering problem (ISP). By combining deep learning technology with EM physical laws, we have successfully developed a multi-frequency neural Born iterative method (NeuralBIM), guided by the principles of the single-frequency NeuralBIM. This method integrates multitask learning techniques with NeuralBIM's efficient iterative inversion process to construct a robust multi-frequency Born iterative inversion model. During training, the model employs a multitask learning approach guided by homoscedastic uncertainty to adaptively allocate the weights of each frequency's data. Additionally, an unsupervised learning method, constrained by the physical laws of ISP, is used to train the multi-frequency NeuralBIM model, eliminating the need for contrast and total field data. The effectiveness of the multi-frequency NeuralBIM is validated through synthetic and experimental data, demonstrating improvements in accuracy and computational efficiency for solving ISP. Moreover, this method exhibits strong generalization capabilities and noise resistance. The multi-frequency NeuralBIM method explores a novel inversion method for multi-frequency EM data and provides an effective solution for the electromagnetic ISP of multi-frequency data.

Via

Access Paper or Ask Questions

Adaptive Variational Continual Learning via Task-Heuristic Modelling

Aug 29, 2024

Fan Yang

Abstract:Variational continual learning (VCL) is a turn-key learning algorithm that has state-of-the-art performance among the best continual learning models. In our work, we explore an extension of the generalized variational continual learning (GVCL) model, named AutoVCL, which combines task heuristics for informed learning and model optimization. We demonstrate that our model outperforms the standard GVCL with fixed hyperparameters, benefiting from the automatic adjustment of the hyperparameter based on the difficulty and similarity of the incoming task compared to the previous tasks.

* 4 pages, 2 figures, 3 tables

Via

Access Paper or Ask Questions

BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and Deduplication by Introducing a Competitive Large Language Model Baseline

Aug 27, 2024

Guosheng Dong, Da Pan, Yiding Sun, Shusen Zhang, Zheng Liang, Xin Wu, Yanjun Shen, Fan Yang, Haoze Sun, Tianpeng Li(+10 more)

Figure 1 for BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and Deduplication by Introducing a Competitive Large Language Model Baseline

Figure 2 for BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and Deduplication by Introducing a Competitive Large Language Model Baseline

Figure 3 for BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and Deduplication by Introducing a Competitive Large Language Model Baseline

Figure 4 for BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and Deduplication by Introducing a Competitive Large Language Model Baseline

Abstract:The general capabilities of Large Language Models (LLM) highly rely on the composition and selection on extensive pretraining datasets, treated as commercial secrets by several institutions. To mitigate this issue, we open-source the details of a universally applicable data processing pipeline and validate its effectiveness and potential by introducing a competitive LLM baseline. Specifically, the data processing pipeline consists of broad collection to scale up and reweighting to improve quality. We then pretrain a 7B model BaichuanSEED with 3T tokens processed by our pipeline without any deliberate downstream task-related optimization, followed by an easy but effective supervised fine-tuning stage. BaichuanSEED demonstrates consistency and predictability throughout training and achieves comparable performance on comprehensive benchmarks with several commercial advanced large language models, such as Qwen1.5 and Llama3. We also conduct several heuristic experiments to discuss the potential for further optimization of downstream tasks, such as mathematics and coding.

* 19 pages, 6 figures

Via

Access Paper or Ask Questions

Multi-finger Manipulation via Trajectory Optimization with Differentiable Rolling and Geometric Constraints

Aug 23, 2024

Fan Yang, Thomas Power, Sergio Aguilera Marinovic, Soshi Iba, Rana Soltani Zarrin, Dmitry Berenson

Abstract:Parameterizing finger rolling and finger-object contacts in a differentiable manner is important for formulating dexterous manipulation as a trajectory optimization problem. In contrast to previous methods which often assume simplified geometries of the robot and object or do not explicitly model finger rolling, we propose a method to further extend the capabilities of dexterous manipulation by accounting for non-trivial geometries of both the robot and the object. By integrating the object's Signed Distance Field (SDF) with a sampling method, our method estimates contact and rolling-related variables and includes those in a trajectory optimization framework. This formulation naturally allows for the emergence of finger-rolling behaviors, enabling the robot to locally adjust the contact points. Our method is tested in a peg alignment task and a screwdriver turning task, where it outperforms the baselines in terms of achieving desired object configurations and avoiding dropping the object. We also successfully apply our method to a real-world screwdriver turning task, demonstrating its robustness to the sim2real gap.

Via

Access Paper or Ask Questions