Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yisong Yue

End-to-End Sequential Sampling and Reconstruction for MR Imaging

May 13, 2021
Tianwei Yin, Zihui Wu, He Sun, Adrian V. Dalca, Yisong Yue, Katherine L. Bouman

Figure 1 for End-to-End Sequential Sampling and Reconstruction for MR Imaging

Figure 2 for End-to-End Sequential Sampling and Reconstruction for MR Imaging

Figure 3 for End-to-End Sequential Sampling and Reconstruction for MR Imaging

Figure 4 for End-to-End Sequential Sampling and Reconstruction for MR Imaging

Accelerated MRI shortens acquisition time by subsampling in the measurement k-space. Recovering a high-fidelity anatomical image from subsampled measurements requires close cooperation between two components: (1) a sampler that chooses the subsampling pattern and (2) a reconstructor that recovers images from incomplete measurements. In this paper, we leverage the sequential nature of MRI measurements, and propose a fully differentiable framework that jointly learns a sequential sampling policy simultaneously with a reconstruction strategy. This co-designed framework is able to adapt during acquisition in order to capture the most informative measurements for a particular target (Figure 1). Experimental results on the fastMRI knee dataset demonstrate that the proposed approach successfully utilizes intermediate information during the sampling process to boost reconstruction performance. In particular, our proposed method outperforms the current state-of-the-art learned k-space sampling baseline on up to 96.96% of test samples. We also investigate the individual and collective benefits of the sequential sampling and co-design strategies. Code and more visualizations are available at http://imaging.cms.caltech.edu/seq-mri

* Code and supplementary materials are available at http://imaging.cms.caltech.edu/seq-mri

Via

Access Paper or Ask Questions

The Multi-Agent Behavior Dataset: Mouse Dyadic Social Interactions

Apr 07, 2021
Jennifer J. Sun, Tomomi Karigo, Dipam Chakraborty, Sharada P. Mohanty, David J. Anderson, Pietro Perona, Yisong Yue, Ann Kennedy

Figure 1 for The Multi-Agent Behavior Dataset: Mouse Dyadic Social Interactions

Figure 2 for The Multi-Agent Behavior Dataset: Mouse Dyadic Social Interactions

Figure 3 for The Multi-Agent Behavior Dataset: Mouse Dyadic Social Interactions

Figure 4 for The Multi-Agent Behavior Dataset: Mouse Dyadic Social Interactions

Multi-agent behavior modeling aims to understand the interactions that occur between agents. We present a multi-agent dataset from behavioral neuroscience, the Caltech Mouse Social Interactions (CalMS21) Dataset. Our dataset consists of trajectory data of social interactions, recorded from videos of freely behaving mice in a standard resident-intruder assay. The CalMS21 dataset is part of the Multi-Agent Behavior Challenge 2021 and for our next step, our goal is to incorporate datasets from other domains studying multi-agent behavior. To help accelerate behavioral studies, the CalMS21 dataset provides a benchmark to evaluate the performance of automated behavior classification methods in three settings: (1) for training on large behavioral datasets all annotated by a single annotator, (2) for style transfer to learn inter-annotator differences in behavior definitions, and (3) for learning of new behaviors of interest given limited training data. The dataset consists of 6 million frames of unlabelled tracked poses of interacting mice, as well as over 1 million frames with tracked poses and corresponding frame-level behavior annotations. The challenge of our dataset is to be able to classify behaviors accurately using both labelled and unlabelled tracking data, as well as being able to generalize to new annotators and behaviors.

* Dataset and challenge: https://www.aicrowd.com/challenges/multi-agent-behavior-representation-modeling-measurement-and-applications Part of MABe workshop @ CVPR21: https://sites.google.com/view/mabe21/home

Via

Access Paper or Ask Questions

Learning Unstable Dynamics with One Minute of Data: A Differentiation-based Gaussian Process Approach

Mar 08, 2021
Ivan D. Jimenez Rodriguez, Ugo Rosolia, Aaron D. Ames, Yisong Yue

Figure 1 for Learning Unstable Dynamics with One Minute of Data: A Differentiation-based Gaussian Process Approach

Figure 2 for Learning Unstable Dynamics with One Minute of Data: A Differentiation-based Gaussian Process Approach

Figure 3 for Learning Unstable Dynamics with One Minute of Data: A Differentiation-based Gaussian Process Approach

Figure 4 for Learning Unstable Dynamics with One Minute of Data: A Differentiation-based Gaussian Process Approach

We present a straightforward and efficient way to estimate dynamics models for unstable robotic systems. Specifically, we show how to exploit the differentiability of Gaussian processes to create a state-dependent linearized approximation of the true continuous dynamics. Our approach is compatible with most Gaussian process approaches for system identification, and can learn an accurate model using modest amounts of training data. We validate our approach by iteratively learning the system dynamics of an unstable system such as a 9-D segway (using only one minute of data) and we show that the resulting controller is robust to unmodelled dynamics and disturbances, while state-of-the-art control methods based on nominal models can fail under small perturbations.

Via

Access Paper or Ask Questions

Minimax Model Learning

Mar 02, 2021
Cameron Voloshin, Nan Jiang, Yisong Yue

We present a novel off-policy loss function for learning a transition model in model-based reinforcement learning. Notably, our loss is derived from the off-policy policy evaluation objective with an emphasis on correcting distribution shift. Compared to previous model-based techniques, our approach allows for greater robustness under model misspecification or distribution shift induced by learning/evaluating policies that are distinct from the data-generating policy. We provide a theoretical analysis and show empirical improvements over existing model-based off-policy evaluation methods. We provide further analysis showing our loss can be used for off-policy optimization (OPO) and demonstrate its integration with more recent improvements in OPO.

* PMLR, Volume 130, 2021

Via

Access Paper or Ask Questions

Computing the Information Content of Trained Neural Networks

Mar 01, 2021
Jeremy Bernstein, Yisong Yue

Figure 1 for Computing the Information Content of Trained Neural Networks

Figure 2 for Computing the Information Content of Trained Neural Networks

Figure 3 for Computing the Information Content of Trained Neural Networks

How much information does a learning algorithm extract from the training data and store in a neural network's weights? Too much, and the network would overfit to the training data. Too little, and the network would not fit to anything at all. Na\"ively, the amount of information the network stores should scale in proportion to the number of trainable weights. This raises the question: how can neural networks with vastly more weights than training data still generalise? A simple resolution to this conundrum is that the number of weights is usually a bad proxy for the actual amount of information stored. For instance, typical weight vectors may be highly compressible. Then another question occurs: is it possible to compute the actual amount of information stored? This paper derives both a consistent estimator and a closed-form upper bound on the information content of infinitely wide neural networks. The derivation is based on an identification between neural information content and the negative log probability of a Gaussian orthant. This identification yields bounds that analytically control the generalisation behaviour of the entire solution space of infinitely wide networks. The bounds have a simple dependence on both the network architecture and the training data. Corroborating the findings of Valle-P\'erez et al. (2019), who conducted a similar analysis using approximate Gaussian integration techniques, the bounds are found to be both non-vacuous and correlated with the empirical generalisation behaviour at finite width.

Via

Access Paper or Ask Questions

Learning Invariant Representation of Tasks for Robust Surgical State Estimation

Feb 18, 2021
Yidan Qin, Max Allan, Yisong Yue, Joel W. Burdick, Mahdi Azizian

Figure 1 for Learning Invariant Representation of Tasks for Robust Surgical State Estimation

Figure 2 for Learning Invariant Representation of Tasks for Robust Surgical State Estimation

Figure 3 for Learning Invariant Representation of Tasks for Robust Surgical State Estimation

Figure 4 for Learning Invariant Representation of Tasks for Robust Surgical State Estimation

Surgical state estimators in robot-assisted surgery (RAS) - especially those trained via learning techniques - rely heavily on datasets that capture surgeon actions in laboratory or real-world surgical tasks. Real-world RAS datasets are costly to acquire, are obtained from multiple surgeons who may use different surgical strategies, and are recorded under uncontrolled conditions in highly complex environments. The combination of high diversity and limited data calls for new learning methods that are robust and invariant to operating conditions and surgical techniques. We propose StiseNet, a Surgical Task Invariance State Estimation Network with an invariance induction framework that minimizes the effects of variations in surgical technique and operating environments inherent to RAS datasets. StiseNet's adversarial architecture learns to separate nuisance factors from information needed for surgical state estimation. StiseNet is shown to outperform state-of-the-art state estimation methods on three datasets (including a new real-world RAS dataset: HERNIA-20).

* Accepted to IEEE Robotics & Automation Letters

Via

Access Paper or Ask Questions

Learning by Turning: Neural Architecture Aware Optimisation

Feb 14, 2021
Yang Liu, Jeremy Bernstein, Markus Meister, Yisong Yue

Figure 1 for Learning by Turning: Neural Architecture Aware Optimisation

Figure 2 for Learning by Turning: Neural Architecture Aware Optimisation

Figure 3 for Learning by Turning: Neural Architecture Aware Optimisation

Figure 4 for Learning by Turning: Neural Architecture Aware Optimisation

Descent methods for deep networks are notoriously capricious: they require careful tuning of step size, momentum and weight decay, and which method will work best on a new benchmark is a priori unclear. To address this problem, this paper conducts a combined study of neural architecture and optimisation, leading to a new optimiser called Nero: the neuronal rotator. Nero trains reliably without momentum or weight decay, works in situations where Adam and SGD fail, and requires little to no learning rate tuning. Also, Nero's memory footprint is ~ square root that of Adam or LAMB. Nero combines two ideas: (1) projected gradient descent over the space of balanced networks; (2) neuron-specific updates, where the step size sets the angle through which each neuron's hyperplane turns. The paper concludes by discussing how this geometric connection between architecture and optimisation may impact theories of generalisation in deep learning.

Via

Access Paper or Ask Questions

Disentangling Observed Causal Effects from Latent Confounders using Method of Moments

Jan 17, 2021
Anqi Liu, Hao Liu, Tongxin Li, Saeed Karimi-Bidhendi, Yisong Yue, Anima Anandkumar

Figure 1 for Disentangling Observed Causal Effects from Latent Confounders using Method of Moments

Figure 2 for Disentangling Observed Causal Effects from Latent Confounders using Method of Moments

Figure 3 for Disentangling Observed Causal Effects from Latent Confounders using Method of Moments

Figure 4 for Disentangling Observed Causal Effects from Latent Confounders using Method of Moments

Discovering the complete set of causal relations among a group of variables is a challenging unsupervised learning problem. Often, this challenge is compounded by the fact that there are latent or hidden confounders. When only observational data is available, the problem is ill-posed, i.e. the causal relationships are non-identifiable unless strong modeling assumptions are made. When interventions are available, we provide guarantees on identifiability and learnability under mild assumptions. We assume a linear structural equation model (SEM) with independent latent factors and directed acyclic graph (DAG) relationships among the observables. Since the latent variable inference is based on independent component analysis (ICA), we call this model SEM-ICA. We use the method of moments principle to establish model identifiability. We develop efficient algorithms based on coupled tensor decomposition with linear constraints to obtain scalable and guaranteed solutions. Thus, we provide a principled approach to tackling the joint problem of causal discovery and latent variable inference.

Via

Access Paper or Ask Questions

Neural-Swarm2: Planning and Control of Heterogeneous Multirotor Swarms using Learned Interactions

Dec 10, 2020
Guanya Shi, Wolfgang Hönig, Xichen Shi, Yisong Yue, Soon-Jo Chung

Figure 1 for Neural-Swarm2: Planning and Control of Heterogeneous Multirotor Swarms using Learned Interactions

Figure 2 for Neural-Swarm2: Planning and Control of Heterogeneous Multirotor Swarms using Learned Interactions

Figure 3 for Neural-Swarm2: Planning and Control of Heterogeneous Multirotor Swarms using Learned Interactions

Figure 4 for Neural-Swarm2: Planning and Control of Heterogeneous Multirotor Swarms using Learned Interactions

We present Neural-Swarm2, a learning-based method for motion planning and control that allows heterogeneous multirotors in a swarm to safely fly in close proximity. Such operation for drones is challenging due to complex aerodynamic interaction forces, such as downwash generated by nearby drones and ground effect. Conventional planning and control methods neglect capturing these interaction forces, resulting in sparse swarm configuration during flight. Our approach combines a physics-based nominal dynamics model with learned Deep Neural Networks (DNNs) with strong Lipschitz properties. We evolve two techniques to accurately predict the aerodynamic interactions between heterogeneous multirotors: i) spectral normalization for stability and generalization guarantees of unseen data and ii) heterogeneous deep sets for supporting any number of heterogeneous neighbors in a permutation-invariant manner without reducing expressiveness. The learned residual dynamics benefit both the proposed interaction-aware multi-robot motion planning and the nonlinear tracking control designs because the learned interaction forces reduce the modelling errors. Experimental results demonstrate that Neural-Swarm2 is able to generalize to larger swarms beyond training cases and significantly outperforms a baseline nonlinear tracking controller with up to three times reduction in worst-case tracking errors.

* Video is available at https://youtu.be/Y02juH6BDxo

Via

Access Paper or Ask Questions

Task Programming: Learning Data Efficient Behavior Representations

Nov 27, 2020
Jennifer J. Sun, Ann Kennedy, Eric Zhan, Yisong Yue, Pietro Perona

Figure 1 for Task Programming: Learning Data Efficient Behavior Representations

Figure 2 for Task Programming: Learning Data Efficient Behavior Representations

Figure 3 for Task Programming: Learning Data Efficient Behavior Representations

Figure 4 for Task Programming: Learning Data Efficient Behavior Representations

Specialized domain knowledge is often necessary to accurately annotate training sets for in-depth analysis, but can be burdensome and time-consuming to acquire from domain experts. This issue arises prominently in automated behavior analysis, in which agent movements or actions of interest are detected from video tracking data. To reduce annotation effort, we present TREBA: a method to learn annotation-sample efficient trajectory embedding for behavior analysis, based on multi-task self-supervised learning. The tasks in our method can be efficiently engineered by domain experts through a process we call "task programming", which uses programs to explicitly encode structured knowledge from domain experts. Total domain expert effort can be reduced by exchanging data annotation time for the construction of a small number of programmed tasks. We evaluate this trade-off using data from behavioral neuroscience, in which specialized domain knowledge is used to identify behaviors. We present experimental results in three datasets across two domains: mice and fruit flies. Using embeddings from TREBA, we reduce annotation burden by up to a factor of 10 without compromising accuracy compared to state-of-the-art features. Our results thus suggest that task programming can be an effective way to reduce annotation effort for domain experts.

Via

Access Paper or Ask Questions