Alert button
Picture for Kumar Krishna Agrawal

Kumar Krishna Agrawal

Alert button

Attribute Diversity Determines the Systematicity Gap in VQA

Nov 15, 2023
Ian Berlot-Attwell, A. Michael Carrell, Kumar Krishna Agrawal, Yash Sharma, Naomi Saphra

The degree to which neural networks can generalize to new combinations of familiar concepts, and the conditions under which they are able to do so, has long been an open question. In this work, we study the systematicity gap in visual question answering: the performance difference between reasoning on previously seen and unseen combinations of object attributes. To test, we introduce a novel diagnostic dataset, CLEVR-HOPE. We find that while increased quantity of training data does not reduce the systematicity gap, increased training data diversity of the attributes in the unseen combination does. In all, our experiments suggest that the more distinct attribute type combinations are seen during training, the more systematic we can expect the resulting model to be.

* 18 pages, 20 figures 
Viaarxiv icon

Context-Aware Streaming Perception in Dynamic Environments

Aug 16, 2022
Gur-Eyal Sela, Ionel Gog, Justin Wong, Kumar Krishna Agrawal, Xiangxi Mo, Sukrit Kalra, Peter Schafhalter, Eric Leong, Xin Wang, Bharathan Balaji, Joseph Gonzalez, Ion Stoica

Figure 1 for Context-Aware Streaming Perception in Dynamic Environments
Figure 2 for Context-Aware Streaming Perception in Dynamic Environments
Figure 3 for Context-Aware Streaming Perception in Dynamic Environments
Figure 4 for Context-Aware Streaming Perception in Dynamic Environments

Efficient vision works maximize accuracy under a latency budget. These works evaluate accuracy offline, one image at a time. However, real-time vision applications like autonomous driving operate in streaming settings, where ground truth changes between inference start and finish. This results in a significant accuracy drop. Therefore, a recent work proposed to maximize accuracy in streaming settings on average. In this paper, we propose to maximize streaming accuracy for every environment context. We posit that scenario difficulty influences the initial (offline) accuracy difference, while obstacle displacement in the scene affects the subsequent accuracy degradation. Our method, Octopus, uses these scenario properties to select configurations that maximize streaming accuracy at test time. Our method improves tracking performance (S-MOTA) by 7.4% over the conventional static approach. Further, performance improvement using our method comes in addition to, and not instead of, advances in offline accuracy.

* 26 pages, 10 figures, to be published in ECCV 2022 
Viaarxiv icon

Investigating Power laws in Deep Representation Learning

Feb 11, 2022
Arna Ghosh, Arnab Kumar Mondal, Kumar Krishna Agrawal, Blake Richards

Figure 1 for Investigating Power laws in Deep Representation Learning
Figure 2 for Investigating Power laws in Deep Representation Learning
Figure 3 for Investigating Power laws in Deep Representation Learning
Figure 4 for Investigating Power laws in Deep Representation Learning

Representation learning that leverages large-scale labelled datasets, is central to recent progress in machine learning. Access to task relevant labels at scale is often scarce or expensive, motivating the need to learn from unlabelled datasets with self-supervised learning (SSL). Such large unlabelled datasets (with data augmentations) often provide a good coverage of the underlying input distribution. However evaluating the representations learned by SSL algorithms still requires task-specific labelled samples in the training pipeline. Additionally, the generalization of task-specific encoding is often sensitive to potential distribution shift. Inspired by recent advances in theoretical machine learning and vision neuroscience, we observe that the eigenspectrum of the empirical feature covariance matrix often follows a power law. For visual representations, we estimate the coefficient of the power law, $\alpha$, across three key attributes which influence representation learning: learning objective (supervised, SimCLR, Barlow Twins and BYOL), network architecture (VGG, ResNet and Vision Transformer), and tasks (object and scene recognition). We observe that under mild conditions, proximity of $\alpha$ to 1, is strongly correlated to the downstream generalization performance. Furthermore, $\alpha \approx 1$ is a strong indicator of robustness to label noise during fine-tuning. Notably, $\alpha$ is computable from the representations without knowledge of any labels, thereby offering a framework to evaluate the quality of representations in unlabelled datasets.

Viaarxiv icon

Learning from an Exploring Demonstrator: Optimal Reward Estimation for Bandits

Jun 28, 2021
Wenshuo Guo, Kumar Krishna Agrawal, Aditya Grover, Vidya Muthukumar, Ashwin Pananjady

Figure 1 for Learning from an Exploring Demonstrator: Optimal Reward Estimation for Bandits
Figure 2 for Learning from an Exploring Demonstrator: Optimal Reward Estimation for Bandits
Figure 3 for Learning from an Exploring Demonstrator: Optimal Reward Estimation for Bandits
Figure 4 for Learning from an Exploring Demonstrator: Optimal Reward Estimation for Bandits

We introduce the "inverse bandit" problem of estimating the rewards of a multi-armed bandit instance from observing the learning process of a low-regret demonstrator. Existing approaches to the related problem of inverse reinforcement learning assume the execution of an optimal policy, and thereby suffer from an identifiability issue. In contrast, our paradigm leverages the demonstrator's behavior en route to optimality, and in particular, the exploration phase, to obtain consistent reward estimates. We develop simple and efficient reward estimation procedures for demonstrations within a class of upper-confidence-based algorithms, showing that reward estimation gets progressively easier as the regret of the algorithm increases. We match these upper bounds with information-theoretic lower bounds that apply to any demonstrator algorithm, thereby characterizing the optimal tradeoff between exploration and reward estimation. Extensive empirical evaluations on both synthetic data and simulated experimental design data from the natural sciences corroborate our theoretical results.

Viaarxiv icon

Discrete Flows: Invertible Generative Models of Discrete Data

May 24, 2019
Dustin Tran, Keyon Vafa, Kumar Krishna Agrawal, Laurent Dinh, Ben Poole

Figure 1 for Discrete Flows: Invertible Generative Models of Discrete Data
Figure 2 for Discrete Flows: Invertible Generative Models of Discrete Data
Figure 3 for Discrete Flows: Invertible Generative Models of Discrete Data
Figure 4 for Discrete Flows: Invertible Generative Models of Discrete Data

While normalizing flows have led to significant advances in modeling high-dimensional continuous distributions, their applicability to discrete distributions remains unknown. In this paper, we show that flows can in fact be extended to discrete events---and under a simple change-of-variables formula not requiring log-determinant-Jacobian computations. Discrete flows have numerous applications. We consider two flow architectures: discrete autoregressive flows that enable bidirectionality, allowing, for example, tokens in text to depend on both left-to-right and right-to-left contexts in an exact language model; and discrete bipartite flows that enable efficient non-autoregressive generation as in RealNVP. Empirically, we find that discrete autoregressive flows outperform autoregressive baselines on synthetic discrete distributions, an addition task, and Potts models; and bipartite flows can obtain competitive performance with autoregressive baselines on character-level language modeling for Penn Tree Bank and text8.

Viaarxiv icon

GANSynth: Adversarial Neural Audio Synthesis

Apr 15, 2019
Jesse Engel, Kumar Krishna Agrawal, Shuo Chen, Ishaan Gulrajani, Chris Donahue, Adam Roberts

Figure 1 for GANSynth: Adversarial Neural Audio Synthesis
Figure 2 for GANSynth: Adversarial Neural Audio Synthesis
Figure 3 for GANSynth: Adversarial Neural Audio Synthesis
Figure 4 for GANSynth: Adversarial Neural Audio Synthesis

Efficient audio synthesis is an inherently difficult machine learning task, as human perception is sensitive to both global structure and fine-scale waveform coherence. Autoregressive models, such as WaveNet, model local structure at the expense of global latent structure and slow iterative sampling, while Generative Adversarial Networks (GANs), have global latent conditioning and efficient parallel sampling, but struggle to generate locally-coherent audio waveforms. Herein, we demonstrate that GANs can in fact generate high-fidelity and locally-coherent audio by modeling log magnitudes and instantaneous frequencies with sufficient frequency resolution in the spectral domain. Through extensive empirical investigations on the NSynth dataset, we demonstrate that GANs are able to outperform strong WaveNet baselines on automated and human evaluation metrics, and efficiently generate audio several orders of magnitude faster than their autoregressive counterparts.

* Colab Notebook: http://goo.gl/magenta/gansynth-demo 
Viaarxiv icon

Discriminator-Actor-Critic: Addressing Sample Inefficiency and Reward Bias in Adversarial Imitation Learning

Oct 15, 2018
Ilya Kostrikov, Kumar Krishna Agrawal, Debidatta Dwibedi, Sergey Levine, Jonathan Tompson

Figure 1 for Discriminator-Actor-Critic: Addressing Sample Inefficiency and Reward Bias in Adversarial Imitation Learning
Figure 2 for Discriminator-Actor-Critic: Addressing Sample Inefficiency and Reward Bias in Adversarial Imitation Learning
Figure 3 for Discriminator-Actor-Critic: Addressing Sample Inefficiency and Reward Bias in Adversarial Imitation Learning
Figure 4 for Discriminator-Actor-Critic: Addressing Sample Inefficiency and Reward Bias in Adversarial Imitation Learning

We identify two issues with the family of algorithms based on the Adversarial Imitation Learning framework. The first problem is implicit bias present in the reward functions used in these algorithms. While these biases might work well for some environments, they can also lead to sub-optimal behavior in others. Secondly, even though these algorithms can learn from few expert demonstrations, they require a prohibitively large number of interactions with the environment in order to imitate the expert for many real-world applications. In order to address these issues, we propose a new algorithm called Discriminator-Actor-Critic that uses off-policy Reinforcement Learning to reduce policy-environment interaction sample complexity by an average factor of 10. Furthermore, since our reward function is designed to be unbiased, we can apply our algorithm to many problems without making any task-specific adjustments.

Viaarxiv icon

Towards Mixed Optimization for Reinforcement Learning with Program Synthesis

Jul 03, 2018
Surya Bhupatiraju, Kumar Krishna Agrawal, Rishabh Singh

Figure 1 for Towards Mixed Optimization for Reinforcement Learning with Program Synthesis
Figure 2 for Towards Mixed Optimization for Reinforcement Learning with Program Synthesis
Figure 3 for Towards Mixed Optimization for Reinforcement Learning with Program Synthesis
Figure 4 for Towards Mixed Optimization for Reinforcement Learning with Program Synthesis

Deep reinforcement learning has led to several recent breakthroughs, though the learned policies are often based on black-box neural networks. This makes them difficult to interpret and to impose desired specification constraints during learning. We present an iterative framework, MORL, for improving the learned policies using program synthesis. Concretely, we propose to use synthesis techniques to obtain a symbolic representation of the learned policy, which can then be debugged manually or automatically using program repair. After the repair step, we use behavior cloning to obtain the policy corresponding to the repaired program, which is then further improved using gradient descent. This process continues until the learned policy satisfies desired constraints. We instantiate MORL for the simple CartPole problem and show that the programmatic representation allows for high-level modifications that in turn lead to improved learning of the policies.

* Updated publication details, format. Accepted at NAMPI workshop, ICML '18 
Viaarxiv icon

Recurrent Memory Addressing for describing videos

Mar 23, 2017
Arnav Kumar Jain, Abhinav Agarwalla, Kumar Krishna Agrawal, Pabitra Mitra

Figure 1 for Recurrent Memory Addressing for describing videos
Figure 2 for Recurrent Memory Addressing for describing videos
Figure 3 for Recurrent Memory Addressing for describing videos
Figure 4 for Recurrent Memory Addressing for describing videos

In this paper, we introduce Key-Value Memory Networks to a multimodal setting and a novel key-addressing mechanism to deal with sequence-to-sequence models. The proposed model naturally decomposes the problem of video captioning into vision and language segments, dealing with them as key-value pairs. More specifically, we learn a semantic embedding (v) corresponding to each frame (k) in the video, thereby creating (k, v) memory slots. We propose to find the next step attention weights conditioned on the previous attention distributions for the key-value memory slots in the memory addressing schema. Exploiting this flexibility of the framework, we additionally capture spatial dependencies while mapping from the visual to semantic embedding. Experiments done on the Youtube2Text dataset demonstrate usefulness of recurrent key-addressing, while achieving competitive scores on BLEU@4, METEOR metrics against state-of-the-art models.

Viaarxiv icon