Alert button
Picture for Gabriel Kreiman

Gabriel Kreiman

Alert button

Sparse Distributed Memory is a Continual Learner

Mar 20, 2023
Trenton Bricken, Xander Davies, Deepak Singh, Dmitry Krotov, Gabriel Kreiman

Figure 1 for Sparse Distributed Memory is a Continual Learner
Figure 2 for Sparse Distributed Memory is a Continual Learner
Figure 3 for Sparse Distributed Memory is a Continual Learner
Figure 4 for Sparse Distributed Memory is a Continual Learner

Continual learning is a problem for artificial neural networks that their biological counterparts are adept at solving. Building on work using Sparse Distributed Memory (SDM) to connect a core neural circuit with the powerful Transformer model, we create a modified Multi-Layered Perceptron (MLP) that is a strong continual learner. We find that every component of our MLP variant translated from biology is necessary for continual learning. Our solution is also free from any memory replay or task information, and introduces novel methods to train sparse networks that may be broadly applicable.

* ICLR 2023  
* 9 Pages. ICLR Acceptance 
Viaarxiv icon

BrainBERT: Self-supervised representation learning for intracranial recordings

Feb 28, 2023
Christopher Wang, Vighnesh Subramaniam, Adam Uri Yaari, Gabriel Kreiman, Boris Katz, Ignacio Cases, Andrei Barbu

Figure 1 for BrainBERT: Self-supervised representation learning for intracranial recordings
Figure 2 for BrainBERT: Self-supervised representation learning for intracranial recordings
Figure 3 for BrainBERT: Self-supervised representation learning for intracranial recordings
Figure 4 for BrainBERT: Self-supervised representation learning for intracranial recordings

We create a reusable Transformer, BrainBERT, for intracranial recordings bringing modern representation learning approaches to neuroscience. Much like in NLP and speech recognition, this Transformer enables classifying complex concepts, i.e., decoding neural data, with higher accuracy and with much less data by being pretrained in an unsupervised manner on a large corpus of unannotated neural recordings. Our approach generalizes to new subjects with electrodes in new positions and to unrelated tasks showing that the representations robustly disentangle the neural signal. Just like in NLP where one can study language by investigating what a language model learns, this approach opens the door to investigating the brain by what a model of the brain learns. As a first step along this path, we demonstrate a new analysis of the intrinsic dimensionality of the computations in different areas of the brain. To construct these representations, we combine a technique for producing super-resolution spectrograms of neural data with an approach designed for generating contextual representations of audio by masking. In the future, far more concepts will be decodable from neural recordings by using representation learning, potentially unlocking the brain like language models unlocked language.

* 9 pages, 6 figures, ICLR 2023 
Viaarxiv icon

Forward Learning with Top-Down Feedback: Empirical and Analytical Characterization

Feb 10, 2023
Ravi Francesco Srinivasan, Francesca Mignacco, Martino Sorbaro, Maria Refinetti, Avi Cooper, Gabriel Kreiman, Giorgia Dellaferrera

Figure 1 for Forward Learning with Top-Down Feedback: Empirical and Analytical Characterization
Figure 2 for Forward Learning with Top-Down Feedback: Empirical and Analytical Characterization
Figure 3 for Forward Learning with Top-Down Feedback: Empirical and Analytical Characterization
Figure 4 for Forward Learning with Top-Down Feedback: Empirical and Analytical Characterization

"Forward-only" algorithms, which train neural networks while avoiding a backward pass, have recently gained attention as a way of solving the biologically unrealistic aspects of backpropagation. Here, we first discuss the similarities between two "forward-only" algorithms, the Forward-Forward and PEPITA frameworks, and demonstrate that PEPITA is equivalent to a Forward-Forward with top-down feedback connections. Then, we focus on PEPITA to address compelling challenges related to the "forward-only" rules, which include providing an analytical understanding of their dynamics and reducing the gap between their performance and that of backpropagation. We propose a theoretical analysis of the dynamics of PEPITA. In particular, we show that PEPITA is well-approximated by an "adaptive-feedback-alignment" algorithm and we analytically track its performance during learning in a prototype high-dimensional setting. Finally, we develop a strategy to apply the weight mirroring algorithm on "forward-only" algorithms with top-down feedback and we show how it impacts PEPITA's accuracy and convergence rate.

Viaarxiv icon

Learning to Learn: How to Continuously Teach Humans and Machines

Nov 28, 2022
Parantak Singh, You Li, Ankur Sikarwar, Weixian Lei, Daniel Gao, Morgan Bruce Talbot, Ying Sun, Mike Zheng Shou, Gabriel Kreiman, Mengmi Zhang

Figure 1 for Learning to Learn: How to Continuously Teach Humans and Machines
Figure 2 for Learning to Learn: How to Continuously Teach Humans and Machines
Figure 3 for Learning to Learn: How to Continuously Teach Humans and Machines
Figure 4 for Learning to Learn: How to Continuously Teach Humans and Machines

Our education system comprises a series of curricula. For example, when we learn mathematics at school, we learn in order from addition, to multiplication, and later to integration. Delineating a curriculum for teaching either a human or a machine shares the underlying goal of maximizing the positive knowledge transfer from early to later tasks and minimizing forgetting of the early tasks. Here, we exhaustively surveyed the effect of curricula on existing continual learning algorithms in the class-incremental setting, where algorithms must learn classes one at a time from a continuous stream of data. We observed that across a breadth of possible class orders (curricula), curricula influence the retention of information and that this effect is not just a product of stochasticity. Further, as a primary effort toward automated curriculum design, we proposed a method capable of designing and ranking effective curricula based on inter-class feature similarities. We compared the predicted curricula against empirically determined effectual curricula and observed significant overlaps between the two. To support the study of a curriculum designer, we conducted a series of human psychophysics experiments and contributed a new Continual Learning benchmark in object recognition. We assessed the degree of agreement in effective curricula between humans and machines. Surprisingly, our curriculum designer successfully predicts an optimal set of curricula that is effective for human learning. There are many considerations in curriculum design, such as timely student feedback and learning with multiple modalities. Our study is the first attempt to set a standard framework for the community to tackle the problem of teaching humans and machines to learn to learn continuously.

Viaarxiv icon

Efficient Zero-shot Visual Search via Target and Context-aware Transformer

Nov 24, 2022
Zhiwei Ding, Xuezhe Ren, Erwan David, Melissa Vo, Gabriel Kreiman, Mengmi Zhang

Figure 1 for Efficient Zero-shot Visual Search via Target and Context-aware Transformer
Figure 2 for Efficient Zero-shot Visual Search via Target and Context-aware Transformer
Figure 3 for Efficient Zero-shot Visual Search via Target and Context-aware Transformer
Figure 4 for Efficient Zero-shot Visual Search via Target and Context-aware Transformer

Visual search is a ubiquitous challenge in natural vision, including daily tasks such as finding a friend in a crowd or searching for a car in a parking lot. Human rely heavily on relevant target features to perform goal-directed visual search. Meanwhile, context is of critical importance for locating a target object in complex scenes as it helps narrow down the search area and makes the search process more efficient. However, few works have combined both target and context information in visual search computational models. Here we propose a zero-shot deep learning architecture, TCT (Target and Context-aware Transformer), that modulates self attention in the Vision Transformer with target and contextual relevant information to enable human-like zero-shot visual search performance. Target modulation is computed as patch-wise local relevance between the target and search images, whereas contextual modulation is applied in a global fashion. We conduct visual search experiments on TCT and other competitive visual search models on three natural scene datasets with varying levels of difficulty. TCT demonstrates human-like performance in terms of search efficiency and beats the SOTA models in challenging visual search tasks. Importantly, TCT generalizes well across datasets with novel objects without retraining or fine-tuning. Furthermore, we also introduce a new dataset to benchmark models for invariant visual search under incongruent contexts. TCT manages to search flexibly via target and context modulation, even under incongruent contexts.

Viaarxiv icon

Human or Machine? Turing Tests for Vision and Language

Nov 23, 2022
Mengmi Zhang, Giorgia Dellaferrera, Ankur Sikarwar, Marcelo Armendariz, Noga Mudrik, Prachi Agrawal, Spandan Madan, Andrei Barbu, Haochen Yang, Tanishq Kumar, Meghna Sadwani, Stella Dellaferrera, Michele Pizzochero, Hanspeter Pfister, Gabriel Kreiman

Figure 1 for Human or Machine? Turing Tests for Vision and Language
Figure 2 for Human or Machine? Turing Tests for Vision and Language
Figure 3 for Human or Machine? Turing Tests for Vision and Language
Figure 4 for Human or Machine? Turing Tests for Vision and Language

As AI algorithms increasingly participate in daily activities that used to be the sole province of humans, we are inevitably called upon to consider how much machines are really like us. To address this question, we turn to the Turing test and systematically benchmark current AIs in their abilities to imitate humans. We establish a methodology to evaluate humans versus machines in Turing-like tests and systematically evaluate a representative set of selected domains, parameters, and variables. The experiments involved testing 769 human agents, 24 state-of-the-art AI agents, 896 human judges, and 8 AI judges, in 21,570 Turing tests across 6 tasks encompassing vision and language modalities. Surprisingly, the results reveal that current AIs are not far from being able to impersonate human judges across different ages, genders, and educational levels in complex visual and language challenges. In contrast, simple AI judges outperform human judges in distinguishing human answers versus machine answers. The curated large-scale Turing test datasets introduced here and their evaluation metrics provide valuable insights to assess whether an agent is human or not. The proposed formulation to benchmark human imitation ability in current AIs paves a way for the research community to expand Turing tests to other research areas and conditions. All of source code and data are publicly available at https://tinyurl.com/8x8nha7p

* 134 pages 
Viaarxiv icon

Reason from Context with Self-supervised Learning

Nov 23, 2022
Xiao Liu, Ankur Sikarwar, Joo Hwee Lim, Gabriel Kreiman, Zenglin Shi, Mengmi Zhang

Figure 1 for Reason from Context with Self-supervised Learning
Figure 2 for Reason from Context with Self-supervised Learning
Figure 3 for Reason from Context with Self-supervised Learning
Figure 4 for Reason from Context with Self-supervised Learning

A tiny object in the sky cannot be an elephant. Context reasoning is critical in visual recognition, where current inputs need to be interpreted in the light of previous experience and knowledge. To date, research into contextual reasoning in visual recognition has largely proceeded with supervised learning methods. The question of whether contextual knowledge can be captured with self-supervised learning regimes remains under-explored. Here, we established a methodology for context-aware self-supervised learning. We proposed a novel Self-supervised Learning Method for Context Reasoning (SeCo), where the only inputs to SeCo are unlabeled images with multiple objects present in natural scenes. Similar to the distinction between fovea and periphery in human vision, SeCo processes self-proposed target object regions and their contexts separately, and then employs a learnable external memory for retrieving and updating context-relevant target information. To evaluate the contextual associations learned by the computational models, we introduced two evaluation protocols, lift-the-flap and object priming, addressing the problems of "what" and "where" in context reasoning. In both tasks, SeCo outperformed all state-of-the-art (SOTA) self-supervised learning methods by a significant margin. Our network analysis revealed that the external memory in SeCo learns to store prior contextual knowledge, facilitating target identity inference in lift-the-flap task. Moreover, we conducted psychophysics experiments and introduced a Human benchmark in Object Priming dataset (HOP). Our quantitative and qualitative results demonstrate that SeCo approximates human-level performance and exhibits human-like behavior. All our source code and data are publicly available here.

Viaarxiv icon

White-Box Adversarial Policies in Deep Reinforcement Learning

Sep 05, 2022
Stephen Casper, Dylan Hadfield-Menell, Gabriel Kreiman

Figure 1 for White-Box Adversarial Policies in Deep Reinforcement Learning
Figure 2 for White-Box Adversarial Policies in Deep Reinforcement Learning
Figure 3 for White-Box Adversarial Policies in Deep Reinforcement Learning
Figure 4 for White-Box Adversarial Policies in Deep Reinforcement Learning

Adversarial examples against AI systems pose both risks via malicious attacks and opportunities for improving robustness via adversarial training. In multiagent settings, adversarial policies can be developed by training an adversarial agent to minimize a victim agent's rewards. Prior work has studied black-box attacks where the adversary only sees the state observations and effectively treats the victim as any other part of the environment. In this work, we experiment with white-box adversarial policies to study whether an agent's internal state can offer useful information for other agents. We make three contributions. First, we introduce white-box adversarial policies in which an attacker can observe a victim's internal state at each timestep. Second, we demonstrate that white-box access to a victim makes for better attacks in two-agent environments, resulting in both faster initial learning and higher asymptotic performance against the victim. Third, we show that training against white-box adversarial policies can be used to make learners in single-agent environments more robust to domain shifts.

* Code is available at https://github.com/thestephencasper/white_box_rarl 
Viaarxiv icon

What makes domain generalization hard?

Jun 15, 2022
Spandan Madan, Li You, Mengmi Zhang, Hanspeter Pfister, Gabriel Kreiman

Figure 1 for What makes domain generalization hard?
Figure 2 for What makes domain generalization hard?
Figure 3 for What makes domain generalization hard?
Figure 4 for What makes domain generalization hard?

While several methodologies have been proposed for the daunting task of domain generalization, understanding what makes this task challenging has received little attention. Here we present SemanticDG (Semantic Domain Generalization): a benchmark with 15 photo-realistic domains with the same geometry, scene layout and camera parameters as the popular 3D ScanNet dataset, but with controlled domain shifts in lighting, materials, and viewpoints. Using this benchmark, we investigate the impact of each of these semantic shifts on generalization independently. Visual recognition models easily generalize to novel lighting, but struggle with distribution shifts in materials and viewpoints. Inspired by human vision, we hypothesize that scene context can serve as a bridge to help models generalize across material and viewpoint domain shifts and propose a context-aware vision transformer along with a contrastive loss over material and viewpoint changes to address these domain shifts. Our approach (dubbed as CDCNet) outperforms existing domain generalization methods by over an 18% margin. As a critical benchmark, we also conduct psychophysics experiments and find that humans generalize equally well across lighting, materials and viewpoints. The benchmark and computational model introduced here help understand the challenges associated with generalization across domains and provide initial steps towards extrapolation to semantic distribution shifts. We include all data and source code in the supplement.

Viaarxiv icon

Error-driven Input Modulation: Solving the Credit Assignment Problem without a Backward Pass

Jan 27, 2022
Giorgia Dellaferrera, Gabriel Kreiman

Figure 1 for Error-driven Input Modulation: Solving the Credit Assignment Problem without a Backward Pass
Figure 2 for Error-driven Input Modulation: Solving the Credit Assignment Problem without a Backward Pass
Figure 3 for Error-driven Input Modulation: Solving the Credit Assignment Problem without a Backward Pass
Figure 4 for Error-driven Input Modulation: Solving the Credit Assignment Problem without a Backward Pass

Supervised learning in artificial neural networks typically relies on backpropagation, where the weights are updated based on the error-function gradients and sequentially propagated from the output layer to the input layer. Although this approach has proven effective in a wide domain of applications, it lacks biological plausibility in many regards, including the weight symmetry problem, the dependence of learning on non-local signals, the freezing of neural activity during error propagation, and the update locking problem. Alternative training schemes - such as sign symmetry, feedback alignment, and direct feedback alignment - have been introduced, but invariably rely on a backward pass that hinders the possibility of solving all the issues simultaneously. Here, we propose to replace the backward pass with a second forward pass in which the input signal is modulated based on the error of the network. We show that this novel learning rule comprehensively addresses all the above-mentioned issues and can be applied to both fully connected and convolutional models. We test this learning rule on MNIST, CIFAR-10, and CIFAR-100. These results help incorporate biological principles into machine learning.

Viaarxiv icon