Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Hanspeter Pfister

On the Capability of Neural Networks to Generalize to Unseen Category-Pose Combinations

Jul 15, 2020
Spandan Madan, Timothy Henry, Jamell Dozier, Helen Ho, Nishchal Bhandari, Tomotake Sasaki, Frédo Durand, Hanspeter Pfister, Xavier Boix

Figure 1 for On the Capability of Neural Networks to Generalize to Unseen Category-Pose Combinations

Figure 2 for On the Capability of Neural Networks to Generalize to Unseen Category-Pose Combinations

Figure 3 for On the Capability of Neural Networks to Generalize to Unseen Category-Pose Combinations

Figure 4 for On the Capability of Neural Networks to Generalize to Unseen Category-Pose Combinations

Recognizing an object's category and pose lies at the heart of visual understanding. Recent works suggest that deep neural networks (DNNs) often fail to generalize to category-pose combinations not seen during training. However, it is unclear when and how such generalization may be possible. Does the number of combinations seen during training impact generalization? Is it better to learn category and pose in separate networks, or in a single shared network? Furthermore, what are the neural mechanisms that drive the network's generalization? In this paper, we answer these questions by analyzing state-of-the-art DNNs trained to recognize both object category and pose (position, scale, and 3D viewpoint) with quantitative control over the number of category-pose combinations seen during training. We also investigate the emergence of two types of specialized neurons that can explain generalization to unseen combinations---neurons selective to category and invariant to pose, and vice versa. We perform experiments on MNIST extended with position or scale, the iLab dataset with vehicles at different viewpoints, and a challenging new dataset for car model recognition and viewpoint estimation that we introduce in this paper, the Biased-Cars dataset. Our results demonstrate that as the number of combinations seen during training increases, networks generalize better to unseen category-pose combinations, facilitated by an increase in the selectivity and invariance of individual neurons. We find that learning category and pose in separate networks compared to a shared one leads to an increase in such selectivity and invariance, as separate networks are not forced to preserve information about both category and pose. This enables separate networks to significantly outperform shared ones at predicting unseen category-pose combinations.

Via

Access Paper or Ask Questions

A New Age of Computing and the Brain

Apr 27, 2020
Polina Golland, Jack Gallant, Greg Hager, Hanspeter Pfister, Christos Papadimitriou, Stefan Schaal, Joshua T. Vogelstein

Figure 1 for A New Age of Computing and the Brain

Figure 2 for A New Age of Computing and the Brain

Figure 3 for A New Age of Computing and the Brain

Figure 4 for A New Age of Computing and the Brain

The history of computer science and brain sciences are intertwined. In his unfinished manuscript "The Computer and the Brain," von Neumann debates whether or not the brain can be thought of as a computing machine and identifies some of the similarities and differences between natural and artificial computation. Turing, in his 1950 article in Mind, argues that computing devices could ultimately emulate intelligence, leading to his proposed Turing test. Herbert Simon predicted in 1957 that most psychological theories would take the form of a computer program. In 1976, David Marr proposed that the function of the visual system could be abstracted and studied at computational and algorithmic levels that did not depend on the underlying physical substrate. In December 2014, a two-day workshop supported by the Computing Community Consortium (CCC) and the National Science Foundation's Computer and Information Science and Engineering Directorate (NSF CISE) was convened in Washington, DC, with the goal of bringing together computer scientists and brain researchers to explore these new opportunities and connections, and develop a new, modern dialogue between the two research communities. Specifically, our objectives were: 1. To articulate a conceptual framework for research at the interface of brain sciences and computing and to identify key problems in this interface, presented in a way that will attract both CISE and brain researchers into this space. 2. To inform and excite researchers within the CISE research community about brain research opportunities and to identify and explain strategic roles they can play in advancing this initiative. 3. To develop new connections, conversations and collaborations between brain sciences and CISE researchers that will lead to highly relevant and competitive proposals, high-impact research, and influential publications.

* A Computing Community Consortium (CCC) workshop report, 24 pages

Via

Access Paper or Ask Questions

A Topological Nomenclature for 3D Shape Analysis in Connectomics

Sep 27, 2019
Abhimanyu Talwar, Zudi Lin, Donglai Wei, Yuesong Wu, Bowen Zheng, Jinglin Zhao, Won-Dong Jang, Xueying Wang, Jeff W. Lichtman, Hanspeter Pfister

Figure 1 for A Topological Nomenclature for 3D Shape Analysis in Connectomics

Figure 2 for A Topological Nomenclature for 3D Shape Analysis in Connectomics

Figure 3 for A Topological Nomenclature for 3D Shape Analysis in Connectomics

Figure 4 for A Topological Nomenclature for 3D Shape Analysis in Connectomics

An essential task in nano-scale connectomics is the morphology analysis of neurons and organelles like mitochondria to shed light on their biological properties. However, these biological objects often have tangled parts or complex branching patterns, which makes it hard to abstract, categorize, and manipulate their morphology. Here we propose a topological nomenclature to name these objects like chemical compounds for neuroscience analysis. To this end, we convert the volumetric representation into the topology-preserving reduced graph, develop nomenclature rules for pyramidal neurons and mitochondria from the reduced graph, and learn the feature embedding for shape manipulation. In ablation studies, we show that the proposed reduced graph extraction method yield graphs better in accord with the perception of experts. On 3D shape retrieval and decomposition tasks, we show that the encoded topological nomenclature features achieve better results than state-of-the-art shape descriptors. To advance neuroscience, we will release a 3D mesh dataset of mitochondria and pyramidal neurons reconstructed from a 100{\mu}m cube electron microscopy (EM) volume. Code is publicly available at https://github.com/donglaiw/ibexHelper.

* Technical report

Via

Access Paper or Ask Questions

White-Box Adversarial Defense via Self-Supervised Data Estimation

Sep 13, 2019
Zudi Lin, Hanspeter Pfister, Ziming Zhang

Figure 1 for White-Box Adversarial Defense via Self-Supervised Data Estimation

Figure 2 for White-Box Adversarial Defense via Self-Supervised Data Estimation

Figure 3 for White-Box Adversarial Defense via Self-Supervised Data Estimation

Figure 4 for White-Box Adversarial Defense via Self-Supervised Data Estimation

In this paper, we study the problem of how to defend classifiers against adversarial attacks that fool the classifiers using subtly modified input data. In contrast to previous works, here we focus on the white-box adversarial defense where the attackers are granted full access to not only the classifiers but also defenders to produce as strong attacks as possible. In such a context we propose viewing a defender as a functional, a higher-order function that takes functions as its argument to represent a function space, rather than fixed functions conventionally. From this perspective, a defender should be realized and optimized individually for each adversarial input. To this end, we propose RIDE, an efficient and provably convergent self-supervised learning algorithm for individual data estimation to protect the predictions from adversarial attacks. We demonstrate the significant improvement of adversarial defense performance on image recognition, eg, 98%, 76%, 43% test accuracy on MNIST, CIFAR-10, and ImageNet datasets respectively under the state-of-the-art BPDA attacker.

Via

Access Paper or Ask Questions

FDive: Learning Relevance Models using Pattern-based Similarity Measures

Jul 30, 2019
Frederik L. Dennig, Tom Polk, Zudi Lin, Tobias Schreck, Hanspeter Pfister, Michael Behrisch

Figure 1 for FDive: Learning Relevance Models using Pattern-based Similarity Measures

Figure 2 for FDive: Learning Relevance Models using Pattern-based Similarity Measures

Figure 3 for FDive: Learning Relevance Models using Pattern-based Similarity Measures

Figure 4 for FDive: Learning Relevance Models using Pattern-based Similarity Measures

The detection of interesting patterns in large high-dimensional datasets is difficult because of their dimensionality and pattern complexity. Therefore, analysts require automated support for the extraction of relevant patterns. In this paper, we present FDive, a visual active learning system that helps to create visually explorable relevance models, assisted by learning a pattern-based similarity. We use a small set of user-provided labels to rank similarity measures, consisting of feature descriptor and distance function combinations, by their ability to distinguish relevant from irrelevant data. Based on the best-ranked similarity measure, the system calculates an interactive Self-Organizing Map-based relevance model, which classifies data according to the cluster affiliation. It also automatically prompts further relevance feedback to improve its accuracy. Uncertain areas, especially near the decision boundaries, are highlighted and can be refined by the user. We evaluate our approach by comparison to state-of-the-art feature selection techniques and demonstrate the usefulness of our approach by a case study classifying electron microscopy images of brain cells. The results show that FDive enhances both the quality and understanding of relevance models and can thus lead to new insights for brain research.

* 12 pages, 7 figures, 2 tables, LaTeX; corrected typo

Via

Access Paper or Ask Questions

Visual Interaction with Deep Learning Models through Collaborative Semantic Inference

Jul 24, 2019
Sebastian Gehrmann, Hendrik Strobelt, Robert Krüger, Hanspeter Pfister, Alexander M. Rush

Figure 1 for Visual Interaction with Deep Learning Models through Collaborative Semantic Inference

Figure 2 for Visual Interaction with Deep Learning Models through Collaborative Semantic Inference

Figure 3 for Visual Interaction with Deep Learning Models through Collaborative Semantic Inference

Figure 4 for Visual Interaction with Deep Learning Models through Collaborative Semantic Inference

Automation of tasks can have critical consequences when humans lose agency over decision processes. Deep learning models are particularly susceptible since current black-box approaches lack explainable reasoning. We argue that both the visual interface and model structure of deep learning systems need to take into account interaction design. We propose a framework of collaborative semantic inference (CSI) for the co-design of interactions and models to enable visual collaboration between humans and algorithms. The approach exposes the intermediate reasoning process of models which allows semantic interactions with the visual metaphors of a problem, which means that a user can both understand and control parts of the model reasoning process. We demonstrate the feasibility of CSI with a co-designed case study of a document summarization system.

* IEEE VIS 2019 (VAST)

Via

Access Paper or Ask Questions

Fast Mitochondria Segmentation for Connectomics

Dec 14, 2018
Vincent Casser, Kai Kang, Hanspeter Pfister, Daniel Haehn

Figure 1 for Fast Mitochondria Segmentation for Connectomics

Figure 2 for Fast Mitochondria Segmentation for Connectomics

Figure 3 for Fast Mitochondria Segmentation for Connectomics

Figure 4 for Fast Mitochondria Segmentation for Connectomics

In connectomics, scientists create the wiring diagram of a mammalian brain by identifying synaptic connections between neurons in nano-scale electron microscopy images. This allows for the identification of dysfunctional mitochondria which are linked to a variety of diseases such as autism or bipolar. However, manual analysis is not feasible since connectomics datasets can be petabytes in size. To process such large data, we present a fully automatic mitochondria detector based on a modified U-Net architecture that yields high accuracy and fast processing times. We evaluate our method on multiple real-world connectomics datasets, including an improved version of the EPFL Hippocampus mitochondria detection benchmark. Our results show a Jaccard index of up to 0.90 with inference speeds lower than 16ms for a 512x512 image tile. This speed is faster than the acquisition time of modern electron microscopes, allowing mitochondria detection in real-time. Compared to previous work, our detector ranks first among real-time methods and third overall. Our data, results, and code are freely available.

Via

Access Paper or Ask Questions

Detecting Synapse Location and Connectivity by Signed Proximity Estimation and Pruning with Deep Nets

Oct 25, 2018
Toufiq Parag, Daniel Berger, Lee Kamentsky, Benedikt Staffler, Donglai Wei, Moritz Helmstaedter, Jeff W. Lichtman, Hanspeter Pfister

Figure 1 for Detecting Synapse Location and Connectivity by Signed Proximity Estimation and Pruning with Deep Nets

Figure 2 for Detecting Synapse Location and Connectivity by Signed Proximity Estimation and Pruning with Deep Nets

Figure 3 for Detecting Synapse Location and Connectivity by Signed Proximity Estimation and Pruning with Deep Nets

Figure 4 for Detecting Synapse Location and Connectivity by Signed Proximity Estimation and Pruning with Deep Nets

Synaptic connectivity detection is a critical task for neural reconstruction from Electron Microscopy (EM) data. Most of the existing algorithms for synapse detection do not identify the cleft location and direction of connectivity simultaneously. The few methods that computes direction along with contact location have only been demonstrated to work on either dyadic (most common in vertebrate brain) or polyadic (found in fruit fly brain) synapses, but not on both types. In this paper, we present an algorithm to automatically predict the location as well as the direction of both dyadic and polyadic synapses. The proposed algorithm first generates candidate synaptic connections from voxelwise predictions of signed proximity generated by a 3D U-net. A second 3D CNN then prunes the set of candidates to produce the final detection of cleft and connectivity orientation. Experimental results demonstrate that the proposed method outperforms the existing methods for determining synapses in both rodent and fruit fly brain.

Via

Access Paper or Ask Questions

Seq2Seq-Vis: A Visual Debugging Tool for Sequence-to-Sequence Models

Oct 16, 2018
Hendrik Strobelt, Sebastian Gehrmann, Michael Behrisch, Adam Perer, Hanspeter Pfister, Alexander M. Rush

Figure 1 for Seq2Seq-Vis: A Visual Debugging Tool for Sequence-to-Sequence Models

Figure 2 for Seq2Seq-Vis: A Visual Debugging Tool for Sequence-to-Sequence Models

Figure 3 for Seq2Seq-Vis: A Visual Debugging Tool for Sequence-to-Sequence Models

Figure 4 for Seq2Seq-Vis: A Visual Debugging Tool for Sequence-to-Sequence Models

Neural Sequence-to-Sequence models have proven to be accurate and robust for many sequence prediction tasks, and have become the standard approach for automatic translation of text. The models work in a five stage blackbox process that involves encoding a source sequence to a vector space and then decoding out to a new target sequence. This process is now standard, but like many deep learning methods remains quite difficult to understand or debug. In this work, we present a visual analysis tool that allows interaction with a trained sequence-to-sequence model through each stage of the translation process. The aim is to identify which patterns have been learned and to detect model errors. We demonstrate the utility of our tool through several real-world large-scale sequence-to-sequence use cases.

* VAST - IEEE VIS 2018

Via

Access Paper or Ask Questions

Parallel Separable 3D Convolution for Video and Volumetric Data Understanding

Sep 11, 2018
Felix Gonda, Donglai Wei, Toufiq Parag, Hanspeter Pfister

Figure 1 for Parallel Separable 3D Convolution for Video and Volumetric Data Understanding

Figure 2 for Parallel Separable 3D Convolution for Video and Volumetric Data Understanding

Figure 3 for Parallel Separable 3D Convolution for Video and Volumetric Data Understanding

Figure 4 for Parallel Separable 3D Convolution for Video and Volumetric Data Understanding

For video and volumetric data understanding, 3D convolution layers are widely used in deep learning, however, at the cost of increasing computation and training time. Recent works seek to replace the 3D convolution layer with convolution blocks, e.g. structured combinations of 2D and 1D convolution layers. In this paper, we propose a novel convolution block, Parallel Separable 3D Convolution (PmSCn), which applies m parallel streams of n 2D and one 1D convolution layers along different dimensions. We first mathematically justify the need of parallel streams (Pm) to replace a single 3D convolution layer through tensor decomposition. Then we jointly replace consecutive 3D convolution layers, common in modern network architectures, with the multiple 2D convolution layers (Cn). Lastly, we empirically show that PmSCn is applicable to different backbone architectures, such as ResNet, DenseNet, and UNet, for different applications, such as video action recognition, MRI brain segmentation, and electron microscopy segmentation. In all three applications, we replace the 3D convolution layers in state-of-the art models with PmSCn and achieve around 14% improvement in test performance and 40% reduction in model size and on average.

Via

Access Paper or Ask Questions