Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

"Information": models, code, and papers

Contact-less Material Probing with Distributed Sensors: Joint Sensing and Communication Optimization

May 17, 2022
Ali Kariminezhad, Soheil Gherekhloo, Aydin Sezgin

Figure 1 for Contact-less Material Probing with Distributed Sensors: Joint Sensing and Communication Optimization

Figure 2 for Contact-less Material Probing with Distributed Sensors: Joint Sensing and Communication Optimization

Figure 3 for Contact-less Material Probing with Distributed Sensors: Joint Sensing and Communication Optimization

Figure 4 for Contact-less Material Probing with Distributed Sensors: Joint Sensing and Communication Optimization

The utilization of RF signals to probe material properties of objects is of huge interest both in academia as well as industry. To this end, a setup is investigated, in which a transmitter equipped with a two-dimensional multi-antenna array dispatches a signal, which hits objects in the environment and the reflections from the objects are captured by distributed sensors. The received signal at those sensors are then amplified and forwarded to a multiple antenna fusion center, which performs space-time post-processing in order to optimize the information extraction. In this process, optimal design of power allocation per object alongside sensors amplifications is of crucial importance. Here, the power allocation and sensors amplifications is jointly optimized, given maximum-ratio combining (MRC) at the fusion center. We formulate this challenge as a sum-power minimization under per-object SINR constraints, a sum-power constraint at the transmitter and individual power constraints at the sensors. Moreover, the advantage of deploying zero-forcing (ZF) and minimum mean-squared error (MMSE) at the fusion center is discussed. Asymptotic analysis is also provided for the case that large number of sensors are deployed in the sensing environment.

* arXiv admin note: text overlap with arXiv:1902.11117

Via

Access Paper or Ask Questions

Local Attention Graph-based Transformer for Multi-target Genetic Alteration Prediction

May 13, 2022
Daniel Reisenbüchler, Sophia J. Wagner, Melanie Boxberg, Tingying Peng

Figure 1 for Local Attention Graph-based Transformer for Multi-target Genetic Alteration Prediction

Figure 2 for Local Attention Graph-based Transformer for Multi-target Genetic Alteration Prediction

Figure 3 for Local Attention Graph-based Transformer for Multi-target Genetic Alteration Prediction

Figure 4 for Local Attention Graph-based Transformer for Multi-target Genetic Alteration Prediction

Classical multiple instance learning (MIL) methods are often based on the identical and independent distributed assumption between instances, hence neglecting the potentially rich contextual information beyond individual entities. On the other hand, Transformers with global self-attention modules have been proposed to model the interdependencies among all instances. However, in this paper we question: Is global relation modeling using self-attention necessary, or can we appropriately restrict self-attention calculations to local regimes in large-scale whole slide images (WSIs)? We propose a general-purpose local attention graph-based Transformer for MIL (LA-MIL), introducing an inductive bias by explicitly contextualizing instances in adaptive local regimes of arbitrary size. Additionally, an efficiently adapted loss function enables our approach to learn expressive WSI embeddings for the joint analysis of multiple biomarkers. We demonstrate that LA-MIL achieves state-of-the-art results in mutation prediction for gastrointestinal cancer, outperforming existing models on important biomarkers such as microsatellite instability for colorectal cancer. This suggests that local self-attention sufficiently models dependencies on par with global modules. Our implementation will be published.

Via

Access Paper or Ask Questions

Finding patterns in Knowledge Attribution for Transformers

May 04, 2022
Jeevesh Juneja, Ritu Agarwal

Figure 1 for Finding patterns in Knowledge Attribution for Transformers

Figure 2 for Finding patterns in Knowledge Attribution for Transformers

Figure 3 for Finding patterns in Knowledge Attribution for Transformers

Figure 4 for Finding patterns in Knowledge Attribution for Transformers

We analyze the Knowledge Neurons framework for the attribution of factual and relational knowledge to particular neurons in the transformer network. We use a 12-layer multi-lingual BERT model for our experiments. Our study reveals various interesting phenomena. We observe that mostly factual knowledge can be attributed to middle and higher layers of the network($\ge 6$). Further analysis reveals that the middle layers($6-9$) are mostly responsible for relational information, which is further refined into actual factual knowledge or the "correct answer" in the last few layers($10-12$). Our experiments also show that the model handles prompts in different languages, but representing the same fact, similarly, providing further evidence for effectiveness of multi-lingual pre-training. Applying the attribution scheme for grammatical knowledge, we find that grammatical knowledge is far more dispersed among the neurons than factual knowledge.

* Remove unnecessary files; Correct Typos;

Via

Access Paper or Ask Questions

Masked Spectrogram Modeling using Masked Autoencoders for Learning General-purpose Audio Representation

Apr 26, 2022
Daisuke Niizumi, Daiki Takeuchi, Yasunori Ohishi, Noboru Harada, Kunio Kashino

Figure 1 for Masked Spectrogram Modeling using Masked Autoencoders for Learning General-purpose Audio Representation

Figure 2 for Masked Spectrogram Modeling using Masked Autoencoders for Learning General-purpose Audio Representation

Figure 3 for Masked Spectrogram Modeling using Masked Autoencoders for Learning General-purpose Audio Representation

Figure 4 for Masked Spectrogram Modeling using Masked Autoencoders for Learning General-purpose Audio Representation

Recent general-purpose audio representations show state-of-the-art performance on various audio tasks. These representations are pre-trained by self-supervised learning methods that create training signals from the input. For example, typical audio contrastive learning uses temporal relationships among input sounds to create training signals, whereas some methods use a difference among input views created by data augmentations. However, these training signals do not provide information derived from the intact input sound, which we think is suboptimal for learning representation that describes the input as it is. In this paper, we seek to learn audio representations from the input itself as supervision using a pretext task of auto-encoding of masked spectrogram patches, Masked Spectrogram Modeling (MSM, a variant of Masked Image Modeling applied to audio spectrogram). To implement MSM, we use Masked Autoencoders (MAE), an image self-supervised learning method. MAE learns to efficiently encode the small number of visible patches into latent representations to carry essential information for reconstructing a large number of masked patches. While training, MAE minimizes the reconstruction error, which uses the input as training signal, consequently achieving our goal. We conducted experiments on our MSM using MAE (MSM-MAE) models under the evaluation benchmark of the HEAR 2021 NeurIPS Challenge. Our MSM-MAE models outperformed the HEAR 2021 Challenge results on seven out of 15 tasks (e.g., accuracies of 73.4% on CREMA-D and 85.8% on LibriCount), while showing top performance on other tasks where specialized models perform better. We also investigate how the design choices of MSM-MAE impact the performance and conduct qualitative analysis of visualization outcomes to gain an understanding of learned representations. We make our code available online.

* 22 pages, 8 figures. Under the review process

Via

Access Paper or Ask Questions

Information-theoretic Task Selection for Meta-Reinforcement Learning

Nov 02, 2020
Ricardo Luna Gutierrez, Matteo Leonetti

Figure 1 for Information-theoretic Task Selection for Meta-Reinforcement Learning

Figure 2 for Information-theoretic Task Selection for Meta-Reinforcement Learning

Figure 3 for Information-theoretic Task Selection for Meta-Reinforcement Learning

Figure 4 for Information-theoretic Task Selection for Meta-Reinforcement Learning

In Meta-Reinforcement Learning (meta-RL) an agent is trained on a set of tasks to prepare for and learn faster in new, unseen, but related tasks. The training tasks are usually hand-crafted to be representative of the expected distribution of test tasks and hence all used in training. We show that given a set of training tasks, learning can be both faster and more effective (leading to better performance in the test tasks), if the training tasks are appropriately selected. We propose a task selection algorithm, Information-Theoretic Task Selection (ITTS), based on information theory, which optimizes the set of tasks used for training in meta-RL, irrespectively of how they are generated. The algorithm establishes which training tasks are both sufficiently relevant for the test tasks, and different enough from one another. We reproduce different meta-RL experiments from the literature and show that ITTS improves the final performance in all of them.

* Work to be presented at NeurIPS 2020

Via

Access Paper or Ask Questions

RCP: Recurrent Closest Point for Scene Flow Estimation on 3D Point Clouds

May 24, 2022
Xiaodong Gu, Chengzhou Tang, Weihao Yuan, Zuozhuo Dai, Siyu Zhu, Ping Tan

Figure 1 for RCP: Recurrent Closest Point for Scene Flow Estimation on 3D Point Clouds

Figure 2 for RCP: Recurrent Closest Point for Scene Flow Estimation on 3D Point Clouds

Figure 3 for RCP: Recurrent Closest Point for Scene Flow Estimation on 3D Point Clouds

Figure 4 for RCP: Recurrent Closest Point for Scene Flow Estimation on 3D Point Clouds

3D motion estimation including scene flow and point cloud registration has drawn increasing interest. Inspired by 2D flow estimation, recent methods employ deep neural networks to construct the cost volume for estimating accurate 3D flow. However, these methods are limited by the fact that it is difficult to define a search window on point clouds because of the irregular data structure. In this paper, we avoid this irregularity by a simple yet effective method.We decompose the problem into two interlaced stages, where the 3D flows are optimized point-wisely at the first stage and then globally regularized in a recurrent network at the second stage. Therefore, the recurrent network only receives the regular point-wise information as the input. In the experiments, we evaluate the proposed method on both the 3D scene flow estimation and the point cloud registration task. For 3D scene flow estimation, we make comparisons on the widely used FlyingThings3D and KITTIdatasets. For point cloud registration, we follow previous works and evaluate the data pairs with large pose and partially overlapping from ModelNet40. The results show that our method outperforms the previous method and achieves a new state-of-the-art performance on both 3D scene flow estimation and point cloud registration, which demonstrates the superiority of the proposed zero-order method on irregular point cloud data.

* Accepted to CVPR 2022

Via

Access Paper or Ask Questions

Assembly Planning from Observations under Physical Constraints

Apr 20, 2022
Thomas Chabal, Robin Strudel, Etienne Arlaud, Jean Ponce, Cordelia Schmid

Figure 1 for Assembly Planning from Observations under Physical Constraints

Figure 2 for Assembly Planning from Observations under Physical Constraints

Figure 3 for Assembly Planning from Observations under Physical Constraints

Figure 4 for Assembly Planning from Observations under Physical Constraints

This paper addresses the problem of copying an unknown assembly of primitives with known shape and appearance using information extracted from a single photograph by an off-the-shelf procedure for object detection and pose estimation. The proposed algorithm uses a simple combination of physical stability constraints, convex optimization and Monte Carlo tree search to plan assemblies as sequences of pick-and-place operations represented by STRIPS operators. It is efficient and, most importantly, robust to the errors in object detection and pose estimation unavoidable in any real robotic system. The proposed approach is demonstrated with thorough experiments on a UR5 manipulator.

* See the project webpage at https://www.di.ens.fr/willow/research/assembly-planning/

Via

Access Paper or Ask Questions

A Nonlocal Graph-PDE and Higher-Order Geometric Integration for Image Labeling

May 09, 2022
Dmitrij Sitenko, Bastian Boll, Christoph Schnörr

Figure 1 for A Nonlocal Graph-PDE and Higher-Order Geometric Integration for Image Labeling

Figure 2 for A Nonlocal Graph-PDE and Higher-Order Geometric Integration for Image Labeling

Figure 3 for A Nonlocal Graph-PDE and Higher-Order Geometric Integration for Image Labeling

Figure 4 for A Nonlocal Graph-PDE and Higher-Order Geometric Integration for Image Labeling

This paper introduces a novel nonlocal partial difference equation (PDE) for labeling metric data on graphs. The PDE is derived as nonlocal reparametrization of the assignment flow approach that was introduced in \textit{J.~Math.~Imaging \& Vision} 58(2), 2017. Due to this parameterization, solving the PDE numerically is shown to be equivalent to computing the Riemannian gradient flow with respect to a nonconvex potential. We devise an entropy-regularized difference-of-convex-functions (DC) decomposition of this potential and show that the basic geometric Euler scheme for integrating the assignment flow is equivalent to solving the PDE by an established DC programming scheme. Moreover, the viewpoint of geometric integration reveals a basic way to exploit higher-order information of the vector field that drives the assignment flow, in order to devise a novel accelerated DC programming scheme. A detailed convergence analysis of both numerical schemes is provided and illustrated by numerical experiments.

Via

Access Paper or Ask Questions

Vocalsound: A Dataset for Improving Human Vocal Sounds Recognition

May 06, 2022
Yuan Gong, Jin Yu, James Glass

Figure 1 for Vocalsound: A Dataset for Improving Human Vocal Sounds Recognition

Figure 2 for Vocalsound: A Dataset for Improving Human Vocal Sounds Recognition

Figure 3 for Vocalsound: A Dataset for Improving Human Vocal Sounds Recognition

Figure 4 for Vocalsound: A Dataset for Improving Human Vocal Sounds Recognition

Recognizing human non-speech vocalizations is an important task and has broad applications such as automatic sound transcription and health condition monitoring. However, existing datasets have a relatively small number of vocal sound samples or noisy labels. As a consequence, state-of-the-art audio event classification models may not perform well in detecting human vocal sounds. To support research on building robust and accurate vocal sound recognition, we have created a VocalSound dataset consisting of over 21,000 crowdsourced recordings of laughter, sighs, coughs, throat clearing, sneezes, and sniffs from 3,365 unique subjects. Experiments show that the vocal sound recognition performance of a model can be significantly improved by 41.9% by adding VocalSound dataset to an existing dataset as training material. In addition, different from previous datasets, the VocalSound dataset contains meta information such as speaker age, gender, native language, country, and health condition.

* Accepted at ICASSP 2022. Dataset and code at https://github.com/YuanGongND/vocalsound Interactive Colab demo at https://colab.research.google.com/github/YuanGongND/vocalsound/blob/main/colab/VocalSound.ipynb

Via

Access Paper or Ask Questions

BronchusNet: Region and Structure Prior Embedded Representation Learning for Bronchus Segmentation and Classification

May 24, 2022
Wenhao Huang, Haifan Gong, Huan Zhang, Yu Wang, Haofeng Li, Guanbin Li, Hong Shen

Figure 1 for BronchusNet: Region and Structure Prior Embedded Representation Learning for Bronchus Segmentation and Classification

Figure 2 for BronchusNet: Region and Structure Prior Embedded Representation Learning for Bronchus Segmentation and Classification

Figure 3 for BronchusNet: Region and Structure Prior Embedded Representation Learning for Bronchus Segmentation and Classification

Figure 4 for BronchusNet: Region and Structure Prior Embedded Representation Learning for Bronchus Segmentation and Classification

CT-based bronchial tree analysis plays an important role in the computer-aided diagnosis for respiratory diseases, as it could provide structured information for clinicians. The basis of airway analysis is bronchial tree reconstruction, which consists of bronchus segmentation and classification. However, there remains a challenge for accurate bronchial analysis due to the individual variations and the severe class imbalance. In this paper, we propose a region and structure prior embedded framework named BronchusNet to achieve accurate segmentation and classification of bronchial regions in CT images. For bronchus segmentation, we propose an adaptive hard region-aware UNet that incorporates multi-level prior guidance of hard pixel-wise samples in the general Unet segmentation network to achieve better hierarchical feature learning. For the classification of bronchial branches, we propose a hybrid point-voxel graph learning module to fully exploit bronchial structure priors and to support simultaneous feature interactions across different branches. To facilitate the study of bronchial analysis, we contribute~\textbf{BRSC}: an open-access benchmark of \textbf{BR}onchus imaging analysis with high-quality pixel-wise \textbf{S}egmentation masks and the \textbf{C}lass of bronchial segments. Experimental results on BRSC show that our proposed method not only achieves the state-of-the-art performance for binary segmentation of bronchial region but also exceeds the best existing method on bronchial branches classification by 6.9\%.

Via

Access Paper or Ask Questions