Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Topic": models, code, and papers

An Overview Of 3D Object Detection

Oct 29, 2020
Yilin Wang, Jiayi Ye

Point cloud 3D object detection has recently received major attention and becomes an active research topic in 3D computer vision community. However, recognizing 3D objects in LiDAR (Light Detection and Ranging) is still a challenge due to the complexity of point clouds. Objects such as pedestrians, cyclists, or traffic cones are usually represented by quite sparse points, which makes the detection quite complex using only point cloud. In this project, we propose a framework that uses both RGB and point cloud data to perform multiclass object recognition. We use existing 2D detection models to localize the region of interest (ROI) on the RGB image, followed by a pixel mapping strategy in the point cloud, and finally, lift the initial 2D bounding box to 3D space. We use the recently released nuScenes dataset---a large-scale dataset contains many data formats---to training and evaluate our proposed architecture.

  Access Paper or Ask Questions

Do Deeper Convolutional Networks Perform Better?

Oct 19, 2020
Eshaan Nichani, Adityanarayanan Radhakrishnan, Caroline Uhler

Over-parameterization is a recent topic of much interest in the machine learning community. While over-parameterized neural networks are capable of perfectly fitting (interpolating) training data, these networks often perform well on test data, thereby contradicting classical learning theory. Recent work provided an explanation for this phenomenon by introducing the double descent curve, showing that increasing model capacity past the interpolation threshold can lead to a decrease in test error. In line with this, it was recently shown empirically and theoretically that increasing neural network capacity through width leads to double descent. In this work, we analyze the effect of increasing depth on test performance. In contrast to what is observed for increasing width, we demonstrate through a variety of classification experiments on CIFAR10 and ImageNet32 using ResNets and fully-convolutional networks that test performance worsens beyond a critical depth. We posit an explanation for this phenomenon by drawing intuition from the principle of minimum norm solutions in linear networks.

* 16 pages, 16 figures 

  Access Paper or Ask Questions

Speaker Diarization Using Stereo Audio Channels: Preliminary Study on Utterance Clustering

Sep 10, 2020
Yingjun Dong, Neil G. MacLaren, Yiding Cao, Francis J. Yammarino, Shelley D. Dionne, Michael D. Mumford, Shane Connelly, Hiroki Sayama, Gregory A. Ruark

Speaker diarization is one of the actively researched topics in audio signal processing and machine learning. Utterance clustering is a critical part of a speaker diarization task. In this study, we aim to improve the performance of utterance clustering by processing multichannel (stereo) audio signals. We generated processed audio signals by combining left- and right-channel audio signals in a few different ways and then extracted embedded features (also called d-vectors) from those processed audio signals. We applied the Gaussian mixture model (GMM) for supervised utterance clustering. In the training phase, we used a parameter sharing GMM to train the model for each speaker. In the testing phase, we selected the speaker with the maximum likelihood as the detected speaker. Results of experiments with real audio recordings of multi-person discussion sessions showed that our proposed method that used multichannel audio signals achieved significantly better performance than a conventional method with mono audio signals.

  Access Paper or Ask Questions

Zero-Resource Knowledge-Grounded Dialogue Generation

Aug 29, 2020
Linxiao Li, Can Xu, Wei Wu, Yufan Zhao, Xueliang Zhao, Chongyang Tao

While neural conversation models have shown great potentials towards generating informative and engaging responses via introducing external knowledge, learning such a model often requires knowledge-grounded dialogues that are difficult to obtain. To overcome the data challenge and reduce the cost of building a knowledge-grounded dialogue system, we explore the problem under a zero-resource setting by assuming no context-knowledge-response triples are needed for training. To this end, we propose representing the knowledge that bridges a context and a response and the way that the knowledge is expressed as latent variables, and devise a variational approach that can effectively estimate a generation model from a dialogue corpus and a knowledge corpus that are independent with each other. Evaluation results on three benchmarks of knowledge-grounded dialogue generation indicate that our model can achieve comparable performance with state-of-the-art methods that rely on knowledge-grounded dialogues for training, and exhibits a good generalization ability over different topics and different datasets.

  Access Paper or Ask Questions

A Comparison of Synthetic Oversampling Methods for Multi-class Text Classification

Aug 11, 2020
Anna Glazkova

The authors compared oversampling methods for the problem of multi-class topic classification. The SMOTE algorithm underlies one of the most popular oversampling methods. It consists in choosing two examples of a minority class and generating a new example based on them. In the paper, the authors compared the basic SMOTE method with its two modifications (Borderline SMOTE and ADASYN) and random oversampling technique on the example of one of text classification tasks. The paper discusses the k-nearest neighbor algorithm, the support vector machine algorithm and three types of neural networks (feedforward network, long short-term memory (LSTM) and bidirectional LSTM). The authors combine these machine learning algorithms with different text representations and compared synthetic oversampling methods. In most cases, the use of oversampling techniques can significantly improve the quality of classification. The authors conclude that for this task, the quality of the KNN and SVM algorithms is more influenced by class imbalance than neural networks.

* 12 pages, 5 figures 

  Access Paper or Ask Questions

PneumoXttention: A CNN compensating for Human Fallibility when Detecting Pneumonia through CXR images with Attention

Aug 11, 2020
Sanskriti Singh

Automatic Chest Radiograph X-ray (CXR) interpretation by machines is an important research topic of Artificial Intelligence. As part of my journey through the California Science Fair, I have developed an algorithm that can detect pneumonia from a CXR image to compensate for human fallibility. My algorithm, PneumoXttention, is an ensemble of two 13 layer convolutional neural network trained on the RSNA dataset, a dataset provided by the Radiological Society of North America, containing 26,684 frontal X-ray images split into the categories of pneumonia and no pneumonia. The dataset was annotated by many professional radiologists in North America. It achieved an impressive F1 score, 0.82, on the test set (20% random split of RSNA dataset) and completely compensated Human Radiologists on a random set of 25 test images drawn from RSNA and NIH. I don't have a direct comparison but Stanford's Chexnet has a F1 score of 0.435 on the NIH dataset for category Pneumonia.

* 9 pages, 7 figures 

  Access Paper or Ask Questions

EZLDA: Efficient and Scalable LDA on GPUs

Jul 17, 2020
Shilong Wang, Hang Liu, Anil Gaihre, Hengyong Yu

LDA is a statistical approach for topic modeling with a wide range of applications. However, there exist very few attempts to accelerate LDA on GPUs which come with exceptional computing and memory throughput capabilities. To this end, we introduce EZLDA which achieves efficient and scalable LDA training on GPUs with the following three contributions: First, EZLDA introduces three-branch sampling method which takes advantage of the convergence heterogeneity of various tokens to reduce the redundant sampling task. Second, to enable sparsity-aware format for both D and W on GPUs with fast sampling and updating, we introduce hybrid format for W along with corresponding token partition to T and inverted index designs. Third, we design a hierarchical workload balancing solution to address the extremely skewed workload imbalance problem on GPU and scaleEZLDA across multiple GPUs. Taken together, EZLDA achieves superior performance over the state-of-the-art attempts with lower memory consumption.

  Access Paper or Ask Questions

Explaining predictive models with mixed features using Shapley values and conditional inference trees

Jul 02, 2020
Annabelle Redelmeier, Martin Jullum, Kjersti Aas

It is becoming increasingly important to explain complex, black-box machine learning models. Although there is an expanding literature on this topic, Shapley values stand out as a sound method to explain predictions from any type of machine learning model. The original development of Shapley values for prediction explanation relied on the assumption that the features being described were independent. This methodology was then extended to explain dependent features with an underlying continuous distribution. In this paper, we propose a method to explain mixed (i.e. continuous, discrete, ordinal, and categorical) dependent features by modeling the dependence structure of the features using conditional inference trees. We demonstrate our proposed method against the current industry standards in various simulation studies and find that our method often outperforms the other approaches. Finally, we apply our method to a real financial data set used in the 2018 FICO Explainable Machine Learning Challenge and show how our explanations compare to the FICO challenge Recognition Award winning team.

  Access Paper or Ask Questions

Deep Geometric Texture Synthesis

Jun 30, 2020
Amir Hertz, Rana Hanocka, Raja Giryes, Daniel Cohen-Or

Recently, deep generative adversarial networks for image generation have advanced rapidly; yet, only a small amount of research has focused on generative models for irregular structures, particularly meshes. Nonetheless, mesh generation and synthesis remains a fundamental topic in computer graphics. In this work, we propose a novel framework for synthesizing geometric textures. It learns geometric texture statistics from local neighborhoods (i.e., local triangular patches) of a single reference 3D model. It learns deep features on the faces of the input triangulation, which is used to subdivide and generate offsets across multiple scales, without parameterization of the reference or target mesh. Our network displaces mesh vertices in any direction (i.e., in the normal and tangential direction), enabling synthesis of geometric textures, which cannot be expressed by a simple 2D displacement map. Learning and synthesizing on local geometric patches enables a genus-oblivious framework, facilitating texture transfer between shapes of different genus.

* SIGGRAPH 2020 

  Access Paper or Ask Questions

Streaming Coresets for Symmetric Tensor Factorization

Jun 01, 2020
Rachit Chhaya, Jayesh Choudhari, Anirban Dasgupta, Supratim Shit

Factorizing tensors has recently become an important optimization module in a number of machine learning pipelines, especially in latent variable models. We show how to do this efficiently in the streaming setting. Given a set of $n$ vectors, each in $\mathbb{R}^d$, we present algorithms to select a sublinear number of these vectors as coreset, while guaranteeing that the CP decomposition of the $p$-moment tensor of the coreset approximates the corresponding decomposition of the $p$-moment tensor computed from the full data. We introduce two novel algorithmic techniques: online filtering and kernelization. Using these two, we present four algorithms that achieve different tradeoffs of coreset size, update time and working space, beating or matching various state of the art algorithms. In case of matrices (2-ordered tensor) our online row sampling algorithm guarantees $(1 \pm \epsilon)$ relative error spectral approximation. We show applications of our algorithms in learning single topic modeling.

* To appear at ICML 2020 

  Access Paper or Ask Questions