Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Topic": models, code, and papers

Efficient Codebook and Factorization for Second Order Representation Learning

Jun 05, 2019
Pierre Jacob, David Picard, Aymeric Histace, Edouard Klein

Learning rich and compact representations is an open topic in many fields such as object recognition or image retrieval. Deep neural networks have made a major breakthrough during the last few years for these tasks but their representations are not necessary as rich as needed nor as compact as expected. To build richer representations, high order statistics have been exploited and have shown excellent performances, but they produce higher dimensional features. While this drawback has been partially addressed with factorization schemes, the original compactness of first order models has never been retrieved, or at the cost of a strong performance decrease. Our method, by jointly integrating codebook strategy to factorization scheme, is able to produce compact representations while keeping the second order performances with few additional parameters. This formulation leads to state-of-the-art results on three image retrieval datasets.

* Accepted at IEEE International Conference on Image Processing (ICIP) 2019 

  Access Paper or Ask Questions

Identifying collaborators in large codebases

May 07, 2019
Waren Long, Vadim Markovtsev, Hugo Mougard, Egor Bulychev, Jan Hula

The way developers collaborate inside and particularly across teams often escapes management's attention, despite a formal organization with designated teams being defined. Observability of the actual, organically formed engineering structure provides decision makers invaluable additional tools to manage their talent pool. To identify existing inter and intra-team interactions - and suggest relevant opportunities for suitable collaborations - this paper studies contributors' commit activity, usage of programming languages, and code identifier topics by embedding and clustering them. We evaluate our findings collaborating with the GitLab organization, analyzing 117 of their open source projects. We show that we are able to restore their engineering organization in broad strokes, and also reveal hidden coding collaborations as well as justify in-house technical decisions.

* 4 pages; Workshop on Machine Learning for Software Engineering 2019 

  Access Paper or Ask Questions

Cross-Corpora Evaluation and Analysis of Grammatical Error Correction Models --- Is Single-Corpus Evaluation Enough?

Apr 05, 2019
Masato Mita, Tomoya Mizumoto, Masahiro Kaneko, Ryo Nagata, Kentaro Inui

This study explores the necessity of performing cross-corpora evaluation for grammatical error correction (GEC) models. GEC models have been previously evaluated based on a single commonly applied corpus: the CoNLL-2014 benchmark. However, the evaluation remains incomplete because the task difficulty varies depending on the test corpus and conditions such as the proficiency levels of the writers and essay topics. To overcome this limitation, we evaluate the performance of several GEC models, including NMT-based (LSTM, CNN, and transformer) and an SMT-based model, against various learner corpora (CoNLL-2013, CoNLL-2014, FCE, JFLEG, ICNALE, and KJ). Evaluation results reveal that the models' rankings considerably vary depending on the corpus, indicating that single-corpus evaluation is insufficient for GEC models.

* accepted by NAACL-HLT 2019 

  Access Paper or Ask Questions

Keyphrase Generation: A Text Summarization Struggle

Apr 03, 2019
Erion Çano, Ondřej Bojar

Authors' keyphrases assigned to scientific articles are essential for recognizing content and topic aspects. Most of the proposed supervised and unsupervised methods for keyphrase generation are unable to produce terms that are valuable but do not appear in the text. In this paper, we explore the possibility of considering the keyphrase string as an abstractive summary of the title and the abstract. First, we collect, process and release a large dataset of scientific paper metadata that contains 2.2 million records. Then we experiment with popular text summarization neural architectures. Despite using advanced deep learning models, large quantities of data and many days of computation, our systematic evaluation on four test datasets reveals that the explored text summarization methods could not produce better keyphrases than the simpler unsupervised methods, or the existing supervised ones.

* 7 pages, 3 tables. Published in proceedings of 2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics. Identical to the previous version 

  Access Paper or Ask Questions

Towards Partner-Aware Humanoid Robot Control Under Physical Interactions

Mar 20, 2019
Yeshasvi Tirupachuri, Gabriele Nava, Claudia Latella, Diego Ferigo, Lorenzo Rapetti, Luca Tagliapietra, Francesco Nori, Daniele Pucci

The topic of physical human-robot interaction received a lot of attention from the robotics community because of many promising application domains. However, studying physical interaction between a robot and an external agent, like a human or another robot, without considering the dynamics of both the systems may lead to many short-comings in fully exploiting the interaction. In this paper, we present a coupled-dynamics formalism followed by a sound approach in exploiting helpful interaction with a humanoid robot. In particular, we propose the first attempt to define and exploit the human help for the robot to accomplish a specific task. As a result, we present a task-based partner-aware robot control techniques. The theoretical results are validated by conducting experiments with two iCub humanoid robots involved in physical interaction.

* Accepted to Intelligent Systems 2019 Conference 

  Access Paper or Ask Questions

Machine Learning Based Prediction and Classification of Computational Jobs in Cloud Computing Centers

Mar 09, 2019
Zheqi Zhu, Pingyi Fan

With the rapid growth of the data volume and the fast increasing of the computational model complexity in the scenario of cloud computing, it becomes an important topic that how to handle users' requests by scheduling computational jobs and assigning the resources in data center. In order to have a better perception of the computing jobs and their requests of resources, we analyze its characteristics and focus on the prediction and classification of the computing jobs with some machine learning approaches. Specifically, we apply LSTM neural network to predict the arrival of the jobs and the aggregated requests for computing resources. Then we evaluate it on Google Cluster dataset and it shows that the accuracy has been improved compared to the current existing methods. Additionally, to have a better understanding of the computing jobs, we use an unsupervised hierarchical clustering algorithm, BIRCH, to make classification and get some interpretability of our results in the computing centers.

  Access Paper or Ask Questions

A Review of Meta-Reinforcement Learning for Deep Neural Networks Architecture Search

Dec 17, 2018
Yesmina Jaafra, Jean Luc Laurent, Aline Deruyver, Mohamed Saber Naceur

Deep Neural networks are efficient and flexible models that perform well for a variety of tasks such as image, speech recognition and natural language understanding. In particular, convolutional neural networks (CNN) generate a keen interest among researchers in computer vision and more specifically in classification tasks. CNN architecture and related hyperparameters are generally correlated to the nature of the processed task as the network extracts complex and relevant characteristics allowing the optimal convergence. Designing such architectures requires significant human expertise, substantial computation time and doesn't always lead to the optimal network. Model configuration topic has been extensively studied in machine learning without leading to a standard automatic method. This survey focuses on reviewing and discussing the current progress in automating CNN architecture search.

  Access Paper or Ask Questions

Feature Selection Approach with Missing Values Conducted for Statistical Learning: A Case Study of Entrepreneurship Survival Dataset

Oct 02, 2018
Diego Nascimento, Anderson Ara, Francisco Louzada Neto

In this article, we investigate the features which enhanced discriminate the survival in the micro and small business (MSE) using the approach of data mining with feature selection. According to the complexity of the data set, we proposed a comparison of three data imputation methods such as mean imputation (MI), k-nearest neighbor (KNN) and expectation maximization (EM) using mutually the selection of variables technique, whereby t-test, then through the data mining process using logistic regression classification methods, naive Bayes algorithm, linear discriminant analysis and support vector machine hence comparing their respective performances. The experimental results will be spread in developing a model to predict the MSE survival, providing a better understanding in the topic once it is a significant part of the Brazilian' GPA and macroeconomy.

  Access Paper or Ask Questions

Cost-Sensitive Active Learning for Intracranial Hemorrhage Detection

Sep 08, 2018
Weicheng Kuo, Christian Häne, Esther Yuh, Pratik Mukherjee, Jitendra Malik

Deep learning for clinical applications is subject to stringent performance requirements, which raises a need for large labeled datasets. However, the enormous cost of labeling medical data makes this challenging. In this paper, we build a cost-sensitive active learning system for the problem of intracranial hemorrhage detection and segmentation on head computed tomography (CT). We show that our ensemble method compares favorably with the state-of-the-art, while running faster and using less memory. Moreover, our experiments are done using a substantially larger dataset than earlier papers on this topic. Since the labeling time could vary tremendously across examples, we model the labeling time and optimize the return on investment. We validate this idea by core-set selection on our large labeled dataset and by growing it with data from the wild.

  Access Paper or Ask Questions

Why are Sequence-to-Sequence Models So Dull? Understanding the Low-Diversity Problem of Chatbots

Sep 06, 2018
Shaojie Jiang, Maarten de Rijke

Diversity is a long-studied topic in information retrieval that usually refers to the requirement that retrieved results should be non-repetitive and cover different aspects. In a conversational setting, an additional dimension of diversity matters: an engaging response generation system should be able to output responses that are diverse and interesting. Sequence-to-sequence (Seq2Seq) models have been shown to be very effective for response generation. However, dialogue responses generated by Seq2Seq models tend to have low diversity. In this paper, we review known sources and existing approaches to this low-diversity problem. We also identify a source of low diversity that has been little studied so far, namely model over-confidence. We sketch several directions for tackling model over-confidence and, hence, the low-diversity problem, including confidence penalties and label smoothing.

  Access Paper or Ask Questions