Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Topic": models, code, and papers

Model-agnostic multi-objective approach for the evolutionary discovery of mathematical models

Jul 08, 2021
Alexander Hvatov, Mikhail Maslyaev, Iana S. Polonskaya, Mikhail Sarafanov, Mark Merezhnikov, Nikolay O. Nikitin

In modern data science, it is often not enough to obtain only a data-driven model with a good prediction quality. On the contrary, it is more interesting to understand the properties of the model, which parts could be replaced to obtain better results. Such questions are unified under machine learning interpretability questions, which could be considered one of the area's raising topics. In the paper, we use multi-objective evolutionary optimization for composite data-driven model learning to obtain the algorithm's desired properties. It means that whereas one of the apparent objectives is precision, the other could be chosen as the complexity of the model, robustness, and many others. The method application is shown on examples of multi-objective learning of composite models, differential equations, and closed-form algebraic expressions are unified and form approach for model-agnostic learning of the interpretable models.

* OL2A conference 

  Access Paper or Ask Questions

Point-of-Interest Recommender Systems: A Survey from an Experimental Perspective

Jun 18, 2021
Pablo Sánchez, Alejandro Bellogín

Point-of-Interest recommendation is an increasing research and developing area within the widely adopted technologies known as Recommender Systems. Among them, those that exploit information coming from Location-Based Social Networks (LBSNs) are very popular nowadays and could work with different information sources, which pose several challenges and research questions to the community as a whole. We present a systematic review focused on the research done in the last 10 years about this topic. We discuss and categorize the algorithms and evaluation methodologies used in these works and point out the opportunities and challenges that remain open in the field. More specifically, we report the leading recommendation techniques and information sources that have been exploited more often (such as the geographical signal and deep learning approaches) while we also alert about the lack of reproducibility in the field that may hinder real performance improvements.

* Submitted in Jul 2020 (revised in Jun 2021, still under review) to ACM Computing Surveys 

  Access Paper or Ask Questions

Survey of Visual-Semantic Embedding Methods for Zero-Shot Image Retrieval

May 16, 2021
Kazuya Ueki

Visual-semantic embedding is an interesting research topic because it is useful for various tasks, such as visual question answering (VQA), image-text retrieval, image captioning, and scene graph generation. In this paper, we focus on zero-shot image retrieval using sentences as queries and present a survey of the technological trends in this area. First, we provide a comprehensive overview of the history of the technology, starting with a discussion of the early studies of image-to-text matching and how the technology has evolved over time. In addition, a description of the datasets commonly used in experiments and a comparison of the evaluation results of each method are presented. We also introduce the implementation available on github for use in confirming the accuracy of experiments and for further improvement. We hope that this survey paper will encourage researchers to further develop their research on bridging images and languages.


  Access Paper or Ask Questions

Exact Sparse Orthogonal Dictionary Learning

Mar 20, 2021
Kai Liu, Yongjian Zhao, Hua Wang

Over the past decade, learning a dictionary from input images for sparse modeling has been one of the topics which receive most research attention in image processing and compressed sensing. Most existing dictionary learning methods consider an over-complete dictionary, such as the K-SVD method, which may result in high mutual incoherence and therefore has a negative impact in recognition. On the other side, the sparse codes are usually optimized by adding the $\ell_0$ or $\ell_1$-norm penalty, but with no strict sparsity guarantee. In this paper, we propose an orthogonal dictionary learning model which can obtain strictly sparse codes and orthogonal dictionary with global sequence convergence guarantee. We find that our method can result in better denoising results than over-complete dictionary based learning methods, and has the additional advantage of high computation efficiency.


  Access Paper or Ask Questions

Utilising Graph Machine Learning within Drug Discovery and Development

Dec 09, 2020
Thomas Gaudelet, Ben Day, Arian R. Jamasb, Jyothish Soman, Cristian Regep, Gertrude Liu, Jeremy B. R. Hayter, Richard Vickers, Charles Roberts, Jian Tang, David Roblin, Tom L. Blundell, Michael M. Bronstein, Jake P. Taylor-King

Graph Machine Learning (GML) is receiving growing interest within the pharmaceutical and biotechnology industries for its ability to model biomolecular structures, the functional relationships between them, and integrate multi-omic datasets - amongst other data types. Herein, we present a multidisciplinary academic-industrial review of the topic within the context of drug discovery and development. After introducing key terms and modelling approaches, we move chronologically through the drug development pipeline to identify and summarise work incorporating: target identification, design of small molecules and biologics, and drug repurposing. Whilst the field is still emerging, key milestones including repurposed drugs entering in vivo studies, suggest graph machine learning will become a modelling framework of choice within biomedical machine learning.

* 19 pages, 8 figures 

  Access Paper or Ask Questions

WikiLingua: A New Benchmark Dataset for Cross-Lingual Abstractive Summarization

Oct 07, 2020
Faisal Ladhak, Esin Durmus, Claire Cardie, Kathleen McKeown

We introduce WikiLingua, a large-scale, multilingual dataset for the evaluation of crosslingual abstractive summarization systems. We extract article and summary pairs in 18 languages from WikiHow, a high quality, collaborative resource of how-to guides on a diverse set of topics written by human authors. We create gold-standard article-summary alignments across languages by aligning the images that are used to describe each how-to step in an article. As a set of baselines for further studies, we evaluate the performance of existing cross-lingual abstractive summarization methods on our dataset. We further propose a method for direct crosslingual summarization (i.e., without requiring translation at inference time) by leveraging synthetic data and Neural Machine Translation as a pre-training step. Our method significantly outperforms the baseline approaches, while being more cost efficient during inference.

* Findings of EMNLP 2020 

  Access Paper or Ask Questions

CoVoST 2 and Massively Multilingual Speech-to-Text Translation

Aug 20, 2020
Changhan Wang, Anne Wu, Juan Pino

Speech translation has recently become an increasingly popular topic of research, partly due to the development of benchmark datasets. Nevertheless, current datasets cover a limited number of languages. With the aim to foster research in massive multilingual speech translation and speech translation for low resource language pairs, we release CoVoST 2, a large-scale multilingual speech translation corpus covering translations from 21 languages into English and from English into 15 languages. This represents the largest open dataset available to date from total volume and language coverage perspective. Data sanity checks provide evidence about the quality of the data, which is released under CC0 license. We also provide extensive speech recognition, bilingual and multilingual machine translation and speech translation baselines.


  Access Paper or Ask Questions

CoVoST 2: A Massively Multilingual Speech-to-Text Translation Corpus

Jul 20, 2020
Changhan Wang, Anne Wu, Juan Pino

Speech translation has recently become an increasingly popular topic of research, partly due to the development of benchmark datasets. Nevertheless, current datasets cover a limited number of languages. With the aim to foster research in massive multilingual speech translation and speech translation for low resource language pairs, we release CoVoST 2, a large-scale multilingual speech translation corpus covering translations from 21 languages into English and from English into 15 languages. This represents the largest open dataset available to date from total volume and language coverage perspective. Data sanity checks provide evidence about the quality of the data, which is released under CC0 license. We also provide extensive speech recognition, bilingual and multilingual machine translation and speech translation baselines.


  Access Paper or Ask Questions

Stance Detection in Web and Social Media: A Comparative Study

Jul 12, 2020
Shalmoli Ghosh, Prajwal Singhania, Siddharth Singh, Koustav Rudra, Saptarshi Ghosh

Online forums and social media platforms are increasingly being used to discuss topics of varying polarities where different people take different stances. Several methodologies for automatic stance detection from text have been proposed in literature. To our knowledge, there has not been any systematic investigation towards their reproducibility, and their comparative performances. In this work, we explore the reproducibility of several existing stance detection models, including both neural models and classical classifier-based models. Through experiments on two datasets -- (i)~the popular SemEval microblog dataset, and (ii)~a set of health-related online news articles -- we also perform a detailed comparative analysis of various methods and explore their shortcomings. Implementations of all algorithms discussed in this paper are available at https://github.com/prajwal1210/Stance-Detection-in-Web-and-Social-Media.

* Proceedings of Conference and Labs of the Evaluation Forum (CLEF) 2019; Lecture Notes in Computer Science, vol 11696, pp. 75-87 

  Access Paper or Ask Questions

<<
251
252
253
254
255
256
257
258
259
260
261
262
263
>>