Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Recommendation": models, code, and papers

Active Evaluation: Efficient NLG Evaluation with Few Pairwise Comparisons

Mar 11, 2022
Akash Kumar Mohankumar, Mitesh M. Khapra

Recent studies have shown the advantages of evaluating NLG systems using pairwise comparisons as opposed to direct assessment. Given $k$ systems, a naive approach for identifying the top-ranked system would be to uniformly obtain pairwise comparisons from all ${k \choose 2}$ pairs of systems. However, this can be very expensive as the number of human annotations required would grow quadratically with $k$. In this work, we introduce Active Evaluation, a framework to efficiently identify the top-ranked system by actively choosing system pairs for comparison using dueling bandit algorithms. We perform extensive experiments with 13 dueling bandits algorithms on 13 NLG evaluation datasets spanning 5 tasks and show that the number of human annotations can be reduced by 80%. To further reduce the number of human annotations, we propose model-based dueling bandit algorithms which combine automatic evaluation metrics with human evaluations. Specifically, we eliminate sub-optimal systems even before the human annotation process and perform human evaluations only on test examples where the automatic metric is highly uncertain. This reduces the number of human annotations required further by 89%. In effect, we show that identifying the top-ranked system requires only a few hundred human annotations, which grow linearly with $k$. Lastly, we provide practical recommendations and best practices to identify the top-ranked system efficiently. Our code has been made publicly available at

* Accepted at ACL 2022; 21 pages and 12 figures 

  Access Paper or Ask Questions

LaMDA: Language Models for Dialog Applications

Feb 10, 2022
Romal Thoppilan, Daniel De Freitas, Jamie Hall, Noam Shazeer, Apoorv Kulshreshtha, Heng-Tze Cheng, Alicia Jin, Taylor Bos, Leslie Baker, Yu Du, YaGuang Li, Hongrae Lee, Huaixiu Steven Zheng, Amin Ghafouri, Marcelo Menegali, Yanping Huang, Maxim Krikun, Dmitry Lepikhin, James Qin, Dehao Chen, Yuanzhong Xu, Zhifeng Chen, Adam Roberts, Maarten Bosma, Vincent Zhao, Yanqi Zhou, Chung-Ching Chang, Igor Krivokon, Will Rusch, Marc Pickett, Pranesh Srinivasan, Laichee Man, Kathleen Meier-Hellstern, Meredith Ringel Morris, Tulsee Doshi, Renelito Delos Santos, Toju Duke, Johnny Soraker, Ben Zevenbergen, Vinodkumar Prabhakaran, Mark Diaz, Ben Hutchinson, Kristen Olson, Alejandra Molina, Erin Hoffman-John, Josh Lee, Lora Aroyo, Ravi Rajakumar, Alena Butryna, Matthew Lamm, Viktoriya Kuzmina, Joe Fenton, Aaron Cohen, Rachel Bernstein, Ray Kurzweil, Blaise Aguera-Arcas, Claire Cui, Marian Croak, Ed Chi, Quoc Le

We present LaMDA: Language Models for Dialog Applications. LaMDA is a family of Transformer-based neural language models specialized for dialog, which have up to 137B parameters and are pre-trained on 1.56T words of public dialog data and web text. While model scaling alone can improve quality, it shows less improvements on safety and factual grounding. We demonstrate that fine-tuning with annotated data and enabling the model to consult external knowledge sources can lead to significant improvements towards the two key challenges of safety and factual grounding. The first challenge, safety, involves ensuring that the model's responses are consistent with a set of human values, such as preventing harmful suggestions and unfair bias. We quantify safety using a metric based on an illustrative set of human values, and we find that filtering candidate responses using a LaMDA classifier fine-tuned with a small amount of crowdworker-annotated data offers a promising approach to improving model safety. The second challenge, factual grounding, involves enabling the model to consult external knowledge sources, such as an information retrieval system, a language translator, and a calculator. We quantify factuality using a groundedness metric, and we find that our approach enables the model to generate responses grounded in known sources, rather than responses that merely sound plausible. Finally, we explore the use of LaMDA in the domains of education and content recommendations, and analyze their helpfulness and role consistency.

  Access Paper or Ask Questions

Graph Few-shot Class-incremental Learning

Dec 23, 2021
Zhen Tan, Kaize Ding, Ruocheng Guo, Huan Liu

The ability to incrementally learn new classes is vital to all real-world artificial intelligence systems. A large portion of high-impact applications like social media, recommendation systems, E-commerce platforms, etc. can be represented by graph models. In this paper, we investigate the challenging yet practical problem, Graph Few-shot Class-incremental (Graph FCL) problem, where the graph model is tasked to classify both newly encountered classes and previously learned classes. Towards that purpose, we put forward a Graph Pseudo Incremental Learning paradigm by sampling tasks recurrently from the base classes, so as to produce an arbitrary number of training episodes for our model to practice the incremental learning skill. Furthermore, we design a Hierarchical-Attention-based Graph Meta-learning framework, HAG-Meta. We present a task-sensitive regularizer calculated from task-level attention and node class prototypes to mitigate overfitting onto either novel or base classes. To employ the topological knowledge, we add a node-level attention module to adjust the prototype representation. Our model not only achieves greater stability of old knowledge consolidation, but also acquires advantageous adaptability to new knowledge with very limited data samples. Extensive experiments on three real-world datasets, including Amazon-clothing, Reddit, and DBLP, show that our framework demonstrates remarkable advantages in comparison with the baseline and other related state-of-the-art methods.

* Accepted to WSDM 2022 

  Access Paper or Ask Questions

Hardware Acceleration of Sparse and Irregular Tensor Computations of ML Models: A Survey and Insights

Jul 02, 2020
Shail Dave, Riyadh Baghdadi, Tony Nowatzki, Sasikanth Avancha, Aviral Shrivastava, Baoxin Li

Machine learning (ML) models are widely used in many domains including media processing and generation, computer vision, medical diagnosis, embedded systems, high-performance and scientific computing, and recommendation systems. For efficiently processing these computational- and memory-intensive applications, tensors of these over-parameterized models are compressed by leveraging sparsity, size reduction, and quantization of tensors. Unstructured sparsity and tensors with varying dimensions yield irregular-shaped computation, communication, and memory access patterns; processing them on hardware accelerators in a conventional manner does not inherently leverage acceleration opportunities. This paper provides a comprehensive survey on how to efficiently execute sparse and irregular tensor computations of ML models on hardware accelerators. In particular, it discusses additional enhancement modules in architecture design and software support; categorizes different hardware designs and acceleration techniques and analyzes them in terms of hardware and execution costs; highlights further opportunities in terms of hardware/software/algorithm co-design optimizations and joint optimizations among described hardware and software enhancement modules. The takeaways from this paper include: understanding the key challenges in accelerating sparse, irregular-shaped, and quantized tensors; understanding enhancements in acceleration systems for supporting their efficient computations; analyzing trade-offs in opting for a specific type of design enhancement; understanding how to map and compile models with sparse tensors on the accelerators; understanding recent design trends for efficient accelerations and further opportunities.

  Access Paper or Ask Questions

The Big Picture: Ethical Considerations and Statistical Analysis of Industry Involvement in Machine Learning Research

Jun 08, 2020
Thilo Hagendorff, Kristof Meding

It is commonly believed among the machine learning (ML) community that industry influence on the community itself as well as the scientific process is increasing since tech companies have begun to allocate a large amount of human and monetary resources to ML. However, concrete ethical implications and the quantitative scale of this influence are rather unknown. For this purpose we have not only carried out an informed ethical analysis of the field, but have inspected all papers of the main ML conferences NeurIPS, CVPR and ICML of the last 5 years - almost 11000 papers in total. Our statistical approach focuses on conflicts of interest, innovation and gender equality. We have obtained four main findings: (1) Academic-corporate collaborations are growing in numbers. At the same time, we found that conflicts of interest are rarely disclosed. (2) Industry publishes papers about trending ML topics on average two years earlier than academia. (3) Industry papers are not lagging behind academic papers concerning social impact considerations. (4) Finally, we demonstrate that industrial papers fall short of their academic counterparts with respect to the ratio of gender diversity. The results have been reviewed in the light of related research works from ethics and other disciplines. For the first time we have quantitatively analysed the influence of industry on the ML community. We believe that this is a good starting point for further fine-grained discussion. The main recommendation that follows from our research is for the community to openly declare conflicts of interest, also subtle or only potential ones, to foster trustworthiness and transparency.

  Access Paper or Ask Questions

ArduCode: Predictive Framework for Automation Engineering

Sep 11, 2019
Arquimedes Canedo, Palash Goyal, Di Huang, Amit Pandey

Automation engineering is the task of integrating, via software, various sensors, actuators, and controls for automating a real-world process. Today, automation engineering is supported by a suite of software tools including integrated development environments (IDE), hardware configurators, compilers, and runtimes. These tools focus on the automation code itself, but leave the automation engineer unassisted in their decision making. This can lead to increased time for software development because of imperfections in decision making leading to multiple iterations between software and hardware. To address this, this paper defines multiple challenges often faced in automation engineering and propose solutions using machine learning to assist engineers tackle such challenges. We show that machine learning can be leveraged to assist the automation engineer in classifying automation, finding similar code snippets, and reasoning about the hardware selection of sensors and actuators. We validate our architecture on two real datasets consisting of 2,927 Arduino projects, and 683 Programmable Logic Controller (PLC) projects. Our results show that paragraph embedding techniques can be utilized to classify automation using code snippets with precision close to human annotation, giving an F1-score of 72%. Further, we show that such embedding techniques can help us find similar code snippets with high accuracy. Finally, we use autoencoder models for hardware recommendation and achieve a [email protected] of 0.79 and [email protected] of 0.95.

* 6 pages, 5 figures, 4 tables 

  Access Paper or Ask Questions

Reducing malicious use of synthetic media research: Considerations and potential release practices for machine learning

Jul 29, 2019
Aviv Ovadya, Jess Whittlestone

The aim of this paper is to facilitate nuanced discussion around research norms and practices to mitigate the harmful impacts of advances in machine learning (ML). We focus particularly on the use of ML to create "synthetic media" (e.g. to generate or manipulate audio, video, images, and text), and the question of what publication and release processes around such research might look like, though many of the considerations discussed will apply to ML research more broadly. We are not arguing for any specific approach on when or how research should be distributed, but instead try to lay out some useful tools, analogies, and options for thinking about these issues. We begin with some background on the idea that ML research might be misused in harmful ways, and why advances in synthetic media, in particular, are raising concerns. We then outline in more detail some of the different paths to harm from ML research, before reviewing research risk mitigation strategies in other fields and identifying components that seem most worth emulating in the ML and synthetic media research communities. Next, we outline some important dimensions of disagreement on these issues which risk polarizing conversations. Finally, we conclude with recommendations, suggesting that the machine learning community might benefit from: working with subject matter experts to increase understanding of the risk landscape and possible mitigation strategies; building a community and norms around understanding the impacts of ML research, e.g. through regular workshops at major conferences; and establishing institutions and systems to support release practices that would otherwise be onerous and error-prone.

* 11 pages. Language fixes and tweaks for clarity 

  Access Paper or Ask Questions

Making BREAD: Biomimetic strategies for Artificial Intelligence Now and in the Future

Dec 04, 2018
Jeffrey L. Krichmar, William Severa, Salar M. Khan, James L. Olds

The Artificial Intelligence (AI) revolution foretold of during the 1960s is well underway in the second decade of the 21st century. Its period of phenomenal growth likely lies ahead. Still, we believe, there are crucial lessons that biology can offer that will enable a prosperous future for AI. For machines in general, and for AI's especially, operating over extended periods or in extreme environments will require energy usage orders of magnitudes more efficient than exists today. In many operational environments, energy sources will be constrained. Any plans for AI devices operating in a challenging environment must begin with the question of how they are powered, where fuel is located, how energy is stored and made available to the machine, and how long the machine can operate on specific energy units. Hence, the materials and technologies that provide the needed energy represent a critical challenge towards future use-scenarios of AI and should be integrated into their design. Here we make four recommendations for stakeholders and especially decision makers to facilitate a successful trajectory for this technology. First, that scientific societies and governments coordinate Biomimetic Research for Energy-efficient, AI Designs (BREAD); a multinational initiative and a funding strategy for investments in the future integrated design of energetics into AI. Second, that biomimetic energetic solutions be central to design consideration for future AI. Third, that a pre-competitive space be organized between stakeholder partners and fourth, that a trainee pipeline be established to ensure the human capital required for success in this area.

  Access Paper or Ask Questions

Learning Mixtures of Discrete Product Distributions using Spectral Decompositions

May 17, 2014
Prateek Jain, Sewoong Oh

We study the problem of learning a distribution from samples, when the underlying distribution is a mixture of product distributions over discrete domains. This problem is motivated by several practical applications such as crowd-sourcing, recommendation systems, and learning Boolean functions. The existing solutions either heavily rely on the fact that the number of components in the mixtures is finite or have sample/time complexity that is exponential in the number of components. In this paper, we introduce a polynomial time/sample complexity method for learning a mixture of $r$ discrete product distributions over $\{1, 2, \dots, \ell\}^n$, for general $\ell$ and $r$. We show that our approach is statistically consistent and further provide finite sample guarantees. We use techniques from the recent work on tensor decompositions for higher-order moment matching. A crucial step in these moment matching methods is to construct a certain matrix and a certain tensor with low-rank spectral decompositions. These tensors are typically estimated directly from the samples. The main challenge in learning mixtures of discrete product distributions is that these low-rank tensors cannot be obtained directly from the sample moments. Instead, we reduce the tensor estimation problem to: $a$) estimating a low-rank matrix using only off-diagonal block elements; and $b$) estimating a tensor using a small number of linear measurements. Leveraging on recent developments in matrix completion, we give an alternating minimization based method to estimate the low-rank matrix, and formulate the tensor completion problem as a least-squares problem.

* 30 pages no figures 

  Access Paper or Ask Questions