Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yi Su

for the Alzheimer's Disease Neuroimaging Initiative

OPT-Tree: Speculative Decoding with Adaptive Draft Tree Structure

Jun 25, 2024

Jikai Wang, Yi Su, Juntao Li, Qinrong Xia, Zi Ye, Xinyu Duan, Zhefeng Wang, Min Zhang

Figure 1 for OPT-Tree: Speculative Decoding with Adaptive Draft Tree Structure

Figure 2 for OPT-Tree: Speculative Decoding with Adaptive Draft Tree Structure

Figure 3 for OPT-Tree: Speculative Decoding with Adaptive Draft Tree Structure

Figure 4 for OPT-Tree: Speculative Decoding with Adaptive Draft Tree Structure

Abstract:Autoregressive language models demonstrate excellent performance in various scenarios. However, the inference efficiency is limited by its one-step-one-word generation mode, which has become a pressing problem recently as the models become increasingly larger. Speculative decoding employs a "draft and then verify" mechanism to allow multiple tokens to be generated in one step, realizing lossless acceleration. Existing methods mainly adopt fixed heuristic draft structures, which fail to adapt to different situations to maximize the acceptance length during verification. To alleviate this dilemma, we proposed OPT-Tree, an algorithm to construct adaptive and scalable draft trees. It searches the optimal tree structure that maximizes the mathematical expectation of the acceptance length in each decoding step. Experimental results reveal that OPT-Tree outperforms the existing draft structures and achieves a speed-up ratio of up to 3.2 compared with autoregressive decoding. If the draft model is powerful enough and the node budget is sufficient, it can generate more than ten tokens in a single step. Our code is available at https://github.com/Jikai0Wang/OPT-Tree.

Via

Access Paper or Ask Questions

Demonstration Augmentation for Zero-shot In-context Learning

Jun 03, 2024

Yi Su, Yunpeng Tai, Yixin Ji, Juntao Li, Bowen Yan, Min Zhang

Figure 1 for Demonstration Augmentation for Zero-shot In-context Learning

Figure 2 for Demonstration Augmentation for Zero-shot In-context Learning

Figure 3 for Demonstration Augmentation for Zero-shot In-context Learning

Figure 4 for Demonstration Augmentation for Zero-shot In-context Learning

Abstract:Large Language Models (LLMs) have demonstrated an impressive capability known as In-context Learning (ICL), which enables them to acquire knowledge from textual demonstrations without the need for parameter updates. However, many studies have highlighted that the model's performance is sensitive to the choice of demonstrations, presenting a significant challenge for practical applications where we lack prior knowledge of user queries. Consequently, we need to construct an extensive demonstration pool and incorporate external databases to assist the model, leading to considerable time and financial costs. In light of this, some recent research has shifted focus towards zero-shot ICL, aiming to reduce the model's reliance on external information by leveraging their inherent generative capabilities. Despite the effectiveness of these approaches, the content generated by the model may be unreliable, and the generation process is time-consuming. To address these issues, we propose Demonstration Augmentation for In-context Learning (DAIL), which employs the model's previously predicted historical samples as demonstrations for subsequent ones. DAIL brings no additional inference cost and does not rely on the model's generative capabilities. Our experiments reveal that DAIL can significantly improve the model's performance over direct zero-shot inference and can even outperform few-shot ICL without any external information.

* Accepted to ACL 2024 Findings

Via

Access Paper or Ask Questions

OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage Pruning

May 09, 2024

Dan Qiao, Yi Su, Pinzheng Wang, Jing Ye, Wenjing Xie, Yuechi Zhou, Yuyang Ding, Zecheng Tang, Jikai Wang, Yixin Ji(+11 more)

Figure 1 for OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage Pruning

Figure 2 for OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage Pruning

Figure 3 for OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage Pruning

Figure 4 for OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage Pruning

Abstract:Large Language Models (LLMs) have played an important role in many fields due to their powerful capabilities.However, their massive number of parameters leads to high deployment requirements and incurs significant inference costs, which impedes their practical applications. Training smaller models is an effective way to address this problem. Therefore, we introduce OpenBA-V2, a 3.4B model derived from multi-stage compression and continual pre-training from the original 15B OpenBA model. OpenBA-V2 utilizes more data, more flexible training objectives, and techniques such as layer pruning, neural pruning, and vocabulary pruning to achieve a compression rate of 77.3\% with minimal performance loss. OpenBA-V2 demonstrates competitive performance compared to other open-source models of similar size, achieving results close to or on par with the 15B OpenBA model in downstream tasks such as common sense reasoning and Named Entity Recognition (NER). OpenBA-V2 illustrates that LLMs can be compressed into smaller ones with minimal performance loss by employing advanced training objectives and data strategies, which may help deploy LLMs in resource-limited scenarios.

Via

Access Paper or Ask Questions

VISION2UI: A Real-World Dataset with Layout for Code Generation from UI Designs

Apr 09, 2024

Yi Gui, Zhen Li, Yao Wan, Yemin Shi, Hongyu Zhang, Yi Su, Shaoling Dong, Xing Zhou, Wenbin Jiang

Abstract:Automatically generating UI code from webpage design visions can significantly alleviate the burden of developers, enabling beginner developers or designers to directly generate Web pages from design diagrams. Currently, prior research has accomplished the objective of generating UI code from rudimentary design visions or sketches through designing deep neural networks. Inspired by the groundbreaking advancements achieved by Multimodal Large Language Models (MLLMs), the automatic generation of UI code from high-fidelity design images is now emerging as a viable possibility. Nevertheless, our investigation reveals that existing MLLMs are hampered by the scarcity of authentic, high-quality, and large-scale datasets, leading to unsatisfactory performance in automated UI code generation. To mitigate this gap, we present a novel dataset, termed VISION2UI, extracted from real-world scenarios, augmented with comprehensive layout information, tailored specifically for finetuning MLLMs in UI code generation. Specifically, this dataset is derived through a series of operations, encompassing collecting, cleaning, and filtering of the open-source Common Crawl dataset. In order to uphold its quality, a neural scorer trained on labeled samples is utilized to refine the data, retaining higher-quality instances. Ultimately, this process yields a dataset comprising 2,000 (Much more is coming soon) parallel samples encompassing design visions and UI code. The dataset is available at https://huggingface.co/datasets/xcodemind/vision2ui.

Via

Access Paper or Ask Questions

Online Feature Updates Improve Online (Generalized) Label Shift Adaptation

Feb 05, 2024

Ruihan Wu, Siddhartha Datta, Yi Su, Dheeraj Baby, Yu-Xiang Wang, Kilian Q. Weinberger

Abstract:This paper addresses the prevalent issue of label shift in an online setting with missing labels, where data distributions change over time and obtaining timely labels is challenging. While existing methods primarily focus on adjusting or updating the final layer of a pre-trained classifier, we explore the untapped potential of enhancing feature representations using unlabeled data at test-time. Our novel method, Online Label Shift adaptation with Online Feature Updates (OLS-OFU), leverages self-supervised learning to refine the feature extraction process, thereby improving the prediction model. Theoretical analyses confirm that OLS-OFU reduces algorithmic regret by capitalizing on self-supervised learning for feature refinement. Empirical studies on various datasets, under both online label shift and generalized label shift conditions, underscore the effectiveness and robustness of OLS-OFU, especially in cases of domain shifts.

Via

Access Paper or Ask Questions

KGLens: A Parameterized Knowledge Graph Solution to Assess What an LLM Does and Doesn't Know

Dec 15, 2023

Shangshang Zheng, He Bai, Yizhe Zhang, Yi Su, Xiaochuan Niu, Navdeep Jaitly

Abstract:Current approaches to evaluating large language models (LLMs) with pre-existing Knowledge Graphs (KG) mostly ignore the structure of the KG and make arbitrary choices of which part of the graph to evaluate. In this paper, we introduce KGLens, a method to evaluate LLMs by generating natural language questions from a KG in a structure aware manner so that we can characterize its performance on a more aggregated level. KGLens uses a parameterized KG, where each edge is augmented with a beta distribution that guides how to sample edges from the KG for QA testing. As the evaluation proceeds, different edges of the parameterized KG are sampled and assessed appropriately, converging to a more global picture of the performance of the LLMs on the KG as a whole. In our experiments, we construct three domain-specific KGs for knowledge assessment, comprising over 19,000 edges, 700 relations, and 21,000 entities. The results demonstrate that KGLens can not only assess overall performance but also provide topic, temporal, and relation analyses of LLMs. This showcases the adaptability and customizability of KGLens, emphasizing its ability to focus the evaluation based on specific criteria.

* In progress

Via

Access Paper or Ask Questions

A Novel Hybrid Ordinal Learning Model with Health Care Application

Dec 15, 2023

Lujia Wang, Hairong Wang, Yi Su, Fleming Lure, Jing Li

Figure 1 for A Novel Hybrid Ordinal Learning Model with Health Care Application

Figure 2 for A Novel Hybrid Ordinal Learning Model with Health Care Application

Figure 3 for A Novel Hybrid Ordinal Learning Model with Health Care Application

Figure 4 for A Novel Hybrid Ordinal Learning Model with Health Care Application

Abstract:Ordinal learning (OL) is a type of machine learning models with broad utility in health care applications such as diagnosis of different grades of a disease (e.g., mild, modest, severe) and prediction of the speed of disease progression (e.g., very fast, fast, moderate, slow). This paper aims to tackle a situation when precisely labeled samples are limited in the training set due to cost or availability constraints, whereas there could be an abundance of samples with imprecise labels. We focus on imprecise labels that are intervals, i.e., one can know that a sample belongs to an interval of labels but cannot know which unique label it has. This situation is quite common in health care datasets due to limitations of the diagnostic instrument, sparse clinical visits, or/and patient dropout. Limited research has been done to develop OL models with imprecise/interval labels. We propose a new Hybrid Ordinal Learner (HOL) to integrate samples with both precise and interval labels to train a robust OL model. We also develop a tractable and efficient optimization algorithm to solve the HOL formulation. We compare HOL with several recently developed OL methods on four benchmarking datasets, which demonstrate the superior performance of HOL. Finally, we apply HOL to a real-world dataset for predicting the speed of progressing to Alzheimer's Disease (AD) for individuals with Mild Cognitive Impairment (MCI) based on a combination of multi-modality neuroimaging and demographic/clinical datasets. HOL achieves high accuracy in the prediction and outperforms existing methods. The capability of accurately predicting the speed of progression to AD for each individual with MCI has the potential for helping facilitate more individually-optimized interventional strategies.

* 16 pages, 3 figures, 2 tables

Via

Access Paper or Ask Questions

Leveraging Large Language Models for Exploiting ASR Uncertainty

Sep 12, 2023

Pranay Dighe, Yi Su, Shangshang Zheng, Yunshu Liu, Vineet Garg, Xiaochuan Niu, Ahmed Tewfik

Figure 1 for Leveraging Large Language Models for Exploiting ASR Uncertainty

Figure 2 for Leveraging Large Language Models for Exploiting ASR Uncertainty

Figure 3 for Leveraging Large Language Models for Exploiting ASR Uncertainty

Figure 4 for Leveraging Large Language Models for Exploiting ASR Uncertainty

Abstract:While large language models excel in a variety of natural language processing (NLP) tasks, to perform well on spoken language understanding (SLU) tasks, they must either rely on off-the-shelf automatic speech recognition (ASR) systems for transcription, or be equipped with an in-built speech modality. This work focuses on the former scenario, where LLM's accuracy on SLU tasks is constrained by the accuracy of a fixed ASR system on the spoken input. Specifically, we tackle speech-intent classification task, where a high word-error-rate can limit the LLM's ability to understand the spoken intent. Instead of chasing a high accuracy by designing complex or specialized architectures regardless of deployment costs, we seek to answer how far we can go without substantially changing the underlying ASR and LLM, which can potentially be shared by multiple unrelated tasks. To this end, we propose prompting the LLM with an n-best list of ASR hypotheses instead of only the error-prone 1-best hypothesis. We explore prompt-engineering to explain the concept of n-best lists to the LLM; followed by the finetuning of Low-Rank Adapters on the downstream tasks. Our approach using n-best lists proves to be effective on a device-directed speech detection task as well as on a keyword spotting task, where systems using n-best list prompts outperform those using 1-best ASR hypothesis; thus paving the way for an efficient method to exploit ASR uncertainty via LLMs for speech-based applications.

* Added references

Via

Access Paper or Ask Questions

Intensity-free Convolutional Temporal Point Process: Incorporating Local and Global Event Contexts

Jun 24, 2023

Wang-Tao Zhou, Zhao Kang, Ling Tian, Yi Su

Figure 1 for Intensity-free Convolutional Temporal Point Process: Incorporating Local and Global Event Contexts

Figure 2 for Intensity-free Convolutional Temporal Point Process: Incorporating Local and Global Event Contexts

Figure 3 for Intensity-free Convolutional Temporal Point Process: Incorporating Local and Global Event Contexts

Figure 4 for Intensity-free Convolutional Temporal Point Process: Incorporating Local and Global Event Contexts

Abstract:Event prediction in the continuous-time domain is a crucial but rather difficult task. Temporal point process (TPP) learning models have shown great advantages in this area. Existing models mainly focus on encoding global contexts of events using techniques like recurrent neural networks (RNNs) or self-attention mechanisms. However, local event contexts also play an important role in the occurrences of events, which has been largely ignored. Popular convolutional neural networks, which are designated for local context capturing, have never been applied to TPP modelling due to their incapability of modelling in continuous time. In this work, we propose a novel TPP modelling approach that combines local and global contexts by integrating a continuous-time convolutional event encoder with an RNN. The presented framework is flexible and scalable to handle large datasets with long sequences and complex latent patterns. The experimental result shows that the proposed model improves the performance of probabilistic sequential modelling and the accuracy of event prediction. To our best knowledge, this is the first work that applies convolutional neural networks to TPP modelling.

* Accepted to Information Sciences

Via

Access Paper or Ask Questions

Unified Off-Policy Learning to Rank: a Reinforcement Learning Perspective

Jun 13, 2023

Zeyu Zhang, Yi Su, Hui Yuan, Yiran Wu, Rishab Balasubramanian, Qingyun Wu, Huazheng Wang, Mengdi Wang

Figure 1 for Unified Off-Policy Learning to Rank: a Reinforcement Learning Perspective

Figure 2 for Unified Off-Policy Learning to Rank: a Reinforcement Learning Perspective

Figure 3 for Unified Off-Policy Learning to Rank: a Reinforcement Learning Perspective

Figure 4 for Unified Off-Policy Learning to Rank: a Reinforcement Learning Perspective

Abstract:Off-policy Learning to Rank (LTR) aims to optimize a ranker from data collected by a deployed logging policy. However, existing off-policy learning to rank methods often make strong assumptions about how users generate the click data, i.e., the click model, and hence need to tailor their methods specifically under different click models. In this paper, we unified the ranking process under general stochastic click models as a Markov Decision Process (MDP), and the optimal ranking could be learned with offline reinforcement learning (RL) directly. Building upon this, we leverage offline RL techniques for off-policy LTR and propose the Click Model-Agnostic Unified Off-policy Learning to Rank (CUOLR) method, which could be easily applied to a wide range of click models. Through a dedicated formulation of the MDP, we show that offline RL algorithms can adapt to various click models without complex debiasing techniques and prior knowledge of the model. Results on various large-scale datasets demonstrate that CUOLR consistently outperforms the state-of-the-art off-policy learning to rank algorithms while maintaining consistency and robustness under different click models.

Via

Access Paper or Ask Questions