Alert button
Picture for Di Wu

Di Wu

Alert button

Active Instruction Tuning: Improving Cross-Task Generalization by Training on Prompt Sensitive Tasks

Nov 01, 2023
Po-Nien Kung, Fan Yin, Di Wu, Kai-Wei Chang, Nanyun Peng

Instruction tuning (IT) achieves impressive zero-shot generalization results by training large language models (LLMs) on a massive amount of diverse tasks with instructions. However, how to select new tasks to improve the performance and generalizability of IT models remains an open question. Training on all existing tasks is impractical due to prohibiting computation requirements, and randomly selecting tasks can lead to suboptimal performance. In this work, we propose active instruction tuning based on prompt uncertainty, a novel framework to identify informative tasks, and then actively tune the models on the selected tasks. We represent the informativeness of new tasks with the disagreement of the current model outputs over perturbed prompts. Our experiments on NIV2 and Self-Instruct datasets demonstrate that our method consistently outperforms other baseline strategies for task selection, achieving better out-of-distribution generalization with fewer training tasks. Additionally, we introduce a task map that categorizes and diagnoses tasks based on prompt uncertainty and prediction probability. We discover that training on ambiguous (prompt-uncertain) tasks improves generalization while training on difficult (prompt-certain and low-probability) tasks offers no benefit, underscoring the importance of task selection for instruction tuning.

* EMNLP 2023 Main 
Viaarxiv icon

Online Thermal Field Prediction for Metal Additive Manufacturing of Thin Walls

Oct 24, 2023
Yifan Tang, M. Rahmani Dehaghani, Pouyan Sajadi, Shahriar Bakrani Balani, Akshay Dhalpe, Suraj Panicker, Di Wu, Eric Coatanea, G. Gary Wang

Figure 1 for Online Thermal Field Prediction for Metal Additive Manufacturing of Thin Walls
Figure 2 for Online Thermal Field Prediction for Metal Additive Manufacturing of Thin Walls
Figure 3 for Online Thermal Field Prediction for Metal Additive Manufacturing of Thin Walls
Figure 4 for Online Thermal Field Prediction for Metal Additive Manufacturing of Thin Walls

This paper aims to study a practical issue in metal AM, i.e., how to predict the thermal field of yet-to-print parts online when only a few sensors are available. This work proposes an online thermal field prediction method using mapping and reconstruction, which could be integrated into a metal AM process for online performance control. Based on the similarity of temperature curves (curve segments of a temperature profile of one point), the thermal field mapping applies an artificial neural network to estimate the temperature curves of points on the yet-to-print layer from measured temperatures of certain points on the previously printed layer. With measured/predicted temperature profiles of several points on the same layer, the thermal field reconstruction proposes a reduced order model (ROM) to construct the temperature profiles of all points on the same layer, which could be used to build the temperature field of the entire layer. The training of ROM is performed with an extreme learning machine (ELM) for computational efficiency. Fifteen wire arc AM experiments and nine simulations are designed for thin walls with a fixed length and unidirectional printing of each layer. The test results indicate that the proposed prediction method could construct the thermal field of a yet-to-print layer within 0.1 seconds on a low-cost desktop. Meanwhile, the method has acceptable generalization capability in most cases from lower layers to higher layers in the same simulation and from one simulation to a new simulation on different AM process parameters. More importantly, after fine-tuning the proposed method with limited experimental data, the relative errors of all predicted temperature profiles on a new experiment are sufficiently small, demonstrating the applicability and generalization of the proposed thermal field prediction method in online applications for metal AM.

* 36 pages, 26 figures, 5 tables 
Viaarxiv icon

Rethinking Model Selection and Decoding for Keyphrase Generation with Pre-trained Sequence-to-Sequence Models

Oct 22, 2023
Di Wu, Wasi Uddin Ahmad, Kai-Wei Chang

Keyphrase Generation (KPG) is a longstanding task in NLP with widespread applications. The advent of sequence-to-sequence (seq2seq) pre-trained language models (PLMs) has ushered in a transformative era for KPG, yielding promising performance improvements. However, many design decisions remain unexplored and are often made arbitrarily. This paper undertakes a systematic analysis of the influence of model selection and decoding strategies on PLM-based KPG. We begin by elucidating why seq2seq PLMs are apt for KPG, anchored by an attention-driven hypothesis. We then establish that conventional wisdom for selecting seq2seq PLMs lacks depth: (1) merely increasing model size or performing task-specific adaptation is not parameter-efficient; (2) although combining in-domain pre-training with task adaptation benefits KPG, it does partially hinder generalization. Regarding decoding, we demonstrate that while greedy search achieves strong F1 scores, it lags in recall compared with sampling-based methods. Based on these insights, we propose DeSel, a likelihood-based decode-select algorithm for seq2seq PLMs. DeSel improves greedy search by an average of 4.7% semantic F1 across five datasets. Our collective findings pave the way for deeper future investigations into PLM-based KPG.

* EMNLP 2023 camera ready 
Viaarxiv icon

A Better Match for Drivers and Riders: Reinforcement Learning at Lyft

Oct 20, 2023
Xabi Azagirre, Akshay Balwally, Guillaume Candeli, Nicholas Chamandy, Benjamin Han, Alona King, Hyungjun Lee, Martin Loncaric, Sébastien Martin, Vijay Narasiman, Zhiwei, Qin, Baptiste Richard, Sara Smoot, Sean Taylor, Garrett van Ryzin, Di Wu, Fei Yu, Alex Zamoshchin

Figure 1 for A Better Match for Drivers and Riders: Reinforcement Learning at Lyft
Figure 2 for A Better Match for Drivers and Riders: Reinforcement Learning at Lyft
Figure 3 for A Better Match for Drivers and Riders: Reinforcement Learning at Lyft
Figure 4 for A Better Match for Drivers and Riders: Reinforcement Learning at Lyft

To better match drivers to riders in our ridesharing application, we revised Lyft's core matching algorithm. We use a novel online reinforcement learning approach that estimates the future earnings of drivers in real time and use this information to find more efficient matches. This change was the first documented implementation of a ridesharing matching algorithm that can learn and improve in real time. We evaluated the new approach during weeks of switchback experimentation in most Lyft markets, and estimated how it benefited drivers, riders, and the platform. In particular, it enabled our drivers to serve millions of additional riders each year, leading to more than $30 million per year in incremental revenue. Lyft rolled out the algorithm globally in 2021.

Viaarxiv icon

UvA-MT's Participation in the WMT23 General Translation Shared Task

Oct 15, 2023
Di Wu, Shaomu Tan, David Stap, Ali Araabi, Christof Monz

This paper describes the UvA-MT's submission to the WMT 2023 shared task on general machine translation. We participate in the constrained track in two directions: English <-> Hebrew. In this competition, we show that by using one model to handle bidirectional tasks, as a minimal setting of Multilingual Machine Translation (MMT), it is possible to achieve comparable results with that of traditional bilingual translation for both directions. By including effective strategies, like back-translation, re-parameterized embedding table, and task-oriented fine-tuning, we obtained competitive final results in the automatic evaluation for both English -> Hebrew and Hebrew -> English directions.

* This paper has been accepted by the WMT2023 Conference 
Viaarxiv icon

Decoding Modular Reconfigurable Robots: A Survey on Mechanisms and Design

Oct 15, 2023
Guanqi Liang, Di Wu, Yuxiao Tu, Tin Lun Lam

The intrinsic modularity and reconfigurability of modular reconfigurable robots (MRR) confer advantages such as versatility, fault tolerance, and economic efficacy, thereby showcasing considerable potential across diverse applications. The continuous evolution of the technology landscape and the emergence of diverse conceptual designs have generated multiple MRR categories, each described by its respective morphology or capability characteristics, leading to some ambiguity in the taxonomy. This paper conducts a comprehensive survey encompassing the entirety of MRR hardware and design, spanning from the inception in 1985 to 2023. This paper introduces an innovative, unified conceptual framework for understanding MRR hardware, which encompasses three pivotal elements: connectors, actuators, and homogeneity. Through the utilization of this trilateral framework, this paper provide an intuitive understanding of the diverse spectrum of MRR hardware iterations while systematically deciphering and classifying the entire range, offering a more structured perspective. This survey elucidates the fundamental attributes characterizing MRRs and their compositional aspects, providinig insights into their design, technology, functionality, and categorization. Augmented by the proposed trilateral framework, this paper also elaborates on the trajectory of evolution, prevailing trends, principal challenges, and potential prospects within the field of MRRs.

Viaarxiv icon

Realizing XR Applications Using 5G-Based 3D Holographic Communication and Mobile Edge Computing

Oct 05, 2023
Dun Yuan, Ekram Hossain, Di Wu, Xue Liu, Gregory Dudek

Figure 1 for Realizing XR Applications Using 5G-Based 3D Holographic Communication and Mobile Edge Computing
Figure 2 for Realizing XR Applications Using 5G-Based 3D Holographic Communication and Mobile Edge Computing
Figure 3 for Realizing XR Applications Using 5G-Based 3D Holographic Communication and Mobile Edge Computing
Figure 4 for Realizing XR Applications Using 5G-Based 3D Holographic Communication and Mobile Edge Computing

3D holographic communication has the potential to revolutionize the way people interact with each other in virtual spaces, offering immersive and realistic experiences. However, demands for high data rates, extremely low latency, and high computations to enable this technology pose a significant challenge. To address this challenge, we propose a novel job scheduling algorithm that leverages Mobile Edge Computing (MEC) servers in order to minimize the total latency in 3D holographic communication. One of the motivations for this work is to prevent the uncanny valley effect, which can occur when the latency hinders the seamless and real-time rendering of holographic content, leading to a less convincing and less engaging user experience. Our proposed algorithm dynamically allocates computation tasks to MEC servers, considering the network conditions, computational capabilities of the servers, and the requirements of the 3D holographic communication application. We conduct extensive experiments to evaluate the performance of our algorithm in terms of latency reduction, and the results demonstrate that our approach significantly outperforms other baseline methods. Furthermore, we present a practical scenario involving Augmented Reality (AR), which not only illustrates the applicability of our algorithm but also highlights the importance of minimizing latency in achieving high-quality holographic views. By efficiently distributing the computation workload among MEC servers and reducing the overall latency, our proposed algorithm enhances the user experience in 3D holographic communications and paves the way for the widespread adoption of this technology in various applications, such as telemedicine, remote collaboration, and entertainment.

Viaarxiv icon

A Read Margin Enhancement Circuit with Dynamic Bias Optimization for MRAM

Sep 18, 2023
Renhe Chen, Albert Lee, Zirui Wang, Di Wu, Xufeng Kou

Figure 1 for A Read Margin Enhancement Circuit with Dynamic Bias Optimization for MRAM
Figure 2 for A Read Margin Enhancement Circuit with Dynamic Bias Optimization for MRAM
Figure 3 for A Read Margin Enhancement Circuit with Dynamic Bias Optimization for MRAM
Figure 4 for A Read Margin Enhancement Circuit with Dynamic Bias Optimization for MRAM

This brief introduces a read bias circuit to improve readout yield of magnetic random access memories (MRAMs). A dynamic bias optimization (DBO) circuit is proposed to enable the real-time tracking of the optimal read voltage across processvoltage-temperature (PVT) variations within an MRAM array. It optimizes read performance by adjusting the read bias voltage dynamically for maximum sensing margin. Simulation results on a 28-nm 1Mb MRAM macro show that the tracking accuracy of the proposed DBO circuit remains above 90% even when the optimal sensing voltage varies up to 50%. Such dynamic tracking strategy further results in up to two orders of magnitude reduction in the bit error rate with respect to different variations, highlighting its effectiveness in enhancing MRAM performance and reliability.

Viaarxiv icon

A Tutorial on Environment-Aware Communications via Channel Knowledge Map for 6G

Sep 14, 2023
Yong Zeng, Junting Chen, Jie Xu, Di Wu, Xiaoli Xu, Shi Jin, Xiqi Gao, David Gesbert, Shuguang Cui, Rui Zhang

Figure 1 for A Tutorial on Environment-Aware Communications via Channel Knowledge Map for 6G
Figure 2 for A Tutorial on Environment-Aware Communications via Channel Knowledge Map for 6G
Figure 3 for A Tutorial on Environment-Aware Communications via Channel Knowledge Map for 6G
Figure 4 for A Tutorial on Environment-Aware Communications via Channel Knowledge Map for 6G

Sixth-generation (6G) mobile communication networks are expected to have dense infrastructures, large-dimensional channels, cost-effective hardware, diversified positioning methods, and enhanced intelligence. Such trends bring both new challenges and opportunities for the practical design of 6G. On one hand, acquiring channel state information (CSI) in real time for all wireless links becomes quite challenging in 6G. On the other hand, there would be numerous data sources in 6G containing high-quality location-tagged channel data, making it possible to better learn the local wireless environment. By exploiting such new opportunities and for tackling the CSI acquisition challenge, there is a promising paradigm shift from the conventional environment-unaware communications to the new environment-aware communications based on the novel approach of channel knowledge map (CKM). This article aims to provide a comprehensive tutorial overview on environment-aware communications enabled by CKM to fully harness its benefits for 6G. First, the basic concept of CKM is presented, and a comparison of CKM with various existing channel inference techniques is discussed. Next, the main techniques for CKM construction are discussed, including both the model-free and model-assisted approaches. Furthermore, a general framework is presented for the utilization of CKM to achieve environment-aware communications, followed by some typical CKM-aided communication scenarios. Finally, important open problems in CKM research are highlighted and potential solutions are discussed to inspire future work.

Viaarxiv icon