Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Cathy Jiao

Efficient Dataset Selection for Continual Adaptation of Generative Recommenders

Apr 09, 2026

Cathy Jiao, Juan Elenter, Praveen Ravichandran, Bernd Huber, Joseph Cauteruccio, Todd Wasson, Timothy Heath, Chenyan Xiong, Mounia Lalmas, Paul Bennett

Abstract:Recommendation systems must continuously adapt to evolving user behavior, yet the volume of data generated in large-scale streaming environments makes frequent full retraining impractical. This work investigates how targeted data selection can mitigate performance degradation caused by temporal distributional drift while maintaining scalability. We evaluate a range of representation choices and sampling strategies for curating small but informative subsets of user interaction data. Our results demonstrate that gradient-based representations, coupled with distribution-matching, improve downstream model performance, achieving training efficiency gains while preserving robustness to drift. These findings highlight data curation as a practical mechanism for scalable monitoring and adaptive model updates in production-scale recommendation systems.

* ICLR 2026 CAO Workshop (Oral)

Via

Access Paper or Ask Questions

In-Context Probing Approximates Influence Function for Data Valuation

Jul 17, 2024

Cathy Jiao, Gary Gao, Chenyan Xiong

Abstract:Data valuation quantifies the value of training data, and is used for data attribution (i.e., determining the contribution of training data towards model predictions), and data selection; both of which are important for curating high-quality datasets to train large language models. In our paper, we show that data valuation through in-context probing (i.e., prompting a LLM) approximates influence functions for selecting training data. We provide a theoretical sketch on this connection based on transformer models performing "implicit" gradient descent on its in-context inputs. Our empirical findings show that in-context probing and gradient-based influence frameworks are similar in how they rank training data. Furthermore, fine-tuning experiments on data selected by either method reveal similar model performance.

Via

Access Paper or Ask Questions

ET tu, CLIP? Addressing Common Object Errors for Unseen Environments

Jun 25, 2024

Ye Won Byun, Cathy Jiao, Shahriar Noroozizadeh, Jimin Sun, Rosa Vitiello

Figure 1 for ET tu, CLIP? Addressing Common Object Errors for Unseen Environments

Figure 2 for ET tu, CLIP? Addressing Common Object Errors for Unseen Environments

Figure 3 for ET tu, CLIP? Addressing Common Object Errors for Unseen Environments

Abstract:We introduce a simple method that employs pre-trained CLIP encoders to enhance model generalization in the ALFRED task. In contrast to previous literature where CLIP replaces the visual encoder, we suggest using CLIP as an additional module through an auxiliary object detection objective. We validate our method on the recently proposed Episodic Transformer architecture and demonstrate that incorporating CLIP improves task performance on the unseen validation set. Additionally, our analysis results support that CLIP especially helps with leveraging object descriptions, detecting small objects, and interpreting rare words.

* Conference on Computer Vision and Pattern Recognition (CVPR 2022) - Embodied AI Workshop

Via

Access Paper or Ask Questions

Understanding the Effectiveness of Very Large Language Models on Dialog Evaluation

Jan 27, 2023

Jessica Huynh, Cathy Jiao, Prakhar Gupta, Shikib Mehri, Payal Bajaj, Vishrav Chaudhary, Maxine Eskenazi

Figure 1 for Understanding the Effectiveness of Very Large Language Models on Dialog Evaluation

Figure 2 for Understanding the Effectiveness of Very Large Language Models on Dialog Evaluation

Figure 3 for Understanding the Effectiveness of Very Large Language Models on Dialog Evaluation

Figure 4 for Understanding the Effectiveness of Very Large Language Models on Dialog Evaluation

Abstract:Language models have steadily increased in size over the past few years. They achieve a high level of performance on various natural language processing (NLP) tasks such as question answering and summarization. Large language models (LLMs) have been used for generation and can now output human-like text. Due to this, there are other downstream tasks in the realm of dialog that can now harness the LLMs' language understanding capabilities. Dialog evaluation is one task that this paper will explore. It concentrates on prompting with LLMs: BLOOM, OPT, GPT-3, Flan-T5, InstructDial and TNLGv2. The paper shows that the choice of datasets used for training a model contributes to how well it performs on a task as well as on how the prompt should be structured. Specifically, the more diverse and relevant the group of datasets that a model is trained on, the better dialog evaluation performs. This paper also investigates how the number of examples in the prompt and the type of example selection used affect the model's performance.

* Accepted for publication at IWSDS 2023

Via

Access Paper or Ask Questions

The DialPort tools

Aug 18, 2022

Jessica Huynh, Shikib Mehri, Cathy Jiao, Maxine Eskenazi

Abstract:The DialPort project http://dialport.org/, funded by the National Science Foundation (NSF), covers a group of tools and services that aim at fulfilling the needs of the dialog research community. Over the course of six years, several offerings have been created, including the DialPort Portal and DialCrowd. This paper describes these contributions, which will be demoed at SIGDIAL, including implementation, prior studies, corresponding discoveries, and the locations at which the tools will remain freely available to the community going forward.

* Accepted to SIGDIAL 2022

Via

Access Paper or Ask Questions

Improving Zero and Few-shot Generalization in Dialogue through Instruction Tuning

May 25, 2022

Prakhar Gupta, Cathy Jiao, Yi-Ting Yeh, Shikib Mehri, Maxine Eskenazi, Jeffrey P. Bigham

Figure 1 for Improving Zero and Few-shot Generalization in Dialogue through Instruction Tuning

Figure 2 for Improving Zero and Few-shot Generalization in Dialogue through Instruction Tuning

Figure 3 for Improving Zero and Few-shot Generalization in Dialogue through Instruction Tuning

Figure 4 for Improving Zero and Few-shot Generalization in Dialogue through Instruction Tuning

Abstract:Instruction tuning is an emergent paradigm in NLP wherein natural language instructions are leveraged with language models to induce zero-shot performance on unseen tasks. Instructions have been shown to enable good performance on unseen tasks and datasets in both large and small language models. Dialogue is an especially interesting area to explore instruction tuning because dialogue systems perform multiple kinds of tasks related to language (e.g., natural language understanding and generation, domain-specific interaction), yet instruction tuning has not been systematically explored for dialogue-related tasks. We introduce InstructDial, an instruction tuning framework for dialogue, which consists of a repository of 48 diverse dialogue tasks in a unified text-to-text format created from 59 openly available dialogue datasets. Next, we explore cross-task generalization ability on models tuned on InstructDial across diverse dialogue tasks. Our analysis reveals that InstructDial enables good zero-shot performance on unseen datasets and tasks such as dialogue evaluation and intent detection, and even better performance in a few-shot setting. To ensure that models adhere to instructions, we introduce novel meta-tasks. We establish benchmark zero-shot and few-shot performance of models trained using the proposed framework on multiple dialogue tasks.

Via

Access Paper or Ask Questions