Alert button
Picture for Prithviraj Ammanabrolu

Prithviraj Ammanabrolu

Alert button

Personalized Soups: Personalized Large Language Model Alignment via Post-hoc Parameter Merging

Oct 17, 2023
Joel Jang, Seungone Kim, Bill Yuchen Lin, Yizhong Wang, Jack Hessel, Luke Zettlemoyer, Hannaneh Hajishirzi, Yejin Choi, Prithviraj Ammanabrolu

Viaarxiv icon

Fine-Grained Human Feedback Gives Better Rewards for Language Model Training

Jun 02, 2023
Zeqiu Wu, Yushi Hu, Weijia Shi, Nouha Dziri, Alane Suhr, Prithviraj Ammanabrolu, Noah A. Smith, Mari Ostendorf, Hannaneh Hajishirzi

Figure 1 for Fine-Grained Human Feedback Gives Better Rewards for Language Model Training
Figure 2 for Fine-Grained Human Feedback Gives Better Rewards for Language Model Training
Figure 3 for Fine-Grained Human Feedback Gives Better Rewards for Language Model Training
Figure 4 for Fine-Grained Human Feedback Gives Better Rewards for Language Model Training
Viaarxiv icon

SwiftSage: A Generative Agent with Fast and Slow Thinking for Complex Interactive Tasks

May 27, 2023
Bill Yuchen Lin, Yicheng Fu, Karina Yang, Prithviraj Ammanabrolu, Faeze Brahman, Shiyu Huang, Chandra Bhagavatula, Yejin Choi, Xiang Ren

Figure 1 for SwiftSage: A Generative Agent with Fast and Slow Thinking for Complex Interactive Tasks
Figure 2 for SwiftSage: A Generative Agent with Fast and Slow Thinking for Complex Interactive Tasks
Figure 3 for SwiftSage: A Generative Agent with Fast and Slow Thinking for Complex Interactive Tasks
Figure 4 for SwiftSage: A Generative Agent with Fast and Slow Thinking for Complex Interactive Tasks
Viaarxiv icon

Inference-Time Policy Adapters (IPA): Tailoring Extreme-Scale LMs without Fine-tuning

May 24, 2023
Ximing Lu, Faeze Brahman, Peter West, Jaehun Jang, Khyathi Chandu, Abhilasha Ravichander, Lianhui Qin, Prithviraj Ammanabrolu, Liwei Jiang, Sahana Ramnath, Nouha Dziri, Jillian Fisher, Bill Yuchen Lin, Skyler Hallinan, Xiang Ren, Sean Welleck, Yejin Choi

Figure 1 for Inference-Time Policy Adapters (IPA): Tailoring Extreme-Scale LMs without Fine-tuning
Figure 2 for Inference-Time Policy Adapters (IPA): Tailoring Extreme-Scale LMs without Fine-tuning
Figure 3 for Inference-Time Policy Adapters (IPA): Tailoring Extreme-Scale LMs without Fine-tuning
Figure 4 for Inference-Time Policy Adapters (IPA): Tailoring Extreme-Scale LMs without Fine-tuning
Viaarxiv icon

Do Embodied Agents Dream of Pixelated Sheep?: Embodied Decision Making using Language Guided World Modelling

Jan 28, 2023
Kolby Nottingham, Prithviraj Ammanabrolu, Alane Suhr, Yejin Choi, Hannaneh Hajishirzi, Sameer Singh, Roy Fox

Figure 1 for Do Embodied Agents Dream of Pixelated Sheep?: Embodied Decision Making using Language Guided World Modelling
Figure 2 for Do Embodied Agents Dream of Pixelated Sheep?: Embodied Decision Making using Language Guided World Modelling
Figure 3 for Do Embodied Agents Dream of Pixelated Sheep?: Embodied Decision Making using Language Guided World Modelling
Figure 4 for Do Embodied Agents Dream of Pixelated Sheep?: Embodied Decision Making using Language Guided World Modelling
Viaarxiv icon

An AI Dungeon Master's Guide: Learning to Converse and Guide with Intents and Theory-of-Mind in Dungeons and Dragons

Dec 20, 2022
Pei Zhou, Andrew Zhu, Jennifer Hu, Jay Pujara, Xiang Ren, Chris Callison-Burch, Yejin Choi, Prithviraj Ammanabrolu

Figure 1 for An AI Dungeon Master's Guide: Learning to Converse and Guide with Intents and Theory-of-Mind in Dungeons and Dragons
Figure 2 for An AI Dungeon Master's Guide: Learning to Converse and Guide with Intents and Theory-of-Mind in Dungeons and Dragons
Figure 3 for An AI Dungeon Master's Guide: Learning to Converse and Guide with Intents and Theory-of-Mind in Dungeons and Dragons
Figure 4 for An AI Dungeon Master's Guide: Learning to Converse and Guide with Intents and Theory-of-Mind in Dungeons and Dragons
Viaarxiv icon

Behavior Cloned Transformers are Neurosymbolic Reasoners

Oct 13, 2022
Ruoyao Wang, Peter Jansen, Marc-Alexandre Côté, Prithviraj Ammanabrolu

Figure 1 for Behavior Cloned Transformers are Neurosymbolic Reasoners
Figure 2 for Behavior Cloned Transformers are Neurosymbolic Reasoners
Figure 3 for Behavior Cloned Transformers are Neurosymbolic Reasoners
Figure 4 for Behavior Cloned Transformers are Neurosymbolic Reasoners
Viaarxiv icon

Is Reinforcement Learning (Not) for Natural Language Processing?: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization

Oct 03, 2022
Rajkumar Ramamurthy, Prithviraj Ammanabrolu, Kianté Brantley, Jack Hessel, Rafet Sifa, Christian Bauckhage, Hannaneh Hajishirzi, Yejin Choi

Figure 1 for Is Reinforcement Learning (Not) for Natural Language Processing?: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization
Figure 2 for Is Reinforcement Learning (Not) for Natural Language Processing?: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization
Figure 3 for Is Reinforcement Learning (Not) for Natural Language Processing?: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization
Figure 4 for Is Reinforcement Learning (Not) for Natural Language Processing?: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization
Viaarxiv icon

INSCIT: Information-Seeking Conversations with Mixed-Initiative Interactions

Jul 02, 2022
Zeqiu Wu, Ryu Parish, Hao Cheng, Sewon Min, Prithviraj Ammanabrolu, Mari Ostendorf, Hannaneh Hajishirzi

Figure 1 for INSCIT: Information-Seeking Conversations with Mixed-Initiative Interactions
Figure 2 for INSCIT: Information-Seeking Conversations with Mixed-Initiative Interactions
Figure 3 for INSCIT: Information-Seeking Conversations with Mixed-Initiative Interactions
Figure 4 for INSCIT: Information-Seeking Conversations with Mixed-Initiative Interactions
Viaarxiv icon

Quark: Controllable Text Generation with Reinforced Unlearning

May 26, 2022
Ximing Lu, Sean Welleck, Liwei Jiang, Jack Hessel, Lianhui Qin, Peter West, Prithviraj Ammanabrolu, Yejin Choi

Figure 1 for Quark: Controllable Text Generation with Reinforced Unlearning
Figure 2 for Quark: Controllable Text Generation with Reinforced Unlearning
Figure 3 for Quark: Controllable Text Generation with Reinforced Unlearning
Figure 4 for Quark: Controllable Text Generation with Reinforced Unlearning
Viaarxiv icon