Alert button
Picture for Ashwin Kalyan

Ashwin Kalyan

Alert button

RLHF Deciphered: A Critical Analysis of Reinforcement Learning from Human Feedback for LLMs

Add code
Bookmark button
Alert button
Apr 12, 2024
Shreyas Chaudhari, Pranjal Aggarwal, Vishvak Murahari, Tanmay Rajpurohit, Ashwin Kalyan, Karthik Narasimhan, Ameet Deshpande, Bruno Castro da Silva

Viaarxiv icon

GEO: Generative Engine Optimization

Add code
Bookmark button
Alert button
Nov 16, 2023
Pranjal Aggarwal, Vishvak Murahari, Tanmay Rajpurohit, Ashwin Kalyan, Karthik R Narasimhan, Ameet Deshpande

Viaarxiv icon

Bias Runs Deep: Implicit Reasoning Biases in Persona-Assigned LLMs

Add code
Bookmark button
Alert button
Nov 08, 2023
Shashank Gupta, Vaishnavi Shrivastava, Ameet Deshpande, Ashwin Kalyan, Peter Clark, Ashish Sabharwal, Tushar Khot

Figure 1 for Bias Runs Deep: Implicit Reasoning Biases in Persona-Assigned LLMs
Figure 2 for Bias Runs Deep: Implicit Reasoning Biases in Persona-Assigned LLMs
Figure 3 for Bias Runs Deep: Implicit Reasoning Biases in Persona-Assigned LLMs
Figure 4 for Bias Runs Deep: Implicit Reasoning Biases in Persona-Assigned LLMs
Viaarxiv icon

QualEval: Qualitative Evaluation for Model Improvement

Add code
Bookmark button
Alert button
Nov 06, 2023
Vishvak Murahari, Ameet Deshpande, Peter Clark, Tanmay Rajpurohit, Ashish Sabharwal, Karthik Narasimhan, Ashwin Kalyan

Viaarxiv icon

Estimating Numbers without Regression

Add code
Bookmark button
Alert button
Oct 09, 2023
Avijit Thawani, Jay Pujara, Ashwin Kalyan

Figure 1 for Estimating Numbers without Regression
Figure 2 for Estimating Numbers without Regression
Figure 3 for Estimating Numbers without Regression
Figure 4 for Estimating Numbers without Regression
Viaarxiv icon

Distraction-free Embeddings for Robust VQA

Add code
Bookmark button
Alert button
Aug 31, 2023
Atharvan Dogra, Deeksha Varshney, Ashwin Kalyan, Ameet Deshpande, Neeraj Kumar

Figure 1 for Distraction-free Embeddings for Robust VQA
Figure 2 for Distraction-free Embeddings for Robust VQA
Figure 3 for Distraction-free Embeddings for Robust VQA
Figure 4 for Distraction-free Embeddings for Robust VQA
Viaarxiv icon

Exploiting Generalization in Offline Reinforcement Learning via Unseen State Augmentations

Add code
Bookmark button
Alert button
Aug 07, 2023
Nirbhay Modhe, Qiaozi Gao, Ashwin Kalyan, Dhruv Batra, Govind Thattai, Gaurav Sukhatme

Figure 1 for Exploiting Generalization in Offline Reinforcement Learning via Unseen State Augmentations
Figure 2 for Exploiting Generalization in Offline Reinforcement Learning via Unseen State Augmentations
Figure 3 for Exploiting Generalization in Offline Reinforcement Learning via Unseen State Augmentations
Figure 4 for Exploiting Generalization in Offline Reinforcement Learning via Unseen State Augmentations
Viaarxiv icon

CSTS: Conditional Semantic Textual Similarity

Add code
Bookmark button
Alert button
May 24, 2023
Ameet Deshpande, Carlos E. Jimenez, Howard Chen, Vishvak Murahari, Victoria Graf, Tanmay Rajpurohit, Ashwin Kalyan, Danqi Chen, Karthik Narasimhan

Figure 1 for CSTS: Conditional Semantic Textual Similarity
Figure 2 for CSTS: Conditional Semantic Textual Similarity
Figure 3 for CSTS: Conditional Semantic Textual Similarity
Figure 4 for CSTS: Conditional Semantic Textual Similarity
Viaarxiv icon

Anthropomorphization of AI: Opportunities and Risks

Add code
Bookmark button
Alert button
May 24, 2023
Ameet Deshpande, Tanmay Rajpurohit, Karthik Narasimhan, Ashwin Kalyan

Figure 1 for Anthropomorphization of AI: Opportunities and Risks
Figure 2 for Anthropomorphization of AI: Opportunities and Risks
Figure 3 for Anthropomorphization of AI: Opportunities and Risks
Viaarxiv icon

RL4F: Generating Natural Language Feedback with Reinforcement Learning for Repairing Model Outputs

Add code
Bookmark button
Alert button
May 15, 2023
Afra Feyza Akyürek, Ekin Akyürek, Aman Madaan, Ashwin Kalyan, Peter Clark, Derry Wijaya, Niket Tandon

Figure 1 for RL4F: Generating Natural Language Feedback with Reinforcement Learning for Repairing Model Outputs
Figure 2 for RL4F: Generating Natural Language Feedback with Reinforcement Learning for Repairing Model Outputs
Figure 3 for RL4F: Generating Natural Language Feedback with Reinforcement Learning for Repairing Model Outputs
Figure 4 for RL4F: Generating Natural Language Feedback with Reinforcement Learning for Repairing Model Outputs
Viaarxiv icon