Alert button
Picture for Arman Cohan

Arman Cohan

Alert button

Evaluating LLMs at Detecting Errors in LLM Responses

Add code
Bookmark button
Alert button
Apr 04, 2024
Ryo Kamoi, Sarkar Snigdha Sarathi Das, Renze Lou, Jihyun Janice Ahn, Yilun Zhao, Xiaoxin Lu, Nan Zhang, Yusen Zhang, Ranran Haoran Zhang, Sujeeth Reddy Vummanthala, Salika Dave, Shaobo Qin, Arman Cohan, Wenpeng Yin, Rui Zhang

Viaarxiv icon

MIMIR: A Streamlined Platform for Personalized Agent Tuning in Domain Expertise

Add code
Bookmark button
Alert button
Apr 03, 2024
Chunyuan Deng, Xiangru Tang, Yilun Zhao, Hanming Wang, Haoran Wang, Wangchunshu Zhou, Arman Cohan, Mark Gerstein

Viaarxiv icon

FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions

Add code
Bookmark button
Alert button
Mar 22, 2024
Orion Weller, Benjamin Chang, Sean MacAvaney, Kyle Lo, Arman Cohan, Benjamin Van Durme, Dawn Lawrie, Luca Soldaini

Viaarxiv icon

On the Benefits of Fine-Grained Loss Truncation: A Case Study on Factuality in Summarization

Add code
Bookmark button
Alert button
Mar 09, 2024
Lorenzo Jaime Yu Flores, Arman Cohan

Figure 1 for On the Benefits of Fine-Grained Loss Truncation: A Case Study on Factuality in Summarization
Figure 2 for On the Benefits of Fine-Grained Loss Truncation: A Case Study on Factuality in Summarization
Figure 3 for On the Benefits of Fine-Grained Loss Truncation: A Case Study on Factuality in Summarization
Figure 4 for On the Benefits of Fine-Grained Loss Truncation: A Case Study on Factuality in Summarization
Viaarxiv icon

Quantifying Contamination in Evaluating Code Generation Capabilities of Language Models

Add code
Bookmark button
Alert button
Mar 06, 2024
Martin Riddell, Ansong Ni, Arman Cohan

Figure 1 for Quantifying Contamination in Evaluating Code Generation Capabilities of Language Models
Figure 2 for Quantifying Contamination in Evaluating Code Generation Capabilities of Language Models
Figure 3 for Quantifying Contamination in Evaluating Code Generation Capabilities of Language Models
Figure 4 for Quantifying Contamination in Evaluating Code Generation Capabilities of Language Models
Viaarxiv icon

Calibrating Long-form Generations from Large Language Models

Add code
Bookmark button
Alert button
Feb 09, 2024
Yukun Huang, Yixin Liu, Raghuveer Thirukovalluru, Arman Cohan, Bhuwan Dhingra

Viaarxiv icon

OLMo: Accelerating the Science of Language Models

Add code
Bookmark button
Alert button
Feb 07, 2024
Dirk Groeneveld, Iz Beltagy, Pete Walsh, Akshita Bhagia, Rodney Kinney, Oyvind Tafjord, Ananya Harsh Jha, Hamish Ivison, Ian Magnusson, Yizhong Wang, Shane Arora, David Atkinson, Russell Authur, Khyathi Raghavi Chandu, Arman Cohan, Jennifer Dumas, Yanai Elazar, Yuling Gu, Jack Hessel, Tushar Khot, William Merrill, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam, Matthew E. Peters, Valentina Pyatkin, Abhilasha Ravichander, Dustin Schwenk, Saurabh Shah, Will Smith, Emma Strubell, Nishant Subramani, Mitchell Wortsman, Pradeep Dasigi, Nathan Lambert, Kyle Richardson, Luke Zettlemoyer, Jesse Dodge, Kyle Lo, Luca Soldaini, Noah A. Smith, Hannaneh Hajishirzi

Viaarxiv icon

Prioritizing Safeguarding Over Autonomy: Risks of LLM Agents for Science

Add code
Bookmark button
Alert button
Feb 07, 2024
Xiangru Tang, Qiao Jin, Kunlun Zhu, Tongxin Yuan, Yichi Zhang, Wangchunshu Zhou, Meng Qu, Yilun Zhao, Jian Tang, Zhuosheng Zhang, Arman Cohan, Zhiyong Lu, Mark Gerstein

Viaarxiv icon

Observable Propagation: A Data-Efficient Approach to Uncover Feature Vectors in Transformers

Add code
Bookmark button
Alert button
Dec 26, 2023
Jacob Dunefsky, Arman Cohan

Viaarxiv icon