Alert button
Picture for Ruochen Xu

Ruochen Xu

Alert button

Rho-1: Not All Tokens Are What You Need

Add code
Bookmark button
Alert button
Apr 11, 2024
Zhenghao Lin, Zhibin Gou, Yeyun Gong, Xiao Liu, Yelong Shen, Ruochen Xu, Chen Lin, Yujiu Yang, Jian Jiao, Nan Duan, Weizhu Chen

Viaarxiv icon

ERBench: An Entity-Relationship based Automatically Verifiable Hallucination Benchmark for Large Language Models

Add code
Bookmark button
Alert button
Mar 08, 2024
Jio Oh, Soyeon Kim, Junseok Seo, Jindong Wang, Ruochen Xu, Xing Xie, Steven Euijong Whang

Figure 1 for ERBench: An Entity-Relationship based Automatically Verifiable Hallucination Benchmark for Large Language Models
Figure 2 for ERBench: An Entity-Relationship based Automatically Verifiable Hallucination Benchmark for Large Language Models
Figure 3 for ERBench: An Entity-Relationship based Automatically Verifiable Hallucination Benchmark for Large Language Models
Figure 4 for ERBench: An Entity-Relationship based Automatically Verifiable Hallucination Benchmark for Large Language Models
Viaarxiv icon

DyVal 2: Dynamic Evaluation of Large Language Models by Meta Probing Agents

Add code
Bookmark button
Alert button
Feb 21, 2024
Kaijie Zhu, Jindong Wang, Qinlin Zhao, Ruochen Xu, Xing Xie

Viaarxiv icon

SciAgent: Tool-augmented Language Models for Scientific Reasoning

Add code
Bookmark button
Alert button
Feb 21, 2024
Yubo Ma, Zhibin Gou, Junheng Hao, Ruochen Xu, Shuohang Wang, Liangming Pan, Yujiu Yang, Yixin Cao, Aixin Sun, Hany Awadalla, Weizhu Chen

Viaarxiv icon

Supervised Knowledge Makes Large Language Models Better In-context Learners

Add code
Bookmark button
Alert button
Dec 26, 2023
Linyi Yang, Shuibai Zhang, Zhuohao Yu, Guangsheng Bao, Yidong Wang, Jindong Wang, Ruochen Xu, Wei Ye, Xing Xie, Weizhu Chen, Yue Zhang

Viaarxiv icon

Language Models can be Logical Solvers

Add code
Bookmark button
Alert button
Nov 10, 2023
Jiazhan Feng, Ruochen Xu, Junheng Hao, Hiteshi Sharma, Yelong Shen, Dongyan Zhao, Weizhu Chen

Figure 1 for Language Models can be Logical Solvers
Figure 2 for Language Models can be Logical Solvers
Figure 3 for Language Models can be Logical Solvers
Figure 4 for Language Models can be Logical Solvers
Viaarxiv icon

In-Context Demonstration Selection with Cross Entropy Difference

Add code
Bookmark button
Alert button
May 24, 2023
Dan Iter, Reid Pryzant, Ruochen Xu, Shuohang Wang, Yang Liu, Yichong Xu, Chenguang Zhu

Figure 1 for In-Context Demonstration Selection with Cross Entropy Difference
Figure 2 for In-Context Demonstration Selection with Cross Entropy Difference
Figure 3 for In-Context Demonstration Selection with Cross Entropy Difference
Figure 4 for In-Context Demonstration Selection with Cross Entropy Difference
Viaarxiv icon

LMGQS: A Large-scale Dataset for Query-focused Summarization

Add code
Bookmark button
Alert button
May 22, 2023
Ruochen Xu, Song Wang, Yang Liu, Shuohang Wang, Yichong Xu, Dan Iter, Chenguang Zhu, Michael Zeng

Figure 1 for LMGQS: A Large-scale Dataset for Query-focused Summarization
Figure 2 for LMGQS: A Large-scale Dataset for Query-focused Summarization
Figure 3 for LMGQS: A Large-scale Dataset for Query-focused Summarization
Figure 4 for LMGQS: A Large-scale Dataset for Query-focused Summarization
Viaarxiv icon

InheritSumm: A General, Versatile and Compact Summarizer by Distilling from GPT

Add code
Bookmark button
Alert button
May 22, 2023
Yichong Xu, Ruochen Xu, Dan Iter, Yang Liu, Shuohang Wang, Chenguang Zhu, Michael Zeng

Figure 1 for InheritSumm: A General, Versatile and Compact Summarizer by Distilling from GPT
Figure 2 for InheritSumm: A General, Versatile and Compact Summarizer by Distilling from GPT
Figure 3 for InheritSumm: A General, Versatile and Compact Summarizer by Distilling from GPT
Figure 4 for InheritSumm: A General, Versatile and Compact Summarizer by Distilling from GPT
Viaarxiv icon

G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment

Add code
Bookmark button
Alert button
Apr 06, 2023
Yang Liu, Dan Iter, Yichong Xu, Shuohang Wang, Ruochen Xu, Chenguang Zhu

Figure 1 for G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment
Figure 2 for G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment
Figure 3 for G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment
Figure 4 for G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment
Viaarxiv icon