Picture for Wenlin Yao

Wenlin Yao

When Reasoning Meets Information Aggregation: A Case Study with Sports Narratives

Add code
Jun 17, 2024
Figure 1 for When Reasoning Meets Information Aggregation: A Case Study with Sports Narratives
Figure 2 for When Reasoning Meets Information Aggregation: A Case Study with Sports Narratives
Figure 3 for When Reasoning Meets Information Aggregation: A Case Study with Sports Narratives
Figure 4 for When Reasoning Meets Information Aggregation: A Case Study with Sports Narratives
Viaarxiv icon

MathChat: Benchmarking Mathematical Reasoning and Instruction Following in Multi-Turn Interactions

Add code
May 29, 2024
Figure 1 for MathChat: Benchmarking Mathematical Reasoning and Instruction Following in Multi-Turn Interactions
Figure 2 for MathChat: Benchmarking Mathematical Reasoning and Instruction Following in Multi-Turn Interactions
Figure 3 for MathChat: Benchmarking Mathematical Reasoning and Instruction Following in Multi-Turn Interactions
Figure 4 for MathChat: Benchmarking Mathematical Reasoning and Instruction Following in Multi-Turn Interactions
Viaarxiv icon

Usable XAI: 10 Strategies Towards Exploiting Explainability in the LLM Era

Add code
Mar 13, 2024
Figure 1 for Usable XAI: 10 Strategies Towards Exploiting Explainability in the LLM Era
Figure 2 for Usable XAI: 10 Strategies Towards Exploiting Explainability in the LLM Era
Figure 3 for Usable XAI: 10 Strategies Towards Exploiting Explainability in the LLM Era
Figure 4 for Usable XAI: 10 Strategies Towards Exploiting Explainability in the LLM Era
Viaarxiv icon

Fact-and-Reflection (FaR) Improves Confidence Calibration of Large Language Models

Add code
Feb 27, 2024
Figure 1 for Fact-and-Reflection (FaR) Improves Confidence Calibration of Large Language Models
Figure 2 for Fact-and-Reflection (FaR) Improves Confidence Calibration of Large Language Models
Figure 3 for Fact-and-Reflection (FaR) Improves Confidence Calibration of Large Language Models
Figure 4 for Fact-and-Reflection (FaR) Improves Confidence Calibration of Large Language Models
Viaarxiv icon

WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models

Add code
Jan 28, 2024
Viaarxiv icon

InFoBench: Evaluating Instruction Following Ability in Large Language Models

Add code
Jan 07, 2024
Viaarxiv icon

MMC: Advancing Multimodal Chart Understanding with Large-scale Instruction Tuning

Add code
Nov 15, 2023
Viaarxiv icon

TencentLLMEval: A Hierarchical Evaluation of Real-World Capabilities for Human-Aligned LLMs

Add code
Nov 09, 2023
Viaarxiv icon

From Language Modeling to Instruction Following: Understanding the Behavior Shift in LLMs after Instruction Tuning

Add code
Sep 30, 2023
Figure 1 for From Language Modeling to Instruction Following: Understanding the Behavior Shift in LLMs after Instruction Tuning
Figure 2 for From Language Modeling to Instruction Following: Understanding the Behavior Shift in LLMs after Instruction Tuning
Figure 3 for From Language Modeling to Instruction Following: Understanding the Behavior Shift in LLMs after Instruction Tuning
Figure 4 for From Language Modeling to Instruction Following: Understanding the Behavior Shift in LLMs after Instruction Tuning
Viaarxiv icon

Thrust: Adaptively Propels Large Language Models with External Knowledge

Add code
Jul 19, 2023
Figure 1 for Thrust: Adaptively Propels Large Language Models with External Knowledge
Figure 2 for Thrust: Adaptively Propels Large Language Models with External Knowledge
Figure 3 for Thrust: Adaptively Propels Large Language Models with External Knowledge
Figure 4 for Thrust: Adaptively Propels Large Language Models with External Knowledge
Viaarxiv icon