Picture for Yue Zhang

Yue Zhang

Renmin University of China

Tables as Images? Exploring the Strengths and Limitations of LLMs on Multimodal Representations of Tabular Data

Add code
Feb 23, 2024
Figure 1 for Tables as Images? Exploring the Strengths and Limitations of LLMs on Multimodal Representations of Tabular Data
Figure 2 for Tables as Images? Exploring the Strengths and Limitations of LLMs on Multimodal Representations of Tabular Data
Figure 3 for Tables as Images? Exploring the Strengths and Limitations of LLMs on Multimodal Representations of Tabular Data
Figure 4 for Tables as Images? Exploring the Strengths and Limitations of LLMs on Multimodal Representations of Tabular Data
Viaarxiv icon

RefuteBench: Evaluating Refuting Instruction-Following for Large Language Models

Add code
Feb 22, 2024
Figure 1 for RefuteBench: Evaluating Refuting Instruction-Following for Large Language Models
Figure 2 for RefuteBench: Evaluating Refuting Instruction-Following for Large Language Models
Figure 3 for RefuteBench: Evaluating Refuting Instruction-Following for Large Language Models
Figure 4 for RefuteBench: Evaluating Refuting Instruction-Following for Large Language Models
Viaarxiv icon

Word-Sequence Entropy: Towards Uncertainty Estimation in Free-Form Medical Question Answering Applications and Beyond

Add code
Feb 22, 2024
Figure 1 for Word-Sequence Entropy: Towards Uncertainty Estimation in Free-Form Medical Question Answering Applications and Beyond
Figure 2 for Word-Sequence Entropy: Towards Uncertainty Estimation in Free-Form Medical Question Answering Applications and Beyond
Figure 3 for Word-Sequence Entropy: Towards Uncertainty Estimation in Free-Form Medical Question Answering Applications and Beyond
Figure 4 for Word-Sequence Entropy: Towards Uncertainty Estimation in Free-Form Medical Question Answering Applications and Beyond
Viaarxiv icon

Potential and Challenges of Model Editing for Social Debiasing

Add code
Feb 21, 2024
Viaarxiv icon

SQL-CRAFT: Text-to-SQL through Interactive Refinement and Enhanced Reasoning

Add code
Feb 20, 2024
Figure 1 for SQL-CRAFT: Text-to-SQL through Interactive Refinement and Enhanced Reasoning
Figure 2 for SQL-CRAFT: Text-to-SQL through Interactive Refinement and Enhanced Reasoning
Figure 3 for SQL-CRAFT: Text-to-SQL through Interactive Refinement and Enhanced Reasoning
Figure 4 for SQL-CRAFT: Text-to-SQL through Interactive Refinement and Enhanced Reasoning
Viaarxiv icon

MRKE: The Multi-hop Reasoning Evaluation of LLMs by Knowledge Edition

Add code
Feb 19, 2024
Figure 1 for MRKE: The Multi-hop Reasoning Evaluation of LLMs by Knowledge Edition
Figure 2 for MRKE: The Multi-hop Reasoning Evaluation of LLMs by Knowledge Edition
Figure 3 for MRKE: The Multi-hop Reasoning Evaluation of LLMs by Knowledge Edition
Figure 4 for MRKE: The Multi-hop Reasoning Evaluation of LLMs by Knowledge Edition
Viaarxiv icon

Fine-grained and Explainable Factuality Evaluation for Multimodal Summarization

Add code
Feb 18, 2024
Viaarxiv icon

Detecting Multimedia Generated by Large AI Models: A Survey

Add code
Feb 07, 2024
Figure 1 for Detecting Multimedia Generated by Large AI Models: A Survey
Figure 2 for Detecting Multimedia Generated by Large AI Models: A Survey
Figure 3 for Detecting Multimedia Generated by Large AI Models: A Survey
Figure 4 for Detecting Multimedia Generated by Large AI Models: A Survey
Viaarxiv icon

NavHint: Vision and Language Navigation Agent with a Hint Generator

Add code
Feb 04, 2024
Viaarxiv icon

Common Sense Reasoning for Deep Fake Detection

Add code
Jan 31, 2024
Figure 1 for Common Sense Reasoning for Deep Fake Detection
Figure 2 for Common Sense Reasoning for Deep Fake Detection
Figure 3 for Common Sense Reasoning for Deep Fake Detection
Figure 4 for Common Sense Reasoning for Deep Fake Detection
Viaarxiv icon