Picture for Jinhao Duan

Jinhao Duan

LRR-Bench: Left, Right or Rotate? Vision-Language models Still Struggle With Spatial Understanding Tasks

Add code
Jul 27, 2025
Viaarxiv icon

DynaCode: A Dynamic Complexity-Aware Code Benchmark for Evaluating Large Language Models in Code Generation

Add code
Mar 13, 2025
Viaarxiv icon

TruthPrInt: Mitigating LVLM Object Hallucination Via Latent Truthful-Guided Pre-Intervention

Add code
Mar 13, 2025
Viaarxiv icon

GuideLLM: Exploring LLM-Guided Conversation with Applications in Autobiography Interviewing

Add code
Feb 10, 2025
Figure 1 for GuideLLM: Exploring LLM-Guided Conversation with Applications in Autobiography Interviewing
Figure 2 for GuideLLM: Exploring LLM-Guided Conversation with Applications in Autobiography Interviewing
Figure 3 for GuideLLM: Exploring LLM-Guided Conversation with Applications in Autobiography Interviewing
Figure 4 for GuideLLM: Exploring LLM-Guided Conversation with Applications in Autobiography Interviewing
Viaarxiv icon

Tune In, Act Up: Exploring the Impact of Audio Modality-Specific Edits on Large Audio Language Models in Jailbreak

Add code
Jan 23, 2025
Viaarxiv icon

ConU: Conformal Uncertainty in Large Language Models with Correctness Coverage Guarantees

Add code
Jun 29, 2024
Figure 1 for ConU: Conformal Uncertainty in Large Language Models with Correctness Coverage Guarantees
Figure 2 for ConU: Conformal Uncertainty in Large Language Models with Correctness Coverage Guarantees
Figure 3 for ConU: Conformal Uncertainty in Large Language Models with Correctness Coverage Guarantees
Figure 4 for ConU: Conformal Uncertainty in Large Language Models with Correctness Coverage Guarantees
Viaarxiv icon

Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression

Add code
Mar 18, 2024
Figure 1 for Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression
Figure 2 for Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression
Figure 3 for Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression
Figure 4 for Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression
Viaarxiv icon

Word-Sequence Entropy: Towards Uncertainty Estimation in Free-Form Medical Question Answering Applications and Beyond

Add code
Feb 22, 2024
Figure 1 for Word-Sequence Entropy: Towards Uncertainty Estimation in Free-Form Medical Question Answering Applications and Beyond
Figure 2 for Word-Sequence Entropy: Towards Uncertainty Estimation in Free-Form Medical Question Answering Applications and Beyond
Figure 3 for Word-Sequence Entropy: Towards Uncertainty Estimation in Free-Form Medical Question Answering Applications and Beyond
Figure 4 for Word-Sequence Entropy: Towards Uncertainty Estimation in Free-Form Medical Question Answering Applications and Beyond
Viaarxiv icon

GTBench: Uncovering the Strategic Reasoning Limitations of LLMs via Game-Theoretic Evaluations

Add code
Feb 19, 2024
Viaarxiv icon

A Survey on Large Language Model Security and Privacy: The Good, the Bad, and the Ugly

Add code
Dec 04, 2023
Viaarxiv icon