Picture for Chengcheng Han

Chengcheng Han

VitaBench: Benchmarking LLM Agents with Versatile Interactive Tasks in Real-world Applications

Add code
Sep 30, 2025
Viaarxiv icon

MUSE: MCTS-Driven Red Teaming Framework for Enhanced Multi-Turn Dialogue Safety in Large Language Models

Add code
Sep 18, 2025
Viaarxiv icon

MARS2 2025 Challenge on Multimodal Reasoning: Datasets, Methods, Results, Discussion, and Outlook

Add code
Sep 17, 2025
Figure 1 for MARS2 2025 Challenge on Multimodal Reasoning: Datasets, Methods, Results, Discussion, and Outlook
Figure 2 for MARS2 2025 Challenge on Multimodal Reasoning: Datasets, Methods, Results, Discussion, and Outlook
Figure 3 for MARS2 2025 Challenge on Multimodal Reasoning: Datasets, Methods, Results, Discussion, and Outlook
Figure 4 for MARS2 2025 Challenge on Multimodal Reasoning: Datasets, Methods, Results, Discussion, and Outlook
Viaarxiv icon

RMoA: Optimizing Mixture-of-Agents through Diversity Maximization and Residual Compensation

Add code
May 30, 2025
Viaarxiv icon

Length Desensitization in Directed Preference Optimization

Add code
Sep 10, 2024
Figure 1 for Length Desensitization in Directed Preference Optimization
Figure 2 for Length Desensitization in Directed Preference Optimization
Figure 3 for Length Desensitization in Directed Preference Optimization
Figure 4 for Length Desensitization in Directed Preference Optimization
Viaarxiv icon

A Survey of Neural Code Intelligence: Paradigms, Advances and Beyond

Add code
Mar 21, 2024
Viaarxiv icon

Learning to Check: Unleashing Potentials for Self-Correction in Large Language Models

Add code
Feb 23, 2024
Figure 1 for Learning to Check: Unleashing Potentials for Self-Correction in Large Language Models
Figure 2 for Learning to Check: Unleashing Potentials for Self-Correction in Large Language Models
Figure 3 for Learning to Check: Unleashing Potentials for Self-Correction in Large Language Models
Figure 4 for Learning to Check: Unleashing Potentials for Self-Correction in Large Language Models
Viaarxiv icon

OS-Copilot: Towards Generalist Computer Agents with Self-Improvement

Add code
Feb 15, 2024
Figure 1 for OS-Copilot: Towards Generalist Computer Agents with Self-Improvement
Figure 2 for OS-Copilot: Towards Generalist Computer Agents with Self-Improvement
Figure 3 for OS-Copilot: Towards Generalist Computer Agents with Self-Improvement
Figure 4 for OS-Copilot: Towards Generalist Computer Agents with Self-Improvement
Viaarxiv icon

DialCoT Meets PPO: Decomposing and Exploring Reasoning Paths in Smaller Language Models

Add code
Oct 23, 2023
Viaarxiv icon

Exchanging-based Multimodal Fusion with Transformer

Add code
Sep 05, 2023
Figure 1 for Exchanging-based Multimodal Fusion with Transformer
Figure 2 for Exchanging-based Multimodal Fusion with Transformer
Figure 3 for Exchanging-based Multimodal Fusion with Transformer
Figure 4 for Exchanging-based Multimodal Fusion with Transformer
Viaarxiv icon