Picture for Chaozheng Wang

Chaozheng Wang

WARBENCH: A Comprehensive Benchmark for Evaluating LLMs in Military Decision-Making

Add code
Mar 22, 2026
Viaarxiv icon

Reinforcing Real-world Service Agents: Balancing Utility and Cost in Task-oriented Dialogue

Add code
Feb 26, 2026
Viaarxiv icon

SEAD: Self-Evolving Agent for Multi-Turn Service Dialogue

Add code
Feb 03, 2026
Viaarxiv icon

REPAIR: Robust Editing via Progressive Adaptive Intervention and Reintegration

Add code
Oct 02, 2025
Figure 1 for REPAIR: Robust Editing via Progressive Adaptive Intervention and Reintegration
Figure 2 for REPAIR: Robust Editing via Progressive Adaptive Intervention and Reintegration
Figure 3 for REPAIR: Robust Editing via Progressive Adaptive Intervention and Reintegration
Figure 4 for REPAIR: Robust Editing via Progressive Adaptive Intervention and Reintegration
Viaarxiv icon

UMoE: Unifying Attention and FFN with Shared Experts

Add code
May 12, 2025
Figure 1 for UMoE: Unifying Attention and FFN with Shared Experts
Figure 2 for UMoE: Unifying Attention and FFN with Shared Experts
Figure 3 for UMoE: Unifying Attention and FFN with Shared Experts
Figure 4 for UMoE: Unifying Attention and FFN with Shared Experts
Viaarxiv icon

CODECRASH: Stress Testing LLM Reasoning under Structural and Semantic Perturbations

Add code
Apr 19, 2025
Figure 1 for CODECRASH: Stress Testing LLM Reasoning under Structural and Semantic Perturbations
Figure 2 for CODECRASH: Stress Testing LLM Reasoning under Structural and Semantic Perturbations
Figure 3 for CODECRASH: Stress Testing LLM Reasoning under Structural and Semantic Perturbations
Figure 4 for CODECRASH: Stress Testing LLM Reasoning under Structural and Semantic Perturbations
Viaarxiv icon

IDInit: A Universal and Stable Initialization Method for Neural Network Training

Add code
Mar 06, 2025
Viaarxiv icon

How Should I Build A Benchmark?

Add code
Jan 18, 2025
Viaarxiv icon

The Prompt Alchemist: Automated LLM-Tailored Prompt Optimization for Test Case Generation

Add code
Jan 02, 2025
Figure 1 for The Prompt Alchemist: Automated LLM-Tailored Prompt Optimization for Test Case Generation
Figure 2 for The Prompt Alchemist: Automated LLM-Tailored Prompt Optimization for Test Case Generation
Figure 3 for The Prompt Alchemist: Automated LLM-Tailored Prompt Optimization for Test Case Generation
Figure 4 for The Prompt Alchemist: Automated LLM-Tailored Prompt Optimization for Test Case Generation
Viaarxiv icon

Learning to Ask: When LLMs Meet Unclear Instruction

Add code
Aug 31, 2024
Figure 1 for Learning to Ask: When LLMs Meet Unclear Instruction
Figure 2 for Learning to Ask: When LLMs Meet Unclear Instruction
Figure 3 for Learning to Ask: When LLMs Meet Unclear Instruction
Figure 4 for Learning to Ask: When LLMs Meet Unclear Instruction
Viaarxiv icon