Picture for Wenyue Hua

Wenyue Hua

Chimera: Latency- and Performance-Aware Multi-agent Serving for Heterogeneous LLMs

Add code
Mar 23, 2026
Viaarxiv icon

Individual Turing Test: A Case Study of LLM-based Simulation Using Longitudinal Personal Data

Add code
Mar 01, 2026
Viaarxiv icon

Epistemic Context Learning: Building Trust the Right Way in LLM-Based Multi-Agent Systems

Add code
Jan 29, 2026
Viaarxiv icon

A Survey of Self-Evolving Agents: On Path to Artificial Super Intelligence

Add code
Jul 28, 2025
Figure 1 for A Survey of Self-Evolving Agents: On Path to Artificial Super Intelligence
Figure 2 for A Survey of Self-Evolving Agents: On Path to Artificial Super Intelligence
Figure 3 for A Survey of Self-Evolving Agents: On Path to Artificial Super Intelligence
Figure 4 for A Survey of Self-Evolving Agents: On Path to Artificial Super Intelligence
Viaarxiv icon

MAGPIE: A dataset for Multi-AGent contextual PrIvacy Evaluation

Add code
Jun 25, 2025
Viaarxiv icon

Semantic Scheduling for LLM Inference

Add code
Jun 13, 2025
Viaarxiv icon

THOUGHTTERMINATOR: Benchmarking, Calibrating, and Mitigating Overthinking in Reasoning Models

Add code
Apr 17, 2025
Figure 1 for THOUGHTTERMINATOR: Benchmarking, Calibrating, and Mitigating Overthinking in Reasoning Models
Figure 2 for THOUGHTTERMINATOR: Benchmarking, Calibrating, and Mitigating Overthinking in Reasoning Models
Figure 3 for THOUGHTTERMINATOR: Benchmarking, Calibrating, and Mitigating Overthinking in Reasoning Models
Figure 4 for THOUGHTTERMINATOR: Benchmarking, Calibrating, and Mitigating Overthinking in Reasoning Models
Viaarxiv icon

REALM: A Dataset of Real-World LLM Use Cases

Add code
Mar 24, 2025
Viaarxiv icon

AgentOrca: A Dual-System Framework to Evaluate Language Agents on Operational Routine and Constraint Adherence

Add code
Mar 11, 2025
Figure 1 for AgentOrca: A Dual-System Framework to Evaluate Language Agents on Operational Routine and Constraint Adherence
Figure 2 for AgentOrca: A Dual-System Framework to Evaluate Language Agents on Operational Routine and Constraint Adherence
Figure 3 for AgentOrca: A Dual-System Framework to Evaluate Language Agents on Operational Routine and Constraint Adherence
Figure 4 for AgentOrca: A Dual-System Framework to Evaluate Language Agents on Operational Routine and Constraint Adherence
Viaarxiv icon

InductionBench: LLMs Fail in the Simplest Complexity Class

Add code
Feb 26, 2025
Viaarxiv icon