Picture for Weiwen Liu

Weiwen Liu

CATArena: Evaluation of LLM Agents through Iterative Tournament Competitions

Add code
Oct 30, 2025
Figure 1 for CATArena: Evaluation of LLM Agents through Iterative Tournament Competitions
Figure 2 for CATArena: Evaluation of LLM Agents through Iterative Tournament Competitions
Figure 3 for CATArena: Evaluation of LLM Agents through Iterative Tournament Competitions
Figure 4 for CATArena: Evaluation of LLM Agents through Iterative Tournament Competitions
Viaarxiv icon

ColorEcosystem: Powering Personalized, Standardized, and Trustworthy Agentic Service in massive-agent Ecosystem

Add code
Oct 27, 2025
Viaarxiv icon

VeriOS: Query-Driven Proactive Human-Agent-GUI Interaction for Trustworthy OS Agents

Add code
Sep 09, 2025
Viaarxiv icon

Fast, Slow, and Tool-augmented Thinking for LLMs: A Review

Add code
Aug 17, 2025
Viaarxiv icon

Quick on the Uptake: Eliciting Implicit Intents from Human Demonstrations for Personalized Mobile-Use Agents

Add code
Aug 12, 2025
Figure 1 for Quick on the Uptake: Eliciting Implicit Intents from Human Demonstrations for Personalized Mobile-Use Agents
Figure 2 for Quick on the Uptake: Eliciting Implicit Intents from Human Demonstrations for Personalized Mobile-Use Agents
Figure 3 for Quick on the Uptake: Eliciting Implicit Intents from Human Demonstrations for Personalized Mobile-Use Agents
Figure 4 for Quick on the Uptake: Eliciting Implicit Intents from Human Demonstrations for Personalized Mobile-Use Agents
Viaarxiv icon

The Real Barrier to LLM Agent Usability is Agentic ROI

Add code
May 23, 2025
Viaarxiv icon

Superplatforms Have to Attack AI Agents

Add code
May 23, 2025
Viaarxiv icon

Stepwise Reasoning Checkpoint Analysis: A Test Time Scaling Method to Enhance LLMs' Reasoning

Add code
May 23, 2025
Viaarxiv icon

InfoDeepSeek: Benchmarking Agentic Information Seeking for Retrieval-Augmented Generation

Add code
May 21, 2025
Viaarxiv icon

NL-Debugging: Exploiting Natural Language as an Intermediate Representation for Code Debugging

Add code
May 21, 2025
Viaarxiv icon