Picture for Taiwei Shi

Taiwei Shi

Self-Evolving LLM Memory Extraction Across Heterogeneous Tasks

Add code
Apr 13, 2026
Viaarxiv icon

The Blind Spot of Agent Safety: How Benign User Instructions Expose Critical Vulnerabilities in Computer-Use Agents

Add code
Apr 12, 2026
Viaarxiv icon

Video-Based Reward Modeling for Computer-Use Agents

Add code
Mar 10, 2026
Viaarxiv icon

Experiential Reinforcement Learning

Add code
Feb 15, 2026
Viaarxiv icon

One Model, All Roles: Multi-Turn, Multi-Agent Self-Play Reinforcement Learning for Conversational Social Intelligence

Add code
Feb 03, 2026
Viaarxiv icon

CoAct-1: Computer-using Agents with Coding as Actions

Add code
Aug 05, 2025
Figure 1 for CoAct-1: Computer-using Agents with Coding as Actions
Figure 2 for CoAct-1: Computer-using Agents with Coding as Actions
Figure 3 for CoAct-1: Computer-using Agents with Coding as Actions
Figure 4 for CoAct-1: Computer-using Agents with Coding as Actions
Viaarxiv icon

STEER-BENCH: A Benchmark for Evaluating the Steerability of Large Language Models

Add code
May 27, 2025
Viaarxiv icon

The Hallucination Tax of Reinforcement Finetuning

Add code
May 20, 2025
Viaarxiv icon

Efficient Reinforcement Finetuning via Adaptive Curriculum Learning

Add code
Apr 07, 2025
Viaarxiv icon

Discovering Knowledge Deficiencies of Language Models on Massive Knowledge Base

Add code
Mar 30, 2025
Viaarxiv icon