Picture for Shangding Gu

Shangding Gu

Agents' Last Exam

Add code
Jun 03, 2026
Viaarxiv icon

From Model Scaling to System Scaling: Scaling the Harness in Agentic AI

Add code
May 25, 2026
Viaarxiv icon

LLMs Should Express Uncertainty Explicitly

Add code
Apr 07, 2026
Viaarxiv icon

Long Context, Less Focus: A Scaling Gap in LLMs Revealed through Privacy and Personalization

Add code
Feb 16, 2026
Viaarxiv icon

AgenticPay: A Multi-Agent LLM Negotiation System for Buyer-Seller Transactions

Add code
Feb 05, 2026
Viaarxiv icon

Understanding Agent Scaling in LLM-Based Multi-Agent Systems via Diversity

Add code
Feb 03, 2026
Viaarxiv icon

AccidentBench: Benchmarking Multimodal Understanding and Reasoning in Vehicle Accidents and Beyond

Add code
Sep 30, 2025
Viaarxiv icon

RLBenchNet: The Right Network for the Right Reinforcement Learning Task

Add code
May 21, 2025
Viaarxiv icon

Few-Shot Test-Time Optimization Without Retraining for Semiconductor Recipe Generation and Beyond

Add code
May 21, 2025
Figure 1 for Few-Shot Test-Time Optimization Without Retraining for Semiconductor Recipe Generation and Beyond
Figure 2 for Few-Shot Test-Time Optimization Without Retraining for Semiconductor Recipe Generation and Beyond
Figure 3 for Few-Shot Test-Time Optimization Without Retraining for Semiconductor Recipe Generation and Beyond
Figure 4 for Few-Shot Test-Time Optimization Without Retraining for Semiconductor Recipe Generation and Beyond
Viaarxiv icon

Safe Continual Domain Adaptation after Sim2Real Transfer of Reinforcement Learning Policies in Robotics

Add code
Mar 13, 2025
Viaarxiv icon