Picture for Huan Sun

Huan Sun

AutoSDT: Scaling Data-Driven Discovery Tasks Toward Open Co-Scientists

Add code
Jun 09, 2025
Viaarxiv icon

RedTeamCUA: Realistic Adversarial Testing of Computer-Use Agents in Hybrid Web-OS Environments

Add code
May 28, 2025
Viaarxiv icon

Is the Reversal Curse a Binding Problem? Uncovering Limitations of Transformers from a Basic Generalization Failure

Add code
Apr 02, 2025
Viaarxiv icon

An Illusion of Progress? Assessing the Current State of Web Agents

Add code
Apr 02, 2025
Viaarxiv icon

Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems

Add code
Mar 31, 2025
Viaarxiv icon

AGrail: A Lifelong Agent Guardrail with Effective and Adaptive Safety Detection

Add code
Feb 18, 2025
Viaarxiv icon

Tooling or Not Tooling? The Impact of Tools on Language Agents for Chemistry Problem Solving

Add code
Nov 11, 2024
Figure 1 for Tooling or Not Tooling? The Impact of Tools on Language Agents for Chemistry Problem Solving
Figure 2 for Tooling or Not Tooling? The Impact of Tools on Language Agents for Chemistry Problem Solving
Figure 3 for Tooling or Not Tooling? The Impact of Tools on Language Agents for Chemistry Problem Solving
Figure 4 for Tooling or Not Tooling? The Impact of Tools on Language Agents for Chemistry Problem Solving
Viaarxiv icon

Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents

Add code
Nov 10, 2024
Figure 1 for Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents
Figure 2 for Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents
Figure 3 for Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents
Figure 4 for Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents
Viaarxiv icon

AmpleGCG-Plus: A Strong Generative Model of Adversarial Suffixes to Jailbreak LLMs with Higher Success Rates in Fewer Attempts

Add code
Oct 29, 2024
Viaarxiv icon

AdvWeb: Controllable Black-box Attacks on VLM-powered Web Agents

Add code
Oct 22, 2024
Figure 1 for AdvWeb: Controllable Black-box Attacks on VLM-powered Web Agents
Figure 2 for AdvWeb: Controllable Black-box Attacks on VLM-powered Web Agents
Figure 3 for AdvWeb: Controllable Black-box Attacks on VLM-powered Web Agents
Figure 4 for AdvWeb: Controllable Black-box Attacks on VLM-powered Web Agents
Viaarxiv icon