Picture for Huan Sun

Huan Sun

Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents

Add code
Oct 07, 2024
Figure 1 for Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents
Figure 2 for Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents
Figure 3 for Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents
Figure 4 for Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents
Viaarxiv icon

EIA: Environmental Injection Attack on Generalist Web Agents for Privacy Leakage

Add code
Sep 17, 2024
Figure 1 for EIA: Environmental Injection Attack on Generalist Web Agents for Privacy Leakage
Figure 2 for EIA: Environmental Injection Attack on Generalist Web Agents for Privacy Leakage
Figure 3 for EIA: Environmental Injection Attack on Generalist Web Agents for Privacy Leakage
Figure 4 for EIA: Environmental Injection Attack on Generalist Web Agents for Privacy Leakage
Viaarxiv icon

MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding Benchmark

Add code
Sep 04, 2024
Figure 1 for MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding Benchmark
Figure 2 for MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding Benchmark
Figure 3 for MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding Benchmark
Figure 4 for MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding Benchmark
Viaarxiv icon

Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization

Add code
May 27, 2024
Figure 1 for Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization
Figure 2 for Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization
Figure 3 for Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization
Figure 4 for Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization
Viaarxiv icon

AmpleGCG: Learning a Universal and Transferable Generative Model of Adversarial Suffixes for Jailbreaking Both Open and Closed LLMs

Add code
Apr 11, 2024
Figure 1 for AmpleGCG: Learning a Universal and Transferable Generative Model of Adversarial Suffixes for Jailbreaking Both Open and Closed LLMs
Figure 2 for AmpleGCG: Learning a Universal and Transferable Generative Model of Adversarial Suffixes for Jailbreaking Both Open and Closed LLMs
Figure 3 for AmpleGCG: Learning a Universal and Transferable Generative Model of Adversarial Suffixes for Jailbreaking Both Open and Closed LLMs
Figure 4 for AmpleGCG: Learning a Universal and Transferable Generative Model of Adversarial Suffixes for Jailbreaking Both Open and Closed LLMs
Viaarxiv icon

Cleared for Takeoff? Compositional & Conditional Reasoning may be the Achilles Heel to (Flight-Booking) Language Agents

Add code
Apr 05, 2024
Viaarxiv icon

AttributionBench: How Hard is Automatic Attribution Evaluation?

Add code
Feb 23, 2024
Viaarxiv icon

A Multi-Aspect Framework for Counter Narrative Evaluation using Large Language Models

Add code
Feb 18, 2024
Figure 1 for A Multi-Aspect Framework for Counter Narrative Evaluation using Large Language Models
Figure 2 for A Multi-Aspect Framework for Counter Narrative Evaluation using Large Language Models
Figure 3 for A Multi-Aspect Framework for Counter Narrative Evaluation using Large Language Models
Figure 4 for A Multi-Aspect Framework for Counter Narrative Evaluation using Large Language Models
Viaarxiv icon

LlaSMol: Advancing Large Language Models for Chemistry with a Large-Scale, Comprehensive, High-Quality Instruction Tuning Dataset

Add code
Feb 17, 2024
Figure 1 for LlaSMol: Advancing Large Language Models for Chemistry with a Large-Scale, Comprehensive, High-Quality Instruction Tuning Dataset
Figure 2 for LlaSMol: Advancing Large Language Models for Chemistry with a Large-Scale, Comprehensive, High-Quality Instruction Tuning Dataset
Figure 3 for LlaSMol: Advancing Large Language Models for Chemistry with a Large-Scale, Comprehensive, High-Quality Instruction Tuning Dataset
Figure 4 for LlaSMol: Advancing Large Language Models for Chemistry with a Large-Scale, Comprehensive, High-Quality Instruction Tuning Dataset
Viaarxiv icon

When is Tree Search Useful for LLM Planning? It Depends on the Discriminator

Add code
Feb 16, 2024
Viaarxiv icon