Picture for Dehan Kong

Dehan Kong

RIE-Greedy: Regularization-Induced Exploration for Contextual Bandits

Add code
Mar 11, 2026
Viaarxiv icon

WebChain: A Large-Scale Human-Annotated Dataset of Real-World Web Interaction Traces

Add code
Mar 05, 2026
Viaarxiv icon

WebFactory: Automated Compression of Foundational Language Intelligence into Grounded Web Agents

Add code
Mar 05, 2026
Viaarxiv icon

VenusBench-GD: A Comprehensive Multi-Platform GUI Benchmark for Diverse Grounding Tasks

Add code
Dec 18, 2025
Figure 1 for VenusBench-GD: A Comprehensive Multi-Platform GUI Benchmark for Diverse Grounding Tasks
Figure 2 for VenusBench-GD: A Comprehensive Multi-Platform GUI Benchmark for Diverse Grounding Tasks
Figure 3 for VenusBench-GD: A Comprehensive Multi-Platform GUI Benchmark for Diverse Grounding Tasks
Figure 4 for VenusBench-GD: A Comprehensive Multi-Platform GUI Benchmark for Diverse Grounding Tasks
Viaarxiv icon

The BrowserGym Ecosystem for Web Agent Research

Add code
Dec 10, 2024
Figure 1 for The BrowserGym Ecosystem for Web Agent Research
Figure 2 for The BrowserGym Ecosystem for Web Agent Research
Figure 3 for The BrowserGym Ecosystem for Web Agent Research
Figure 4 for The BrowserGym Ecosystem for Web Agent Research
Viaarxiv icon

WebCanvas: Benchmarking Web Agents in Online Environments

Add code
Jun 18, 2024
Viaarxiv icon

General Phrase Debiaser: Debiasing Masked Language Models at a Multi-Token Level

Add code
Nov 23, 2023
Viaarxiv icon

From Adversarial Arms Race to Model-centric Evaluation: Motivating a Unified Automatic Robustness Evaluation Framework

Add code
May 29, 2023
Figure 1 for From Adversarial Arms Race to Model-centric Evaluation: Motivating a Unified Automatic Robustness Evaluation Framework
Figure 2 for From Adversarial Arms Race to Model-centric Evaluation: Motivating a Unified Automatic Robustness Evaluation Framework
Figure 3 for From Adversarial Arms Race to Model-centric Evaluation: Motivating a Unified Automatic Robustness Evaluation Framework
Figure 4 for From Adversarial Arms Race to Model-centric Evaluation: Motivating a Unified Automatic Robustness Evaluation Framework
Viaarxiv icon

Large Language Models Can be Lazy Learners: Analyze Shortcuts in In-Context Learning

Add code
May 26, 2023
Viaarxiv icon

Federated Learning for Computational Pathology on Gigapixel Whole Slide Images

Add code
Sep 23, 2020
Figure 1 for Federated Learning for Computational Pathology on Gigapixel Whole Slide Images
Figure 2 for Federated Learning for Computational Pathology on Gigapixel Whole Slide Images
Figure 3 for Federated Learning for Computational Pathology on Gigapixel Whole Slide Images
Figure 4 for Federated Learning for Computational Pathology on Gigapixel Whole Slide Images
Viaarxiv icon