Picture for Siva Reddy

Siva Reddy

AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories

Add code
Apr 11, 2025
Viaarxiv icon

Not All Data Are Unlearned Equally

Add code
Apr 08, 2025
Viaarxiv icon

Exploiting Instruction-Following Retrievers for Malicious Information Retrieval

Add code
Mar 11, 2025
Viaarxiv icon

SafeArena: Evaluating the Safety of Autonomous Web Agents

Add code
Mar 06, 2025
Viaarxiv icon

How to Get Your LLM to Generate Challenging Problems for Evaluation

Add code
Feb 20, 2025
Viaarxiv icon

MMTEB: Massive Multilingual Text Embedding Benchmark

Add code
Feb 19, 2025
Viaarxiv icon

Warmup Generations: A Task-Agnostic Approach for Guiding Sequence-to-Sequence Learning with Unsupervised Initial State Generation

Add code
Feb 17, 2025
Viaarxiv icon

ReTreever: Tree-based Coarse-to-Fine Representations for Retrieval

Add code
Feb 11, 2025
Viaarxiv icon

Language Models Largely Exhibit Human-like Constituent Ordering Preferences

Add code
Feb 08, 2025
Viaarxiv icon

The BrowserGym Ecosystem for Web Agent Research

Add code
Dec 10, 2024
Figure 1 for The BrowserGym Ecosystem for Web Agent Research
Figure 2 for The BrowserGym Ecosystem for Web Agent Research
Figure 3 for The BrowserGym Ecosystem for Web Agent Research
Figure 4 for The BrowserGym Ecosystem for Web Agent Research
Viaarxiv icon