Picture for Siva Reddy

Siva Reddy

How to Get Your LLM to Generate Challenging Problems for Evaluation

Add code
Feb 20, 2025
Figure 1 for How to Get Your LLM to Generate Challenging Problems for Evaluation
Figure 2 for How to Get Your LLM to Generate Challenging Problems for Evaluation
Figure 3 for How to Get Your LLM to Generate Challenging Problems for Evaluation
Figure 4 for How to Get Your LLM to Generate Challenging Problems for Evaluation
Viaarxiv icon

MMTEB: Massive Multilingual Text Embedding Benchmark

Add code
Feb 19, 2025
Viaarxiv icon

Warmup Generations: A Task-Agnostic Approach for Guiding Sequence-to-Sequence Learning with Unsupervised Initial State Generation

Add code
Feb 17, 2025
Figure 1 for Warmup Generations: A Task-Agnostic Approach for Guiding Sequence-to-Sequence Learning with Unsupervised Initial State Generation
Figure 2 for Warmup Generations: A Task-Agnostic Approach for Guiding Sequence-to-Sequence Learning with Unsupervised Initial State Generation
Figure 3 for Warmup Generations: A Task-Agnostic Approach for Guiding Sequence-to-Sequence Learning with Unsupervised Initial State Generation
Figure 4 for Warmup Generations: A Task-Agnostic Approach for Guiding Sequence-to-Sequence Learning with Unsupervised Initial State Generation
Viaarxiv icon

ReTreever: Tree-based Coarse-to-Fine Representations for Retrieval

Add code
Feb 11, 2025
Viaarxiv icon

Language Models Largely Exhibit Human-like Constituent Ordering Preferences

Add code
Feb 08, 2025
Viaarxiv icon

The BrowserGym Ecosystem for Web Agent Research

Add code
Dec 10, 2024
Figure 1 for The BrowserGym Ecosystem for Web Agent Research
Figure 2 for The BrowserGym Ecosystem for Web Agent Research
Figure 3 for The BrowserGym Ecosystem for Web Agent Research
Figure 4 for The BrowserGym Ecosystem for Web Agent Research
Viaarxiv icon

BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks

Add code
Dec 05, 2024
Figure 1 for BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks
Figure 2 for BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks
Figure 3 for BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks
Figure 4 for BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks
Viaarxiv icon

VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment

Add code
Oct 02, 2024
Figure 1 for VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment
Figure 2 for VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment
Figure 3 for VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment
Figure 4 for VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment
Viaarxiv icon

Benchmarking Vision Language Models for Cultural Understanding

Add code
Jul 15, 2024
Figure 1 for Benchmarking Vision Language Models for Cultural Understanding
Figure 2 for Benchmarking Vision Language Models for Cultural Understanding
Figure 3 for Benchmarking Vision Language Models for Cultural Understanding
Figure 4 for Benchmarking Vision Language Models for Cultural Understanding
Viaarxiv icon

ROSA: Random Subspace Adaptation for Efficient Fine-Tuning

Add code
Jul 10, 2024
Figure 1 for ROSA: Random Subspace Adaptation for Efficient Fine-Tuning
Figure 2 for ROSA: Random Subspace Adaptation for Efficient Fine-Tuning
Figure 3 for ROSA: Random Subspace Adaptation for Efficient Fine-Tuning
Figure 4 for ROSA: Random Subspace Adaptation for Efficient Fine-Tuning
Viaarxiv icon