Picture for Sathwik Tejaswi Madhusudhan

Sathwik Tejaswi Madhusudhan

Super Apriel: One Checkpoint, Many Speeds

Add code
Apr 21, 2026
Viaarxiv icon

EnterpriseOps-Gym: Environments and Evaluations for Stateful Agentic Planning and Tool Use in Enterprise Settings

Add code
Mar 13, 2026
Viaarxiv icon

AprielGuard

Add code
Dec 23, 2025
Figure 1 for AprielGuard
Figure 2 for AprielGuard
Figure 3 for AprielGuard
Figure 4 for AprielGuard
Viaarxiv icon

Grammar Search for Multi-Agent Systems

Add code
Dec 16, 2025
Figure 1 for Grammar Search for Multi-Agent Systems
Figure 2 for Grammar Search for Multi-Agent Systems
Figure 3 for Grammar Search for Multi-Agent Systems
Figure 4 for Grammar Search for Multi-Agent Systems
Viaarxiv icon

AU-Harness: An Open-Source Toolkit for Holistic Evaluation of Audio LLMs

Add code
Sep 11, 2025
Viaarxiv icon

LALM-Eval: An Open-Source Toolkit for Holistic Evaluation of Large Audio Language Models

Add code
Sep 09, 2025
Viaarxiv icon

Augmenting LLM Reasoning with Dynamic Notes Writing for Complex QA

Add code
May 22, 2025
Figure 1 for Augmenting LLM Reasoning with Dynamic Notes Writing for Complex QA
Figure 2 for Augmenting LLM Reasoning with Dynamic Notes Writing for Complex QA
Figure 3 for Augmenting LLM Reasoning with Dynamic Notes Writing for Complex QA
Figure 4 for Augmenting LLM Reasoning with Dynamic Notes Writing for Complex QA
Viaarxiv icon

DNA Bench: When Silence is Smarter -- Benchmarking Over-Reasoning in Reasoning LLMs

Add code
Mar 20, 2025
Figure 1 for DNA Bench: When Silence is Smarter -- Benchmarking Over-Reasoning in Reasoning LLMs
Figure 2 for DNA Bench: When Silence is Smarter -- Benchmarking Over-Reasoning in Reasoning LLMs
Figure 3 for DNA Bench: When Silence is Smarter -- Benchmarking Over-Reasoning in Reasoning LLMs
Figure 4 for DNA Bench: When Silence is Smarter -- Benchmarking Over-Reasoning in Reasoning LLMs
Viaarxiv icon

Revitalizing Saturated Benchmarks: A Weighted Metric Approach for Differentiating Large Language Model Performance

Add code
Mar 07, 2025
Figure 1 for Revitalizing Saturated Benchmarks: A Weighted Metric Approach for Differentiating Large Language Model Performance
Figure 2 for Revitalizing Saturated Benchmarks: A Weighted Metric Approach for Differentiating Large Language Model Performance
Figure 3 for Revitalizing Saturated Benchmarks: A Weighted Metric Approach for Differentiating Large Language Model Performance
Figure 4 for Revitalizing Saturated Benchmarks: A Weighted Metric Approach for Differentiating Large Language Model Performance
Viaarxiv icon

AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding

Add code
Feb 03, 2025
Figure 1 for AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding
Figure 2 for AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding
Figure 3 for AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding
Figure 4 for AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding
Viaarxiv icon