Picture for Vidhisha Balachandran

Vidhisha Balachandran

Just Do It!? Computer-Use Agents Exhibit Blind Goal-Directedness

Add code
Oct 02, 2025
Viaarxiv icon

The Singapore Consensus on Global AI Safety Research Priorities

Add code
Jun 25, 2025
Viaarxiv icon

Phi-4-reasoning Technical Report

Add code
Apr 30, 2025
Viaarxiv icon

Inference-Time Scaling for Complex Tasks: Where We Stand and What Lies Ahead

Add code
Mar 31, 2025
Viaarxiv icon

FACTS&EVIDENCE: An Interactive Tool for Transparent Fine-Grained Factual Verification of Machine-Generated Text

Add code
Mar 19, 2025
Figure 1 for FACTS&EVIDENCE: An Interactive Tool for Transparent Fine-Grained Factual Verification of Machine-Generated Text
Figure 2 for FACTS&EVIDENCE: An Interactive Tool for Transparent Fine-Grained Factual Verification of Machine-Generated Text
Figure 3 for FACTS&EVIDENCE: An Interactive Tool for Transparent Fine-Grained Factual Verification of Machine-Generated Text
Figure 4 for FACTS&EVIDENCE: An Interactive Tool for Transparent Fine-Grained Factual Verification of Machine-Generated Text
Viaarxiv icon

MM-GEN: Enhancing Task Performance Through Targeted Multimodal Data Curation

Add code
Jan 07, 2025
Figure 1 for MM-GEN: Enhancing Task Performance Through Targeted Multimodal Data Curation
Figure 2 for MM-GEN: Enhancing Task Performance Through Targeted Multimodal Data Curation
Figure 3 for MM-GEN: Enhancing Task Performance Through Targeted Multimodal Data Curation
Figure 4 for MM-GEN: Enhancing Task Performance Through Targeted Multimodal Data Curation
Viaarxiv icon

BENCHAGENTS: Automated Benchmark Creation with Agent Interaction

Add code
Oct 29, 2024
Figure 1 for BENCHAGENTS: Automated Benchmark Creation with Agent Interaction
Figure 2 for BENCHAGENTS: Automated Benchmark Creation with Agent Interaction
Figure 3 for BENCHAGENTS: Automated Benchmark Creation with Agent Interaction
Figure 4 for BENCHAGENTS: Automated Benchmark Creation with Agent Interaction
Viaarxiv icon

Unearthing Skill-Level Insights for Understanding Trade-Offs of Foundation Models

Add code
Oct 17, 2024
Figure 1 for Unearthing Skill-Level Insights for Understanding Trade-Offs of Foundation Models
Figure 2 for Unearthing Skill-Level Insights for Understanding Trade-Offs of Foundation Models
Figure 3 for Unearthing Skill-Level Insights for Understanding Trade-Offs of Foundation Models
Figure 4 for Unearthing Skill-Level Insights for Understanding Trade-Offs of Foundation Models
Viaarxiv icon

Eureka: Evaluating and Understanding Large Foundation Models

Add code
Sep 13, 2024
Viaarxiv icon

Teaching LLMs to Abstain across Languages via Multilingual Feedback

Add code
Jun 22, 2024
Figure 1 for Teaching LLMs to Abstain across Languages via Multilingual Feedback
Figure 2 for Teaching LLMs to Abstain across Languages via Multilingual Feedback
Figure 3 for Teaching LLMs to Abstain across Languages via Multilingual Feedback
Figure 4 for Teaching LLMs to Abstain across Languages via Multilingual Feedback
Viaarxiv icon