Picture for Jayanth Srinivasa

Jayanth Srinivasa

EXP-Bench: Can AI Conduct AI Research Experiments?

Add code
May 30, 2025
Viaarxiv icon

An Outlook on the Opportunities and Challenges of Multi-Agent AI Systems

Add code
May 23, 2025
Viaarxiv icon

SafeKey: Amplifying Aha-Moment Insights for Safety Reasoning

Add code
May 22, 2025
Viaarxiv icon

Bidirectional LMs are Better Knowledge Memorizers? A Benchmark for Real-world Knowledge Injection

Add code
May 18, 2025
Viaarxiv icon

SkillWeaver: Web Agents can Self-Improve by Discovering and Honing Skills

Add code
Apr 09, 2025
Viaarxiv icon

Attention Reveals More Than Tokens: Training-Free Long-Context Reasoning with Attention-guided Retrieval

Add code
Mar 12, 2025
Viaarxiv icon

Curie: Toward Rigorous and Automated Scientific Experimentation with AI Agents

Add code
Feb 26, 2025
Viaarxiv icon

The Hidden Risks of Large Reasoning Models: A Safety Assessment of R1

Add code
Feb 18, 2025
Viaarxiv icon

Investigating the Shortcomings of LLMs in Step-by-Step Legal Reasoning

Add code
Feb 08, 2025
Figure 1 for Investigating the Shortcomings of LLMs in Step-by-Step Legal Reasoning
Figure 2 for Investigating the Shortcomings of LLMs in Step-by-Step Legal Reasoning
Figure 3 for Investigating the Shortcomings of LLMs in Step-by-Step Legal Reasoning
Figure 4 for Investigating the Shortcomings of LLMs in Step-by-Step Legal Reasoning
Viaarxiv icon

Diverse Score Distillation

Add code
Dec 09, 2024
Figure 1 for Diverse Score Distillation
Figure 2 for Diverse Score Distillation
Figure 3 for Diverse Score Distillation
Figure 4 for Diverse Score Distillation
Viaarxiv icon