Picture for Karthik Narasimhan

Karthik Narasimhan

Princeton University

$τ^2$-Bench: Evaluating Conversational Agents in a Dual-Control Environment

Add code
Jun 09, 2025
Viaarxiv icon

Contextual Experience Replay for Self-Improvement of Language Agents

Add code
Jun 07, 2025
Viaarxiv icon

When Models Know More Than They Can Explain: Quantifying Knowledge Transfer in Human-AI Collaboration

Add code
Jun 05, 2025
Viaarxiv icon

IMPersona: Evaluating Individual Level LM Impersonation

Add code
Apr 08, 2025
Viaarxiv icon

ShieldGemma 2: Robust and Tractable Image Content Moderation

Add code
Apr 01, 2025
Viaarxiv icon

LoRA Soups: Merging LoRAs for Practical Skill Composition Tasks

Add code
Oct 16, 2024
Figure 1 for LoRA Soups: Merging LoRAs for Practical Skill Composition Tasks
Figure 2 for LoRA Soups: Merging LoRAs for Practical Skill Composition Tasks
Figure 3 for LoRA Soups: Merging LoRAs for Practical Skill Composition Tasks
Figure 4 for LoRA Soups: Merging LoRAs for Practical Skill Composition Tasks
Viaarxiv icon

An Annotated Dataset of Errors in Premodern Greek and Baselines for Detecting Them

Add code
Oct 14, 2024
Figure 1 for An Annotated Dataset of Errors in Premodern Greek and Baselines for Detecting Them
Figure 2 for An Annotated Dataset of Errors in Premodern Greek and Baselines for Detecting Them
Figure 3 for An Annotated Dataset of Errors in Premodern Greek and Baselines for Detecting Them
Figure 4 for An Annotated Dataset of Errors in Premodern Greek and Baselines for Detecting Them
Viaarxiv icon

EnIGMA: Enhanced Interactive Generative Model Agent for CTF Challenges

Add code
Sep 24, 2024
Figure 1 for EnIGMA: Enhanced Interactive Generative Model Agent for CTF Challenges
Figure 2 for EnIGMA: Enhanced Interactive Generative Model Agent for CTF Challenges
Figure 3 for EnIGMA: Enhanced Interactive Generative Model Agent for CTF Challenges
Figure 4 for EnIGMA: Enhanced Interactive Generative Model Agent for CTF Challenges
Viaarxiv icon

LLMs are Superior Feedback Providers: Bootstrapping Reasoning for Lie Detection with Self-Generated Feedback

Add code
Aug 25, 2024
Figure 1 for LLMs are Superior Feedback Providers: Bootstrapping Reasoning for Lie Detection with Self-Generated Feedback
Figure 2 for LLMs are Superior Feedback Providers: Bootstrapping Reasoning for Lie Detection with Self-Generated Feedback
Figure 3 for LLMs are Superior Feedback Providers: Bootstrapping Reasoning for Lie Detection with Self-Generated Feedback
Figure 4 for LLMs are Superior Feedback Providers: Bootstrapping Reasoning for Lie Detection with Self-Generated Feedback
Viaarxiv icon

ShieldGemma: Generative AI Content Moderation Based on Gemma

Add code
Jul 31, 2024
Figure 1 for ShieldGemma: Generative AI Content Moderation Based on Gemma
Figure 2 for ShieldGemma: Generative AI Content Moderation Based on Gemma
Figure 3 for ShieldGemma: Generative AI Content Moderation Based on Gemma
Figure 4 for ShieldGemma: Generative AI Content Moderation Based on Gemma
Viaarxiv icon