Picture for Bhavya Kailkhura

Bhavya Kailkhura

A Comprehensive Survey in LLM(-Agent) Full Stack Safety: Data, Training and Deployment

Add code
Apr 22, 2025
Viaarxiv icon

LLM Unlearning Reveals a Stronger-Than-Expected Coreset Effect in Current Benchmarks

Add code
Apr 16, 2025
Viaarxiv icon

STAR-1: Safer Alignment of Reasoning LLMs with 1K Data

Add code
Apr 02, 2025
Viaarxiv icon

Trajectory Balance with Asynchrony: Decoupling Exploration and Learning for Fast, Scalable LLM Post-Training

Add code
Mar 24, 2025
Viaarxiv icon

TruthPrInt: Mitigating LVLM Object Hallucination Via Latent Truthful-Guided Pre-Intervention

Add code
Mar 13, 2025
Viaarxiv icon

Constrained Language Generation with Discrete Diffusion Models

Add code
Mar 12, 2025
Viaarxiv icon

GRNFormer: A Biologically-Guided Framework for Integrating Gene Regulatory Networks into RNA Foundation Models

Add code
Mar 03, 2025
Viaarxiv icon

EAIRA: Establishing a Methodology for Evaluating AI Models as Scientific Research Assistants

Add code
Feb 27, 2025
Viaarxiv icon

Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

Add code
Feb 07, 2025
Viaarxiv icon

Extracting and Understanding the Superficial Knowledge in Alignment

Add code
Feb 07, 2025
Figure 1 for Extracting and Understanding the Superficial Knowledge in Alignment
Figure 2 for Extracting and Understanding the Superficial Knowledge in Alignment
Figure 3 for Extracting and Understanding the Superficial Knowledge in Alignment
Figure 4 for Extracting and Understanding the Superficial Knowledge in Alignment
Viaarxiv icon