Picture for Robin Jia

Robin Jia

When Do LLMs Admit Their Mistakes? Understanding the Role of Model Belief in Retraction

Add code
May 22, 2025
Viaarxiv icon

Textual Steering Vectors Can Improve Visual Understanding in Multimodal Large Language Models

Add code
May 20, 2025
Viaarxiv icon

Teaching Models to Understand (but not Generate) High-risk Data

Add code
May 05, 2025
Viaarxiv icon

Cancer-Myth: Evaluating AI Chatbot on Patient Questions with False Presuppositions

Add code
Apr 15, 2025
Viaarxiv icon

Robust Data Watermarking in Language Models by Injecting Fictitious Knowledge

Add code
Mar 06, 2025
Viaarxiv icon

Promote, Suppress, Iterate: How Language Models Answer One-to-Many Factual Queries

Add code
Feb 27, 2025
Viaarxiv icon

Interrogating LLM design under a fair learning doctrine

Add code
Feb 22, 2025
Viaarxiv icon

Mechanistic Interpretability of Emotion Inference in Large Language Models

Add code
Feb 08, 2025
Viaarxiv icon

Verify with Caution: The Pitfalls of Relying on Imperfect Factuality Metrics

Add code
Jan 24, 2025
Figure 1 for Verify with Caution: The Pitfalls of Relying on Imperfect Factuality Metrics
Figure 2 for Verify with Caution: The Pitfalls of Relying on Imperfect Factuality Metrics
Figure 3 for Verify with Caution: The Pitfalls of Relying on Imperfect Factuality Metrics
Figure 4 for Verify with Caution: The Pitfalls of Relying on Imperfect Factuality Metrics
Viaarxiv icon

TLDR: Token-Level Detective Reward Model for Large Vision Language Models

Add code
Oct 07, 2024
Figure 1 for TLDR: Token-Level Detective Reward Model for Large Vision Language Models
Figure 2 for TLDR: Token-Level Detective Reward Model for Large Vision Language Models
Figure 3 for TLDR: Token-Level Detective Reward Model for Large Vision Language Models
Figure 4 for TLDR: Token-Level Detective Reward Model for Large Vision Language Models
Viaarxiv icon