Picture for Satvik Golechha

Satvik Golechha

ABBEL: LLM Agents Acting through Belief Bottlenecks Expressed in Language

Add code
Dec 23, 2025
Viaarxiv icon

Auditing Games for Sandbagging

Add code
Dec 08, 2025
Viaarxiv icon

Among Us: A Sandbox for Agentic Deception

Add code
Apr 05, 2025
Viaarxiv icon

Auditing language models for hidden objectives

Add code
Mar 14, 2025
Figure 1 for Auditing language models for hidden objectives
Figure 2 for Auditing language models for hidden objectives
Figure 3 for Auditing language models for hidden objectives
Figure 4 for Auditing language models for hidden objectives
Viaarxiv icon

Modular Training of Neural Networks aids Interpretability

Add code
Feb 04, 2025
Figure 1 for Modular Training of Neural Networks aids Interpretability
Figure 2 for Modular Training of Neural Networks aids Interpretability
Figure 3 for Modular Training of Neural Networks aids Interpretability
Figure 4 for Modular Training of Neural Networks aids Interpretability
Viaarxiv icon

Progress Measures for Grokking on Real-world Datasets

Add code
May 21, 2024
Viaarxiv icon

NICE: To Optimize In-Context Examples or Not?

Add code
Feb 16, 2024
Figure 1 for NICE: To Optimize In-Context Examples or Not?
Figure 2 for NICE: To Optimize In-Context Examples or Not?
Figure 3 for NICE: To Optimize In-Context Examples or Not?
Figure 4 for NICE: To Optimize In-Context Examples or Not?
Viaarxiv icon

CataractBot: An LLM-Powered Expert-in-the-Loop Chatbot for Cataract Patients

Add code
Feb 07, 2024
Figure 1 for CataractBot: An LLM-Powered Expert-in-the-Loop Chatbot for Cataract Patients
Figure 2 for CataractBot: An LLM-Powered Expert-in-the-Loop Chatbot for Cataract Patients
Figure 3 for CataractBot: An LLM-Powered Expert-in-the-Loop Chatbot for Cataract Patients
Figure 4 for CataractBot: An LLM-Powered Expert-in-the-Loop Chatbot for Cataract Patients
Viaarxiv icon

Position Paper: Toward New Frameworks for Studying Model Representations

Add code
Feb 06, 2024
Viaarxiv icon

Predicting Treatment Adherence of Tuberculosis Patients at Scale

Add code
Nov 15, 2022
Figure 1 for Predicting Treatment Adherence of Tuberculosis Patients at Scale
Figure 2 for Predicting Treatment Adherence of Tuberculosis Patients at Scale
Figure 3 for Predicting Treatment Adherence of Tuberculosis Patients at Scale
Figure 4 for Predicting Treatment Adherence of Tuberculosis Patients at Scale
Viaarxiv icon