Picture for Neil Rathi

Neil Rathi

Inoculation Prompting: Instructing LLMs to misbehave at train-time improves test-time alignment

Add code
Oct 06, 2025
Viaarxiv icon

Humans overrely on overconfident language models, across languages

Add code
Jul 08, 2025
Figure 1 for Humans overrely on overconfident language models, across languages
Figure 2 for Humans overrely on overconfident language models, across languages
Figure 3 for Humans overrely on overconfident language models, across languages
Figure 4 for Humans overrely on overconfident language models, across languages
Viaarxiv icon

Mechanistic evaluation of Transformers and state space models

Add code
May 21, 2025
Viaarxiv icon

TopoLM: brain-like spatio-functional organization in a topographic language model

Add code
Oct 15, 2024
Figure 1 for TopoLM: brain-like spatio-functional organization in a topographic language model
Figure 2 for TopoLM: brain-like spatio-functional organization in a topographic language model
Figure 3 for TopoLM: brain-like spatio-functional organization in a topographic language model
Figure 4 for TopoLM: brain-like spatio-functional organization in a topographic language model
Viaarxiv icon