Picture for Elena Tutubalina

Elena Tutubalina

Harnessing non-adversarial robustness in large language models

Add code
May 28, 2026
Viaarxiv icon

SemEval-2026 Task 9: Detecting Multilingual, Multicultural and Multievent Online Polarization

Add code
Apr 08, 2026
Viaarxiv icon

Evolutionary Search for Automated Design of Uncertainty Quantification Methods

Add code
Apr 03, 2026
Viaarxiv icon

Breaking the Chain: A Causal Analysis of LLM Faithfulness to Intermediate Structures

Add code
Mar 17, 2026
Viaarxiv icon

Leveraging LLM Parametric Knowledge for Fact Checking without Retrieval

Add code
Mar 05, 2026
Viaarxiv icon

Anatomy of Unlearning: The Dual Impact of Fact Salience and Model Fine-Tuning

Add code
Feb 24, 2026
Viaarxiv icon

Sanity Checks for Sparse Autoencoders: Do SAEs Beat Random Baselines?

Add code
Feb 15, 2026
Viaarxiv icon

Bring the Apple, Not the Sofa: Impact of Irrelevant Context in Embodied AI Commands on VLA Models

Add code
Oct 08, 2025
Viaarxiv icon

OrtSAE: Orthogonal Sparse Autoencoders Uncover Atomic Features

Add code
Sep 26, 2025
Viaarxiv icon

The Rogue Scalpel: Activation Steering Compromises LLM Safety

Add code
Sep 26, 2025
Viaarxiv icon