Picture for Mikhail Belkin

Mikhail Belkin

Catching rationalization in the act: detecting motivated reasoning before and after CoT via activation probing

Add code
Mar 17, 2026
Viaarxiv icon

General and Efficient Steering of Unconditional Diffusion

Add code
Feb 11, 2026
Viaarxiv icon

A Gap Between the Gaussian RKHS and Neural Networks: An Infinite-Center Asymptotic Analysis

Add code
Feb 22, 2025
Figure 1 for A Gap Between the Gaussian RKHS and Neural Networks: An Infinite-Center Asymptotic Analysis
Viaarxiv icon

Task Generalization With AutoRegressive Compositional Structure: Can Learning From $\d$ Tasks Generalize to $\d^{T}$ Tasks?

Add code
Feb 13, 2025
Viaarxiv icon

Aggregate and conquer: detecting and steering LLM concepts by combining nonlinear predictors over multiple layers

Add code
Feb 06, 2025
Figure 1 for Aggregate and conquer: detecting and steering LLM concepts by combining nonlinear predictors over multiple layers
Figure 2 for Aggregate and conquer: detecting and steering LLM concepts by combining nonlinear predictors over multiple layers
Figure 3 for Aggregate and conquer: detecting and steering LLM concepts by combining nonlinear predictors over multiple layers
Figure 4 for Aggregate and conquer: detecting and steering LLM concepts by combining nonlinear predictors over multiple layers
Viaarxiv icon

Fast training of large kernel models with delayed projections

Add code
Nov 25, 2024
Figure 1 for Fast training of large kernel models with delayed projections
Figure 2 for Fast training of large kernel models with delayed projections
Figure 3 for Fast training of large kernel models with delayed projections
Figure 4 for Fast training of large kernel models with delayed projections
Viaarxiv icon

Mirror Descent on Reproducing Kernel Banach Spaces

Add code
Nov 18, 2024
Figure 1 for Mirror Descent on Reproducing Kernel Banach Spaces
Figure 2 for Mirror Descent on Reproducing Kernel Banach Spaces
Figure 3 for Mirror Descent on Reproducing Kernel Banach Spaces
Viaarxiv icon

Context-Scaling versus Task-Scaling in In-Context Learning

Add code
Oct 16, 2024
Figure 1 for Context-Scaling versus Task-Scaling in In-Context Learning
Figure 2 for Context-Scaling versus Task-Scaling in In-Context Learning
Figure 3 for Context-Scaling versus Task-Scaling in In-Context Learning
Figure 4 for Context-Scaling versus Task-Scaling in In-Context Learning
Viaarxiv icon

Emergence in non-neural models: grokking modular arithmetic via average gradient outer product

Add code
Jul 29, 2024
Viaarxiv icon

Average gradient outer product as a mechanism for deep neural collapse

Add code
Feb 21, 2024
Figure 1 for Average gradient outer product as a mechanism for deep neural collapse
Figure 2 for Average gradient outer product as a mechanism for deep neural collapse
Figure 3 for Average gradient outer product as a mechanism for deep neural collapse
Figure 4 for Average gradient outer product as a mechanism for deep neural collapse
Viaarxiv icon