Picture for Igor Fedorov

Igor Fedorov

Short Data, Long Context: Distilling Positional Knowledge in Transformers

Add code
Apr 07, 2026
Viaarxiv icon

MobileLLM-Flash: Latency-Guided On-Device LLM Design for Industry Scale

Add code
Mar 16, 2026
Viaarxiv icon

The Path Not Taken: RLVR Provably Learns Off the Principals

Add code
Nov 11, 2025
Viaarxiv icon

MobileLLM-Pro Technical Report

Add code
Nov 10, 2025
Viaarxiv icon

Llama Guard 3-1B-INT4: Compact and Efficient Safeguard for Human-AI Conversations

Add code
Nov 18, 2024
Viaarxiv icon

DεpS: Delayed ε-Shrinking for Faster Once-For-All Training

Add code
Jul 08, 2024
Figure 1 for DεpS: Delayed ε-Shrinking for Faster Once-For-All Training
Figure 2 for DεpS: Delayed ε-Shrinking for Faster Once-For-All Training
Figure 3 for DεpS: Delayed ε-Shrinking for Faster Once-For-All Training
Figure 4 for DεpS: Delayed ε-Shrinking for Faster Once-For-All Training
Viaarxiv icon

SpinQuant: LLM quantization with learned rotations

Add code
May 28, 2024
Figure 1 for SpinQuant: LLM quantization with learned rotations
Figure 2 for SpinQuant: LLM quantization with learned rotations
Figure 3 for SpinQuant: LLM quantization with learned rotations
Figure 4 for SpinQuant: LLM quantization with learned rotations
Viaarxiv icon

MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases

Add code
Feb 22, 2024
Viaarxiv icon

SiGeo: Sub-One-Shot NAS via Information Theory and Geometry of Loss Landscape

Add code
Nov 22, 2023
Viaarxiv icon

Rankitect: Ranking Architecture Search Battling World-class Engineers at Meta Scale

Add code
Nov 14, 2023
Figure 1 for Rankitect: Ranking Architecture Search Battling World-class Engineers at Meta Scale
Figure 2 for Rankitect: Ranking Architecture Search Battling World-class Engineers at Meta Scale
Figure 3 for Rankitect: Ranking Architecture Search Battling World-class Engineers at Meta Scale
Figure 4 for Rankitect: Ranking Architecture Search Battling World-class Engineers at Meta Scale
Viaarxiv icon