Picture for Torsten Hoefler

Torsten Hoefler

When Data Is Scarce: Scaling Sparse Language Models with Repeated Training

Add code
May 31, 2026
Viaarxiv icon

Memory-Efficient LLM Training with Dynamic Sparsity: From Stability to Practical Scaling

Add code
May 30, 2026
Viaarxiv icon

Can AI Weather Models Predict Beyond Two Weeks? A Quantitative Benchmark and Analysis of Long Rollouts

Add code
May 28, 2026
Viaarxiv icon

Confounder Detection via Treatment Intent: A New Observational Study Design

Add code
May 26, 2026
Viaarxiv icon

Large Language Model Selection with Limited Annotations

Add code
May 24, 2026
Viaarxiv icon

Grid Games: The Power of Multiple Grids for Quantizing Large Language Models

Add code
May 12, 2026
Viaarxiv icon

Resilient AI Supercomputer Networking using MRC and SRv6

Add code
May 05, 2026
Viaarxiv icon

SFT-then-RL Outperforms Mixed-Policy Methods for LLM Reasoning

Add code
Apr 26, 2026
Viaarxiv icon

Process Reward Agents for Steering Knowledge-Intensive Reasoning

Add code
Apr 10, 2026
Viaarxiv icon

Scaling Laws of Global Weather Models

Add code
Feb 26, 2026
Viaarxiv icon