Picture for Daniel Hsu

Daniel Hsu

Microsoft Research

Group-realizable multi-group learning by minimizing empirical risk

Add code
Jan 23, 2026
Viaarxiv icon

Time-Aware Synthetic Control

Add code
Jan 06, 2026
Viaarxiv icon

Panprediction: Optimal Predictions for Any Downstream Task and Loss

Add code
Oct 31, 2025
Viaarxiv icon

Fast attention mechanisms: a tale of parallelism

Add code
Sep 10, 2025
Viaarxiv icon

Learning Compositional Functions with Transformers from Easy-to-Hard Data

Add code
May 29, 2025
Viaarxiv icon

Survey on Algorithms for multi-index models

Add code
Apr 07, 2025
Viaarxiv icon

Learning Gaussian Multi-Index Models with Gradient Flow: Time Complexity and Directional Convergence

Add code
Nov 13, 2024
Viaarxiv icon

Interactive Machine Teaching by Labeling Rules and Instances

Add code
Sep 08, 2024
Viaarxiv icon

One-layer transformers fail to solve the induction heads task

Add code
Aug 26, 2024
Viaarxiv icon

Transformers Provably Learn Sparse Token Selection While Fully-Connected Nets Cannot

Add code
Jun 11, 2024
Figure 1 for Transformers Provably Learn Sparse Token Selection While Fully-Connected Nets Cannot
Figure 2 for Transformers Provably Learn Sparse Token Selection While Fully-Connected Nets Cannot
Figure 3 for Transformers Provably Learn Sparse Token Selection While Fully-Connected Nets Cannot
Figure 4 for Transformers Provably Learn Sparse Token Selection While Fully-Connected Nets Cannot
Viaarxiv icon