Picture for Tong Yang

Tong Yang

Michael Pokorny

Multi-head Transformers Provably Learn Symbolic Multi-step Reasoning via Gradient Descent

Add code
Aug 11, 2025
Viaarxiv icon

Fairy$\pm i$: the First 2-bit Complex LLM with All Parameters in $\{\pm1, \pm i\}$

Add code
Aug 07, 2025
Viaarxiv icon

FAF: A Feature-Adaptive Framework for Few-Shot Time Series Forecasting

Add code
Jun 24, 2025
Viaarxiv icon

SciDA: Scientific Dynamic Assessor of LLMs

Add code
Jun 15, 2025
Viaarxiv icon

Continuous Semi-Implicit Models

Add code
Jun 07, 2025
Viaarxiv icon

KeepKV: Eliminating Output Perturbation in KV Cache Compression for Efficient LLMs Inference

Add code
Apr 14, 2025
Viaarxiv icon

TinyR1-32B-Preview: Boosting Accuracy with Branch-Merge Distillation

Add code
Mar 06, 2025
Figure 1 for TinyR1-32B-Preview: Boosting Accuracy with Branch-Merge Distillation
Figure 2 for TinyR1-32B-Preview: Boosting Accuracy with Branch-Merge Distillation
Figure 3 for TinyR1-32B-Preview: Boosting Accuracy with Branch-Merge Distillation
Figure 4 for TinyR1-32B-Preview: Boosting Accuracy with Branch-Merge Distillation
Viaarxiv icon

ReFocus: Reinforcing Mid-Frequency and Key-Frequency Modeling for Multivariate Time Series Forecasting

Add code
Feb 24, 2025
Viaarxiv icon

FairKV: Balancing Per-Head KV Cache for Fast Multi-GPU Inference

Add code
Feb 19, 2025
Viaarxiv icon

LLM-Sketch: Enhancing Network Sketches with LLM

Add code
Feb 11, 2025
Viaarxiv icon