Picture for Zhiyuan Li

Zhiyuan Li

A Survey of Retentive Network

Add code
Jun 07, 2025
Viaarxiv icon

THINK-Bench: Evaluating Thinking Efficiency and Chain-of-Thought Quality of Large Reasoning Models

Add code
May 28, 2025
Viaarxiv icon

On Learning Verifiers for Chain-of-Thought Reasoning

Add code
May 28, 2025
Viaarxiv icon

UWSAM: Segment Anything Model Guided Underwater Instance Segmentation and A Large-scale Benchmark Dataset

Add code
May 21, 2025
Viaarxiv icon

PENCIL: Long Thoughts with Short Memory

Add code
Mar 18, 2025
Viaarxiv icon

Structured Preconditioners in Adaptive Optimization: A Unified Analysis

Add code
Mar 13, 2025
Viaarxiv icon

A Theory of Learning with Autoregressive Chain of Thought

Add code
Mar 11, 2025
Viaarxiv icon

Weak-to-Strong Generalization Even in Random Feature Networks, Provably

Add code
Mar 04, 2025
Viaarxiv icon

External Large Foundation Model: How to Efficiently Serve Trillions of Parameters for Online Ads Recommendation

Add code
Feb 26, 2025
Figure 1 for External Large Foundation Model: How to Efficiently Serve Trillions of Parameters for Online Ads Recommendation
Figure 2 for External Large Foundation Model: How to Efficiently Serve Trillions of Parameters for Online Ads Recommendation
Figure 3 for External Large Foundation Model: How to Efficiently Serve Trillions of Parameters for Online Ads Recommendation
Figure 4 for External Large Foundation Model: How to Efficiently Serve Trillions of Parameters for Online Ads Recommendation
Viaarxiv icon

Reasoning with Latent Thoughts: On the Power of Looped Transformers

Add code
Feb 24, 2025
Viaarxiv icon