Picture for Xinran Gu

Xinran Gu

Kimi K2: Open Agentic Intelligence

Add code
Jul 28, 2025
Viaarxiv icon

Data Mixing Can Induce Phase Transitions in Knowledge Acquisition

Add code
May 23, 2025
Viaarxiv icon

Keeping LLMs Aligned After Fine-tuning: The Crucial Role of Prompt Templates

Add code
Feb 28, 2024
Viaarxiv icon

A Quadratic Synchronization Rule for Distributed Deep Learning

Add code
Oct 22, 2023
Viaarxiv icon

Why does Local SGD Generalize Better than SGD?

Add code
Mar 09, 2023
Figure 1 for Why  does Local SGD Generalize Better than SGD?
Figure 2 for Why  does Local SGD Generalize Better than SGD?
Figure 3 for Why  does Local SGD Generalize Better than SGD?
Figure 4 for Why  does Local SGD Generalize Better than SGD?
Viaarxiv icon

Fast Federated Learning in the Presence of Arbitrary Device Unavailability

Add code
Jun 08, 2021
Figure 1 for Fast Federated Learning in the Presence of Arbitrary Device Unavailability
Figure 2 for Fast Federated Learning in the Presence of Arbitrary Device Unavailability
Viaarxiv icon