Picture for Yulun Du

Yulun Du

Kimi K2: Open Agentic Intelligence

Add code
Jul 28, 2025
Viaarxiv icon

Kimi-Audio Technical Report

Add code
Apr 25, 2025
Viaarxiv icon

Kimi-VL Technical Report

Add code
Apr 10, 2025
Viaarxiv icon

Muon is Scalable for LLM Training

Add code
Feb 24, 2025
Viaarxiv icon

MoBA: Mixture of Block Attention for Long-Context LLMs

Add code
Feb 18, 2025
Figure 1 for MoBA: Mixture of Block Attention for Long-Context LLMs
Figure 2 for MoBA: Mixture of Block Attention for Long-Context LLMs
Figure 3 for MoBA: Mixture of Block Attention for Long-Context LLMs
Figure 4 for MoBA: Mixture of Block Attention for Long-Context LLMs
Viaarxiv icon

STORYWARS: A Dataset and Instruction Tuning Baselines for Collaborative Story Understanding and Generation

Add code
May 14, 2023
Viaarxiv icon

GPS: Genetic Prompt Search for Efficient Few-shot Learning

Add code
Oct 31, 2022
Figure 1 for GPS: Genetic Prompt Search for Efficient Few-shot Learning
Figure 2 for GPS: Genetic Prompt Search for Efficient Few-shot Learning
Figure 3 for GPS: Genetic Prompt Search for Efficient Few-shot Learning
Figure 4 for GPS: Genetic Prompt Search for Efficient Few-shot Learning
Viaarxiv icon

ZeroPrompt: Scaling Prompt-Based Pretraining to 1,000 Tasks Improves Zero-Shot Generalization

Add code
Jan 18, 2022
Figure 1 for ZeroPrompt: Scaling Prompt-Based Pretraining to 1,000 Tasks Improves Zero-Shot Generalization
Figure 2 for ZeroPrompt: Scaling Prompt-Based Pretraining to 1,000 Tasks Improves Zero-Shot Generalization
Figure 3 for ZeroPrompt: Scaling Prompt-Based Pretraining to 1,000 Tasks Improves Zero-Shot Generalization
Figure 4 for ZeroPrompt: Scaling Prompt-Based Pretraining to 1,000 Tasks Improves Zero-Shot Generalization
Viaarxiv icon

Distribution Matching for Rationalization

Add code
Jun 01, 2021
Figure 1 for Distribution Matching for Rationalization
Figure 2 for Distribution Matching for Rationalization
Figure 3 for Distribution Matching for Rationalization
Figure 4 for Distribution Matching for Rationalization
Viaarxiv icon

Multimodal Polynomial Fusion for Detecting Driver Distraction

Add code
Oct 24, 2018
Figure 1 for Multimodal Polynomial Fusion for Detecting Driver Distraction
Figure 2 for Multimodal Polynomial Fusion for Detecting Driver Distraction
Figure 3 for Multimodal Polynomial Fusion for Detecting Driver Distraction
Figure 4 for Multimodal Polynomial Fusion for Detecting Driver Distraction
Viaarxiv icon