Picture for Y. K. Li

Y. K. Li

DeepSeek-V3 Technical Report

Add code
Dec 27, 2024
Figure 1 for DeepSeek-V3 Technical Report
Figure 2 for DeepSeek-V3 Technical Report
Figure 3 for DeepSeek-V3 Technical Report
Figure 4 for DeepSeek-V3 Technical Report
Viaarxiv icon

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Add code
Feb 06, 2024
Figure 1 for DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Figure 2 for DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Figure 3 for DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Figure 4 for DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Viaarxiv icon

DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence

Add code
Jan 26, 2024
Figure 1 for DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence
Figure 2 for DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence
Figure 3 for DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence
Figure 4 for DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence
Viaarxiv icon

DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models

Add code
Jan 11, 2024
Figure 1 for DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Figure 2 for DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Figure 3 for DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Figure 4 for DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Viaarxiv icon

DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

Add code
Jan 05, 2024
Figure 1 for DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Figure 2 for DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Figure 3 for DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Figure 4 for DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Viaarxiv icon