Picture for Y. Wu

Y. Wu

Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language Models

Add code
Jul 02, 2024
Figure 1 for Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language Models
Figure 2 for Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language Models
Figure 3 for Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language Models
Figure 4 for Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language Models
Viaarxiv icon

DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence

Add code
Jun 17, 2024
Figure 1 for DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence
Figure 2 for DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence
Figure 3 for DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence
Figure 4 for DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence
Viaarxiv icon

Hyperbolic Secant representation of the logistic function: Application to probabilistic Multiple Instance Learning for CT intracranial hemorrhage detection

Add code
Mar 21, 2024
Figure 1 for Hyperbolic Secant representation of the logistic function: Application to probabilistic Multiple Instance Learning for CT intracranial hemorrhage detection
Figure 2 for Hyperbolic Secant representation of the logistic function: Application to probabilistic Multiple Instance Learning for CT intracranial hemorrhage detection
Figure 3 for Hyperbolic Secant representation of the logistic function: Application to probabilistic Multiple Instance Learning for CT intracranial hemorrhage detection
Figure 4 for Hyperbolic Secant representation of the logistic function: Application to probabilistic Multiple Instance Learning for CT intracranial hemorrhage detection
Viaarxiv icon

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Add code
Feb 06, 2024
Figure 1 for DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Figure 2 for DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Figure 3 for DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Figure 4 for DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Viaarxiv icon

DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence

Add code
Jan 26, 2024
Figure 1 for DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence
Figure 2 for DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence
Figure 3 for DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence
Figure 4 for DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence
Viaarxiv icon

DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models

Add code
Jan 11, 2024
Figure 1 for DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Figure 2 for DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Figure 3 for DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Figure 4 for DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Viaarxiv icon

DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

Add code
Jan 05, 2024
Figure 1 for DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Figure 2 for DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Figure 3 for DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Figure 4 for DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Viaarxiv icon

Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human Annotations

Add code
Dec 28, 2023
Figure 1 for Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human Annotations
Figure 2 for Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human Annotations
Figure 3 for Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human Annotations
Figure 4 for Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human Annotations
Viaarxiv icon

A Generative deep learning approach for shape recognition of arbitrary objects from phaseless acoustic scattering data

Add code
Jul 12, 2022
Figure 1 for A Generative deep learning approach for shape recognition of arbitrary objects from phaseless acoustic scattering data
Figure 2 for A Generative deep learning approach for shape recognition of arbitrary objects from phaseless acoustic scattering data
Figure 3 for A Generative deep learning approach for shape recognition of arbitrary objects from phaseless acoustic scattering data
Figure 4 for A Generative deep learning approach for shape recognition of arbitrary objects from phaseless acoustic scattering data
Viaarxiv icon

Machine learning for knowledge acquisition and accelerated inverse-design for non-Hermitian systems

Add code
Apr 28, 2022
Figure 1 for Machine learning for knowledge acquisition and accelerated inverse-design for non-Hermitian systems
Figure 2 for Machine learning for knowledge acquisition and accelerated inverse-design for non-Hermitian systems
Figure 3 for Machine learning for knowledge acquisition and accelerated inverse-design for non-Hermitian systems
Figure 4 for Machine learning for knowledge acquisition and accelerated inverse-design for non-Hermitian systems
Viaarxiv icon