Picture for Young Jin Kim

Young Jin Kim

Division of Cardiovascular Medicine, Radcliffe Department of Medicine, University of Oxford, Department of Radiology, Severance Hospital, South Korea

Scaling Reasoning Efficiently via Relaxed On-Policy Distillation

Add code
Mar 11, 2026
Viaarxiv icon

Decoder-Hybrid-Decoder Architecture for Efficient Reasoning with Long Generation

Add code
Jul 09, 2025
Figure 1 for Decoder-Hybrid-Decoder Architecture for Efficient Reasoning with Long Generation
Figure 2 for Decoder-Hybrid-Decoder Architecture for Efficient Reasoning with Long Generation
Figure 3 for Decoder-Hybrid-Decoder Architecture for Efficient Reasoning with Long Generation
Figure 4 for Decoder-Hybrid-Decoder Architecture for Efficient Reasoning with Long Generation
Viaarxiv icon

Phi-4-Mini-Reasoning: Exploring the Limits of Small Reasoning Language Models in Math

Add code
Apr 30, 2025
Figure 1 for Phi-4-Mini-Reasoning: Exploring the Limits of Small Reasoning Language Models in Math
Figure 2 for Phi-4-Mini-Reasoning: Exploring the Limits of Small Reasoning Language Models in Math
Figure 3 for Phi-4-Mini-Reasoning: Exploring the Limits of Small Reasoning Language Models in Math
Figure 4 for Phi-4-Mini-Reasoning: Exploring the Limits of Small Reasoning Language Models in Math
Viaarxiv icon

Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs

Add code
Mar 03, 2025
Figure 1 for Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs
Figure 2 for Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs
Figure 3 for Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs
Figure 4 for Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs
Viaarxiv icon

GRIN: GRadient-INformed MoE

Add code
Sep 18, 2024
Figure 1 for GRIN: GRadient-INformed MoE
Figure 2 for GRIN: GRadient-INformed MoE
Figure 3 for GRIN: GRadient-INformed MoE
Figure 4 for GRIN: GRadient-INformed MoE
Viaarxiv icon

Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation

Add code
Feb 02, 2024
Figure 1 for Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation
Figure 2 for Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation
Figure 3 for Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation
Figure 4 for Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation
Viaarxiv icon

PEMA: Plug-in External Memory Adaptation for Language Models

Add code
Nov 14, 2023
Figure 1 for PEMA: Plug-in External Memory Adaptation for Language Models
Figure 2 for PEMA: Plug-in External Memory Adaptation for Language Models
Figure 3 for PEMA: Plug-in External Memory Adaptation for Language Models
Figure 4 for PEMA: Plug-in External Memory Adaptation for Language Models
Viaarxiv icon

Mixture of Quantized Experts (MoQE): Complementary Effect of Low-bit Quantization and Robustness

Add code
Oct 03, 2023
Figure 1 for Mixture of Quantized Experts (MoQE): Complementary Effect of Low-bit Quantization and Robustness
Figure 2 for Mixture of Quantized Experts (MoQE): Complementary Effect of Low-bit Quantization and Robustness
Figure 3 for Mixture of Quantized Experts (MoQE): Complementary Effect of Low-bit Quantization and Robustness
Figure 4 for Mixture of Quantized Experts (MoQE): Complementary Effect of Low-bit Quantization and Robustness
Viaarxiv icon

A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models

Add code
Sep 20, 2023
Viaarxiv icon

Task-Based MoE for Multitask Multilingual Machine Translation

Add code
Sep 11, 2023
Viaarxiv icon