Picture for Shuaiwen Leon Song

Shuaiwen Leon Song

Disentangling Reasoning and Knowledge in Medical Large Language Models

Add code
May 16, 2025
Viaarxiv icon

Improving Model Alignment Through Collective Intelligence of Open-Source LLMS

Add code
May 05, 2025
Viaarxiv icon

How Well Can General Vision-Language Models Learn Medicine By Watching Public Educational Videos?

Add code
Apr 19, 2025
Viaarxiv icon

Think Deep, Think Fast: Investigating Efficiency of Verifier-free Inference-time-scaling Methods

Add code
Apr 18, 2025
Viaarxiv icon

Ladder-residual: parallelism-aware architecture for accelerating large model inference with communication overlapping

Add code
Jan 11, 2025
Figure 1 for Ladder-residual: parallelism-aware architecture for accelerating large model inference with communication overlapping
Figure 2 for Ladder-residual: parallelism-aware architecture for accelerating large model inference with communication overlapping
Figure 3 for Ladder-residual: parallelism-aware architecture for accelerating large model inference with communication overlapping
Figure 4 for Ladder-residual: parallelism-aware architecture for accelerating large model inference with communication overlapping
Viaarxiv icon

CorDA: Context-Oriented Decomposition Adaptation of Large Language Models

Add code
Jun 07, 2024
Viaarxiv icon

Dragonfly: Multi-Resolution Zoom Supercharges Large Visual-Language Model

Add code
Jun 03, 2024
Figure 1 for Dragonfly: Multi-Resolution Zoom Supercharges Large Visual-Language Model
Figure 2 for Dragonfly: Multi-Resolution Zoom Supercharges Large Visual-Language Model
Figure 3 for Dragonfly: Multi-Resolution Zoom Supercharges Large Visual-Language Model
Figure 4 for Dragonfly: Multi-Resolution Zoom Supercharges Large Visual-Language Model
Viaarxiv icon

FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design

Add code
Jan 25, 2024
Figure 1 for FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design
Figure 2 for FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design
Figure 3 for FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design
Figure 4 for FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design
Viaarxiv icon

DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies

Add code
Oct 11, 2023
Figure 1 for DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies
Figure 2 for DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies
Figure 3 for DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies
Figure 4 for DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies
Viaarxiv icon

Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity

Add code
Sep 19, 2023
Viaarxiv icon