Picture for Zijia Chen

Zijia Chen

Celine

CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training

Add code
Apr 17, 2025
Viaarxiv icon

Efficient Hybrid Language Model Compression through Group-Aware SSM Pruning

Add code
Apr 15, 2025
Viaarxiv icon

Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models

Add code
Apr 10, 2025
Viaarxiv icon

Hymba: A Hybrid-head Architecture for Small Language Models

Add code
Nov 20, 2024
Figure 1 for Hymba: A Hybrid-head Architecture for Small Language Models
Figure 2 for Hymba: A Hybrid-head Architecture for Small Language Models
Figure 3 for Hymba: A Hybrid-head Architecture for Small Language Models
Figure 4 for Hymba: A Hybrid-head Architecture for Small Language Models
Viaarxiv icon