Picture for Yonggan Fu

Yonggan Fu

Celine

LongMamba: Enhancing Mamba's Long Context Capabilities via Training-Free Receptive Field Enlargement

Add code
Apr 22, 2025
Viaarxiv icon

CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training

Add code
Apr 17, 2025
Viaarxiv icon

Hymba: A Hybrid-head Architecture for Small Language Models

Add code
Nov 20, 2024
Figure 1 for Hymba: A Hybrid-head Architecture for Small Language Models
Figure 2 for Hymba: A Hybrid-head Architecture for Small Language Models
Figure 3 for Hymba: A Hybrid-head Architecture for Small Language Models
Figure 4 for Hymba: A Hybrid-head Architecture for Small Language Models
Viaarxiv icon

AmoebaLLM: Constructing Any-Shape Large Language Models for Efficient and Instant Deployment

Add code
Nov 15, 2024
Viaarxiv icon

MG-Verilog: Multi-grained Dataset Towards Enhanced LLM-assisted Verilog Generation

Add code
Jul 02, 2024
Figure 1 for MG-Verilog: Multi-grained Dataset Towards Enhanced LLM-assisted Verilog Generation
Figure 2 for MG-Verilog: Multi-grained Dataset Towards Enhanced LLM-assisted Verilog Generation
Figure 3 for MG-Verilog: Multi-grained Dataset Towards Enhanced LLM-assisted Verilog Generation
Figure 4 for MG-Verilog: Multi-grained Dataset Towards Enhanced LLM-assisted Verilog Generation
Viaarxiv icon

Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibration

Add code
Jun 22, 2024
Figure 1 for Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibration
Figure 2 for Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibration
Figure 3 for Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibration
Figure 4 for Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibration
Viaarxiv icon

Omni-Recon: Towards General-Purpose Neural Radiance Fields for Versatile 3D Applications

Add code
Mar 17, 2024
Viaarxiv icon

Towards Cognitive AI Systems: a Survey and Prospective on Neuro-Symbolic AI

Add code
Jan 02, 2024
Viaarxiv icon

NetDistiller: Empowering Tiny Deep Learning via In-Situ Distillation

Add code
Oct 24, 2023
Viaarxiv icon

GPT4AIGChip: Towards Next-Generation AI Accelerator Design Automation via Large Language Models

Add code
Sep 19, 2023
Figure 1 for GPT4AIGChip: Towards Next-Generation AI Accelerator Design Automation via Large Language Models
Figure 2 for GPT4AIGChip: Towards Next-Generation AI Accelerator Design Automation via Large Language Models
Figure 3 for GPT4AIGChip: Towards Next-Generation AI Accelerator Design Automation via Large Language Models
Figure 4 for GPT4AIGChip: Towards Next-Generation AI Accelerator Design Automation via Large Language Models
Viaarxiv icon