Picture for Yunhua Zhou

Yunhua Zhou

BitStack: Fine-Grained Size Control for Compressed Large Language Models in Variable Memory Environments

Add code
Oct 31, 2024
Figure 1 for BitStack: Fine-Grained Size Control for Compressed Large Language Models in Variable Memory Environments
Figure 2 for BitStack: Fine-Grained Size Control for Compressed Large Language Models in Variable Memory Environments
Figure 3 for BitStack: Fine-Grained Size Control for Compressed Large Language Models in Variable Memory Environments
Figure 4 for BitStack: Fine-Grained Size Control for Compressed Large Language Models in Variable Memory Environments
Viaarxiv icon

Llama Scope: Extracting Millions of Features from Llama-3.1-8B with Sparse Autoencoders

Add code
Oct 27, 2024
Viaarxiv icon

Towards Universality: Studying Mechanistic Similarity Across Language Model Architectures

Add code
Oct 10, 2024
Figure 1 for Towards Universality: Studying Mechanistic Similarity Across Language Model Architectures
Figure 2 for Towards Universality: Studying Mechanistic Similarity Across Language Model Architectures
Figure 3 for Towards Universality: Studying Mechanistic Similarity Across Language Model Architectures
Figure 4 for Towards Universality: Studying Mechanistic Similarity Across Language Model Architectures
Viaarxiv icon

Data Mixing Laws: Optimizing Data Mixtures by Predicting Language Modeling Performance

Add code
Mar 25, 2024
Viaarxiv icon

AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling

Add code
Feb 26, 2024
Figure 1 for AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling
Figure 2 for AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling
Figure 3 for AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling
Figure 4 for AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling
Viaarxiv icon

Data-freeWeight Compress and Denoise for Large Language Models

Add code
Feb 26, 2024
Viaarxiv icon

Turn Waste into Worth: Rectifying Top-$k$ Router of MoE

Add code
Feb 21, 2024
Figure 1 for Turn Waste into Worth: Rectifying Top-$k$ Router of MoE
Figure 2 for Turn Waste into Worth: Rectifying Top-$k$ Router of MoE
Figure 3 for Turn Waste into Worth: Rectifying Top-$k$ Router of MoE
Figure 4 for Turn Waste into Worth: Rectifying Top-$k$ Router of MoE
Viaarxiv icon

Code Needs Comments: Enhancing Code LLMs with Comment Augmentation

Add code
Feb 20, 2024
Viaarxiv icon

DenoSent: A Denoising Objective for Self-Supervised Sentence Representation Learning

Add code
Jan 24, 2024
Figure 1 for DenoSent: A Denoising Objective for Self-Supervised Sentence Representation Learning
Figure 2 for DenoSent: A Denoising Objective for Self-Supervised Sentence Representation Learning
Figure 3 for DenoSent: A Denoising Objective for Self-Supervised Sentence Representation Learning
Figure 4 for DenoSent: A Denoising Objective for Self-Supervised Sentence Representation Learning
Viaarxiv icon

An Open-World Lottery Ticket for Out-of-Domain Intent Classification

Add code
Oct 13, 2022
Figure 1 for An Open-World Lottery Ticket for Out-of-Domain Intent Classification
Figure 2 for An Open-World Lottery Ticket for Out-of-Domain Intent Classification
Figure 3 for An Open-World Lottery Ticket for Out-of-Domain Intent Classification
Figure 4 for An Open-World Lottery Ticket for Out-of-Domain Intent Classification
Viaarxiv icon