Alert button
Picture for Zhangyang Wang

Zhangyang Wang

Alert button

H$_2$O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models

Jun 24, 2023
Zhenyu Zhang, Ying Sheng, Tianyi Zhou, Tianlong Chen, Lianmin Zheng, Ruisi Cai, Zhao Song, Yuandong Tian, Christopher Ré, Clark Barrett, Zhangyang Wang, Beidi Chen

Figure 1 for H$_2$O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models
Figure 2 for H$_2$O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models
Figure 3 for H$_2$O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models
Figure 4 for H$_2$O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models
Viaarxiv icon

Graph Ladling: Shockingly Simple Parallel GNN Training without Intermediate Communication

Jun 18, 2023
Ajay Jaiswal, Shiwei Liu, Tianlong Chen, Ying Ding, Zhangyang Wang

Figure 1 for Graph Ladling: Shockingly Simple Parallel GNN Training without Intermediate Communication
Figure 2 for Graph Ladling: Shockingly Simple Parallel GNN Training without Intermediate Communication
Figure 3 for Graph Ladling: Shockingly Simple Parallel GNN Training without Intermediate Communication
Figure 4 for Graph Ladling: Shockingly Simple Parallel GNN Training without Intermediate Communication
Viaarxiv icon

Instant Soup: Cheap Pruning Ensembles in A Single Pass Can Draw Lottery Tickets from Large Models

Jun 18, 2023
Ajay Jaiswal, Shiwei Liu, Tianlong Chen, Ying Ding, Zhangyang Wang

Figure 1 for Instant Soup: Cheap Pruning Ensembles in A Single Pass Can Draw Lottery Tickets from Large Models
Figure 2 for Instant Soup: Cheap Pruning Ensembles in A Single Pass Can Draw Lottery Tickets from Large Models
Figure 3 for Instant Soup: Cheap Pruning Ensembles in A Single Pass Can Draw Lottery Tickets from Large Models
Figure 4 for Instant Soup: Cheap Pruning Ensembles in A Single Pass Can Draw Lottery Tickets from Large Models
Viaarxiv icon

Learning to Estimate 6DoF Pose from Limited Data: A Few-Shot, Generalizable Approach using RGB Images

Jun 13, 2023
Panwang Pan, Zhiwen Fan, Brandon Y. Feng, Peihao Wang, Chenxin Li, Zhangyang Wang

Figure 1 for Learning to Estimate 6DoF Pose from Limited Data: A Few-Shot, Generalizable Approach using RGB Images
Figure 2 for Learning to Estimate 6DoF Pose from Limited Data: A Few-Shot, Generalizable Approach using RGB Images
Figure 3 for Learning to Estimate 6DoF Pose from Limited Data: A Few-Shot, Generalizable Approach using RGB Images
Figure 4 for Learning to Estimate 6DoF Pose from Limited Data: A Few-Shot, Generalizable Approach using RGB Images
Viaarxiv icon

The Emergence of Essential Sparsity in Large Pre-trained Models: The Weights that Matter

Jun 06, 2023
Ajay Jaiswal, Shiwei Liu, Tianlong Chen, Zhangyang Wang

Figure 1 for The Emergence of Essential Sparsity in Large Pre-trained Models: The Weights that Matter
Figure 2 for The Emergence of Essential Sparsity in Large Pre-trained Models: The Weights that Matter
Figure 3 for The Emergence of Essential Sparsity in Large Pre-trained Models: The Weights that Matter
Figure 4 for The Emergence of Essential Sparsity in Large Pre-trained Models: The Weights that Matter
Viaarxiv icon

Dynamic Sparsity Is Channel-Level Sparsity Learner

May 30, 2023
Lu Yin, Gen Li, Meng Fang, Li Shen, Tianjin Huang, Zhangyang Wang, Vlado Menkovski, Xiaolong Ma, Mykola Pechenizkiy, Shiwei Liu

Figure 1 for Dynamic Sparsity Is Channel-Level Sparsity Learner
Figure 2 for Dynamic Sparsity Is Channel-Level Sparsity Learner
Figure 3 for Dynamic Sparsity Is Channel-Level Sparsity Learner
Figure 4 for Dynamic Sparsity Is Channel-Level Sparsity Learner
Viaarxiv icon

Are Large Kernels Better Teachers than Transformers for ConvNets?

May 30, 2023
Tianjin Huang, Lu Yin, Zhenyu Zhang, Li Shen, Meng Fang, Mykola Pechenizkiy, Zhangyang Wang, Shiwei Liu

Figure 1 for Are Large Kernels Better Teachers than Transformers for ConvNets?
Figure 2 for Are Large Kernels Better Teachers than Transformers for ConvNets?
Figure 3 for Are Large Kernels Better Teachers than Transformers for ConvNets?
Figure 4 for Are Large Kernels Better Teachers than Transformers for ConvNets?
Viaarxiv icon

Edge-MoE: Memory-Efficient Multi-Task Vision Transformer Architecture with Task-level Sparsity via Mixture-of-Experts

May 30, 2023
Rishov Sarkar, Hanxue Liang, Zhiwen Fan, Zhangyang Wang, Cong Hao

Figure 1 for Edge-MoE: Memory-Efficient Multi-Task Vision Transformer Architecture with Task-level Sparsity via Mixture-of-Experts
Figure 2 for Edge-MoE: Memory-Efficient Multi-Task Vision Transformer Architecture with Task-level Sparsity via Mixture-of-Experts
Figure 3 for Edge-MoE: Memory-Efficient Multi-Task Vision Transformer Architecture with Task-level Sparsity via Mixture-of-Experts
Figure 4 for Edge-MoE: Memory-Efficient Multi-Task Vision Transformer Architecture with Task-level Sparsity via Mixture-of-Experts
Viaarxiv icon

Towards Constituting Mathematical Structures for Learning to Optimize

May 29, 2023
Jialin Liu, Xiaohan Chen, Zhangyang Wang, Wotao Yin, HanQin Cai

Figure 1 for Towards Constituting Mathematical Structures for Learning to Optimize
Figure 2 for Towards Constituting Mathematical Structures for Learning to Optimize
Figure 3 for Towards Constituting Mathematical Structures for Learning to Optimize
Figure 4 for Towards Constituting Mathematical Structures for Learning to Optimize
Viaarxiv icon

Prompt-Free Diffusion: Taking "Text" out of Text-to-Image Diffusion Models

May 25, 2023
Xingqian Xu, Jiayi Guo, Zhangyang Wang, Gao Huang, Irfan Essa, Humphrey Shi

Figure 1 for Prompt-Free Diffusion: Taking "Text" out of Text-to-Image Diffusion Models
Figure 2 for Prompt-Free Diffusion: Taking "Text" out of Text-to-Image Diffusion Models
Figure 3 for Prompt-Free Diffusion: Taking "Text" out of Text-to-Image Diffusion Models
Figure 4 for Prompt-Free Diffusion: Taking "Text" out of Text-to-Image Diffusion Models
Viaarxiv icon