Picture for Jianchao Tan

Jianchao Tan

C2T: A Classifier-Based Tree Construction Method in Speculative Decoding

Add code
Feb 19, 2025
Figure 1 for C2T: A Classifier-Based Tree Construction Method in Speculative Decoding
Figure 2 for C2T: A Classifier-Based Tree Construction Method in Speculative Decoding
Figure 3 for C2T: A Classifier-Based Tree Construction Method in Speculative Decoding
Figure 4 for C2T: A Classifier-Based Tree Construction Method in Speculative Decoding
Viaarxiv icon

MaskPrune: Mask-based LLM Pruning for Layer-wise Uniform Structures

Add code
Feb 19, 2025
Figure 1 for MaskPrune: Mask-based LLM Pruning for Layer-wise Uniform Structures
Figure 2 for MaskPrune: Mask-based LLM Pruning for Layer-wise Uniform Structures
Figure 3 for MaskPrune: Mask-based LLM Pruning for Layer-wise Uniform Structures
Figure 4 for MaskPrune: Mask-based LLM Pruning for Layer-wise Uniform Structures
Viaarxiv icon

PrefixKV: Adaptive Prefix KV Cache is What Vision Instruction-Following Models Need for Efficient Generation

Add code
Dec 04, 2024
Figure 1 for PrefixKV: Adaptive Prefix KV Cache is What Vision Instruction-Following Models Need for Efficient Generation
Figure 2 for PrefixKV: Adaptive Prefix KV Cache is What Vision Instruction-Following Models Need for Efficient Generation
Figure 3 for PrefixKV: Adaptive Prefix KV Cache is What Vision Instruction-Following Models Need for Efficient Generation
Figure 4 for PrefixKV: Adaptive Prefix KV Cache is What Vision Instruction-Following Models Need for Efficient Generation
Viaarxiv icon

EPS-MoE: Expert Pipeline Scheduler for Cost-Efficient MoE Inference

Add code
Oct 16, 2024
Figure 1 for EPS-MoE: Expert Pipeline Scheduler for Cost-Efficient MoE Inference
Figure 2 for EPS-MoE: Expert Pipeline Scheduler for Cost-Efficient MoE Inference
Figure 3 for EPS-MoE: Expert Pipeline Scheduler for Cost-Efficient MoE Inference
Figure 4 for EPS-MoE: Expert Pipeline Scheduler for Cost-Efficient MoE Inference
Viaarxiv icon

CountFormer: Multi-View Crowd Counting Transformer

Add code
Jul 02, 2024
Figure 1 for CountFormer: Multi-View Crowd Counting Transformer
Figure 2 for CountFormer: Multi-View Crowd Counting Transformer
Figure 3 for CountFormer: Multi-View Crowd Counting Transformer
Figure 4 for CountFormer: Multi-View Crowd Counting Transformer
Viaarxiv icon

ASP: Automatic Selection of Proxy dataset for efficient AutoML

Add code
Oct 17, 2023
Figure 1 for ASP: Automatic Selection of Proxy dataset for efficient AutoML
Figure 2 for ASP: Automatic Selection of Proxy dataset for efficient AutoML
Figure 3 for ASP: Automatic Selection of Proxy dataset for efficient AutoML
Figure 4 for ASP: Automatic Selection of Proxy dataset for efficient AutoML
Viaarxiv icon

USDC: Unified Static and Dynamic Compression for Visual Transformer

Add code
Oct 17, 2023
Figure 1 for USDC: Unified Static and Dynamic Compression for Visual Transformer
Figure 2 for USDC: Unified Static and Dynamic Compression for Visual Transformer
Figure 3 for USDC: Unified Static and Dynamic Compression for Visual Transformer
Figure 4 for USDC: Unified Static and Dynamic Compression for Visual Transformer
Viaarxiv icon

Unified Language-Vision Pretraining in LLM with Dynamic Discrete Visual Tokenization

Add code
Sep 29, 2023
Figure 1 for Unified Language-Vision Pretraining in LLM with Dynamic Discrete Visual Tokenization
Figure 2 for Unified Language-Vision Pretraining in LLM with Dynamic Discrete Visual Tokenization
Figure 3 for Unified Language-Vision Pretraining in LLM with Dynamic Discrete Visual Tokenization
Figure 4 for Unified Language-Vision Pretraining in LLM with Dynamic Discrete Visual Tokenization
Viaarxiv icon

SHARK: A Lightweight Model Compression Approach for Large-scale Recommender Systems

Add code
Aug 18, 2023
Figure 1 for SHARK: A Lightweight Model Compression Approach for Large-scale Recommender Systems
Figure 2 for SHARK: A Lightweight Model Compression Approach for Large-scale Recommender Systems
Figure 3 for SHARK: A Lightweight Model Compression Approach for Large-scale Recommender Systems
Figure 4 for SHARK: A Lightweight Model Compression Approach for Large-scale Recommender Systems
Viaarxiv icon

Resource Constrained Model Compression via Minimax Optimization for Spiking Neural Networks

Add code
Aug 09, 2023
Viaarxiv icon