Picture for Chuanyang Zheng

Chuanyang Zheng

SAS: Simulated Attention Score

Add code
Jul 10, 2025
Viaarxiv icon

SynapseRoute: An Auto-Route Switching Framework on Dual-State Large Language Model

Add code
Jul 03, 2025
Viaarxiv icon

Ming-Omni: A Unified Multimodal Model for Perception and Generation

Add code
Jun 11, 2025
Viaarxiv icon

Logits-Based Finetuning

Add code
May 30, 2025
Viaarxiv icon

Self-Adjust Softmax

Add code
Feb 25, 2025
Viaarxiv icon

ParallelComp: Parallel Long-Context Compressor for Length Extrapolation

Add code
Feb 20, 2025
Viaarxiv icon

The Linear Attention Resurrection in Vision Transformer

Add code
Jan 27, 2025
Viaarxiv icon

iFormer: Integrating ConvNet and Transformer for Mobile Application

Add code
Jan 26, 2025
Viaarxiv icon

Efficient Multi-modal Large Language Models via Visual Token Grouping

Add code
Nov 26, 2024
Figure 1 for Efficient Multi-modal Large Language Models via Visual Token Grouping
Figure 2 for Efficient Multi-modal Large Language Models via Visual Token Grouping
Figure 3 for Efficient Multi-modal Large Language Models via Visual Token Grouping
Figure 4 for Efficient Multi-modal Large Language Models via Visual Token Grouping
Viaarxiv icon

DAPE V2: Process Attention Score as Feature Map for Length Extrapolation

Add code
Oct 07, 2024
Figure 1 for DAPE V2: Process Attention Score as Feature Map for Length Extrapolation
Figure 2 for DAPE V2: Process Attention Score as Feature Map for Length Extrapolation
Figure 3 for DAPE V2: Process Attention Score as Feature Map for Length Extrapolation
Figure 4 for DAPE V2: Process Attention Score as Feature Map for Length Extrapolation
Viaarxiv icon