Picture for Longxiang Tang

Longxiang Tang

AliTok: Towards Sequence Modeling Alignment between Tokenizer and Autoregressive Model

Add code
Jun 05, 2025
Viaarxiv icon

UnfoldIR: Rethinking Deep Unfolding Network in Illumination Degradation Image Restoration

Add code
May 10, 2025
Viaarxiv icon

UniAnimate-DiT: Human Image Animation with Large-Scale Video Diffusion Transformer

Add code
Apr 15, 2025
Viaarxiv icon

Hybrid-Level Instruction Injection for Video Token Compression in Multi-modal Large Language Models

Add code
Mar 20, 2025
Viaarxiv icon

Does Your Vision-Language Model Get Lost in the Long Video Sampling Dilemma?

Add code
Mar 16, 2025
Viaarxiv icon

DreamRelation: Relation-Centric Video Customization

Add code
Mar 10, 2025
Viaarxiv icon

Gamma: Toward Generic Image Assessment with Mixture of Assessment Experts

Add code
Mar 09, 2025
Viaarxiv icon

Integrating Extra Modality Helps Segmentor Find Camouflaged Objects Well

Add code
Feb 20, 2025
Viaarxiv icon

RUN: Reversible Unfolding Network for Concealed Object Segmentation

Add code
Jan 30, 2025
Figure 1 for RUN: Reversible Unfolding Network for Concealed Object Segmentation
Figure 2 for RUN: Reversible Unfolding Network for Concealed Object Segmentation
Figure 3 for RUN: Reversible Unfolding Network for Concealed Object Segmentation
Figure 4 for RUN: Reversible Unfolding Network for Concealed Object Segmentation
Viaarxiv icon

Lyra: An Efficient and Speech-Centric Framework for Omni-Cognition

Add code
Dec 12, 2024
Figure 1 for Lyra: An Efficient and Speech-Centric Framework for Omni-Cognition
Figure 2 for Lyra: An Efficient and Speech-Centric Framework for Omni-Cognition
Figure 3 for Lyra: An Efficient and Speech-Centric Framework for Omni-Cognition
Figure 4 for Lyra: An Efficient and Speech-Centric Framework for Omni-Cognition
Viaarxiv icon