Picture for Haotian Tang

Haotian Tang

USAD: An Unsupervised Data Augmentation Spatio-Temporal Attention Diffusion Network

Add code
Jul 03, 2025
Viaarxiv icon

Redundant feature screening method for human activity recognition based on attention purification mechanism

Add code
Mar 30, 2025
Viaarxiv icon

CMD-HAR: Cross-Modal Disentanglement for Wearable Human Activity Recognition

Add code
Mar 27, 2025
Viaarxiv icon

LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention

Add code
Feb 20, 2025
Viaarxiv icon

SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers

Add code
Oct 15, 2024
Figure 1 for SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers
Figure 2 for SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers
Figure 3 for SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers
Figure 4 for SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers
Viaarxiv icon

HART: Efficient Visual Generation with Hybrid Autoregressive Transformer

Add code
Oct 14, 2024
Figure 1 for HART: Efficient Visual Generation with Hybrid Autoregressive Transformer
Figure 2 for HART: Efficient Visual Generation with Hybrid Autoregressive Transformer
Figure 3 for HART: Efficient Visual Generation with Hybrid Autoregressive Transformer
Figure 4 for HART: Efficient Visual Generation with Hybrid Autoregressive Transformer
Viaarxiv icon

DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads

Add code
Oct 14, 2024
Figure 1 for DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads
Figure 2 for DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads
Figure 3 for DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads
Figure 4 for DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads
Viaarxiv icon

Deep Compression Autoencoder for Efficient High-Resolution Diffusion Models

Add code
Oct 14, 2024
Figure 1 for Deep Compression Autoencoder for Efficient High-Resolution Diffusion Models
Figure 2 for Deep Compression Autoencoder for Efficient High-Resolution Diffusion Models
Figure 3 for Deep Compression Autoencoder for Efficient High-Resolution Diffusion Models
Figure 4 for Deep Compression Autoencoder for Efficient High-Resolution Diffusion Models
Viaarxiv icon

VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation

Add code
Sep 06, 2024
Figure 1 for VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation
Figure 2 for VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation
Figure 3 for VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation
Figure 4 for VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation
Viaarxiv icon

LongVILA: Scaling Long-Context Visual Language Models for Long Videos

Add code
Aug 21, 2024
Figure 1 for LongVILA: Scaling Long-Context Visual Language Models for Long Videos
Figure 2 for LongVILA: Scaling Long-Context Visual Language Models for Long Videos
Figure 3 for LongVILA: Scaling Long-Context Visual Language Models for Long Videos
Figure 4 for LongVILA: Scaling Long-Context Visual Language Models for Long Videos
Viaarxiv icon