Picture for Liang Lin

Liang Lin

Can We Achieve Efficient Diffusion without Self-Attention? Distilling Self-Attention into Convolutions

Add code
Apr 30, 2025
Figure 1 for Can We Achieve Efficient Diffusion without Self-Attention? Distilling Self-Attention into Convolutions
Figure 2 for Can We Achieve Efficient Diffusion without Self-Attention? Distilling Self-Attention into Convolutions
Figure 3 for Can We Achieve Efficient Diffusion without Self-Attention? Distilling Self-Attention into Convolutions
Figure 4 for Can We Achieve Efficient Diffusion without Self-Attention? Distilling Self-Attention into Convolutions
Viaarxiv icon

Rethinking Generalizable Infrared Small Target Detection: A Real-scene Benchmark and Cross-view Representation Learning

Add code
Apr 23, 2025
Figure 1 for Rethinking Generalizable Infrared Small Target Detection: A Real-scene Benchmark and Cross-view Representation Learning
Figure 2 for Rethinking Generalizable Infrared Small Target Detection: A Real-scene Benchmark and Cross-view Representation Learning
Figure 3 for Rethinking Generalizable Infrared Small Target Detection: A Real-scene Benchmark and Cross-view Representation Learning
Figure 4 for Rethinking Generalizable Infrared Small Target Detection: A Real-scene Benchmark and Cross-view Representation Learning
Viaarxiv icon

A Comprehensive Survey in LLM(-Agent) Full Stack Safety: Data, Training and Deployment

Add code
Apr 22, 2025
Viaarxiv icon

3DAffordSplat: Efficient Affordance Reasoning with 3D Gaussians

Add code
Apr 16, 2025
Viaarxiv icon

DreamFuse: Adaptive Image Fusion with Diffusion Transformer

Add code
Apr 11, 2025
Viaarxiv icon

Exploiting Temporal Audio-Visual Correlation Embedding for Audio-Driven One-Shot Talking Head Animation

Add code
Apr 08, 2025
Figure 1 for Exploiting Temporal Audio-Visual Correlation Embedding for Audio-Driven One-Shot Talking Head Animation
Figure 2 for Exploiting Temporal Audio-Visual Correlation Embedding for Audio-Driven One-Shot Talking Head Animation
Figure 3 for Exploiting Temporal Audio-Visual Correlation Embedding for Audio-Driven One-Shot Talking Head Animation
Figure 4 for Exploiting Temporal Audio-Visual Correlation Embedding for Audio-Driven One-Shot Talking Head Animation
Viaarxiv icon

Contrastive Decoupled Representation Learning and Regularization for Speech-Preserving Facial Expression Manipulation

Add code
Apr 08, 2025
Viaarxiv icon

YOLO-LLTS: Real-Time Low-Light Traffic Sign Detection via Prior-Guided Enhancement and Multi-Branch Feature Interaction

Add code
Mar 18, 2025
Viaarxiv icon

VTON 360: High-Fidelity Virtual Try-On from Any Viewing Direction

Add code
Mar 15, 2025
Viaarxiv icon

Beyond the Destination: A Novel Benchmark for Exploration-Aware Embodied Question Answering

Add code
Mar 14, 2025
Figure 1 for Beyond the Destination: A Novel Benchmark for Exploration-Aware Embodied Question Answering
Figure 2 for Beyond the Destination: A Novel Benchmark for Exploration-Aware Embodied Question Answering
Figure 3 for Beyond the Destination: A Novel Benchmark for Exploration-Aware Embodied Question Answering
Figure 4 for Beyond the Destination: A Novel Benchmark for Exploration-Aware Embodied Question Answering
Viaarxiv icon