Picture for Yuxin Guo

Yuxin Guo

Aligned Better, Listen Better for Audio-Visual Large Language Models

Add code
Apr 02, 2025
Viaarxiv icon

GenHancer: Imperfect Generative Models are Secretly Strong Vision-Centric Enhancers

Add code
Mar 25, 2025
Viaarxiv icon

Monocular Depth Estimation and Segmentation for Transparent Object with Iterative Semantic and Geometric Fusion

Add code
Feb 20, 2025
Viaarxiv icon

MaskGWM: A Generalizable Driving World Model with Video Mask Reconstruction

Add code
Feb 17, 2025
Viaarxiv icon

UniMLVG: Unified Framework for Multi-view Long Video Generation with Comprehensive Control Capabilities for Autonomous Driving

Add code
Dec 06, 2024
Viaarxiv icon

HoloDrive: Holistic 2D-3D Multi-Modal Street Scene Generation for Autonomous Driving

Add code
Dec 03, 2024
Figure 1 for HoloDrive: Holistic 2D-3D Multi-Modal Street Scene Generation for Autonomous Driving
Figure 2 for HoloDrive: Holistic 2D-3D Multi-Modal Street Scene Generation for Autonomous Driving
Figure 3 for HoloDrive: Holistic 2D-3D Multi-Modal Street Scene Generation for Autonomous Driving
Figure 4 for HoloDrive: Holistic 2D-3D Multi-Modal Street Scene Generation for Autonomous Driving
Viaarxiv icon

LoTLIP: Improving Language-Image Pre-training for Long Text Understanding

Add code
Oct 07, 2024
Figure 1 for LoTLIP: Improving Language-Image Pre-training for Long Text Understanding
Figure 2 for LoTLIP: Improving Language-Image Pre-training for Long Text Understanding
Figure 3 for LoTLIP: Improving Language-Image Pre-training for Long Text Understanding
Figure 4 for LoTLIP: Improving Language-Image Pre-training for Long Text Understanding
Viaarxiv icon

On the Nonlinearity of Layer Normalization

Add code
Jun 03, 2024
Figure 1 for On the Nonlinearity of Layer Normalization
Figure 2 for On the Nonlinearity of Layer Normalization
Figure 3 for On the Nonlinearity of Layer Normalization
Figure 4 for On the Nonlinearity of Layer Normalization
Viaarxiv icon

CoReS: Orchestrating the Dance of Reasoning and Segmentation

Add code
Apr 08, 2024
Viaarxiv icon

Cross Pseudo-Labeling for Semi-Supervised Audio-Visual Source Localization

Add code
Mar 05, 2024
Viaarxiv icon