Picture for Yuxin Peng

Yuxin Peng

Taxonomy-Aware Representation Alignment for Hierarchical Visual Recognition with Large Multimodal Models

Add code
Feb 28, 2026
Viaarxiv icon

Venus: Benchmarking and Empowering Multimodal Large Language Models for Aesthetic Guidance and Cropping

Add code
Feb 27, 2026
Viaarxiv icon

TiFRe: Text-guided Video Frame Reduction for Efficient Video Multi-modal Large Language Models

Add code
Feb 09, 2026
Viaarxiv icon

Fine-R1: Make Multi-modal LLMs Excel in Fine-Grained Visual Recognition by Chain-of-Thought Reasoning

Add code
Feb 07, 2026
Viaarxiv icon

Multi-Resolution Alignment for Voxel Sparsity in Camera-Based 3D Semantic Scene Completion

Add code
Feb 03, 2026
Viaarxiv icon

Bi-C2R: Bidirectional Continual Compatible Representation for Re-indexing Free Lifelong Person Re-identification

Add code
Dec 31, 2025
Viaarxiv icon

CKDA: Cross-modality Knowledge Disentanglement and Alignment for Visible-Infrared Lifelong Person Re-identification

Add code
Nov 19, 2025
Viaarxiv icon

HD$^2$-SSC: High-Dimension High-Density Semantic Scene Completion for Autonomous Driving

Add code
Nov 13, 2025
Figure 1 for HD$^2$-SSC: High-Dimension High-Density Semantic Scene Completion for Autonomous Driving
Figure 2 for HD$^2$-SSC: High-Dimension High-Density Semantic Scene Completion for Autonomous Driving
Figure 3 for HD$^2$-SSC: High-Dimension High-Density Semantic Scene Completion for Autonomous Driving
Figure 4 for HD$^2$-SSC: High-Dimension High-Density Semantic Scene Completion for Autonomous Driving
Viaarxiv icon

Interact-Custom: Customized Human Object Interaction Image Generation

Add code
Aug 28, 2025
Viaarxiv icon

Investigating Domain Gaps for Indoor 3D Object Detection

Add code
Aug 24, 2025
Viaarxiv icon