Picture for Victor Shea-Jay Huang

Victor Shea-Jay Huang

Driving Intents Amplify Planning-Oriented Reinforcement Learning

Add code
May 14, 2026
Viaarxiv icon

Action Emergence from Streaming Intent

Add code
May 14, 2026
Viaarxiv icon

MindVLA-U1: VLA Beats VA with Unified Streaming Architecture for Autonomous Driving

Add code
May 14, 2026
Viaarxiv icon

The Side Effects of Being Smart: Safety Risks in MLLMs' Multi-Image Reasoning

Add code
Jan 20, 2026
Viaarxiv icon

JPS: Jailbreak Multimodal Large Language Models with Collaborative Visual Perturbation and Textual Steering

Add code
Aug 07, 2025
Viaarxiv icon

Lumina-mGPT 2.0: Stand-Alone AutoRegressive Image Modeling

Add code
Jul 23, 2025
Figure 1 for Lumina-mGPT 2.0: Stand-Alone AutoRegressive Image Modeling
Figure 2 for Lumina-mGPT 2.0: Stand-Alone AutoRegressive Image Modeling
Figure 3 for Lumina-mGPT 2.0: Stand-Alone AutoRegressive Image Modeling
Figure 4 for Lumina-mGPT 2.0: Stand-Alone AutoRegressive Image Modeling
Viaarxiv icon

How Should We Enhance the Safety of Large Reasoning Models: An Empirical Study

Add code
May 21, 2025
Viaarxiv icon

Vision-to-Music Generation: A Survey

Add code
Mar 27, 2025
Viaarxiv icon

TIDE : Temporal-Aware Sparse Autoencoders for Interpretable Diffusion Transformers in Image Generation

Add code
Mar 10, 2025
Viaarxiv icon

Beyond Sight: Towards Cognitive Alignment in LVLM via Enriched Visual Knowledge

Add code
Nov 25, 2024
Figure 1 for Beyond Sight: Towards Cognitive Alignment in LVLM via Enriched Visual Knowledge
Figure 2 for Beyond Sight: Towards Cognitive Alignment in LVLM via Enriched Visual Knowledge
Figure 3 for Beyond Sight: Towards Cognitive Alignment in LVLM via Enriched Visual Knowledge
Figure 4 for Beyond Sight: Towards Cognitive Alignment in LVLM via Enriched Visual Knowledge
Viaarxiv icon