Picture for Victor Shea-Jay Huang

Victor Shea-Jay Huang

JPS: Jailbreak Multimodal Large Language Models with Collaborative Visual Perturbation and Textual Steering

Add code
Aug 07, 2025
Viaarxiv icon

Lumina-mGPT 2.0: Stand-Alone AutoRegressive Image Modeling

Add code
Jul 23, 2025
Viaarxiv icon

How Should We Enhance the Safety of Large Reasoning Models: An Empirical Study

Add code
May 21, 2025
Viaarxiv icon

Vision-to-Music Generation: A Survey

Add code
Mar 27, 2025
Viaarxiv icon

TIDE : Temporal-Aware Sparse Autoencoders for Interpretable Diffusion Transformers in Image Generation

Add code
Mar 10, 2025
Viaarxiv icon

Beyond Sight: Towards Cognitive Alignment in LVLM via Enriched Visual Knowledge

Add code
Nov 25, 2024
Viaarxiv icon

Towards Precise Scaling Laws for Video Diffusion Transformers

Add code
Nov 25, 2024
Figure 1 for Towards Precise Scaling Laws for Video Diffusion Transformers
Figure 2 for Towards Precise Scaling Laws for Video Diffusion Transformers
Figure 3 for Towards Precise Scaling Laws for Video Diffusion Transformers
Figure 4 for Towards Precise Scaling Laws for Video Diffusion Transformers
Viaarxiv icon

Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction

Add code
Oct 29, 2024
Figure 1 for Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction
Figure 2 for Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction
Figure 3 for Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction
Figure 4 for Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction
Viaarxiv icon

BlackDAN: A Black-Box Multi-Objective Approach for Effective and Contextual Jailbreaking of Large Language Models

Add code
Oct 13, 2024
Figure 1 for BlackDAN: A Black-Box Multi-Objective Approach for Effective and Contextual Jailbreaking of Large Language Models
Figure 2 for BlackDAN: A Black-Box Multi-Objective Approach for Effective and Contextual Jailbreaking of Large Language Models
Figure 3 for BlackDAN: A Black-Box Multi-Objective Approach for Effective and Contextual Jailbreaking of Large Language Models
Figure 4 for BlackDAN: A Black-Box Multi-Objective Approach for Effective and Contextual Jailbreaking of Large Language Models
Viaarxiv icon