Picture for Jinchao Zhang

Jinchao Zhang

Tutti: Expressive Multi-Singer Synthesis via Structure-Level Timbre Control and Vocal Texture Modeling

Add code
Feb 09, 2026
Viaarxiv icon

Exploring Specular Reflection Inconsistency for Generalizable Face Forgery Detection

Add code
Feb 06, 2026
Viaarxiv icon

StyleDecoupler: Generalizable Artistic Style Disentanglement

Add code
Jan 25, 2026
Viaarxiv icon

F2RVLM: Boosting Fine-grained Fragment Retrieval for Multi-Modal Long-form Dialogue with Vision Language Model

Add code
Aug 25, 2025
Viaarxiv icon

Enhancing Visual Reliance in Text Generation: A Bayesian Perspective on Mitigating Hallucination in Large Vision-Language Models

Add code
May 26, 2025
Viaarxiv icon

Can Large Language Models Help Multimodal Language Analysis? MMLA: A Comprehensive Benchmark

Add code
Apr 24, 2025
Viaarxiv icon

Exploiting Prefix-Tree in Structured Output Interfaces for Enhancing Jailbreak Attacking

Add code
Feb 19, 2025
Viaarxiv icon

Control-CLIP: Decoupling Category and Style Guidance in CLIP for Specific-Domain Generation

Add code
Feb 17, 2025
Viaarxiv icon

Semantic to Structure: Learning Structural Representations for Infringement Detection

Add code
Feb 11, 2025
Viaarxiv icon

ILDiff: Generate Transparent Animated Stickers by Implicit Layout Distillation

Add code
Dec 30, 2024
Figure 1 for ILDiff: Generate Transparent Animated Stickers by Implicit Layout Distillation
Figure 2 for ILDiff: Generate Transparent Animated Stickers by Implicit Layout Distillation
Figure 3 for ILDiff: Generate Transparent Animated Stickers by Implicit Layout Distillation
Figure 4 for ILDiff: Generate Transparent Animated Stickers by Implicit Layout Distillation
Viaarxiv icon