Picture for Hongbin Zhou

Hongbin Zhou

JANUS: A Lightweight Framework for Jailbreaking Text-to-Image Models via Distribution Optimization

Add code
Mar 22, 2026
Viaarxiv icon

Explainable Token-level Noise Filtering for LLM Fine-tuning Datasets

Add code
Feb 16, 2026
Viaarxiv icon

S2ST-Omni: An Efficient and Scalable Multilingual Speech-to-Speech Translation Framework via Seamlessly Speech-Text Alignment and Streaming Speech Decoder

Add code
Jun 16, 2025
Figure 1 for S2ST-Omni: An Efficient and Scalable Multilingual Speech-to-Speech Translation Framework via Seamlessly Speech-Text Alignment and Streaming Speech Decoder
Figure 2 for S2ST-Omni: An Efficient and Scalable Multilingual Speech-to-Speech Translation Framework via Seamlessly Speech-Text Alignment and Streaming Speech Decoder
Viaarxiv icon

ClapFM-EVC: High-Fidelity and Flexible Emotional Voice Conversion with Dual Control from Natural Language and Speech

Add code
May 20, 2025
Figure 1 for ClapFM-EVC: High-Fidelity and Flexible Emotional Voice Conversion with Dual Control from Natural Language and Speech
Figure 2 for ClapFM-EVC: High-Fidelity and Flexible Emotional Voice Conversion with Dual Control from Natural Language and Speech
Figure 3 for ClapFM-EVC: High-Fidelity and Flexible Emotional Voice Conversion with Dual Control from Natural Language and Speech
Figure 4 for ClapFM-EVC: High-Fidelity and Flexible Emotional Voice Conversion with Dual Control from Natural Language and Speech
Viaarxiv icon

GDI-Bench: A Benchmark for General Document Intelligence with Vision and Reasoning Decoupling

Add code
Apr 30, 2025
Viaarxiv icon

TrustGeoGen: Scalable and Formal-Verified Data Engine for Trustworthy Multi-modal Geometric Problem Solving

Add code
Apr 22, 2025
Figure 1 for TrustGeoGen: Scalable and Formal-Verified Data Engine for Trustworthy Multi-modal Geometric Problem Solving
Figure 2 for TrustGeoGen: Scalable and Formal-Verified Data Engine for Trustworthy Multi-modal Geometric Problem Solving
Figure 3 for TrustGeoGen: Scalable and Formal-Verified Data Engine for Trustworthy Multi-modal Geometric Problem Solving
Figure 4 for TrustGeoGen: Scalable and Formal-Verified Data Engine for Trustworthy Multi-modal Geometric Problem Solving
Viaarxiv icon

Fine-grained Preference Optimization Improves Zero-shot Text-to-Speech

Add code
Feb 05, 2025
Figure 1 for Fine-grained Preference Optimization Improves Zero-shot Text-to-Speech
Figure 2 for Fine-grained Preference Optimization Improves Zero-shot Text-to-Speech
Figure 3 for Fine-grained Preference Optimization Improves Zero-shot Text-to-Speech
Figure 4 for Fine-grained Preference Optimization Improves Zero-shot Text-to-Speech
Viaarxiv icon

GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training

Add code
Dec 16, 2024
Figure 1 for GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training
Figure 2 for GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training
Figure 3 for GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training
Figure 4 for GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training
Viaarxiv icon

StableVC: Style Controllable Zero-Shot Voice Conversion with Conditional Flow Matching

Add code
Dec 10, 2024
Figure 1 for StableVC: Style Controllable Zero-Shot Voice Conversion with Conditional Flow Matching
Figure 2 for StableVC: Style Controllable Zero-Shot Voice Conversion with Conditional Flow Matching
Figure 3 for StableVC: Style Controllable Zero-Shot Voice Conversion with Conditional Flow Matching
Figure 4 for StableVC: Style Controllable Zero-Shot Voice Conversion with Conditional Flow Matching
Viaarxiv icon

OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations

Add code
Dec 10, 2024
Figure 1 for OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations
Figure 2 for OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations
Figure 3 for OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations
Figure 4 for OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations
Viaarxiv icon