Picture for Bo Du

Bo Du

Spotlighting Partially Visible Cinematic Language for Video-to-Audio Generation via Self-distillation

Add code
Jul 03, 2025
Viaarxiv icon

Rethink Sparse Signals for Pose-guided Text-to-image Generation

Add code
Jun 26, 2025
Viaarxiv icon

InverTune: Removing Backdoors from Multimodal Contrastive Learning Models via Trigger Inversion and Activation Tuning

Add code
Jun 14, 2025
Viaarxiv icon

Enhancing Medical Dialogue Generation through Knowledge Refinement and Dynamic Prompt Adjustment

Add code
Jun 12, 2025
Viaarxiv icon

Class Similarity-Based Multimodal Classification under Heterogeneous Category Sets

Add code
Jun 11, 2025
Viaarxiv icon

Intra-Trajectory Consistency for Reward Modeling

Add code
Jun 10, 2025
Viaarxiv icon

Towards Unified Modeling in Federated Multi-Task Learning via Subspace Decoupling

Add code
May 30, 2025
Viaarxiv icon

GoMatching++: Parameter- and Data-Efficient Arbitrary-Shaped Video Text Spotting and Benchmarking

Add code
May 28, 2025
Viaarxiv icon

Adapting Segment Anything Model for Power Transmission Corridor Hazard Segmentation

Add code
May 28, 2025
Viaarxiv icon

Resolving Knowledge Conflicts in Domain-specific Data Selection: A Case Study on Medical Instruction-tuning

Add code
May 28, 2025
Viaarxiv icon