Picture for Yang Yang

Yang Yang

bupt.edu.cn

Text as Any-Modality for Zero-Shot Classification by Consistent Prompt Tuning

Add code
Aug 08, 2025
Viaarxiv icon

RAP: Real-time Audio-driven Portrait Animation with Video Diffusion Transformer

Add code
Aug 07, 2025
Viaarxiv icon

Implicit Counterfactual Learning for Audio-Visual Segmentation

Add code
Jul 28, 2025
Viaarxiv icon

Step-Audio 2 Technical Report

Add code
Jul 24, 2025
Viaarxiv icon

Adapting Large VLMs with Iterative and Manual Instructions for Generative Low-light Enhancement

Add code
Jul 24, 2025
Viaarxiv icon

Multimodal Mathematical Reasoning with Diverse Solving Perspective

Add code
Jul 03, 2025
Viaarxiv icon

Uncertainty-aware Reward Design Process

Add code
Jul 03, 2025
Viaarxiv icon

PanTS: The Pancreatic Tumor Segmentation Dataset

Add code
Jul 02, 2025
Viaarxiv icon

Lightweight Task-Oriented Semantic Communication Empowered by Large-Scale AI Models

Add code
Jun 16, 2025
Viaarxiv icon

Pushing the Limits of Safety: A Technical Report on the ATLAS Challenge 2025

Add code
Jun 14, 2025
Viaarxiv icon