Picture for Yang Yang

Yang Yang

Department of Biostatistics and Bioinformatics, Duke University, Durham, USA

Text as Any-Modality for Zero-Shot Classification by Consistent Prompt Tuning

Add code
Aug 08, 2025
Viaarxiv icon

RAP: Real-time Audio-driven Portrait Animation with Video Diffusion Transformer

Add code
Aug 07, 2025
Viaarxiv icon

Implicit Counterfactual Learning for Audio-Visual Segmentation

Add code
Jul 28, 2025
Figure 1 for Implicit Counterfactual Learning for Audio-Visual Segmentation
Figure 2 for Implicit Counterfactual Learning for Audio-Visual Segmentation
Figure 3 for Implicit Counterfactual Learning for Audio-Visual Segmentation
Figure 4 for Implicit Counterfactual Learning for Audio-Visual Segmentation
Viaarxiv icon

Step-Audio 2 Technical Report

Add code
Jul 24, 2025
Figure 1 for Step-Audio 2 Technical Report
Figure 2 for Step-Audio 2 Technical Report
Figure 3 for Step-Audio 2 Technical Report
Figure 4 for Step-Audio 2 Technical Report
Viaarxiv icon

Adapting Large VLMs with Iterative and Manual Instructions for Generative Low-light Enhancement

Add code
Jul 24, 2025
Viaarxiv icon

Uncertainty-aware Reward Design Process

Add code
Jul 03, 2025
Viaarxiv icon

Multimodal Mathematical Reasoning with Diverse Solving Perspective

Add code
Jul 03, 2025
Figure 1 for Multimodal Mathematical Reasoning with Diverse Solving Perspective
Figure 2 for Multimodal Mathematical Reasoning with Diverse Solving Perspective
Figure 3 for Multimodal Mathematical Reasoning with Diverse Solving Perspective
Figure 4 for Multimodal Mathematical Reasoning with Diverse Solving Perspective
Viaarxiv icon

PanTS: The Pancreatic Tumor Segmentation Dataset

Add code
Jul 02, 2025
Viaarxiv icon

Lightweight Task-Oriented Semantic Communication Empowered by Large-Scale AI Models

Add code
Jun 16, 2025
Viaarxiv icon

Pushing the Limits of Safety: A Technical Report on the ATLAS Challenge 2025

Add code
Jun 14, 2025
Viaarxiv icon