Picture for Yang Wang

Yang Wang

Microsoft Research

Beyond the GUI Paradigm: Do Mobile Agents Need the Phone Screen?

Add code
Jun 16, 2026
Viaarxiv icon

STEDiff: Strengthening Text Embedding for Text-to-Image Alignment in Diffusion Model

Add code
Jun 09, 2026
Viaarxiv icon

TALAN: Task-Aligned Latent Adaptation Networks for Targeted Post-Training of Large Language Models

Add code
Jun 05, 2026
Viaarxiv icon

Selective Coupling of Decoupled Informative Regions: Masked Attention Alignment for Data-Free Quantization of Vision Transformers

Add code
Jun 03, 2026
Viaarxiv icon

Hybrid Adversarial Defence for Natural Language Understanding Tasks

Add code
Jun 03, 2026
Viaarxiv icon

VideoKR: Towards Knowledge- and Reasoning-Intensive Video Understanding

Add code
Jun 03, 2026
Viaarxiv icon

SkySense: A Semi-Supervised Generative Framework for UAV Localization in ISAC Networks

Add code
Jun 02, 2026
Viaarxiv icon

MOSS-Audio Technical Report

Add code
Jun 01, 2026
Viaarxiv icon

AI-T2I: Aggregating-and-Isolating Cross-Attention to Diffusion Models for Text-to-Image Synthesis

Add code
May 27, 2026
Viaarxiv icon

FedMPT: Federated Multi-label Prompt Tuning of Vision-Language Models

Add code
May 27, 2026
Viaarxiv icon