Picture for Yanfeng Wang

Yanfeng Wang

Cooperative Medianet Innovation Center, Shanghai Jiao Tong University, China and Shanghai AI Laboratory, China

GLM-OCR Technical Report

Add code
Mar 11, 2026
Viaarxiv icon

Stepping VLMs onto the Court: Benchmarking Spatial Intelligence in Sports

Add code
Mar 10, 2026
Viaarxiv icon

Rejection Mixing: Fast Semantic Propagation of Mask Tokens for Efficient DLLM Inference

Add code
Feb 26, 2026
Viaarxiv icon

VersaViT: Enhancing MLLM Vision Backbones via Task-Guided Optimization

Add code
Feb 10, 2026
Viaarxiv icon

VocalNet-MDM: Accelerating Streaming Speech LLM via Self-Distilled Masked Diffusion Modeling

Add code
Feb 09, 2026
Viaarxiv icon

PhenoLIP: Integrating Phenotype Ontology Knowledge into Medical Vision-Language Pretraining

Add code
Feb 05, 2026
Viaarxiv icon

Innovator-VL: A Multimodal Large Language Model for Scientific Discovery

Add code
Jan 27, 2026
Viaarxiv icon

AgentEHR: Advancing Autonomous Clinical Decision-Making via Retrospective Summarization

Add code
Jan 20, 2026
Viaarxiv icon

Scientific Image Synthesis: Benchmarking, Methodologies, and Downstream Utility

Add code
Jan 17, 2026
Viaarxiv icon

Miner:Mining Intrinsic Mastery for Data-Efficient RL in Large Reasoning Models

Add code
Jan 08, 2026
Viaarxiv icon