Picture for Shihao Wang

Shihao Wang

DepthMaster: Unified Monocular Depth Estimation for Perspective and Panoramic Images

Add code
Jun 11, 2026
Viaarxiv icon

Cosmos 3: Omnimodal World Models for Physical AI

Add code
Jun 01, 2026
Viaarxiv icon

PRISM: Gauge-Invariant Tangent-Space Differentially Private LoRA

Add code
May 31, 2026
Viaarxiv icon

LocateAnything: Fast and High-Quality Vision-Language Grounding with Parallel Box Decoding

Add code
May 27, 2026
Viaarxiv icon

DEL: Digit Entropy Loss for Numerical Learning of Large Language Models

Add code
May 19, 2026
Viaarxiv icon

Dual Hierarchical Dialogue Policy Learning for Legal Inquisitive Conversational Agents

Add code
May 13, 2026
Viaarxiv icon

Nemotron 3 Nano Omni: Efficient and Open Multimodal Intelligence

Add code
Apr 27, 2026
Viaarxiv icon

VOSR: A Vision-Only Generative Model for Image Super-Resolution

Add code
Apr 03, 2026
Viaarxiv icon

Towards Multimodal Lifelong Understanding: A Dataset and Agentic Baseline

Add code
Mar 05, 2026
Viaarxiv icon

PhyCritic: Multimodal Critic Models for Physical AI

Add code
Feb 11, 2026
Viaarxiv icon