Picture for Weidi Xie

Weidi Xie

OmniStream: Mastering Perception, Reconstruction and Action in Continuous Streams

Add code
Mar 12, 2026
Viaarxiv icon

Real-World Point Tracking with Verifier-Guided Pseudo-Labeling

Add code
Mar 12, 2026
Viaarxiv icon

FAIL: Flow Matching Adversarial Imitation Learning for Image Generation

Add code
Feb 12, 2026
Viaarxiv icon

VersaViT: Enhancing MLLM Vision Backbones via Task-Guided Optimization

Add code
Feb 10, 2026
Viaarxiv icon

Weaver: End-to-End Agentic System Training for Video Interleaved Reasoning

Add code
Feb 05, 2026
Viaarxiv icon

PhenoLIP: Integrating Phenotype Ontology Knowledge into Medical Vision-Language Pretraining

Add code
Feb 05, 2026
Viaarxiv icon

Revisiting Multi-Task Visual Representation Learning

Add code
Jan 20, 2026
Viaarxiv icon

SoccerMaster: A Vision Foundation Model for Soccer Understanding

Add code
Dec 11, 2025
Viaarxiv icon

Inferring Dynamic Physical Properties from Video Foundation Models

Add code
Oct 02, 2025
Viaarxiv icon

Universal Video Temporal Grounding with Generative Multi-modal Large Language Models

Add code
Jun 23, 2025
Viaarxiv icon