Picture for Dongmei Jiang

Dongmei Jiang

Optimus-3: Towards Generalist Multimodal Minecraft Agents with Scalable Task Experts

Add code
Jun 12, 2025
Viaarxiv icon

Mirage-1: Augmenting and Updating GUI Agent with Hierarchical Multimodal Skills

Add code
Jun 12, 2025
Viaarxiv icon

Cross-DINO: Cross the Deep MLP and Transformer for Small Object Detection

Add code
May 28, 2025
Viaarxiv icon

Open-Det: An Efficient Learning Framework for Open-Ended Detection

Add code
May 27, 2025
Viaarxiv icon

Harmony: A Unified Framework for Modality Incremental Learning

Add code
Apr 17, 2025
Viaarxiv icon

Learning Compatible Multi-Prize Subnetworks for Asymmetric Retrieval

Add code
Apr 16, 2025
Viaarxiv icon

Enhancing LLM Reasoning with Iterative DPO: A Comprehensive Empirical Investigation

Add code
Mar 17, 2025
Viaarxiv icon

Optimus-2: Multimodal Minecraft Agent with Goal-Observation-Action Conditioned Policy

Add code
Feb 27, 2025
Viaarxiv icon

PolaFormer: Polarity-aware Linear Attention for Vision Transformers

Add code
Jan 25, 2025
Viaarxiv icon

CatV2TON: Taming Diffusion Transformers for Vision-Based Virtual Try-On with Temporal Concatenation

Add code
Jan 20, 2025
Viaarxiv icon