Picture for Winson Han

Winson Han

MolmoAct2: Action Reasoning Models for Real-world Deployment

Add code
May 04, 2026
Viaarxiv icon

MolmoWeb: Open Visual Web Agent and Open Data for the Open Web

Add code
Apr 09, 2026
Viaarxiv icon

WildDet3D: Scaling Promptable 3D Detection in the Wild

Add code
Apr 09, 2026
Viaarxiv icon

MolmoPoint: Better Pointing for VLMs with Grounding Tokens

Add code
Mar 30, 2026
Viaarxiv icon

Unified Spatio-Temporal Token Scoring for Efficient Video VLMs

Add code
Mar 18, 2026
Viaarxiv icon

MolmoB0T: Large-Scale Simulation Enables Zero-Shot Manipulation

Add code
Mar 17, 2026
Viaarxiv icon

MolmoSpaces: A Large-Scale Open Ecosystem for Robot Navigation and Manipulation

Add code
Feb 11, 2026
Viaarxiv icon

Molmo2: Open Weights and Data for Vision-Language Models with Video Understanding and Grounding

Add code
Jan 15, 2026
Viaarxiv icon

Visual Representations inside the Language Model

Add code
Oct 06, 2025
Figure 1 for Visual Representations inside the Language Model
Figure 2 for Visual Representations inside the Language Model
Figure 3 for Visual Representations inside the Language Model
Figure 4 for Visual Representations inside the Language Model
Viaarxiv icon

MolmoAct: Action Reasoning Models that can Reason in Space

Add code
Aug 12, 2025
Figure 1 for MolmoAct: Action Reasoning Models that can Reason in Space
Figure 2 for MolmoAct: Action Reasoning Models that can Reason in Space
Figure 3 for MolmoAct: Action Reasoning Models that can Reason in Space
Figure 4 for MolmoAct: Action Reasoning Models that can Reason in Space
Viaarxiv icon