Picture for Jiayao Ma

Jiayao Ma

ExPosST: Explicit Positioning with Adaptive Masking for LLM-Based Simultaneous Machine Translation

Add code
Mar 16, 2026
Viaarxiv icon

RoboClaw: An Agentic Framework for Scalable Long-Horizon Robotic Tasks

Add code
Mar 12, 2026
Viaarxiv icon

Text-based Aerial-Ground Person Retrieval

Add code
Nov 11, 2025
Viaarxiv icon

Genie Centurion: Accelerating Scalable Real-World Robot Training with Human Rewind-and-Refine Guidance

Add code
May 24, 2025
Figure 1 for Genie Centurion: Accelerating Scalable Real-World Robot Training with Human Rewind-and-Refine Guidance
Figure 2 for Genie Centurion: Accelerating Scalable Real-World Robot Training with Human Rewind-and-Refine Guidance
Figure 3 for Genie Centurion: Accelerating Scalable Real-World Robot Training with Human Rewind-and-Refine Guidance
Figure 4 for Genie Centurion: Accelerating Scalable Real-World Robot Training with Human Rewind-and-Refine Guidance
Viaarxiv icon

Multimodal Large Language Models for Text-rich Image Understanding: A Comprehensive Review

Add code
Feb 23, 2025
Figure 1 for Multimodal Large Language Models for Text-rich Image Understanding: A Comprehensive Review
Figure 2 for Multimodal Large Language Models for Text-rich Image Understanding: A Comprehensive Review
Figure 3 for Multimodal Large Language Models for Text-rich Image Understanding: A Comprehensive Review
Figure 4 for Multimodal Large Language Models for Text-rich Image Understanding: A Comprehensive Review
Viaarxiv icon