Picture for Daichi Yashima

Daichi Yashima

AnoleVLA: Lightweight Vision-Language-Action Model with Deep State Space Models for Mobile Manipulation

Add code
Mar 16, 2026
Viaarxiv icon

NaiLIA: Multimodal Nail Design Retrieval Based on Dense Intent Descriptions and Palette Queries

Add code
Mar 05, 2026
Viaarxiv icon

ReMoRa: Multimodal Large Language Model based on Refined Motion Representation for Long-Video Understanding

Add code
Feb 18, 2026
Viaarxiv icon

Mobile Manipulation Instruction Generation from Multiple Images with Automatic Metric Enhancement

Add code
Jan 28, 2025
Figure 1 for Mobile Manipulation Instruction Generation from Multiple Images with Automatic Metric Enhancement
Figure 2 for Mobile Manipulation Instruction Generation from Multiple Images with Automatic Metric Enhancement
Figure 3 for Mobile Manipulation Instruction Generation from Multiple Images with Automatic Metric Enhancement
Figure 4 for Mobile Manipulation Instruction Generation from Multiple Images with Automatic Metric Enhancement
Viaarxiv icon

Open-Vocabulary Mobile Manipulation Based on Double Relaxed Contrastive Learning with Dense Labeling

Add code
Dec 24, 2024
Figure 1 for Open-Vocabulary Mobile Manipulation Based on Double Relaxed Contrastive Learning with Dense Labeling
Figure 2 for Open-Vocabulary Mobile Manipulation Based on Double Relaxed Contrastive Learning with Dense Labeling
Figure 3 for Open-Vocabulary Mobile Manipulation Based on Double Relaxed Contrastive Learning with Dense Labeling
Figure 4 for Open-Vocabulary Mobile Manipulation Based on Double Relaxed Contrastive Learning with Dense Labeling
Viaarxiv icon