Picture for Ya Jing

Ya Jing

GR-2: A Generative Video-Language-Action Model with Web-Scale Knowledge for Robot Manipulation

Add code
Oct 08, 2024
Figure 1 for GR-2: A Generative Video-Language-Action Model with Web-Scale Knowledge for Robot Manipulation
Figure 2 for GR-2: A Generative Video-Language-Action Model with Web-Scale Knowledge for Robot Manipulation
Figure 3 for GR-2: A Generative Video-Language-Action Model with Web-Scale Knowledge for Robot Manipulation
Figure 4 for GR-2: A Generative Video-Language-Action Model with Web-Scale Knowledge for Robot Manipulation
Viaarxiv icon

Knowledge Boundary and Persona Dynamic Shape A Better Social Media Agent

Add code
Apr 02, 2024
Viaarxiv icon

Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation

Add code
Dec 21, 2023
Figure 1 for Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation
Figure 2 for Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation
Figure 3 for Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation
Figure 4 for Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation
Viaarxiv icon

Vision-Language Foundation Models as Effective Robot Imitators

Add code
Nov 06, 2023
Figure 1 for Vision-Language Foundation Models as Effective Robot Imitators
Figure 2 for Vision-Language Foundation Models as Effective Robot Imitators
Figure 3 for Vision-Language Foundation Models as Effective Robot Imitators
Figure 4 for Vision-Language Foundation Models as Effective Robot Imitators
Viaarxiv icon

MOMA-Force: Visual-Force Imitation for Real-World Mobile Manipulation

Add code
Aug 07, 2023
Figure 1 for MOMA-Force: Visual-Force Imitation for Real-World Mobile Manipulation
Figure 2 for MOMA-Force: Visual-Force Imitation for Real-World Mobile Manipulation
Figure 3 for MOMA-Force: Visual-Force Imitation for Real-World Mobile Manipulation
Figure 4 for MOMA-Force: Visual-Force Imitation for Real-World Mobile Manipulation
Viaarxiv icon

Exploring Visual Pre-training for Robot Manipulation: Datasets, Models and Methods

Add code
Aug 07, 2023
Figure 1 for Exploring Visual Pre-training for Robot Manipulation: Datasets, Models and Methods
Figure 2 for Exploring Visual Pre-training for Robot Manipulation: Datasets, Models and Methods
Figure 3 for Exploring Visual Pre-training for Robot Manipulation: Datasets, Models and Methods
Figure 4 for Exploring Visual Pre-training for Robot Manipulation: Datasets, Models and Methods
Viaarxiv icon

Learning to Explore Informative Trajectories and Samples for Embodied Perception

Add code
Mar 20, 2023
Figure 1 for Learning to Explore Informative Trajectories and Samples for Embodied Perception
Figure 2 for Learning to Explore Informative Trajectories and Samples for Embodied Perception
Figure 3 for Learning to Explore Informative Trajectories and Samples for Embodied Perception
Figure 4 for Learning to Explore Informative Trajectories and Samples for Embodied Perception
Viaarxiv icon

Towards Unifying Reference Expression Generation and Comprehension

Add code
Oct 24, 2022
Figure 1 for Towards Unifying Reference Expression Generation and Comprehension
Figure 2 for Towards Unifying Reference Expression Generation and Comprehension
Figure 3 for Towards Unifying Reference Expression Generation and Comprehension
Figure 4 for Towards Unifying Reference Expression Generation and Comprehension
Viaarxiv icon

Locate then Segment: A Strong Pipeline for Referring Image Segmentation

Add code
Mar 30, 2021
Figure 1 for Locate then Segment: A Strong Pipeline for Referring Image Segmentation
Figure 2 for Locate then Segment: A Strong Pipeline for Referring Image Segmentation
Figure 3 for Locate then Segment: A Strong Pipeline for Referring Image Segmentation
Figure 4 for Locate then Segment: A Strong Pipeline for Referring Image Segmentation
Viaarxiv icon

Cascade Attention Network for Person Search: Both Image and Text-Image Similarity Selection

Add code
Sep 22, 2018
Figure 1 for Cascade Attention Network for Person Search: Both Image and Text-Image Similarity Selection
Figure 2 for Cascade Attention Network for Person Search: Both Image and Text-Image Similarity Selection
Figure 3 for Cascade Attention Network for Person Search: Both Image and Text-Image Similarity Selection
Figure 4 for Cascade Attention Network for Person Search: Both Image and Text-Image Similarity Selection
Viaarxiv icon