Picture for Ray Zhang

Ray Zhang

Vision-Conditioned Variational Bayesian Last Layer Dynamics Models

Add code
Jan 16, 2026
Viaarxiv icon

Are We Ready for RL in Text-to-3D Generation? A Progressive Investigation

Add code
Dec 11, 2025
Viaarxiv icon

Semantic Property Maps for Driving Applications

Add code
Nov 13, 2025
Figure 1 for Semantic Property Maps for Driving Applications
Figure 2 for Semantic Property Maps for Driving Applications
Figure 3 for Semantic Property Maps for Driving Applications
Viaarxiv icon

CrossLMM: Decoupling Long Video Sequences from LMMs via Dual Cross-Attention Mechanisms

Add code
May 22, 2025
Viaarxiv icon

SciVerse: Unveiling the Knowledge Comprehension and Visual Reasoning of LMMs on Multi-modal Scientific Problems

Add code
Mar 13, 2025
Viaarxiv icon

Exploring the Potential of Encoder-free Architectures in 3D LMMs

Add code
Feb 13, 2025
Figure 1 for Exploring the Potential of Encoder-free Architectures in 3D LMMs
Figure 2 for Exploring the Potential of Encoder-free Architectures in 3D LMMs
Figure 3 for Exploring the Potential of Encoder-free Architectures in 3D LMMs
Figure 4 for Exploring the Potential of Encoder-free Architectures in 3D LMMs
Viaarxiv icon

SLAM assisted 3D tracking system for laparoscopic surgery

Add code
Sep 18, 2024
Figure 1 for SLAM assisted 3D tracking system for laparoscopic surgery
Figure 2 for SLAM assisted 3D tracking system for laparoscopic surgery
Figure 3 for SLAM assisted 3D tracking system for laparoscopic surgery
Figure 4 for SLAM assisted 3D tracking system for laparoscopic surgery
Viaarxiv icon

Correspondence-Free SE(3) Point Cloud Registration in RKHS via Unsupervised Equivariant Learning

Add code
Jul 29, 2024
Viaarxiv icon

MR-MLLM: Mutual Reinforcement of Multimodal Comprehension and Vision Perception

Add code
Jun 22, 2024
Figure 1 for MR-MLLM: Mutual Reinforcement of Multimodal Comprehension and Vision Perception
Figure 2 for MR-MLLM: Mutual Reinforcement of Multimodal Comprehension and Vision Perception
Figure 3 for MR-MLLM: Mutual Reinforcement of Multimodal Comprehension and Vision Perception
Figure 4 for MR-MLLM: Mutual Reinforcement of Multimodal Comprehension and Vision Perception
Viaarxiv icon

RKHS-BA: A Semantic Correspondence-Free Multi-View Registration Framework with Global Tracking

Add code
Mar 02, 2024
Figure 1 for RKHS-BA: A Semantic Correspondence-Free Multi-View Registration Framework with Global Tracking
Figure 2 for RKHS-BA: A Semantic Correspondence-Free Multi-View Registration Framework with Global Tracking
Figure 3 for RKHS-BA: A Semantic Correspondence-Free Multi-View Registration Framework with Global Tracking
Figure 4 for RKHS-BA: A Semantic Correspondence-Free Multi-View Registration Framework with Global Tracking
Viaarxiv icon