Picture for Xinzhuo Li

Xinzhuo Li

VTAM: Video-Tactile-Action Models for Complex Physical Interaction Beyond VLAs

Add code
Mar 24, 2026
Viaarxiv icon

EgoForge: Goal-Directed Egocentric World Simulator

Add code
Mar 20, 2026
Viaarxiv icon

DreamPartGen: Semantically Grounded Part-Level 3D Generation via Collaborative Latent Denoising

Add code
Mar 19, 2026
Viaarxiv icon

SafeCRS: Personalized Safety Alignment for LLM-Based Conversational Recommender Systems

Add code
Mar 03, 2026
Viaarxiv icon

Toward Cognitive Supersensing in Multimodal Large Language Model

Add code
Feb 02, 2026
Viaarxiv icon

CoRe3D: Collaborative Reasoning as a Foundation for 3D Intelligence

Add code
Dec 14, 2025
Viaarxiv icon

HalluSegBench: Counterfactual Visual Reasoning for Segmentation Hallucination Evaluation

Add code
Jun 26, 2025
Viaarxiv icon

PRIMA: Multi-Image Vision-Language Models for Reasoning Segmentation

Add code
Dec 19, 2024
Figure 1 for PRIMA: Multi-Image Vision-Language Models for Reasoning Segmentation
Figure 2 for PRIMA: Multi-Image Vision-Language Models for Reasoning Segmentation
Figure 3 for PRIMA: Multi-Image Vision-Language Models for Reasoning Segmentation
Figure 4 for PRIMA: Multi-Image Vision-Language Models for Reasoning Segmentation
Viaarxiv icon