Picture for Jianwen Xie

Jianwen Xie

AnyHand: A Large-Scale Synthetic Dataset for RGB(-D) Hand Pose Estimation

Add code
Mar 26, 2026
Viaarxiv icon

SG-VLA: Learning Spatially-Grounded Vision-Language-Action Models for Mobile Manipulation

Add code
Mar 24, 2026
Viaarxiv icon

Inference-Time Rethinking with Latent Thought Vectors for Math Reasoning

Add code
Feb 06, 2026
Viaarxiv icon

GraphDancer: Training LLMs to Explore and Reason over Graphs via Curriculum Reinforcement Learning

Add code
Jan 24, 2026
Viaarxiv icon

CVP: Central-Peripheral Vision-Inspired Multimodal Model for Spatial Reasoning

Add code
Dec 09, 2025
Viaarxiv icon

VideoNSA: Native Sparse Attention Scales Video Understanding

Add code
Oct 02, 2025
Figure 1 for VideoNSA: Native Sparse Attention Scales Video Understanding
Figure 2 for VideoNSA: Native Sparse Attention Scales Video Understanding
Figure 3 for VideoNSA: Native Sparse Attention Scales Video Understanding
Figure 4 for VideoNSA: Native Sparse Attention Scales Video Understanding
Viaarxiv icon

DepR: Depth Guided Single-view Scene Reconstruction with Instance-level Diffusion

Add code
Jul 30, 2025
Viaarxiv icon

Tensor Decomposition Networks for Fast Machine Learning Interatomic Potential Computations

Add code
Jul 01, 2025
Viaarxiv icon

Natural Language Guided Ligand-Binding Protein Design

Add code
Jun 11, 2025
Figure 1 for Natural Language Guided Ligand-Binding Protein Design
Figure 2 for Natural Language Guided Ligand-Binding Protein Design
Figure 3 for Natural Language Guided Ligand-Binding Protein Design
Figure 4 for Natural Language Guided Ligand-Binding Protein Design
Viaarxiv icon

DEL-ToM: Inference-Time Scaling for Theory-of-Mind Reasoning via Dynamic Epistemic Logic

Add code
May 22, 2025
Viaarxiv icon