Picture for Yuan Xie

Yuan Xie

EscherVerse: An Open World Benchmark and Dataset for Teleo-Spatial Intelligence with Physical-Dynamic and Intent-Driven Understanding

Add code
Jan 04, 2026
Viaarxiv icon

FLEG: Feed-Forward Language Embedded Gaussian Splatting from Any Views

Add code
Dec 19, 2025
Figure 1 for FLEG: Feed-Forward Language Embedded Gaussian Splatting from Any Views
Figure 2 for FLEG: Feed-Forward Language Embedded Gaussian Splatting from Any Views
Figure 3 for FLEG: Feed-Forward Language Embedded Gaussian Splatting from Any Views
Figure 4 for FLEG: Feed-Forward Language Embedded Gaussian Splatting from Any Views
Viaarxiv icon

3D Guard-Layer: An Integrated Agentic AI Safety System for Edge Artificial Intelligence

Add code
Nov 11, 2025
Viaarxiv icon

CaRF: Enhancing Multi-View Consistency in Referring 3D Gaussian Splatting Segmentation

Add code
Nov 06, 2025
Figure 1 for CaRF: Enhancing Multi-View Consistency in Referring 3D Gaussian Splatting Segmentation
Figure 2 for CaRF: Enhancing Multi-View Consistency in Referring 3D Gaussian Splatting Segmentation
Figure 3 for CaRF: Enhancing Multi-View Consistency in Referring 3D Gaussian Splatting Segmentation
Figure 4 for CaRF: Enhancing Multi-View Consistency in Referring 3D Gaussian Splatting Segmentation
Viaarxiv icon

Switchable Token-Specific Codebook Quantization For Face Image Compression

Add code
Oct 27, 2025
Viaarxiv icon

InfiR2: A Comprehensive FP8 Training Recipe for Reasoning-Enhanced Language Models

Add code
Sep 26, 2025
Figure 1 for InfiR2: A Comprehensive FP8 Training Recipe for Reasoning-Enhanced Language Models
Figure 2 for InfiR2: A Comprehensive FP8 Training Recipe for Reasoning-Enhanced Language Models
Figure 3 for InfiR2: A Comprehensive FP8 Training Recipe for Reasoning-Enhanced Language Models
Figure 4 for InfiR2: A Comprehensive FP8 Training Recipe for Reasoning-Enhanced Language Models
Viaarxiv icon

SeqVLM: Proposal-Guided Multi-View Sequences Reasoning via VLM for Zero-Shot 3D Visual Grounding

Add code
Aug 28, 2025
Figure 1 for SeqVLM: Proposal-Guided Multi-View Sequences Reasoning via VLM for Zero-Shot 3D Visual Grounding
Figure 2 for SeqVLM: Proposal-Guided Multi-View Sequences Reasoning via VLM for Zero-Shot 3D Visual Grounding
Figure 3 for SeqVLM: Proposal-Guided Multi-View Sequences Reasoning via VLM for Zero-Shot 3D Visual Grounding
Figure 4 for SeqVLM: Proposal-Guided Multi-View Sequences Reasoning via VLM for Zero-Shot 3D Visual Grounding
Viaarxiv icon

MOL: Joint Estimation of Micro-Expression, Optical Flow, and Landmark via Transformer-Graph-Style Convolution

Add code
Jun 17, 2025
Figure 1 for MOL: Joint Estimation of Micro-Expression, Optical Flow, and Landmark via Transformer-Graph-Style Convolution
Figure 2 for MOL: Joint Estimation of Micro-Expression, Optical Flow, and Landmark via Transformer-Graph-Style Convolution
Figure 3 for MOL: Joint Estimation of Micro-Expression, Optical Flow, and Landmark via Transformer-Graph-Style Convolution
Figure 4 for MOL: Joint Estimation of Micro-Expression, Optical Flow, and Landmark via Transformer-Graph-Style Convolution
Viaarxiv icon

UniForward: Unified 3D Scene and Semantic Field Reconstruction via Feed-Forward Gaussian Splatting from Only Sparse-View Images

Add code
Jun 11, 2025
Viaarxiv icon

SHTOcc: Effective 3D Occupancy Prediction with Sparse Head and Tail Voxels

Add code
May 29, 2025
Viaarxiv icon