Picture for Keze Wang

Keze Wang

3D-Agent:Tri-Modal Multi-Agent Collaboration for Scalable 3D Object Annotation

Add code
Jan 07, 2026
Viaarxiv icon

Stable Language Guidance for Vision-Language-Action Models

Add code
Jan 07, 2026
Viaarxiv icon

Robust Egocentric Referring Video Object Segmentation via Dual-Modal Causal Intervention

Add code
Dec 30, 2025
Viaarxiv icon

Self-Rewarded Multimodal Coherent Reasoning Across Diverse Visual Domains

Add code
Dec 27, 2025
Viaarxiv icon

CoAgent: Collaborative Planning and Consistency Agent for Coherent Video Generation

Add code
Dec 27, 2025
Viaarxiv icon

RevFFN: Memory-Efficient Full-Parameter Fine-Tuning of Mixture-of-Experts LLMs with Reversible Blocks

Add code
Dec 24, 2025
Viaarxiv icon

FlashVLM: Text-Guided Visual Token Selection for Large Multimodal Models

Add code
Dec 23, 2025
Figure 1 for FlashVLM: Text-Guided Visual Token Selection for Large Multimodal Models
Figure 2 for FlashVLM: Text-Guided Visual Token Selection for Large Multimodal Models
Figure 3 for FlashVLM: Text-Guided Visual Token Selection for Large Multimodal Models
Figure 4 for FlashVLM: Text-Guided Visual Token Selection for Large Multimodal Models
Viaarxiv icon

SirenPose: Dynamic Scene Reconstruction via Geometric Supervision

Add code
Dec 23, 2025
Viaarxiv icon

LLM-CAS: Dynamic Neuron Perturbation for Real-Time Hallucination Correction

Add code
Dec 21, 2025
Viaarxiv icon

Reflective Confidence: Correcting Reasoning Flaws via Online Self-Correction

Add code
Dec 21, 2025
Figure 1 for Reflective Confidence: Correcting Reasoning Flaws via Online Self-Correction
Figure 2 for Reflective Confidence: Correcting Reasoning Flaws via Online Self-Correction
Figure 3 for Reflective Confidence: Correcting Reasoning Flaws via Online Self-Correction
Figure 4 for Reflective Confidence: Correcting Reasoning Flaws via Online Self-Correction
Viaarxiv icon