Picture for Zhixi Cai

Zhixi Cai

DexAvatar: 3D Sign Language Reconstruction with Hand and Body Pose Priors

Add code
Dec 24, 2025
Viaarxiv icon

Do Blind Spots Matter for Word-Referent Mapping? A Computational Study with Infant Egocentric Video

Add code
Nov 13, 2025
Viaarxiv icon

Explain Before You Answer: A Survey on Compositional Visual Reasoning

Add code
Aug 24, 2025
Figure 1 for Explain Before You Answer: A Survey on Compositional Visual Reasoning
Figure 2 for Explain Before You Answer: A Survey on Compositional Visual Reasoning
Figure 3 for Explain Before You Answer: A Survey on Compositional Visual Reasoning
Figure 4 for Explain Before You Answer: A Survey on Compositional Visual Reasoning
Viaarxiv icon

JRDB-Reasoning: A Difficulty-Graded Benchmark for Visual Reasoning in Robotics

Add code
Aug 14, 2025
Viaarxiv icon

AV-Deepfake1M++: A Large-Scale Audio-Visual Deepfake Benchmark with Real-World Perturbations

Add code
Jul 28, 2025
Figure 1 for AV-Deepfake1M++: A Large-Scale Audio-Visual Deepfake Benchmark with Real-World Perturbations
Figure 2 for AV-Deepfake1M++: A Large-Scale Audio-Visual Deepfake Benchmark with Real-World Perturbations
Figure 3 for AV-Deepfake1M++: A Large-Scale Audio-Visual Deepfake Benchmark with Real-World Perturbations
Figure 4 for AV-Deepfake1M++: A Large-Scale Audio-Visual Deepfake Benchmark with Real-World Perturbations
Viaarxiv icon

M-MRE: Extending the Mutual Reinforcement Effect to Multimodal Information Extraction

Add code
Apr 24, 2025
Viaarxiv icon

DWIM: Towards Tool-aware Visual Reasoning via Discrepancy-aware Workflow Generation & Instruct-Masking Tuning

Add code
Mar 25, 2025
Figure 1 for DWIM: Towards Tool-aware Visual Reasoning via Discrepancy-aware Workflow Generation & Instruct-Masking Tuning
Figure 2 for DWIM: Towards Tool-aware Visual Reasoning via Discrepancy-aware Workflow Generation & Instruct-Masking Tuning
Figure 3 for DWIM: Towards Tool-aware Visual Reasoning via Discrepancy-aware Workflow Generation & Instruct-Masking Tuning
Figure 4 for DWIM: Towards Tool-aware Visual Reasoning via Discrepancy-aware Workflow Generation & Instruct-Masking Tuning
Viaarxiv icon

NEUSIS: A Compositional Neuro-Symbolic Framework for Autonomous Perception, Reasoning, and Planning in Complex UAV Search Missions

Add code
Sep 16, 2024
Figure 1 for NEUSIS: A Compositional Neuro-Symbolic Framework for Autonomous Perception, Reasoning, and Planning in Complex UAV Search Missions
Figure 2 for NEUSIS: A Compositional Neuro-Symbolic Framework for Autonomous Perception, Reasoning, and Planning in Complex UAV Search Missions
Figure 3 for NEUSIS: A Compositional Neuro-Symbolic Framework for Autonomous Perception, Reasoning, and Planning in Complex UAV Search Missions
Figure 4 for NEUSIS: A Compositional Neuro-Symbolic Framework for Autonomous Perception, Reasoning, and Planning in Complex UAV Search Missions
Viaarxiv icon

1M-Deepfakes Detection Challenge

Add code
Sep 11, 2024
Figure 1 for 1M-Deepfakes Detection Challenge
Figure 2 for 1M-Deepfakes Detection Challenge
Figure 3 for 1M-Deepfakes Detection Challenge
Figure 4 for 1M-Deepfakes Detection Challenge
Viaarxiv icon

MRAC Track 1: 2nd Workshop on Multimodal, Generative and Responsible Affective Computing

Add code
Sep 11, 2024
Viaarxiv icon