Picture for Shijing Wang

Shijing Wang

Enhancing Gaze Reasoning in Vision Foundation Models for Gaze Following

Add code
May 21, 2026
Viaarxiv icon

CineSRD: Leveraging Visual, Acoustic, and Linguistic Cues for Open-World Visual Media Speaker Diarization

Add code
Mar 17, 2026
Viaarxiv icon

From Utterance to Vividity: Training Expressive Subtitle Translation LLM via Adaptive Local Preference Optimization

Add code
Feb 01, 2026
Viaarxiv icon

Hermes the Polyglot: A Unified Framework to Enhance Expressiveness for Multimodal Interlingual Subtitling

Add code
Jan 31, 2026
Viaarxiv icon

Agentic Reward Modeling: Verifying GUI Agent via Online Proactive Interaction

Add code
Jan 31, 2026
Viaarxiv icon

VL4Gaze: Unleashing Vision-Language Models for Gaze Following

Add code
Dec 23, 2025
Viaarxiv icon

Fine-grained Video Dubbing Duration Alignment with Segment Supervised Preference Optimization

Add code
Aug 12, 2025
Viaarxiv icon

Suppressing Uncertainty in Gaze Estimation

Add code
Dec 17, 2024
Figure 1 for Suppressing Uncertainty in Gaze Estimation
Figure 2 for Suppressing Uncertainty in Gaze Estimation
Figure 3 for Suppressing Uncertainty in Gaze Estimation
Figure 4 for Suppressing Uncertainty in Gaze Estimation
Viaarxiv icon

Cross-Dataset Gaze Estimation by Evidential Inter-intra Fusion

Add code
Sep 07, 2024
Figure 1 for Cross-Dataset Gaze Estimation by Evidential Inter-intra Fusion
Figure 2 for Cross-Dataset Gaze Estimation by Evidential Inter-intra Fusion
Figure 3 for Cross-Dataset Gaze Estimation by Evidential Inter-intra Fusion
Figure 4 for Cross-Dataset Gaze Estimation by Evidential Inter-intra Fusion
Viaarxiv icon

PCIE_LAM Solution for Ego4D Looking At Me Challenge

Add code
Jun 18, 2024
Figure 1 for PCIE_LAM Solution for Ego4D Looking At Me Challenge
Figure 2 for PCIE_LAM Solution for Ego4D Looking At Me Challenge
Figure 3 for PCIE_LAM Solution for Ego4D Looking At Me Challenge
Figure 4 for PCIE_LAM Solution for Ego4D Looking At Me Challenge
Viaarxiv icon