Picture for Jiyuan Zhang

Jiyuan Zhang

NTIRE 2026 Challenge on Short-form UGC Video Restoration in the Wild with Generative Models: Datasets, Methods and Results

Add code
Apr 12, 2026
Viaarxiv icon

ERNIE 5.0 Technical Report

Add code
Feb 04, 2026
Viaarxiv icon

OmniSparse: Training-Aware Fine-Grained Sparse Attention for Long-Video MLLMs

Add code
Nov 18, 2025
Viaarxiv icon

X-Driver: Explainable Autonomous Driving with Vision-Language Models

Add code
May 08, 2025
Figure 1 for X-Driver: Explainable Autonomous Driving with Vision-Language Models
Figure 2 for X-Driver: Explainable Autonomous Driving with Vision-Language Models
Figure 3 for X-Driver: Explainable Autonomous Driving with Vision-Language Models
Figure 4 for X-Driver: Explainable Autonomous Driving with Vision-Language Models
Viaarxiv icon

Exploring the Role of Explicit Temporal Modeling in Multimodal Large Language Models for Video Understanding

Add code
Jan 28, 2025
Figure 1 for Exploring the Role of Explicit Temporal Modeling in Multimodal Large Language Models for Video Understanding
Figure 2 for Exploring the Role of Explicit Temporal Modeling in Multimodal Large Language Models for Video Understanding
Figure 3 for Exploring the Role of Explicit Temporal Modeling in Multimodal Large Language Models for Video Understanding
Figure 4 for Exploring the Role of Explicit Temporal Modeling in Multimodal Large Language Models for Video Understanding
Viaarxiv icon

Automatic Item Generation for Personality Situational Judgment Tests with Large Language Models

Add code
Dec 10, 2024
Viaarxiv icon

Evaluating and Advancing Multimodal Large Language Models in Ability Lens

Add code
Nov 22, 2024
Viaarxiv icon

USP-Gaussian: Unifying Spike-based Image Reconstruction, Pose Correction and Gaussian Splatting

Add code
Nov 15, 2024
Figure 1 for USP-Gaussian: Unifying Spike-based Image Reconstruction, Pose Correction and Gaussian Splatting
Figure 2 for USP-Gaussian: Unifying Spike-based Image Reconstruction, Pose Correction and Gaussian Splatting
Figure 3 for USP-Gaussian: Unifying Spike-based Image Reconstruction, Pose Correction and Gaussian Splatting
Figure 4 for USP-Gaussian: Unifying Spike-based Image Reconstruction, Pose Correction and Gaussian Splatting
Viaarxiv icon

Effectively Enhancing Vision Language Large Models by Prompt Augmentation and Caption Utilization

Add code
Sep 22, 2024
Figure 1 for Effectively Enhancing Vision Language Large Models by Prompt Augmentation and Caption Utilization
Figure 2 for Effectively Enhancing Vision Language Large Models by Prompt Augmentation and Caption Utilization
Figure 3 for Effectively Enhancing Vision Language Large Models by Prompt Augmentation and Caption Utilization
Figure 4 for Effectively Enhancing Vision Language Large Models by Prompt Augmentation and Caption Utilization
Viaarxiv icon

SpikeGS: 3D Gaussian Splatting from Spike Streams with High-Speed Camera Motion

Add code
Jul 14, 2024
Figure 1 for SpikeGS: 3D Gaussian Splatting from Spike Streams with High-Speed Camera Motion
Figure 2 for SpikeGS: 3D Gaussian Splatting from Spike Streams with High-Speed Camera Motion
Figure 3 for SpikeGS: 3D Gaussian Splatting from Spike Streams with High-Speed Camera Motion
Figure 4 for SpikeGS: 3D Gaussian Splatting from Spike Streams with High-Speed Camera Motion
Viaarxiv icon