Picture for Fanrui Zhang

Fanrui Zhang

MDK12-Bench: A Comprehensive Evaluation of Multimodal Large Language Models on Multidisciplinary Exams

Add code
Aug 09, 2025
Viaarxiv icon

Sekai: A Video Dataset towards World Exploration

Add code
Jun 18, 2025
Viaarxiv icon

A High-Quality Dataset and Reliable Evaluation for Interleaved Image-Text Generation

Add code
Jun 11, 2025
Viaarxiv icon

Fact-R1: Towards Explainable Video Misinformation Detection with Deep Reasoning

Add code
May 22, 2025
Viaarxiv icon

MDK12-Bench: A Multi-Discipline Benchmark for Evaluating Reasoning in Multimodal Large Language Models

Add code
Apr 08, 2025
Viaarxiv icon

ARMOR v0.1: Empowering Autoregressive Multimodal Understanding Model with Interleaved Multimodal Generation via Asymmetric Synergy

Add code
Mar 09, 2025
Figure 1 for ARMOR v0.1: Empowering Autoregressive Multimodal Understanding Model with Interleaved Multimodal Generation via Asymmetric Synergy
Figure 2 for ARMOR v0.1: Empowering Autoregressive Multimodal Understanding Model with Interleaved Multimodal Generation via Asymmetric Synergy
Figure 3 for ARMOR v0.1: Empowering Autoregressive Multimodal Understanding Model with Interleaved Multimodal Generation via Asymmetric Synergy
Figure 4 for ARMOR v0.1: Empowering Autoregressive Multimodal Understanding Model with Interleaved Multimodal Generation via Asymmetric Synergy
Viaarxiv icon

ProJudge: A Multi-Modal Multi-Discipline Benchmark and Instruction-Tuning Dataset for MLLM-based Process Judges

Add code
Mar 09, 2025
Viaarxiv icon

ForgeryGPT: Multimodal Large Language Model For Explainable Image Forgery Detection and Localization

Add code
Oct 14, 2024
Figure 1 for ForgeryGPT: Multimodal Large Language Model For Explainable Image Forgery Detection and Localization
Figure 2 for ForgeryGPT: Multimodal Large Language Model For Explainable Image Forgery Detection and Localization
Figure 3 for ForgeryGPT: Multimodal Large Language Model For Explainable Image Forgery Detection and Localization
Figure 4 for ForgeryGPT: Multimodal Large Language Model For Explainable Image Forgery Detection and Localization
Viaarxiv icon

Hierarchical Information Enhancement Network for Cascade Prediction in Social Networks

Add code
Mar 22, 2024
Viaarxiv icon

Multi-perspective Memory Enhanced Network for Identifying Key Nodes in Social Networks

Add code
Mar 22, 2024
Figure 1 for Multi-perspective Memory Enhanced Network for Identifying Key Nodes in Social Networks
Figure 2 for Multi-perspective Memory Enhanced Network for Identifying Key Nodes in Social Networks
Figure 3 for Multi-perspective Memory Enhanced Network for Identifying Key Nodes in Social Networks
Viaarxiv icon