Picture for Renrui Zhang

Renrui Zhang

VideoZeroBench: Probing the Limits of Video MLLMs with Spatio-Temporal Evidence Verification

Add code
Apr 02, 2026
Viaarxiv icon

MME-CoF-Pro: Evaluating Reasoning Coherence in Video Generative Models with Text and Visual Hints

Add code
Mar 20, 2026
Viaarxiv icon

PEARL: Personalized Streaming Video Understanding Model

Add code
Mar 20, 2026
Viaarxiv icon

SafeSci: Safety Evaluation of Large Language Models in Science Domains and Beyond

Add code
Mar 02, 2026
Viaarxiv icon

R-Diverse: Mitigating Diversity Illusion in Self-Play LLM Training

Add code
Feb 16, 2026
Viaarxiv icon

GENIUS: Generative Fluid Intelligence Evaluation Suite

Add code
Feb 11, 2026
Viaarxiv icon

Mind-Brush: Integrating Agentic Cognitive Search and Reasoning into Image Generation

Add code
Feb 02, 2026
Viaarxiv icon

Automated Safety Benchmarking: A Multi-agent Pipeline for LVLMs

Add code
Jan 27, 2026
Viaarxiv icon

Visual Generation Unlocks Human-Like Reasoning through Multimodal World Models

Add code
Jan 27, 2026
Viaarxiv icon

NL2CA: Auto-formalizing Cognitive Decision-Making from Natural Language Using an Unsupervised CriticNL2LTL Framework

Add code
Dec 20, 2025
Figure 1 for NL2CA: Auto-formalizing Cognitive Decision-Making from Natural Language Using an Unsupervised CriticNL2LTL Framework
Figure 2 for NL2CA: Auto-formalizing Cognitive Decision-Making from Natural Language Using an Unsupervised CriticNL2LTL Framework
Figure 3 for NL2CA: Auto-formalizing Cognitive Decision-Making from Natural Language Using an Unsupervised CriticNL2LTL Framework
Figure 4 for NL2CA: Auto-formalizing Cognitive Decision-Making from Natural Language Using an Unsupervised CriticNL2LTL Framework
Viaarxiv icon