Picture for Qi Zhang

Qi Zhang

School of Information, North China University of Technology

ATLAS: A High-Difficulty, Multidisciplinary Benchmark for Frontier Scientific Reasoning

Add code
Nov 18, 2025
Viaarxiv icon

MonkeyOCR v1.5 Technical Report: Unlocking Robust Document Parsing for Complex Patterns

Add code
Nov 16, 2025
Viaarxiv icon

xHAP: Cross-Modal Attention for Haptic Feedback Estimation in the Tactile Internet

Add code
Nov 12, 2025
Viaarxiv icon

AgentPRM: Process Reward Models for LLM Agents via Step-Wise Promise and Progress

Add code
Nov 11, 2025
Viaarxiv icon

Counteracting Matthew Effect in Self-Improvement of LVLMs through Head-Tail Re-balancing

Add code
Oct 30, 2025
Viaarxiv icon

GTR-Mamba: Geometry-to-Tangent Routing for Hyperbolic POI Recommendation

Add code
Oct 27, 2025
Viaarxiv icon

Metacognitive Self-Correction for Multi-Agent System via Prototype-Guided Next-Execution Reconstruction

Add code
Oct 16, 2025
Viaarxiv icon

From Scores to Preferences: Redefining MOS Benchmarking for Speech Quality Reward Modeling

Add code
Oct 01, 2025
Viaarxiv icon

Query-Kontext: An Unified Multimodal Model for Image Generation and Editing

Add code
Sep 30, 2025
Figure 1 for Query-Kontext: An Unified Multimodal Model for Image Generation and Editing
Figure 2 for Query-Kontext: An Unified Multimodal Model for Image Generation and Editing
Figure 3 for Query-Kontext: An Unified Multimodal Model for Image Generation and Editing
Figure 4 for Query-Kontext: An Unified Multimodal Model for Image Generation and Editing
Viaarxiv icon

MDAR: A Multi-scene Dynamic Audio Reasoning Benchmark

Add code
Sep 26, 2025
Viaarxiv icon