Picture for Tong Zhang

Tong Zhang

Nanjing University of Science and Technology, Nanjing, China

Mitigating Hallucinations in Video Large Language Models via Spatiotemporal-Semantic Contrastive Decoding

Add code
Jan 30, 2026
Viaarxiv icon

Farewell to Item IDs: Unlocking the Scaling Potential of Large Ranking Models via Semantic Tokens

Add code
Jan 30, 2026
Viaarxiv icon

PhysProver: Advancing Automatic Theorem Proving for Physics

Add code
Jan 22, 2026
Viaarxiv icon

A Training-Free Guess What Vision Language Model from Snippets to Open-Vocabulary Object Detection

Add code
Jan 21, 2026
Viaarxiv icon

PRL: Process Reward Learning Improves LLMs' Reasoning Ability and Broadens the Reasoning Boundary

Add code
Jan 15, 2026
Viaarxiv icon

Test-time Adaptive Hierarchical Co-enhanced Denoising Network for Reliable Multimodal Classification

Add code
Jan 12, 2026
Viaarxiv icon

WebGym: Scaling Training Environments for Visual Web Agents with Realistic Tasks

Add code
Jan 07, 2026
Viaarxiv icon

AlignDrive: Aligned Lateral-Longitudinal Planning for End-to-End Autonomous Driving

Add code
Jan 05, 2026
Viaarxiv icon

From Indoor to Open World: Revealing the Spatial Reasoning Gap in MLLMs

Add code
Dec 22, 2025
Figure 1 for From Indoor to Open World: Revealing the Spatial Reasoning Gap in MLLMs
Figure 2 for From Indoor to Open World: Revealing the Spatial Reasoning Gap in MLLMs
Figure 3 for From Indoor to Open World: Revealing the Spatial Reasoning Gap in MLLMs
Figure 4 for From Indoor to Open World: Revealing the Spatial Reasoning Gap in MLLMs
Viaarxiv icon

StarCraft+: Benchmarking Multi-agent Algorithms in Adversary Paradigm

Add code
Dec 18, 2025
Figure 1 for StarCraft+: Benchmarking Multi-agent Algorithms in Adversary Paradigm
Figure 2 for StarCraft+: Benchmarking Multi-agent Algorithms in Adversary Paradigm
Figure 3 for StarCraft+: Benchmarking Multi-agent Algorithms in Adversary Paradigm
Figure 4 for StarCraft+: Benchmarking Multi-agent Algorithms in Adversary Paradigm
Viaarxiv icon