Picture for Jusheng Zhang

Jusheng Zhang

The Fourth Challenge on Image Super-Resolution ($\times$4) at NTIRE 2026: Benchmark Results and Method Overview

Add code
Apr 16, 2026
Viaarxiv icon

Aligning Multimodal Sequential Recommendations via Robust Direct Preference Optimization with Sparse MoE

Add code
Mar 31, 2026
Viaarxiv icon

Process-of-Thought Reasoning for Videos

Add code
Feb 07, 2026
Viaarxiv icon

Spectral Gating Networks

Add code
Feb 07, 2026
Viaarxiv icon

Rational ANOVA Networks

Add code
Feb 03, 2026
Viaarxiv icon

Why Keep Your Doubts to Yourself? Trading Visual Uncertainties in Multi-Agent Bandit Systems

Add code
Jan 26, 2026
Viaarxiv icon

ResAgent: Entropy-based Prior Point Discovery and Visual Reasoning for Referring Expression Segmentation

Add code
Jan 23, 2026
Viaarxiv icon

3D-Agent:Tri-Modal Multi-Agent Collaboration for Scalable 3D Object Annotation

Add code
Jan 07, 2026
Viaarxiv icon

FlashVLM: Text-Guided Visual Token Selection for Large Multimodal Models

Add code
Dec 23, 2025
Figure 1 for FlashVLM: Text-Guided Visual Token Selection for Large Multimodal Models
Figure 2 for FlashVLM: Text-Guided Visual Token Selection for Large Multimodal Models
Figure 3 for FlashVLM: Text-Guided Visual Token Selection for Large Multimodal Models
Figure 4 for FlashVLM: Text-Guided Visual Token Selection for Large Multimodal Models
Viaarxiv icon

HybridToken-VLM: Hybrid Token Compression for Vision-Language Models

Add code
Dec 09, 2025
Viaarxiv icon