Picture for Le Zhang

Le Zhang

Unlocking LLM Code Correction with Iterative Feedback Loops

Add code
Jun 16, 2026
Viaarxiv icon

Bridging Short Videos and Live Streams: Reasoning-Guided Multimodal LLMs for Cross-Domain Representation Learning

Add code
Jun 03, 2026
Viaarxiv icon

DynFrame: Adaptive Reasoning-Driven Multimodal Framework with Dynamic Frame Augmentation for Complex Video Understanding

Add code
May 26, 2026
Viaarxiv icon

How and What to Imagine? Visual Thinking in Unified Multimodal Models for Cross-View Spatial Reasoning

Add code
May 26, 2026
Viaarxiv icon

How Far Has AI Come in Liver Fibrosis Staging? A Large-Scale Real-World Dataset and Benchmark

Add code
May 25, 2026
Viaarxiv icon

RiT: Vanilla Diffusion Transformers Suffice in Representation Space

Add code
May 21, 2026
Viaarxiv icon

From Where Things Are to What They Are For: Benchmarking Spatial-Functional Intelligence in Multimodal LLMs

Add code
May 04, 2026
Viaarxiv icon

MedFlowSeg: Flow Matching for Medical Image Segmentation with Frequency-Aware Attention

Add code
Apr 21, 2026
Viaarxiv icon

Make It Up: Fake Images, Real Gains in Generalized Few-shot Semantic Segmentation

Add code
Mar 28, 2026
Viaarxiv icon

End-to-End Dexterous Grasp Learning from Single-View Point Clouds via a Multi-Object Scene Dataset

Add code
Mar 16, 2026
Viaarxiv icon