Picture for Bin Zhu

Bin Zhu

Spatiotemporal Sycophancy: Negation-Based Gaslighting in Video Large Language Models

Add code
Apr 20, 2026
Viaarxiv icon

Learning ECG Image Representations via Dual Physiological-Aware Alignments

Add code
Apr 02, 2026
Viaarxiv icon

Marco DeepResearch: Unlocking Efficient Deep Research Agents via Verification-Centric Design

Add code
Mar 30, 2026
Viaarxiv icon

Teacher-Student Diffusion Model for Text-Driven 3D Hand Motion Generation

Add code
Mar 25, 2026
Viaarxiv icon

OSCBench: Benchmarking Object State Change in Text-to-Video Generation

Add code
Mar 12, 2026
Viaarxiv icon

Verified Multi-Agent Orchestration: A Plan-Execute-Verify-Replan Framework for Complex Query Resolution

Add code
Mar 12, 2026
Viaarxiv icon

RC-NF: Robot-Conditioned Normalizing Flow for Real-Time Anomaly Detection in Robotic Manipulation

Add code
Mar 11, 2026
Viaarxiv icon

SAM3-LiteText: An Anatomical Study of the SAM3 Text Encoder for Efficient Vision-Language Segmentation

Add code
Feb 12, 2026
Viaarxiv icon

Table-as-Search: Formulate Long-Horizon Agentic Information Seeking as Table Completion

Add code
Feb 06, 2026
Viaarxiv icon

DevPiolt: Operation Recommendation for IoT Devices at Xiaomi Home

Add code
Nov 18, 2025
Figure 1 for DevPiolt: Operation Recommendation for IoT Devices at Xiaomi Home
Figure 2 for DevPiolt: Operation Recommendation for IoT Devices at Xiaomi Home
Figure 3 for DevPiolt: Operation Recommendation for IoT Devices at Xiaomi Home
Figure 4 for DevPiolt: Operation Recommendation for IoT Devices at Xiaomi Home
Viaarxiv icon