Picture for Bo Zhang

Bo Zhang

AdaZoom-GUI: Adaptive Zoom-based GUI Grounding with Instruction Refinement

Add code
Mar 18, 2026
Viaarxiv icon

VisBrowse-Bench: Benchmarking Visual-Native Search for Multimodal Browsing Agents

Add code
Mar 17, 2026
Viaarxiv icon

AutoSkill: Experience-Driven Lifelong Learning via Skill Self-Evolution

Add code
Mar 05, 2026
Viaarxiv icon

Image-based Prompt Injection: Hijacking Multimodal LLMs through Visually Embedded Adversarial Instructions

Add code
Mar 04, 2026
Viaarxiv icon

DriveCode: Domain Specific Numerical Encoding for LLM-Based Autonomous Driving

Add code
Mar 01, 2026
Viaarxiv icon

Grounding LLMs in Scientific Discovery via Embodied Actions

Add code
Feb 24, 2026
Viaarxiv icon

STAPO: Stabilizing Reinforcement Learning for LLMs by Silencing Rare Spurious Tokens

Add code
Feb 17, 2026
Viaarxiv icon

DeepSight: An All-in-One LM Safety Toolkit

Add code
Feb 12, 2026
Viaarxiv icon

Sci-CoE: Co-evolving Scientific Reasoning LLMs via Geometric Consensus with Sparse Supervision

Add code
Feb 12, 2026
Viaarxiv icon

Beyond Next-Token Alignment: Distilling Multimodal Large Language Models via Token Interactions

Add code
Feb 10, 2026
Viaarxiv icon