Picture for Yuxin Zhang

Yuxin Zhang

Tony

DuplexSLA: A Full-Duplex Spoken Language Model with Synchronized Speech, Language, and Action

Add code
May 20, 2026
Viaarxiv icon

Boosting Omni-Modal Language Models: Staged Post-Training with Visually Debiased Evaluation

Add code
May 13, 2026
Viaarxiv icon

Learning Responsibility-Attributed Adversarial Scenarios for Testing Autonomous Vehicles

Add code
May 13, 2026
Viaarxiv icon

Step-Audio-R1.5 Technical Report

Add code
Apr 28, 2026
Viaarxiv icon

Prototype-Based Test-Time Adaptation of Vision-Language Models

Add code
Apr 23, 2026
Viaarxiv icon

The Fourth Challenge on Image Super-Resolution ($\times$4) at NTIRE 2026: Benchmark Results and Method Overview

Add code
Apr 16, 2026
Viaarxiv icon

ID-Selection: Importance-Diversity Based Visual Token Selection for Efficient LVLM Inference

Add code
Apr 07, 2026
Viaarxiv icon

A Synthetic Eye Movement Dataset for Script Reading Detection: Real Trajectory Replay on a 3D Simulator

Add code
Apr 07, 2026
Viaarxiv icon

NavCrafter: Exploring 3D Scenes from a Single Image

Add code
Apr 03, 2026
Viaarxiv icon

VRUD: A Drone Dataset for Complex Vehicle-VRU Interactions within Mixed Traffic

Add code
Apr 01, 2026
Viaarxiv icon