Picture for Yusuke Iwasawa

Yusuke Iwasawa

JMed48k: A Multi-Profession Japanese Medical Licensing Benchmark for Vision-Language Model Evaluation

Add code
May 21, 2026
Viaarxiv icon

E3VS-Bench: A Benchmark for Viewpoint-Dependent Active Perception in 3D Gaussian Splatting Scenes

Add code
Apr 20, 2026
Viaarxiv icon

C-voting: Confidence-Based Test-Time Voting without Explicit Energy Functions

Add code
Apr 15, 2026
Viaarxiv icon

Thinking While Listening: Fast-Slow Recurrence for Long-Horizon Sequential Modeling

Add code
Apr 02, 2026
Viaarxiv icon

EC-Bench: Enumeration and Counting Benchmark for Ultra-Long Videos

Add code
Mar 31, 2026
Viaarxiv icon

Semantic Token Clustering for Efficient Uncertainty Quantification in Large Language Models

Add code
Mar 20, 2026
Viaarxiv icon

PhysQuantAgent: An Inference Pipeline of Mass Estimation for Vision-Language Models

Add code
Mar 17, 2026
Viaarxiv icon

Omanic: Towards Step-wise Evaluation of Multi-hop Reasoning in Large Language Models

Add code
Mar 17, 2026
Viaarxiv icon

SAIL: Test-Time Scaling for In-Context Imitation Learning with VLM

Add code
Mar 09, 2026
Viaarxiv icon

Residual Koopman Spectral Profiling for Predicting and Preventing Transformer Training Instability

Add code
Feb 26, 2026
Viaarxiv icon