Picture for Han Yin

Han Yin

A Multi-Stage Separation-and-Classification Framework Guided by Complementary Acoustic-to-Semantic Clues

Add code
Jun 23, 2026
Viaarxiv icon

AudioDER: A Deduplication-Enhanced Reasoning Dataset for Post-Training Large Audio-Language Models

Add code
Jun 12, 2026
Viaarxiv icon

Why Can't They Remember? Uncovering Representation and Retrieval Bottlenecks in Multi-Turn Acoustic Memory

Add code
May 26, 2026
Viaarxiv icon

ESI-Bench: Towards Embodied Spatial Intelligence that Closes the Perception-Action Loop

Add code
May 18, 2026
Viaarxiv icon

Towards Generalist Game Players: An Investigation of Foundation Models in the Game Multiverse

Add code
May 11, 2026
Viaarxiv icon

PolyBench: A Benchmark for Compositional Reasoning in Polyphonic Audio

Add code
Mar 05, 2026
Viaarxiv icon

Dynamic Fusion Multimodal Network for SpeechWellness Detection

Add code
Aug 25, 2025
Viaarxiv icon

Noise-Robust Sound Event Detection and Counting via Language-Queried Sound Separation

Add code
Aug 10, 2025
Viaarxiv icon

SpeakerLM: End-to-End Versatile Speaker Diarization and Recognition with Multimodal Large Language Models

Add code
Aug 08, 2025
Viaarxiv icon

EnvSDD: Benchmarking Environmental Sound Deepfake Detection

Add code
May 25, 2025
Viaarxiv icon