Picture for Han Yin

Han Yin

Why Can't They Remember? Uncovering Representation and Retrieval Bottlenecks in Multi-Turn Acoustic Memory

Add code
May 26, 2026
Viaarxiv icon

ESI-Bench: Towards Embodied Spatial Intelligence that Closes the Perception-Action Loop

Add code
May 18, 2026
Viaarxiv icon

Towards Generalist Game Players: An Investigation of Foundation Models in the Game Multiverse

Add code
May 11, 2026
Viaarxiv icon

PolyBench: A Benchmark for Compositional Reasoning in Polyphonic Audio

Add code
Mar 05, 2026
Viaarxiv icon

Dynamic Fusion Multimodal Network for SpeechWellness Detection

Add code
Aug 25, 2025
Viaarxiv icon

Noise-Robust Sound Event Detection and Counting via Language-Queried Sound Separation

Add code
Aug 10, 2025
Viaarxiv icon

SpeakerLM: End-to-End Versatile Speaker Diarization and Recognition with Multimodal Large Language Models

Add code
Aug 08, 2025
Viaarxiv icon

EnvSDD: Benchmarking Environmental Sound Deepfake Detection

Add code
May 25, 2025
Viaarxiv icon

Pushing the Frontiers of Self-Distillation Prototypes Network with Dimension Regularization and Score Normalization

Add code
May 20, 2025
Figure 1 for Pushing the Frontiers of Self-Distillation Prototypes Network with Dimension Regularization and Score Normalization
Figure 2 for Pushing the Frontiers of Self-Distillation Prototypes Network with Dimension Regularization and Score Normalization
Figure 3 for Pushing the Frontiers of Self-Distillation Prototypes Network with Dimension Regularization and Score Normalization
Figure 4 for Pushing the Frontiers of Self-Distillation Prototypes Network with Dimension Regularization and Score Normalization
Viaarxiv icon

Leveraging LLM and Text-Queried Separation for Noise-Robust Sound Event Detection

Add code
Nov 02, 2024
Viaarxiv icon