Picture for Masahiro Yasuda

Masahiro Yasuda

Assessing the Utility of Audio Foundation Models for Heart and Respiratory Sound Analysis

Add code
Apr 25, 2025
Viaarxiv icon

Baseline Systems and Evaluation Metrics for Spatial Semantic Segmentation of Sound Scenes

Add code
Mar 28, 2025
Viaarxiv icon

M2D2: Exploring General-purpose Audio-Language Representations Beyond CLAP

Add code
Mar 28, 2025
Viaarxiv icon

M2D-CLAP: Masked Modeling Duo Meets CLAP for Learning General-purpose Audio-Language Representation

Add code
Jun 04, 2024
Figure 1 for M2D-CLAP: Masked Modeling Duo Meets CLAP for Learning General-purpose Audio-Language Representation
Figure 2 for M2D-CLAP: Masked Modeling Duo Meets CLAP for Learning General-purpose Audio-Language Representation
Figure 3 for M2D-CLAP: Masked Modeling Duo Meets CLAP for Learning General-purpose Audio-Language Representation
Figure 4 for M2D-CLAP: Masked Modeling Duo Meets CLAP for Learning General-purpose Audio-Language Representation
Viaarxiv icon

Guided Masked Self-Distillation Modeling for Distributed Multimedia Sensor Event Analysis

Add code
Apr 12, 2024
Viaarxiv icon

6DoF SELD: Sound Event Localization and Detection Using Microphones and Motion Tracking Sensors on self-motioning human

Add code
Mar 04, 2024
Viaarxiv icon

First-shot anomaly sound detection for machine condition monitoring: A domain generalization baseline

Add code
Mar 01, 2023
Figure 1 for First-shot anomaly sound detection for machine condition monitoring: A domain generalization baseline
Figure 2 for First-shot anomaly sound detection for machine condition monitoring: A domain generalization baseline
Figure 3 for First-shot anomaly sound detection for machine condition monitoring: A domain generalization baseline
Viaarxiv icon

Multi-view and Multi-modal Event Detection Utilizing Transformer-based Multi-sensor fusion

Add code
Feb 18, 2022
Figure 1 for Multi-view and Multi-modal Event Detection Utilizing Transformer-based Multi-sensor fusion
Figure 2 for Multi-view and Multi-modal Event Detection Utilizing Transformer-based Multi-sensor fusion
Figure 3 for Multi-view and Multi-modal Event Detection Utilizing Transformer-based Multi-sensor fusion
Figure 4 for Multi-view and Multi-modal Event Detection Utilizing Transformer-based Multi-sensor fusion
Viaarxiv icon

Echo-aware Adaptation of Sound Event Localization and Detection in Unknown Environments

Add code
Feb 18, 2022
Figure 1 for Echo-aware Adaptation of Sound Event Localization and Detection in Unknown Environments
Figure 2 for Echo-aware Adaptation of Sound Event Localization and Detection in Unknown Environments
Figure 3 for Echo-aware Adaptation of Sound Event Localization and Detection in Unknown Environments
Figure 4 for Echo-aware Adaptation of Sound Event Localization and Detection in Unknown Environments
Viaarxiv icon

Wearable SELD dataset: Dataset for sound event localization and detection using wearable devices around head

Add code
Feb 17, 2022
Figure 1 for Wearable SELD dataset: Dataset for sound event localization and detection using wearable devices around head
Figure 2 for Wearable SELD dataset: Dataset for sound event localization and detection using wearable devices around head
Figure 3 for Wearable SELD dataset: Dataset for sound event localization and detection using wearable devices around head
Figure 4 for Wearable SELD dataset: Dataset for sound event localization and detection using wearable devices around head
Viaarxiv icon