Alert button
Picture for Okko Räsänen

Okko Räsänen

Alert button

Crowdsourcing and Evaluating Text-Based Audio Retrieval Relevances

Add code
Bookmark button
Alert button
Jun 16, 2023
Huang Xie, Khazar Khorrami, Okko Räsänen, Tuomas Virtanen

Figure 1 for Crowdsourcing and Evaluating Text-Based Audio Retrieval Relevances
Figure 2 for Crowdsourcing and Evaluating Text-Based Audio Retrieval Relevances
Figure 3 for Crowdsourcing and Evaluating Text-Based Audio Retrieval Relevances
Figure 4 for Crowdsourcing and Evaluating Text-Based Audio Retrieval Relevances
Viaarxiv icon

BabySLM: language-acquisition-friendly benchmark of self-supervised spoken language models

Add code
Bookmark button
Alert button
Jun 08, 2023
Marvin Lavechin, Yaya Sy, Hadrien Titeux, María Andrea Cruz Blandón, Okko Räsänen, Hervé Bredin, Emmanuel Dupoux, Alejandrina Cristia

Figure 1 for BabySLM: language-acquisition-friendly benchmark of self-supervised spoken language models
Figure 2 for BabySLM: language-acquisition-friendly benchmark of self-supervised spoken language models
Figure 3 for BabySLM: language-acquisition-friendly benchmark of self-supervised spoken language models
Figure 4 for BabySLM: language-acquisition-friendly benchmark of self-supervised spoken language models
Viaarxiv icon

Simultaneous or Sequential Training? How Speech Representations Cooperate in a Multi-Task Self-Supervised Learning System

Add code
Bookmark button
Alert button
Jun 05, 2023
Khazar Khorrami, María Andrea Cruz Blandón, Tuomas Virtanen, Okko Räsänen

Figure 1 for Simultaneous or Sequential Training? How Speech Representations Cooperate in a Multi-Task Self-Supervised Learning System
Figure 2 for Simultaneous or Sequential Training? How Speech Representations Cooperate in a Multi-Task Self-Supervised Learning System
Figure 3 for Simultaneous or Sequential Training? How Speech Representations Cooperate in a Multi-Task Self-Supervised Learning System
Figure 4 for Simultaneous or Sequential Training? How Speech Representations Cooperate in a Multi-Task Self-Supervised Learning System
Viaarxiv icon

Syllable Discovery and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Mode

Add code
Bookmark button
Alert button
May 19, 2023
Puyuan Peng, Shang-Wen Li, Okko Räsänen, Abdelrahman Mohamed, David Harwath

Figure 1 for Syllable Discovery and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Mode
Figure 2 for Syllable Discovery and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Mode
Figure 3 for Syllable Discovery and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Mode
Figure 4 for Syllable Discovery and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Mode
Viaarxiv icon

Evaluation of self-supervised pre-training for automatic infant movement classification using wearable movement sensors

Add code
Bookmark button
Alert button
May 16, 2023
Einari Vaaras, Manu Airaksinen, Sampsa Vanhatalo, Okko Räsänen

Figure 1 for Evaluation of self-supervised pre-training for automatic infant movement classification using wearable movement sensors
Figure 2 for Evaluation of self-supervised pre-training for automatic infant movement classification using wearable movement sensors
Figure 3 for Evaluation of self-supervised pre-training for automatic infant movement classification using wearable movement sensors
Figure 4 for Evaluation of self-supervised pre-training for automatic infant movement classification using wearable movement sensors
Viaarxiv icon

Analysing the Impact of Audio Quality on the Use of Naturalistic Long-Form Recordings for Infant-Directed Speech Research

Add code
Bookmark button
Alert button
May 03, 2023
María Andrea Cruz Blandón, Alejandrina Cristia, Okko Räsänen

Figure 1 for Analysing the Impact of Audio Quality on the Use of Naturalistic Long-Form Recordings for Infant-Directed Speech Research
Figure 2 for Analysing the Impact of Audio Quality on the Use of Naturalistic Long-Form Recordings for Infant-Directed Speech Research
Figure 3 for Analysing the Impact of Audio Quality on the Use of Naturalistic Long-Form Recordings for Infant-Directed Speech Research
Figure 4 for Analysing the Impact of Audio Quality on the Use of Naturalistic Long-Form Recordings for Infant-Directed Speech Research
Viaarxiv icon

On Negative Sampling for Contrastive Audio-Text Retrieval

Add code
Bookmark button
Alert button
Nov 08, 2022
Huang Xie, Okko Räsänen, Tuomas Virtanen

Figure 1 for On Negative Sampling for Contrastive Audio-Text Retrieval
Figure 2 for On Negative Sampling for Contrastive Audio-Text Retrieval
Viaarxiv icon

Analysis of Self-Supervised Learning and Dimensionality Reduction Methods in Clustering-Based Active Learning for Speech Emotion Recognition

Add code
Bookmark button
Alert button
Jun 21, 2022
Einari Vaaras, Manu Airaksinen, Okko Räsänen

Figure 1 for Analysis of Self-Supervised Learning and Dimensionality Reduction Methods in Clustering-Based Active Learning for Speech Emotion Recognition
Figure 2 for Analysis of Self-Supervised Learning and Dimensionality Reduction Methods in Clustering-Based Active Learning for Speech Emotion Recognition
Figure 3 for Analysis of Self-Supervised Learning and Dimensionality Reduction Methods in Clustering-Based Active Learning for Speech Emotion Recognition
Figure 4 for Analysis of Self-Supervised Learning and Dimensionality Reduction Methods in Clustering-Based Active Learning for Speech Emotion Recognition
Viaarxiv icon

Towards Learning to Speak and Hear Through Multi-Agent Communication over a Continuous Acoustic Channel

Add code
Bookmark button
Alert button
Nov 04, 2021
Kevin Eloff, Arnu Pretorius, Okko Räsänen, Herman A. Engelbrecht, Herman Kamper

Figure 1 for Towards Learning to Speak and Hear Through Multi-Agent Communication over a Continuous Acoustic Channel
Figure 2 for Towards Learning to Speak and Hear Through Multi-Agent Communication over a Continuous Acoustic Channel
Figure 3 for Towards Learning to Speak and Hear Through Multi-Agent Communication over a Continuous Acoustic Channel
Figure 4 for Towards Learning to Speak and Hear Through Multi-Agent Communication over a Continuous Acoustic Channel
Viaarxiv icon

Unsupervised Audio-Caption Aligning Learns Correspondences between Individual Sound Events and Textual Phrases

Add code
Bookmark button
Alert button
Oct 06, 2021
Huang Xie, Okko Räsänen, Konstantinos Drossos, Tuomas Virtanen

Figure 1 for Unsupervised Audio-Caption Aligning Learns Correspondences between Individual Sound Events and Textual Phrases
Figure 2 for Unsupervised Audio-Caption Aligning Learns Correspondences between Individual Sound Events and Textual Phrases
Figure 3 for Unsupervised Audio-Caption Aligning Learns Correspondences between Individual Sound Events and Textual Phrases
Figure 4 for Unsupervised Audio-Caption Aligning Learns Correspondences between Individual Sound Events and Textual Phrases
Viaarxiv icon