Alert button
Picture for Thomas Pellegrini

Thomas Pellegrini

Alert button

IRIT-SAMoVA

Audio classification with Dilated Convolution with Learnable Spacings

Add code
Bookmark button
Alert button
Sep 25, 2023
Ismail Khalfaoui-Hassani, Timothée Masquelier, Thomas Pellegrini

Viaarxiv icon

Multilingual Audio Captioning using machine translated data

Add code
Bookmark button
Alert button
Sep 14, 2023
Matéo Cousin, Étienne Labbé, Thomas Pellegrini

Figure 1 for Multilingual Audio Captioning using machine translated data
Figure 2 for Multilingual Audio Captioning using machine translated data
Figure 3 for Multilingual Audio Captioning using machine translated data
Figure 4 for Multilingual Audio Captioning using machine translated data
Viaarxiv icon

CoNeTTE: An efficient Audio Captioning system leveraging multiple datasets with Task Embedding

Add code
Bookmark button
Alert button
Sep 01, 2023
Étienne Labbé, Thomas Pellegrini, Julien Pinquier

Figure 1 for CoNeTTE: An efficient Audio Captioning system leveraging multiple datasets with Task Embedding
Figure 2 for CoNeTTE: An efficient Audio Captioning system leveraging multiple datasets with Task Embedding
Figure 3 for CoNeTTE: An efficient Audio Captioning system leveraging multiple datasets with Task Embedding
Figure 4 for CoNeTTE: An efficient Audio Captioning system leveraging multiple datasets with Task Embedding
Viaarxiv icon

Killing two birds with one stone: Can an audio captioning system also be used for audio-text retrieval?

Add code
Bookmark button
Alert button
Aug 29, 2023
Etienne Labbé, Thomas Pellegrini, Julien Pinquier

Viaarxiv icon

Adapting a ConvNeXt model to audio classification on AudioSet

Add code
Bookmark button
Alert button
Jun 01, 2023
Thomas Pellegrini, Ismail Khalfaoui-Hassani, Etienne Labbé, Timothée Masquelier

Figure 1 for Adapting a ConvNeXt model to audio classification on AudioSet
Figure 2 for Adapting a ConvNeXt model to audio classification on AudioSet
Figure 3 for Adapting a ConvNeXt model to audio classification on AudioSet
Figure 4 for Adapting a ConvNeXt model to audio classification on AudioSet
Viaarxiv icon

Dilated Convolution with Learnable Spacings: beyond bilinear interpolation

Add code
Bookmark button
Alert button
Jun 01, 2023
Ismail Khalfaoui-Hassani, Thomas Pellegrini, Timothée Masquelier

Figure 1 for Dilated Convolution with Learnable Spacings: beyond bilinear interpolation
Figure 2 for Dilated Convolution with Learnable Spacings: beyond bilinear interpolation
Figure 3 for Dilated Convolution with Learnable Spacings: beyond bilinear interpolation
Viaarxiv icon

Multitask learning in Audio Captioning: a sentence embedding regression loss acts as a regularizer

Add code
Bookmark button
Alert button
May 02, 2023
Etienne Labbé, Julien Pinquier, Thomas Pellegrini

Figure 1 for Multitask learning in Audio Captioning: a sentence embedding regression loss acts as a regularizer
Figure 2 for Multitask learning in Audio Captioning: a sentence embedding regression loss acts as a regularizer
Figure 3 for Multitask learning in Audio Captioning: a sentence embedding regression loss acts as a regularizer
Figure 4 for Multitask learning in Audio Captioning: a sentence embedding regression loss acts as a regularizer
Viaarxiv icon

Is my automatic audio captioning system so bad? spider-max: a metric to consider several caption candidates

Add code
Bookmark button
Alert button
Nov 14, 2022
Etienne Labbé, Thomas Pellegrini, Julien Pinquier

Figure 1 for Is my automatic audio captioning system so bad? spider-max: a metric to consider several caption candidates
Figure 2 for Is my automatic audio captioning system so bad? spider-max: a metric to consider several caption candidates
Figure 3 for Is my automatic audio captioning system so bad? spider-max: a metric to consider several caption candidates
Figure 4 for Is my automatic audio captioning system so bad? spider-max: a metric to consider several caption candidates
Viaarxiv icon

Audio-video fusion strategies for active speaker detection in meetings

Add code
Bookmark button
Alert button
Jun 09, 2022
Lionel Pibre, Francisco Madrigal, Cyrille Equoy, Frédéric Lerasle, Thomas Pellegrini, Julien Pinquier, Isabelle Ferrané

Figure 1 for Audio-video fusion strategies for active speaker detection in meetings
Figure 2 for Audio-video fusion strategies for active speaker detection in meetings
Figure 3 for Audio-video fusion strategies for active speaker detection in meetings
Figure 4 for Audio-video fusion strategies for active speaker detection in meetings
Viaarxiv icon

Dilated convolution with learnable spacings

Add code
Bookmark button
Alert button
Dec 07, 2021
Ismail Khalfaoui Hassani, Thomas Pellegrini, Timothée Masquelier

Figure 1 for Dilated convolution with learnable spacings
Figure 2 for Dilated convolution with learnable spacings
Figure 3 for Dilated convolution with learnable spacings
Figure 4 for Dilated convolution with learnable spacings
Viaarxiv icon