Alert button
Picture for Aren Jansen

Aren Jansen

Alert button

Dataset balancing can hurt model performance

Add code
Bookmark button
Alert button
Jun 30, 2023
R. Channing Moore, Daniel P. W. Ellis, Eduardo Fonseca, Shawn Hershey, Aren Jansen, Manoj Plakal

Figure 1 for Dataset balancing can hurt model performance
Figure 2 for Dataset balancing can hurt model performance
Figure 3 for Dataset balancing can hurt model performance
Figure 4 for Dataset balancing can hurt model performance
Viaarxiv icon

V2Meow: Meowing to the Visual Beat via Music Generation

Add code
Bookmark button
Alert button
May 11, 2023
Kun Su, Judith Yue Li, Qingqing Huang, Dima Kuzmin, Joonseok Lee, Chris Donahue, Fei Sha, Aren Jansen, Yu Wang, Mauro Verzetti, Timo I. Denk

Figure 1 for V2Meow: Meowing to the Visual Beat via Music Generation
Figure 2 for V2Meow: Meowing to the Visual Beat via Music Generation
Figure 3 for V2Meow: Meowing to the Visual Beat via Music Generation
Figure 4 for V2Meow: Meowing to the Visual Beat via Music Generation
Viaarxiv icon

MusicLM: Generating Music From Text

Add code
Bookmark button
Alert button
Jan 26, 2023
Andrea Agostinelli, Timo I. Denk, Zalán Borsos, Jesse Engel, Mauro Verzetti, Antoine Caillon, Qingqing Huang, Aren Jansen, Adam Roberts, Marco Tagliasacchi, Matt Sharifi, Neil Zeghidour, Christian Frank

Figure 1 for MusicLM: Generating Music From Text
Figure 2 for MusicLM: Generating Music From Text
Figure 3 for MusicLM: Generating Music From Text
Figure 4 for MusicLM: Generating Music From Text
Viaarxiv icon

MAQA: A Multimodal QA Benchmark for Negation

Add code
Bookmark button
Alert button
Jan 09, 2023
Judith Yue Li, Aren Jansen, Qingqing Huang, Joonseok Lee, Ravi Ganti, Dima Kuzmin

Figure 1 for MAQA: A Multimodal QA Benchmark for Negation
Figure 2 for MAQA: A Multimodal QA Benchmark for Negation
Figure 3 for MAQA: A Multimodal QA Benchmark for Negation
Figure 4 for MAQA: A Multimodal QA Benchmark for Negation
Viaarxiv icon

MuLan: A Joint Embedding of Music Audio and Natural Language

Add code
Bookmark button
Alert button
Aug 26, 2022
Qingqing Huang, Aren Jansen, Joonseok Lee, Ravi Ganti, Judith Yue Li, Daniel P. W. Ellis

Figure 1 for MuLan: A Joint Embedding of Music Audio and Natural Language
Figure 2 for MuLan: A Joint Embedding of Music Audio and Natural Language
Figure 3 for MuLan: A Joint Embedding of Music Audio and Natural Language
Figure 4 for MuLan: A Joint Embedding of Music Audio and Natural Language
Viaarxiv icon

Text-Driven Separation of Arbitrary Sounds

Add code
Bookmark button
Alert button
Apr 12, 2022
Kevin Kilgour, Beat Gfeller, Qingqing Huang, Aren Jansen, Scott Wisdom, Marco Tagliasacchi

Figure 1 for Text-Driven Separation of Arbitrary Sounds
Figure 2 for Text-Driven Separation of Arbitrary Sounds
Figure 3 for Text-Driven Separation of Arbitrary Sounds
Figure 4 for Text-Driven Separation of Arbitrary Sounds
Viaarxiv icon

Universal Paralinguistic Speech Representations Using Self-Supervised Conformers

Add code
Bookmark button
Alert button
Oct 09, 2021
Joel Shor, Aren Jansen, Wei Han, Daniel Park, Yu Zhang

Figure 1 for Universal Paralinguistic Speech Representations Using Self-Supervised Conformers
Figure 2 for Universal Paralinguistic Speech Representations Using Self-Supervised Conformers
Figure 3 for Universal Paralinguistic Speech Representations Using Self-Supervised Conformers
Figure 4 for Universal Paralinguistic Speech Representations Using Self-Supervised Conformers
Viaarxiv icon

BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition

Add code
Bookmark button
Alert button
Oct 01, 2021
Yu Zhang, Daniel S. Park, Wei Han, James Qin, Anmol Gulati, Joel Shor, Aren Jansen, Yuanzhong Xu, Yanping Huang, Shibo Wang, Zongwei Zhou, Bo Li, Min Ma, William Chan, Jiahui Yu, Yongqiang Wang, Liangliang Cao, Khe Chai Sim, Bhuvana Ramabhadran, Tara N. Sainath, Françoise Beaufays, Zhifeng Chen, Quoc V. Le, Chung-Cheng Chiu, Ruoming Pang, Yonghui Wu

Figure 1 for BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition
Figure 2 for BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition
Figure 3 for BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition
Figure 4 for BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition
Viaarxiv icon

Attention Bottlenecks for Multimodal Fusion

Add code
Bookmark button
Alert button
Jun 30, 2021
Arsha Nagrani, Shan Yang, Anurag Arnab, Aren Jansen, Cordelia Schmid, Chen Sun

Figure 1 for Attention Bottlenecks for Multimodal Fusion
Figure 2 for Attention Bottlenecks for Multimodal Fusion
Figure 3 for Attention Bottlenecks for Multimodal Fusion
Figure 4 for Attention Bottlenecks for Multimodal Fusion
Viaarxiv icon

Sparse, Efficient, and Semantic Mixture Invariant Training: Taming In-the-Wild Unsupervised Sound Separation

Add code
Bookmark button
Alert button
Jun 01, 2021
Scott Wisdom, Aren Jansen, Ron J. Weiss, Hakan Erdogan, John R. Hershey

Figure 1 for Sparse, Efficient, and Semantic Mixture Invariant Training: Taming In-the-Wild Unsupervised Sound Separation
Figure 2 for Sparse, Efficient, and Semantic Mixture Invariant Training: Taming In-the-Wild Unsupervised Sound Separation
Viaarxiv icon