Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

V2Meow: Meowing to the Visual Beat via Music Generation


May 11, 2023
Kun Su, Judith Yue Li, Qingqing Huang, Dima Kuzmin, Joonseok Lee, Chris Donahue, Fei Sha, Aren Jansen, Yu Wang, Mauro Verzetti, Timo I. Denk

Add code


   Access Paper or Ask Questions

MusicLM: Generating Music From Text


Jan 26, 2023
Andrea Agostinelli, Timo I. Denk, Zalán Borsos, Jesse Engel, Mauro Verzetti, Antoine Caillon, Qingqing Huang, Aren Jansen, Adam Roberts, Marco Tagliasacchi, Matt Sharifi, Neil Zeghidour, Christian Frank

Add code

* Supplementary material at https://google-research.github.io/seanet/musiclm/examples and https://kaggle.com/datasets/googleai/musiccaps 

   Access Paper or Ask Questions

MAQA: A Multimodal QA Benchmark for Negation


Jan 09, 2023
Judith Yue Li, Aren Jansen, Qingqing Huang, Joonseok Lee, Ravi Ganti, Dima Kuzmin

Add code

* NeurIPS 2022 SyntheticData4ML Workshop 

   Access Paper or Ask Questions

MuLan: A Joint Embedding of Music Audio and Natural Language


Aug 26, 2022
Qingqing Huang, Aren Jansen, Joonseok Lee, Ravi Ganti, Judith Yue Li, Daniel P. W. Ellis

Add code

* To appear in ISMIR 2022 

   Access Paper or Ask Questions

Text-Driven Separation of Arbitrary Sounds


Apr 12, 2022
Kevin Kilgour, Beat Gfeller, Qingqing Huang, Aren Jansen, Scott Wisdom, Marco Tagliasacchi

Add code

* Submitted to INTERSPEECH 2022 

   Access Paper or Ask Questions

Universal Paralinguistic Speech Representations Using Self-Supervised Conformers


Oct 09, 2021
Joel Shor, Aren Jansen, Wei Han, Daniel Park, Yu Zhang

Add code


   Access Paper or Ask Questions

BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition


Oct 01, 2021
Yu Zhang, Daniel S. Park, Wei Han, James Qin, Anmol Gulati, Joel Shor, Aren Jansen, Yuanzhong Xu, Yanping Huang, Shibo Wang, Zongwei Zhou, Bo Li, Min Ma, William Chan, Jiahui Yu, Yongqiang Wang, Liangliang Cao, Khe Chai Sim, Bhuvana Ramabhadran, Tara N. Sainath, Françoise Beaufays, Zhifeng Chen, Quoc V. Le, Chung-Cheng Chiu, Ruoming Pang, Yonghui Wu

Add code

* 14 pages, 7 figures, 13 tables; v2: minor corrections, reference baselines and bibliography updated 

   Access Paper or Ask Questions

Attention Bottlenecks for Multimodal Fusion


Jun 30, 2021
Arsha Nagrani, Shan Yang, Anurag Arnab, Aren Jansen, Cordelia Schmid, Chen Sun

Add code


   Access Paper or Ask Questions

Sparse, Efficient, and Semantic Mixture Invariant Training: Taming In-the-Wild Unsupervised Sound Separation


Jun 01, 2021
Scott Wisdom, Aren Jansen, Ron J. Weiss, Hakan Erdogan, John R. Hershey

Add code

* 5 pages, 1 figure. submitted to WASPAA 2021 

   Access Paper or Ask Questions

The Benefit Of Temporally-Strong Labels In Audio Event Classification


May 14, 2021
Shawn Hershey, Daniel P W Ellis, Eduardo Fonseca, Aren Jansen, Caroline Liu, R Channing Moore, Manoj Plakal

Add code

* Accepted for publication at ICASSP 2021 

   Access Paper or Ask Questions

1
2
3
>>