Alert button
Picture for Xinhao Mei

Xinhao Mei

Alert button

First-Shot Unsupervised Anomalous Sound Detection With Unknown Anomalies Estimated by Metadata-Assisted Audio Generation

Add code
Bookmark button
Alert button
Oct 22, 2023
Hejing Zhang, Qiaoxi Zhu, Jian Guan, Haohe Liu, Feiyang Xiao, Jiantong Tian, Xinhao Mei, Xubo Liu, Wenwu Wang

Viaarxiv icon

FoleyGen: Visually-Guided Audio Generation

Add code
Bookmark button
Alert button
Sep 19, 2023
Xinhao Mei, Varun Nagaraja, Gael Le Lan, Zhaoheng Ni, Ernie Chang, Yangyang Shi, Vikas Chandra

Figure 1 for FoleyGen: Visually-Guided Audio Generation
Figure 2 for FoleyGen: Visually-Guided Audio Generation
Figure 3 for FoleyGen: Visually-Guided Audio Generation
Figure 4 for FoleyGen: Visually-Guided Audio Generation
Viaarxiv icon

Enhance audio generation controllability through representation similarity regularization

Add code
Bookmark button
Alert button
Sep 15, 2023
Yangyang Shi, Gael Le Lan, Varun Nagaraja, Zhaoheng Ni, Xinhao Mei, Ernie Chang, Forrest Iandola, Yang Liu, Vikas Chandra

Figure 1 for Enhance audio generation controllability through representation similarity regularization
Figure 2 for Enhance audio generation controllability through representation similarity regularization
Figure 3 for Enhance audio generation controllability through representation similarity regularization
Figure 4 for Enhance audio generation controllability through representation similarity regularization
Viaarxiv icon

AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining

Add code
Bookmark button
Alert button
Aug 10, 2023
Haohe Liu, Qiao Tian, Yi Yuan, Xubo Liu, Xinhao Mei, Qiuqiang Kong, Yuping Wang, Wenwu Wang, Yuxuan Wang, Mark D. Plumbley

Figure 1 for AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining
Figure 2 for AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining
Figure 3 for AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining
Figure 4 for AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining
Viaarxiv icon

Dual Transformer Decoder based Features Fusion Network for Automated Audio Captioning

Add code
Bookmark button
Alert button
May 30, 2023
Jianyuan Sun, Xubo Liu, Xinhao Mei, Volkan Kılıç, Mark D. Plumbley, Wenwu Wang

Figure 1 for Dual Transformer Decoder based Features Fusion Network for Automated Audio Captioning
Figure 2 for Dual Transformer Decoder based Features Fusion Network for Automated Audio Captioning
Figure 3 for Dual Transformer Decoder based Features Fusion Network for Automated Audio Captioning
Figure 4 for Dual Transformer Decoder based Features Fusion Network for Automated Audio Captioning
Viaarxiv icon

WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal Research

Add code
Bookmark button
Alert button
Mar 30, 2023
Xinhao Mei, Chutong Meng, Haohe Liu, Qiuqiang Kong, Tom Ko, Chengqi Zhao, Mark D. Plumbley, Yuexian Zou, Wenwu Wang

Figure 1 for WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal Research
Figure 2 for WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal Research
Figure 3 for WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal Research
Figure 4 for WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal Research
Viaarxiv icon

AudioLDM: Text-to-Audio Generation with Latent Diffusion Models

Add code
Bookmark button
Alert button
Feb 16, 2023
Haohe Liu, Zehua Chen, Yi Yuan, Xinhao Mei, Xubo Liu, Danilo Mandic, Wenwu Wang, Mark D. Plumbley

Figure 1 for AudioLDM: Text-to-Audio Generation with Latent Diffusion Models
Figure 2 for AudioLDM: Text-to-Audio Generation with Latent Diffusion Models
Figure 3 for AudioLDM: Text-to-Audio Generation with Latent Diffusion Models
Figure 4 for AudioLDM: Text-to-Audio Generation with Latent Diffusion Models
Viaarxiv icon

Towards Generating Diverse Audio Captions via Adversarial Training

Add code
Bookmark button
Alert button
Dec 05, 2022
Xinhao Mei, Xubo Liu, Jianyuan Sun, Mark D. Plumbley, Wenwu Wang

Figure 1 for Towards Generating Diverse Audio Captions via Adversarial Training
Figure 2 for Towards Generating Diverse Audio Captions via Adversarial Training
Figure 3 for Towards Generating Diverse Audio Captions via Adversarial Training
Figure 4 for Towards Generating Diverse Audio Captions via Adversarial Training
Viaarxiv icon

Ontology-aware Learning and Evaluation for Audio Tagging

Add code
Bookmark button
Alert button
Nov 22, 2022
Haohe Liu, Qiuqiang Kong, Xubo Liu, Xinhao Mei, Wenwu Wang, Mark D. Plumbley

Figure 1 for Ontology-aware Learning and Evaluation for Audio Tagging
Figure 2 for Ontology-aware Learning and Evaluation for Audio Tagging
Figure 3 for Ontology-aware Learning and Evaluation for Audio Tagging
Figure 4 for Ontology-aware Learning and Evaluation for Audio Tagging
Viaarxiv icon