Alert button
Picture for Michael Zeng

Michael Zeng

Alert button

Making Flow-Matching-Based Zero-Shot Text-to-Speech Laugh as You Like

Feb 12, 2024
Naoyuki Kanda, Xiaofei Wang, Sefik Emre Eskimez, Manthan Thakker, Hemin Yang, Zirun Zhu, Min Tang, Canrun Li, Steven Tsai, Zhen Xiao, Yufei Xia, Jinzhu Li, Yanqing Liu, Sheng Zhao, Michael Zeng

Viaarxiv icon

Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks

Nov 10, 2023
Bin Xiao, Haiping Wu, Weijian Xu, Xiyang Dai, Houdong Hu, Yumao Lu, Michael Zeng, Ce Liu, Lu Yuan

Figure 1 for Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks
Figure 2 for Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks
Figure 3 for Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks
Figure 4 for Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks
Viaarxiv icon

Diffusion Conditional Expectation Model for Efficient and Robust Target Speech Extraction

Sep 25, 2023
Leying Zhang, Yao Qian, Linfeng Yu, Heming Wang, Xinkai Wang, Hemin Yang, Long Zhou, Shujie Liu, Yanmin Qian, Michael Zeng

Viaarxiv icon

Adapting Large Language Model with Speech for Fully Formatted End-to-End Speech Recognition

Aug 03, 2023
Shaoshi Ling, Yuxuan Hu, Shuangbei Qian, Guoli Ye, Yao Qian, Yifan Gong, Ed Lin, Michael Zeng

Figure 1 for Adapting Large Language Model with Speech for Fully Formatted End-to-End Speech Recognition
Figure 2 for Adapting Large Language Model with Speech for Fully Formatted End-to-End Speech Recognition
Figure 3 for Adapting Large Language Model with Speech for Fully Formatted End-to-End Speech Recognition
Figure 4 for Adapting Large Language Model with Speech for Fully Formatted End-to-End Speech Recognition
Viaarxiv icon

Adapting Multi-Lingual ASR Models for Handling Multiple Talkers

May 30, 2023
Chenda Li, Yao Qian, Zhuo Chen, Naoyuki Kanda, Dongmei Wang, Takuya Yoshioka, Yanmin Qian, Michael Zeng

Figure 1 for Adapting Multi-Lingual ASR Models for Handling Multiple Talkers
Figure 2 for Adapting Multi-Lingual ASR Models for Handling Multiple Talkers
Figure 3 for Adapting Multi-Lingual ASR Models for Handling Multiple Talkers
Figure 4 for Adapting Multi-Lingual ASR Models for Handling Multiple Talkers
Viaarxiv icon

ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation

May 24, 2023
Chenyang Le, Yao Qian, Long Zhou, Shujie Liu, Michael Zeng, Xuedong Huang

Figure 1 for ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation
Figure 2 for ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation
Figure 3 for ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation
Figure 4 for ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation
Viaarxiv icon

i-Code Studio: A Configurable and Composable Framework for Integrative AI

May 23, 2023
Yuwei Fang, Mahmoud Khademi, Chenguang Zhu, Ziyi Yang, Reid Pryzant, Yichong Xu, Yao Qian, Takuya Yoshioka, Lu Yuan, Michael Zeng, Xuedong Huang

Figure 1 for i-Code Studio: A Configurable and Composable Framework for Integrative AI
Figure 2 for i-Code Studio: A Configurable and Composable Framework for Integrative AI
Figure 3 for i-Code Studio: A Configurable and Composable Framework for Integrative AI
Figure 4 for i-Code Studio: A Configurable and Composable Framework for Integrative AI
Viaarxiv icon

LMGQS: A Large-scale Dataset for Query-focused Summarization

May 22, 2023
Ruochen Xu, Song Wang, Yang Liu, Shuohang Wang, Yichong Xu, Dan Iter, Chenguang Zhu, Michael Zeng

Figure 1 for LMGQS: A Large-scale Dataset for Query-focused Summarization
Figure 2 for LMGQS: A Large-scale Dataset for Query-focused Summarization
Figure 3 for LMGQS: A Large-scale Dataset for Query-focused Summarization
Figure 4 for LMGQS: A Large-scale Dataset for Query-focused Summarization
Viaarxiv icon

InheritSumm: A General, Versatile and Compact Summarizer by Distilling from GPT

May 22, 2023
Yichong Xu, Ruochen Xu, Dan Iter, Yang Liu, Shuohang Wang, Chenguang Zhu, Michael Zeng

Figure 1 for InheritSumm: A General, Versatile and Compact Summarizer by Distilling from GPT
Figure 2 for InheritSumm: A General, Versatile and Compact Summarizer by Distilling from GPT
Figure 3 for InheritSumm: A General, Versatile and Compact Summarizer by Distilling from GPT
Figure 4 for InheritSumm: A General, Versatile and Compact Summarizer by Distilling from GPT
Viaarxiv icon