Alert button
Picture for Xuedong Huang

Xuedong Huang

Alert button

ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation

May 24, 2023
Chenyang Le, Yao Qian, Long Zhou, Shujie Liu, Michael Zeng, Xuedong Huang

Figure 1 for ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation
Figure 2 for ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation
Figure 3 for ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation
Figure 4 for ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation
Viaarxiv icon

i-Code Studio: A Configurable and Composable Framework for Integrative AI

May 23, 2023
Yuwei Fang, Mahmoud Khademi, Chenguang Zhu, Ziyi Yang, Reid Pryzant, Yichong Xu, Yao Qian, Takuya Yoshioka, Lu Yuan, Michael Zeng, Xuedong Huang

Figure 1 for i-Code Studio: A Configurable and Composable Framework for Integrative AI
Figure 2 for i-Code Studio: A Configurable and Composable Framework for Integrative AI
Figure 3 for i-Code Studio: A Configurable and Composable Framework for Integrative AI
Figure 4 for i-Code Studio: A Configurable and Composable Framework for Integrative AI
Viaarxiv icon

i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data

May 21, 2023
Ziyi Yang, Mahmoud Khademi, Yichong Xu, Reid Pryzant, Yuwei Fang, Chenguang Zhu, Dongdong Chen, Yao Qian, Mei Gao, Yi-Ling Chen, Robert Gmyr, Naoyuki Kanda, Noel Codella, Bin Xiao, Yu Shi, Lu Yuan, Takuya Yoshioka, Michael Zeng, Xuedong Huang

Figure 1 for i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data
Figure 2 for i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data
Figure 3 for i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data
Figure 4 for i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data
Viaarxiv icon

Z-Code++: A Pre-trained Language Model Optimized for Abstractive Summarization

Aug 21, 2022
Pengcheng He, Baolin Peng, Liyang Lu, Song Wang, Jie Mei, Yang Liu, Ruochen Xu, Hany Hassan Awadalla, Yu Shi, Chenguang Zhu, Wayne Xiong, Michael Zeng, Jianfeng Gao, Xuedong Huang

Figure 1 for Z-Code++: A Pre-trained Language Model Optimized for Abstractive Summarization
Figure 2 for Z-Code++: A Pre-trained Language Model Optimized for Abstractive Summarization
Figure 3 for Z-Code++: A Pre-trained Language Model Optimized for Abstractive Summarization
Figure 4 for Z-Code++: A Pre-trained Language Model Optimized for Abstractive Summarization
Viaarxiv icon

i-Code: An Integrative and Composable Multimodal Learning Framework

May 05, 2022
Ziyi Yang, Yuwei Fang, Chenguang Zhu, Reid Pryzant, Dongdong Chen, Yu Shi, Yichong Xu, Yao Qian, Mei Gao, Yi-Ling Chen, Liyang Lu, Yujia Xie, Robert Gmyr, Noel Codella, Naoyuki Kanda, Bin Xiao, Lu Yuan, Takuya Yoshioka, Michael Zeng, Xuedong Huang

Figure 1 for i-Code: An Integrative and Composable Multimodal Learning Framework
Figure 2 for i-Code: An Integrative and Composable Multimodal Learning Framework
Figure 3 for i-Code: An Integrative and Composable Multimodal Learning Framework
Figure 4 for i-Code: An Integrative and Composable Multimodal Learning Framework
Viaarxiv icon

Human Parity on CommonsenseQA: Augmenting Self-Attention with External Attention

Dec 14, 2021
Yichong Xu, Chenguang Zhu, Shuohang Wang, Siqi Sun, Hao Cheng, Xiaodong Liu, Jianfeng Gao, Pengcheng He, Michael Zeng, Xuedong Huang

Figure 1 for Human Parity on CommonsenseQA: Augmenting Self-Attention with External Attention
Figure 2 for Human Parity on CommonsenseQA: Augmenting Self-Attention with External Attention
Figure 3 for Human Parity on CommonsenseQA: Augmenting Self-Attention with External Attention
Figure 4 for Human Parity on CommonsenseQA: Augmenting Self-Attention with External Attention
Viaarxiv icon

Florence: A New Foundation Model for Computer Vision

Nov 22, 2021
Lu Yuan, Dongdong Chen, Yi-Ling Chen, Noel Codella, Xiyang Dai, Jianfeng Gao, Houdong Hu, Xuedong Huang, Boxin Li, Chunyuan Li, Ce Liu, Mengchen Liu, Zicheng Liu, Yumao Lu, Yu Shi, Lijuan Wang, Jianfeng Wang, Bin Xiao, Zhen Xiao, Jianwei Yang, Michael Zeng, Luowei Zhou, Pengchuan Zhang

Figure 1 for Florence: A New Foundation Model for Computer Vision
Figure 2 for Florence: A New Foundation Model for Computer Vision
Figure 3 for Florence: A New Foundation Model for Computer Vision
Figure 4 for Florence: A New Foundation Model for Computer Vision
Viaarxiv icon

One model to enhance them all: array geometry agnostic multi-channel personalized speech enhancement

Oct 20, 2021
Hassan Taherian, Sefik Emre Eskimez, Takuya Yoshioka, Huaming Wang, Zhuo Chen, Xuedong Huang

Figure 1 for One model to enhance them all: array geometry agnostic multi-channel personalized speech enhancement
Figure 2 for One model to enhance them all: array geometry agnostic multi-channel personalized speech enhancement
Figure 3 for One model to enhance them all: array geometry agnostic multi-channel personalized speech enhancement
Figure 4 for One model to enhance them all: array geometry agnostic multi-channel personalized speech enhancement
Viaarxiv icon

Personalized Speech Enhancement: New Models and Comprehensive Evaluation

Oct 18, 2021
Sefik Emre Eskimez, Takuya Yoshioka, Huaming Wang, Xiaofei Wang, Zhuo Chen, Xuedong Huang

Figure 1 for Personalized Speech Enhancement: New Models and Comprehensive Evaluation
Figure 2 for Personalized Speech Enhancement: New Models and Comprehensive Evaluation
Figure 3 for Personalized Speech Enhancement: New Models and Comprehensive Evaluation
Viaarxiv icon