Alert button

"Text": models, code, and papers
Alert button

i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data

May 21, 2023
Ziyi Yang, Mahmoud Khademi, Yichong Xu, Reid Pryzant, Yuwei Fang, Chenguang Zhu, Dongdong Chen, Yao Qian, Mei Gao, Yi-Ling Chen, Robert Gmyr, Naoyuki Kanda, Noel Codella, Bin Xiao, Yu Shi, Lu Yuan, Takuya Yoshioka, Michael Zeng, Xuedong Huang

Figure 1 for i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data
Figure 2 for i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data
Figure 3 for i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data
Figure 4 for i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data
Viaarxiv icon

When ChatGPT for Computer Vision Will Come? From 2D to 3D

May 10, 2023
Chenghao Li, Chaoning Zhang

Figure 1 for When ChatGPT for Computer Vision Will Come? From 2D to 3D
Figure 2 for When ChatGPT for Computer Vision Will Come? From 2D to 3D
Figure 3 for When ChatGPT for Computer Vision Will Come? From 2D to 3D
Figure 4 for When ChatGPT for Computer Vision Will Come? From 2D to 3D
Viaarxiv icon

Arukikata Travelogue Dataset with Geographic Entity Mention, Coreference, and Link Annotation

May 23, 2023
Shohei Higashiyama, Hiroki Ouchi, Hiroki Teranishi, Hiroyuki Otomo, Yusuke Ide, Aitaro Yamamoto, Hiroyuki Shindo, Yuki Matsuda, Shoko Wakamiya, Naoya Inoue, Ikuya Yamada, Taro Watanabe

Figure 1 for Arukikata Travelogue Dataset with Geographic Entity Mention, Coreference, and Link Annotation
Figure 2 for Arukikata Travelogue Dataset with Geographic Entity Mention, Coreference, and Link Annotation
Figure 3 for Arukikata Travelogue Dataset with Geographic Entity Mention, Coreference, and Link Annotation
Figure 4 for Arukikata Travelogue Dataset with Geographic Entity Mention, Coreference, and Link Annotation
Viaarxiv icon

Diffusion-Based Mel-Spectrogram Enhancement for Personalized Speech Synthesis with Found Data

May 18, 2023
Yusheng Tian, Wei Liu, Tan Lee

Figure 1 for Diffusion-Based Mel-Spectrogram Enhancement for Personalized Speech Synthesis with Found Data
Figure 2 for Diffusion-Based Mel-Spectrogram Enhancement for Personalized Speech Synthesis with Found Data
Figure 3 for Diffusion-Based Mel-Spectrogram Enhancement for Personalized Speech Synthesis with Found Data
Figure 4 for Diffusion-Based Mel-Spectrogram Enhancement for Personalized Speech Synthesis with Found Data
Viaarxiv icon

Information Screening whilst Exploiting! Multimodal Relation Extraction with Feature Denoising and Multimodal Topic Modeling

May 19, 2023
Shengqiong Wu, Hao Fei, Yixin Cao, Lidong Bing, Tat-Seng Chua

Figure 1 for Information Screening whilst Exploiting! Multimodal Relation Extraction with Feature Denoising and Multimodal Topic Modeling
Figure 2 for Information Screening whilst Exploiting! Multimodal Relation Extraction with Feature Denoising and Multimodal Topic Modeling
Figure 3 for Information Screening whilst Exploiting! Multimodal Relation Extraction with Feature Denoising and Multimodal Topic Modeling
Figure 4 for Information Screening whilst Exploiting! Multimodal Relation Extraction with Feature Denoising and Multimodal Topic Modeling
Viaarxiv icon

DUB: Discrete Unit Back-translation for Speech Translation

May 19, 2023
Dong Zhang, Rong Ye, Tom Ko, Mingxuan Wang, Yaqian Zhou

Figure 1 for DUB: Discrete Unit Back-translation for Speech Translation
Figure 2 for DUB: Discrete Unit Back-translation for Speech Translation
Figure 3 for DUB: Discrete Unit Back-translation for Speech Translation
Figure 4 for DUB: Discrete Unit Back-translation for Speech Translation
Viaarxiv icon

Joint Representation Learning for Text and 3D Point Cloud

Jan 18, 2023
Rui Huang, Xuran Pan, Henry Zheng, Haojun Jiang, Zhifeng Xie, Shiji Song, Gao Huang

Figure 1 for Joint Representation Learning for Text and 3D Point Cloud
Figure 2 for Joint Representation Learning for Text and 3D Point Cloud
Figure 3 for Joint Representation Learning for Text and 3D Point Cloud
Figure 4 for Joint Representation Learning for Text and 3D Point Cloud
Viaarxiv icon

InstructEdit: Improving Automatic Masks for Diffusion-based Image Editing With User Instructions

May 29, 2023
Qian Wang, Biao Zhang, Michael Birsak, Peter Wonka

Figure 1 for InstructEdit: Improving Automatic Masks for Diffusion-based Image Editing With User Instructions
Figure 2 for InstructEdit: Improving Automatic Masks for Diffusion-based Image Editing With User Instructions
Figure 3 for InstructEdit: Improving Automatic Masks for Diffusion-based Image Editing With User Instructions
Figure 4 for InstructEdit: Improving Automatic Masks for Diffusion-based Image Editing With User Instructions
Viaarxiv icon

Are Large Language Models Really Good Logical Reasoners? A Comprehensive Evaluation From Deductive, Inductive and Abductive Views

Jun 16, 2023
Fangzhi Xu, Qika Lin, Jiawei Han, Tianzhe Zhao, Jun Liu, Erik Cambria

Figure 1 for Are Large Language Models Really Good Logical Reasoners? A Comprehensive Evaluation From Deductive, Inductive and Abductive Views
Figure 2 for Are Large Language Models Really Good Logical Reasoners? A Comprehensive Evaluation From Deductive, Inductive and Abductive Views
Figure 3 for Are Large Language Models Really Good Logical Reasoners? A Comprehensive Evaluation From Deductive, Inductive and Abductive Views
Figure 4 for Are Large Language Models Really Good Logical Reasoners? A Comprehensive Evaluation From Deductive, Inductive and Abductive Views
Viaarxiv icon

A Low-rank Matching Attention based Cross-modal Feature Fusion Method for Conversational Emotion Recognition

Jun 16, 2023
Yuntao Shou, Xiangyong Cao, Deyu Meng, Bo Dong, Qinghua Zheng

Figure 1 for A Low-rank Matching Attention based Cross-modal Feature Fusion Method for Conversational Emotion Recognition
Figure 2 for A Low-rank Matching Attention based Cross-modal Feature Fusion Method for Conversational Emotion Recognition
Figure 3 for A Low-rank Matching Attention based Cross-modal Feature Fusion Method for Conversational Emotion Recognition
Figure 4 for A Low-rank Matching Attention based Cross-modal Feature Fusion Method for Conversational Emotion Recognition
Viaarxiv icon