Alert button

"Text": models, code, and papers
Alert button

ReelFramer: Co-creating News Reels on Social Media with Generative AI

Apr 19, 2023
Sitong Wang, Samia Menon, Tao Long, Keren Henderson, Dingzeyu Li, Kevin Crowston, Mark Hansen, Jeffrey V. Nickerson, Lydia B. Chilton

Figure 1 for ReelFramer: Co-creating News Reels on Social Media with Generative AI
Figure 2 for ReelFramer: Co-creating News Reels on Social Media with Generative AI
Figure 3 for ReelFramer: Co-creating News Reels on Social Media with Generative AI
Figure 4 for ReelFramer: Co-creating News Reels on Social Media with Generative AI
Viaarxiv icon

Open-Vocabulary Point-Cloud Object Detection without 3D Annotation

Apr 03, 2023
Yuheng Lu, Chenfeng Xu, Xiaobao Wei, Xiaodong Xie, Masayoshi Tomizuka, Kurt Keutzer, Shanghang Zhang

Figure 1 for Open-Vocabulary Point-Cloud Object Detection without 3D Annotation
Figure 2 for Open-Vocabulary Point-Cloud Object Detection without 3D Annotation
Figure 3 for Open-Vocabulary Point-Cloud Object Detection without 3D Annotation
Figure 4 for Open-Vocabulary Point-Cloud Object Detection without 3D Annotation
Viaarxiv icon

Optimizing Prompts for Text-to-Image Generation

Dec 19, 2022
Yaru Hao, Zewen Chi, Li Dong, Furu Wei

Figure 1 for Optimizing Prompts for Text-to-Image Generation
Figure 2 for Optimizing Prompts for Text-to-Image Generation
Figure 3 for Optimizing Prompts for Text-to-Image Generation
Figure 4 for Optimizing Prompts for Text-to-Image Generation
Viaarxiv icon

Investigation of Japanese PnG BERT language model in text-to-speech synthesis for pitch accent language

Dec 16, 2022
Yusuke Yasuda, Tomoki Toda

Figure 1 for Investigation of Japanese PnG BERT language model in text-to-speech synthesis for pitch accent language
Figure 2 for Investigation of Japanese PnG BERT language model in text-to-speech synthesis for pitch accent language
Figure 3 for Investigation of Japanese PnG BERT language model in text-to-speech synthesis for pitch accent language
Figure 4 for Investigation of Japanese PnG BERT language model in text-to-speech synthesis for pitch accent language
Viaarxiv icon

SIA-FTP: A Spoken Instruction Aware Flight Trajectory Prediction Framework

May 02, 2023
Dongyue Guo, Jianwei Zhang, Yi Lin

Figure 1 for SIA-FTP: A Spoken Instruction Aware Flight Trajectory Prediction Framework
Figure 2 for SIA-FTP: A Spoken Instruction Aware Flight Trajectory Prediction Framework
Figure 3 for SIA-FTP: A Spoken Instruction Aware Flight Trajectory Prediction Framework
Viaarxiv icon

token2vec: A Joint Self-Supervised Pre-training Framework Using Unpaired Speech and Text

Oct 30, 2022
Xianghu Yue, Junyi Ao, Xiaoxue Gao, Haizhou Li

Figure 1 for token2vec: A Joint Self-Supervised Pre-training Framework Using Unpaired Speech and Text
Figure 2 for token2vec: A Joint Self-Supervised Pre-training Framework Using Unpaired Speech and Text
Figure 3 for token2vec: A Joint Self-Supervised Pre-training Framework Using Unpaired Speech and Text
Figure 4 for token2vec: A Joint Self-Supervised Pre-training Framework Using Unpaired Speech and Text
Viaarxiv icon

LP-SLAM: Language-Perceptive RGB-D SLAM system based on Large Language Model

Mar 17, 2023
Weiyi Zhang, Yushi Guo, Liting Niu, Peijun Li, Chun Zhang, Zeyu Wan, Jiaxiang Yan, Fasih Ud Din Farrukh, Debing Zhang

Viaarxiv icon

Fast Text-Conditional Discrete Denoising on Vector-Quantized Latent Spaces

Nov 14, 2022
Dominic Rampas, Pablo Pernias, Elea Zhong, Marc Aubreville

Figure 1 for Fast Text-Conditional Discrete Denoising on Vector-Quantized Latent Spaces
Figure 2 for Fast Text-Conditional Discrete Denoising on Vector-Quantized Latent Spaces
Figure 3 for Fast Text-Conditional Discrete Denoising on Vector-Quantized Latent Spaces
Figure 4 for Fast Text-Conditional Discrete Denoising on Vector-Quantized Latent Spaces
Viaarxiv icon

TextCraft: Zero-Shot Generation of High-Fidelity and Diverse Shapes from Text

Nov 04, 2022
Aditya Sanghi, Rao Fu, Vivian Liu, Karl Willis, Hooman Shayani, Amir Hosein Khasahmadi, Srinath Sridhar, Daniel Ritchie

Figure 1 for TextCraft: Zero-Shot Generation of High-Fidelity and Diverse Shapes from Text
Figure 2 for TextCraft: Zero-Shot Generation of High-Fidelity and Diverse Shapes from Text
Figure 3 for TextCraft: Zero-Shot Generation of High-Fidelity and Diverse Shapes from Text
Figure 4 for TextCraft: Zero-Shot Generation of High-Fidelity and Diverse Shapes from Text
Viaarxiv icon

Prior-RadGraphFormer: A Prior-Knowledge-Enhanced Transformer for Generating Radiology Graphs from X-Rays

Mar 27, 2023
Yiheng Xiong, Jingsong Liu, Kamilia Zaripova, Sahand Sharifzadeh, Matthias Keicher, Nassir Navab

Figure 1 for Prior-RadGraphFormer: A Prior-Knowledge-Enhanced Transformer for Generating Radiology Graphs from X-Rays
Figure 2 for Prior-RadGraphFormer: A Prior-Knowledge-Enhanced Transformer for Generating Radiology Graphs from X-Rays
Figure 3 for Prior-RadGraphFormer: A Prior-Knowledge-Enhanced Transformer for Generating Radiology Graphs from X-Rays
Figure 4 for Prior-RadGraphFormer: A Prior-Knowledge-Enhanced Transformer for Generating Radiology Graphs from X-Rays
Viaarxiv icon