Alert button

"Text": models, code, and papers
Alert button

Exploring Annotation-free Image Captioning with Retrieval-augmented Pseudo Sentence Generation

Jul 28, 2023
Zhiyuan Li, Dongnan Liu, Heng Wang, Chaoyi Zhang, Weidong Cai

Figure 1 for Exploring Annotation-free Image Captioning with Retrieval-augmented Pseudo Sentence Generation
Figure 2 for Exploring Annotation-free Image Captioning with Retrieval-augmented Pseudo Sentence Generation
Figure 3 for Exploring Annotation-free Image Captioning with Retrieval-augmented Pseudo Sentence Generation
Figure 4 for Exploring Annotation-free Image Captioning with Retrieval-augmented Pseudo Sentence Generation
Viaarxiv icon

DiffVoice: Text-to-Speech with Latent Diffusion

Apr 23, 2023
Zhijun Liu, Yiwei Guo, Kai Yu

Figure 1 for DiffVoice: Text-to-Speech with Latent Diffusion
Figure 2 for DiffVoice: Text-to-Speech with Latent Diffusion
Figure 3 for DiffVoice: Text-to-Speech with Latent Diffusion
Figure 4 for DiffVoice: Text-to-Speech with Latent Diffusion
Viaarxiv icon

Text-Conditioned Sampling Framework for Text-to-Image Generation with Masked Generative Models

Apr 04, 2023
Jaewoong Lee, Sangwon Jang, Jaehyeong Jo, Jaehong Yoon, Yunji Kim, Jin-Hwa Kim, Jung-Woo Ha, Sung Ju Hwang

Figure 1 for Text-Conditioned Sampling Framework for Text-to-Image Generation with Masked Generative Models
Figure 2 for Text-Conditioned Sampling Framework for Text-to-Image Generation with Masked Generative Models
Figure 3 for Text-Conditioned Sampling Framework for Text-to-Image Generation with Masked Generative Models
Figure 4 for Text-Conditioned Sampling Framework for Text-to-Image Generation with Masked Generative Models
Viaarxiv icon

Cascaded Cross-Modal Transformer for Request and Complaint Detection

Jul 27, 2023
Nicolae-Catalin Ristea, Radu Tudor Ionescu

Figure 1 for Cascaded Cross-Modal Transformer for Request and Complaint Detection
Figure 2 for Cascaded Cross-Modal Transformer for Request and Complaint Detection
Figure 3 for Cascaded Cross-Modal Transformer for Request and Complaint Detection
Figure 4 for Cascaded Cross-Modal Transformer for Request and Complaint Detection
Viaarxiv icon

Explainable Topic-Enhanced Argument Mining from Heterogeneous Sources

Jul 22, 2023
Jiasheng Si, Yingjie Zhu, Xingyu Shi, Deyu Zhou, Yulan He

Figure 1 for Explainable Topic-Enhanced Argument Mining from Heterogeneous Sources
Figure 2 for Explainable Topic-Enhanced Argument Mining from Heterogeneous Sources
Figure 3 for Explainable Topic-Enhanced Argument Mining from Heterogeneous Sources
Figure 4 for Explainable Topic-Enhanced Argument Mining from Heterogeneous Sources
Viaarxiv icon

Dialogue Shaping: Empowering Agents through NPC Interaction

Jul 28, 2023
Wei Zhou, Xiangyu Peng, Mark Riedl

Figure 1 for Dialogue Shaping: Empowering Agents through NPC Interaction
Figure 2 for Dialogue Shaping: Empowering Agents through NPC Interaction
Figure 3 for Dialogue Shaping: Empowering Agents through NPC Interaction
Figure 4 for Dialogue Shaping: Empowering Agents through NPC Interaction
Viaarxiv icon

Efficient Guided Generation for Large Language Models

Jul 20, 2023
Brandon T. Willard, Rémi Louf

Figure 1 for Efficient Guided Generation for Large Language Models
Viaarxiv icon

Findings of Factify 2: Multimodal Fake News Detection

Jul 19, 2023
S Suryavardan, Shreyash Mishra, Megha Chakraborty, Parth Patwa, Anku Rani, Aman Chadha, Aishwarya Reganti, Amitava Das, Amit Sheth, Manoj Chinnakotla, Asif Ekbal, Srijan Kumar

Figure 1 for Findings of Factify 2: Multimodal Fake News Detection
Figure 2 for Findings of Factify 2: Multimodal Fake News Detection
Figure 3 for Findings of Factify 2: Multimodal Fake News Detection
Figure 4 for Findings of Factify 2: Multimodal Fake News Detection
Viaarxiv icon

Large Language Models are Frame-level Directors for Zero-shot Text-to-Video Generation

May 23, 2023
Susung Hong, Junyoung Seo, Sunghwan Hong, Heeseong Shin, Seungryong Kim

Figure 1 for Large Language Models are Frame-level Directors for Zero-shot Text-to-Video Generation
Figure 2 for Large Language Models are Frame-level Directors for Zero-shot Text-to-Video Generation
Figure 3 for Large Language Models are Frame-level Directors for Zero-shot Text-to-Video Generation
Figure 4 for Large Language Models are Frame-level Directors for Zero-shot Text-to-Video Generation
Viaarxiv icon

KDSTM: Neural Semi-supervised Topic Modeling with Knowledge Distillation

Jul 04, 2023
Weijie Xu, Xiaoyu Jiang, Jay Desai, Bin Han, Fuqin Yan, Francis Iannacci

Figure 1 for KDSTM: Neural Semi-supervised Topic Modeling with Knowledge Distillation
Figure 2 for KDSTM: Neural Semi-supervised Topic Modeling with Knowledge Distillation
Figure 3 for KDSTM: Neural Semi-supervised Topic Modeling with Knowledge Distillation
Figure 4 for KDSTM: Neural Semi-supervised Topic Modeling with Knowledge Distillation
Viaarxiv icon