Alert button

"Text": models, code, and papers
Alert button

Magic3D: High-Resolution Text-to-3D Content Creation

Nov 18, 2022
Chen-Hsuan Lin, Jun Gao, Luming Tang, Towaki Takikawa, Xiaohui Zeng, Xun Huang, Karsten Kreis, Sanja Fidler, Ming-Yu Liu, Tsung-Yi Lin

Figure 1 for Magic3D: High-Resolution Text-to-3D Content Creation
Figure 2 for Magic3D: High-Resolution Text-to-3D Content Creation
Figure 3 for Magic3D: High-Resolution Text-to-3D Content Creation
Figure 4 for Magic3D: High-Resolution Text-to-3D Content Creation
Viaarxiv icon

Personalized Lightweight Text-to-Speech: Voice Cloning with Adaptive Structured Pruning

Mar 21, 2023
Sung-Feng Huang, Chia-ping Chen, Zhi-Sheng Chen, Yu-Pao Tsai, Hung-yi Lee

Figure 1 for Personalized Lightweight Text-to-Speech: Voice Cloning with Adaptive Structured Pruning
Figure 2 for Personalized Lightweight Text-to-Speech: Voice Cloning with Adaptive Structured Pruning
Figure 3 for Personalized Lightweight Text-to-Speech: Voice Cloning with Adaptive Structured Pruning
Figure 4 for Personalized Lightweight Text-to-Speech: Voice Cloning with Adaptive Structured Pruning
Viaarxiv icon

Inducing anxiety in large language models increases exploration and bias

Apr 21, 2023
Julian Coda-Forno, Kristin Witte, Akshay K. Jagadish, Marcel Binz, Zeynep Akata, Eric Schulz

Figure 1 for Inducing anxiety in large language models increases exploration and bias
Figure 2 for Inducing anxiety in large language models increases exploration and bias
Figure 3 for Inducing anxiety in large language models increases exploration and bias
Figure 4 for Inducing anxiety in large language models increases exploration and bias
Viaarxiv icon

Eyettention: An Attention-based Dual-Sequence Model for Predicting Human Scanpaths during Reading

Apr 21, 2023
Shuwen Deng, David R. Reich, Paul Prasse, Patrick Haller, Tobias Scheffer, Lena A. Jäger

Figure 1 for Eyettention: An Attention-based Dual-Sequence Model for Predicting Human Scanpaths during Reading
Figure 2 for Eyettention: An Attention-based Dual-Sequence Model for Predicting Human Scanpaths during Reading
Figure 3 for Eyettention: An Attention-based Dual-Sequence Model for Predicting Human Scanpaths during Reading
Figure 4 for Eyettention: An Attention-based Dual-Sequence Model for Predicting Human Scanpaths during Reading
Viaarxiv icon

Using Large Text-to-Image Models with Structured Prompts for Skin Disease Identification: A Case Study

Jan 17, 2023
Sajith Rajapaksa, Jean Marie Uwabeza Vianney, Renell Castro, Farzad Khalvati, Shubhra Aich

Figure 1 for Using Large Text-to-Image Models with Structured Prompts for Skin Disease Identification: A Case Study
Figure 2 for Using Large Text-to-Image Models with Structured Prompts for Skin Disease Identification: A Case Study
Figure 3 for Using Large Text-to-Image Models with Structured Prompts for Skin Disease Identification: A Case Study
Figure 4 for Using Large Text-to-Image Models with Structured Prompts for Skin Disease Identification: A Case Study
Viaarxiv icon

TAG: Boosting Text-VQA via Text-aware Visual Question-answer Generation

Aug 03, 2022
Jun Wang, Mingfei Gao, Yuqian Hu, Ramprasaath R. Selvaraju, Chetan Ramaiah, Ran Xu, Joseph F. JaJa, Larry S. Davis

Figure 1 for TAG: Boosting Text-VQA via Text-aware Visual Question-answer Generation
Figure 2 for TAG: Boosting Text-VQA via Text-aware Visual Question-answer Generation
Figure 3 for TAG: Boosting Text-VQA via Text-aware Visual Question-answer Generation
Figure 4 for TAG: Boosting Text-VQA via Text-aware Visual Question-answer Generation
Viaarxiv icon

Bilex Rx: Lexical Data Augmentation for Massively Multilingual Machine Translation

Mar 27, 2023
Alex Jones, Isaac Caswell, Ishank Saxena, Orhan Firat

Figure 1 for Bilex Rx: Lexical Data Augmentation for Massively Multilingual Machine Translation
Figure 2 for Bilex Rx: Lexical Data Augmentation for Massively Multilingual Machine Translation
Figure 3 for Bilex Rx: Lexical Data Augmentation for Massively Multilingual Machine Translation
Figure 4 for Bilex Rx: Lexical Data Augmentation for Massively Multilingual Machine Translation
Viaarxiv icon

Revisiting Multimodal Representation in Contrastive Learning: From Patch and Token Embeddings to Finite Discrete Tokens

Mar 27, 2023
Yuxiao Chen, Jianbo Yuan, Yu Tian, Shijie Geng, Xinyu Li, Ding Zhou, Dimitris N. Metaxas, Hongxia Yang

Figure 1 for Revisiting Multimodal Representation in Contrastive Learning: From Patch and Token Embeddings to Finite Discrete Tokens
Figure 2 for Revisiting Multimodal Representation in Contrastive Learning: From Patch and Token Embeddings to Finite Discrete Tokens
Figure 3 for Revisiting Multimodal Representation in Contrastive Learning: From Patch and Token Embeddings to Finite Discrete Tokens
Figure 4 for Revisiting Multimodal Representation in Contrastive Learning: From Patch and Token Embeddings to Finite Discrete Tokens
Viaarxiv icon

TextMI: Textualize Multimodal Information for Integrating Non-verbal Cues in Pre-trained Language Models

Mar 29, 2023
Md Kamrul Hasan, Md Saiful Islam, Sangwu Lee, Wasifur Rahman, Iftekhar Naim, Mohammed Ibrahim Khan, Ehsan Hoque

Figure 1 for TextMI: Textualize Multimodal Information for Integrating Non-verbal Cues in Pre-trained Language Models
Figure 2 for TextMI: Textualize Multimodal Information for Integrating Non-verbal Cues in Pre-trained Language Models
Figure 3 for TextMI: Textualize Multimodal Information for Integrating Non-verbal Cues in Pre-trained Language Models
Figure 4 for TextMI: Textualize Multimodal Information for Integrating Non-verbal Cues in Pre-trained Language Models
Viaarxiv icon

Multi-step Jailbreaking Privacy Attacks on ChatGPT

Apr 11, 2023
Haoran Li, Dadi Guo, Wei Fan, Mingshi Xu, Yangqiu Song

Figure 1 for Multi-step Jailbreaking Privacy Attacks on ChatGPT
Figure 2 for Multi-step Jailbreaking Privacy Attacks on ChatGPT
Figure 3 for Multi-step Jailbreaking Privacy Attacks on ChatGPT
Figure 4 for Multi-step Jailbreaking Privacy Attacks on ChatGPT
Viaarxiv icon