Alert button

"Text": models, code, and papers
Alert button

Patchwork Learning: A Paradigm Towards Integrative Analysis across Diverse Biomedical Data Sources

May 13, 2023
Suraj Rajendran, Weishen Pan, Mert R. Sabuncu, Yong Chen, Jiayu Zhou, Fei Wang

Figure 1 for Patchwork Learning: A Paradigm Towards Integrative Analysis across Diverse Biomedical Data Sources
Figure 2 for Patchwork Learning: A Paradigm Towards Integrative Analysis across Diverse Biomedical Data Sources
Figure 3 for Patchwork Learning: A Paradigm Towards Integrative Analysis across Diverse Biomedical Data Sources
Figure 4 for Patchwork Learning: A Paradigm Towards Integrative Analysis across Diverse Biomedical Data Sources
Viaarxiv icon

Interpreting Vision and Language Generative Models with Semantic Visual Priors

May 04, 2023
Michele Cafagna, Lina M. Rojas-Barahona, Kees van Deemter, Albert Gatt

Figure 1 for Interpreting Vision and Language Generative Models with Semantic Visual Priors
Figure 2 for Interpreting Vision and Language Generative Models with Semantic Visual Priors
Figure 3 for Interpreting Vision and Language Generative Models with Semantic Visual Priors
Figure 4 for Interpreting Vision and Language Generative Models with Semantic Visual Priors
Viaarxiv icon

Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation

Dec 22, 2022
Jay Zhangjie Wu, Yixiao Ge, Xintao Wang, Weixian Lei, Yuchao Gu, Wynne Hsu, Ying Shan, Xiaohu Qie, Mike Zheng Shou

Figure 1 for Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation
Figure 2 for Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation
Figure 3 for Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation
Figure 4 for Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation
Viaarxiv icon

On Negative Sampling for Contrastive Audio-Text Retrieval

Nov 08, 2022
Huang Xie, Okko Räsänen, Tuomas Virtanen

Figure 1 for On Negative Sampling for Contrastive Audio-Text Retrieval
Figure 2 for On Negative Sampling for Contrastive Audio-Text Retrieval
Viaarxiv icon

Duration-aware pause insertion using pre-trained language model for multi-speaker text-to-speech

Feb 27, 2023
Dong Yang, Tomoki Koriyama, Yuki Saito, Takaaki Saeki, Detai Xin, Hiroshi Saruwatari

Figure 1 for Duration-aware pause insertion using pre-trained language model for multi-speaker text-to-speech
Figure 2 for Duration-aware pause insertion using pre-trained language model for multi-speaker text-to-speech
Figure 3 for Duration-aware pause insertion using pre-trained language model for multi-speaker text-to-speech
Figure 4 for Duration-aware pause insertion using pre-trained language model for multi-speaker text-to-speech
Viaarxiv icon

Muse: Text-To-Image Generation via Masked Generative Transformers

Jan 02, 2023
Huiwen Chang, Han Zhang, Jarred Barber, AJ Maschinot, Jose Lezama, Lu Jiang, Ming-Hsuan Yang, Kevin Murphy, William T. Freeman, Michael Rubinstein, Yuanzhen Li, Dilip Krishnan

Figure 1 for Muse: Text-To-Image Generation via Masked Generative Transformers
Figure 2 for Muse: Text-To-Image Generation via Masked Generative Transformers
Figure 3 for Muse: Text-To-Image Generation via Masked Generative Transformers
Figure 4 for Muse: Text-To-Image Generation via Masked Generative Transformers
Viaarxiv icon

Prefix tuning for automated audio captioning

Mar 30, 2023
Minkyu Kim, Kim Sung-Bin2, Tae-Hyun Oh

Figure 1 for Prefix tuning for automated audio captioning
Figure 2 for Prefix tuning for automated audio captioning
Figure 3 for Prefix tuning for automated audio captioning
Figure 4 for Prefix tuning for automated audio captioning
Viaarxiv icon

Otter: A Multi-Modal Model with In-Context Instruction Tuning

May 05, 2023
Bo Li, Yuanhan Zhang, Liangyu Chen, Jinghao Wang, Jingkang Yang, Ziwei Liu

Figure 1 for Otter: A Multi-Modal Model with In-Context Instruction Tuning
Figure 2 for Otter: A Multi-Modal Model with In-Context Instruction Tuning
Figure 3 for Otter: A Multi-Modal Model with In-Context Instruction Tuning
Figure 4 for Otter: A Multi-Modal Model with In-Context Instruction Tuning
Viaarxiv icon

Token Imbalance Adaptation for Radiology Report Generation

Apr 18, 2023
Yuexin Wu, I-Chan Huang, Xiaolei Huang

Figure 1 for Token Imbalance Adaptation for Radiology Report Generation
Figure 2 for Token Imbalance Adaptation for Radiology Report Generation
Figure 3 for Token Imbalance Adaptation for Radiology Report Generation
Figure 4 for Token Imbalance Adaptation for Radiology Report Generation
Viaarxiv icon

Enhancing Indic Handwritten Text Recognition Using Global Semantic Information

Dec 15, 2022
Ajoy Mondal, C. V. Jawahar

Viaarxiv icon