Alert button

"Text": models, code, and papers
Alert button

To token or not to token: A Comparative Study of Text Representations for Cross-Lingual Transfer

Oct 12, 2023
Md Mushfiqur Rahman, Fardin Ahsan Sakib, Fahim Faisal, Antonios Anastasopoulos

Viaarxiv icon

InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition

Sep 29, 2023
Pan Zhang, Xiaoyi Dong, Bin Wang, Yuhang Cao, Chao Xu, Linke Ouyang, Zhiyuan Zhao, Shuangrui Ding, Songyang Zhang, Haodong Duan, Hang Yan, Xinyue Zhang, Wei Li, Jingwen Li, Kai Chen, Conghui He, Xingcheng Zhang, Yu Qiao, Dahua Lin, Jiaqi Wang

Figure 1 for InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition
Figure 2 for InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition
Figure 3 for InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition
Figure 4 for InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition
Viaarxiv icon

Semantic-aware Video Representation for Few-shot Action Recognition

Nov 10, 2023
Yutao Tang, Benjamin Bejar, Rene Vidal

Viaarxiv icon

Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation

Sep 27, 2023
David Junhao Zhang, Jay Zhangjie Wu, Jia-Wei Liu, Rui Zhao, Lingmin Ran, Yuchao Gu, Difei Gao, Mike Zheng Shou

Figure 1 for Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation
Figure 2 for Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation
Figure 3 for Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation
Figure 4 for Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation
Viaarxiv icon

Act As You Wish: Fine-Grained Control of Motion Diffusion Model with Hierarchical Semantic Graphs

Nov 02, 2023
Peng Jin, Yang Wu, Yanbo Fan, Zhongqian Sun, Yang Wei, Li Yuan

Viaarxiv icon

Personalizing Keyword Spotting with Speaker Information

Nov 06, 2023
Beltrán Labrador, Pai Zhu, Guanlong Zhao, Angelo Scorza Scarpati, Quan Wang, Alicia Lozano-Diez, Alex Park, Ignacio López Moreno

Viaarxiv icon

Repetition In Repetition Out: Towards Understanding Neural Text Degeneration from the Data Perspective

Oct 16, 2023
Huayang Li, Tian Lan, Zihao Fu, Deng Cai, Lemao Liu, Nigel Collier, Taro Watanabe, Yixuan Su

Viaarxiv icon

DynaPipe: Optimizing Multi-task Training through Dynamic Pipelines

Nov 17, 2023
Chenyu Jiang, Zhen Jia, Shuai Zheng, Yida Wang, Chuan Wu

Viaarxiv icon

NewsGPT: ChatGPT Integration for Robot-Reporter

Nov 11, 2023
Abdelhadi Hireche, Abdelkader Nasreddine Belkacem, Sadia Jamil, Chao Chen

Figure 1 for NewsGPT: ChatGPT Integration for Robot-Reporter
Figure 2 for NewsGPT: ChatGPT Integration for Robot-Reporter
Figure 3 for NewsGPT: ChatGPT Integration for Robot-Reporter
Figure 4 for NewsGPT: ChatGPT Integration for Robot-Reporter
Viaarxiv icon

Comics for Everyone: Generating Accessible Text Descriptions for Comic Strips

Oct 01, 2023
Reshma Ramaprasad

Viaarxiv icon