Alert button

"Text": models, code, and papers
Alert button

Vietnamese Poem Generation & The Prospect Of Cross-Language Poem-To-Poem Translation

Jan 04, 2024
Triet Minh Huynh, Quan Le Bao

Figure 1 for Vietnamese Poem Generation & The Prospect Of Cross-Language Poem-To-Poem Translation
Figure 2 for Vietnamese Poem Generation & The Prospect Of Cross-Language Poem-To-Poem Translation
Figure 3 for Vietnamese Poem Generation & The Prospect Of Cross-Language Poem-To-Poem Translation
Viaarxiv icon

DreamSync: Aligning Text-to-Image Generation with Image Understanding Feedback

Nov 29, 2023
Jiao Sun, Deqing Fu, Yushi Hu, Su Wang, Royi Rassin, Da-Cheng Juan, Dana Alon, Charles Herrmann, Sjoerd van Steenkiste, Ranjay Krishna, Cyrus Rashtchian

Viaarxiv icon

GOAT-Bench: Safety Insights to Large Multimodal Models through Meme-Based Social Abuse

Jan 07, 2024
Hongzhan Lin, Ziyang Luo, Bo Wang, Ruichao Yang, Jing Ma

Viaarxiv icon

Latte: Latent Diffusion Transformer for Video Generation

Jan 05, 2024
Xin Ma, Yaohui Wang, Gengyun Jia, Xinyuan Chen, Ziwei Liu, Yuan-Fang Li, Cunjian Chen, Yu Qiao

Viaarxiv icon

Object-Centric Instruction Augmentation for Robotic Manipulation

Jan 05, 2024
Junjie Wen, Yichen Zhu, Minjie Zhu, Jinming Li, Zhiyuan Xu, Zhengping Che, Chaomin Shen, Yaxin Peng, Dong Liu, Feifei Feng, Jian Tang

Viaarxiv icon

VoroNav: Voronoi-based Zero-shot Object Navigation with Large Language Model

Jan 05, 2024
Pengying Wu, Yao Mu, Bingxian Wu, Yi Hou, Ji Ma, Shanghang Zhang, Chang Liu

Viaarxiv icon

Progressive Knowledge Distillation Of Stable Diffusion XL Using Layer Level Loss

Jan 05, 2024
Yatharth Gupta, Vishnu V. Jaddipal, Harish Prabhala, Sayak Paul, Patrick Von Platen

Viaarxiv icon

Mismatch Quest: Visual and Textual Feedback for Image-Text Misalignment

Dec 05, 2023
Brian Gordon, Yonatan Bitton, Yonatan Shafir, Roopal Garg, Xi Chen, Dani Lischinski, Daniel Cohen-Or, Idan Szpektor

Viaarxiv icon

A Joint-Reasoning based Disease Q&A System

Jan 06, 2024
Prakash Chandra Sukhwal, Vaibhav Rajan, Atreyi Kankanhalli

Viaarxiv icon

DocLLM: A layout-aware generative language model for multimodal document understanding

Dec 31, 2023
Dongsheng Wang, Natraj Raman, Mathieu Sibue, Zhiqiang Ma, Petr Babkin, Simerjot Kaur, Yulong Pei, Armineh Nourbakhsh, Xiaomo Liu

Viaarxiv icon