Alert button

"Text": models, code, and papers
Alert button

Efficient End-to-End Visual Document Understanding with Rationale Distillation

Nov 16, 2023
Wang Zhu, Alekh Agarwal, Mandar Joshi, Robin Jia, Jesse Thomason, Kristina Toutanova

Viaarxiv icon

End-to-end Joint Rich and Normalized ASR with a limited amount of rich training data

Nov 29, 2023
Can Cui, Imran Ahamad Sheikh, Mostafa Sadeghi, Emmanuel Vincent

Viaarxiv icon

GenZI: Zero-Shot 3D Human-Scene Interaction Generation

Nov 29, 2023
Lei Li, Angela Dai

Viaarxiv icon

Smooth Video Synthesis with Noise Constraints on Diffusion Models for One-shot Video Tuning

Nov 29, 2023
Liang Peng, Haoran Cheng, Zheng Yang, Ruisi Zhao, Linxuan Xia, Chaotian Song, Qinglin Lu, Wei Liu, Boxi Wu

Viaarxiv icon

Visual Anagrams: Generating Multi-View Optical Illusions with Diffusion Models

Nov 29, 2023
Daniel Geng, Inbum Park, Andrew Owens

Viaarxiv icon

Taiwan LLM: Bridging the Linguistic Divide with a Culturally Aligned Language Model

Nov 29, 2023
Yen-Ting Lin, Yun-Nung Chen

Viaarxiv icon

AgentAvatar: Disentangling Planning, Driving and Rendering for Photorealistic Avatar Agents

Nov 29, 2023
Duomin Wang, Bin Dai, Yu Deng, Baoyuan Wang

Viaarxiv icon

Wired Perspectives: Multi-View Wire Art Embraces Generative AI

Nov 26, 2023
Zhiyu Qu, Lan Yang, Honggang Zhang, Tao Xiang, Kaiyue Pang, Yi-Zhe Song

Figure 1 for Wired Perspectives: Multi-View Wire Art Embraces Generative AI
Figure 2 for Wired Perspectives: Multi-View Wire Art Embraces Generative AI
Figure 3 for Wired Perspectives: Multi-View Wire Art Embraces Generative AI
Figure 4 for Wired Perspectives: Multi-View Wire Art Embraces Generative AI
Viaarxiv icon

LoCo: Locally Constrained Training-Free Layout-to-Image Synthesis

Nov 21, 2023
Peiang Zhao, Han Li, Ruiyang Jin, S. Kevin Zhou

Viaarxiv icon

Fine-Grained Open Domain Image Animation with Motion Guidance

Nov 21, 2023
Zuozhuo Dai, Zhenghao Zhang, Yao Yao, Bingxue Qiu, Siyu Zhu, Long Qin, Weizhi Wang

Viaarxiv icon