Alert button
Picture for Zhuowen Tu

Zhuowen Tu

Alert button

On the Scalability of Diffusion-based Text-to-Image Generation

Add code
Bookmark button
Alert button
Apr 03, 2024
Hao Li, Yang Zou, Ying Wang, Orchid Majumder, Yusheng Xie, R. Manmatha, Ashwin Swaminathan, Zhuowen Tu, Stefano Ermon, Stefano Soatto

Viaarxiv icon

HOIDiffusion: Generating Realistic 3D Hand-Object Interaction Data

Add code
Bookmark button
Alert button
Mar 18, 2024
Mengqi Zhang, Yang Fu, Zheng Ding, Sifei Liu, Zhuowen Tu, Xiaolong Wang

Figure 1 for HOIDiffusion: Generating Realistic 3D Hand-Object Interaction Data
Figure 2 for HOIDiffusion: Generating Realistic 3D Hand-Object Interaction Data
Figure 3 for HOIDiffusion: Generating Realistic 3D Hand-Object Interaction Data
Figure 4 for HOIDiffusion: Generating Realistic 3D Hand-Object Interaction Data
Viaarxiv icon

Bayesian Diffusion Models for 3D Shape Reconstruction

Add code
Bookmark button
Alert button
Mar 11, 2024
Haiyang Xu, Yu Lei, Zeyuan Chen, Xiang Zhang, Yue Zhao, Yilin Wang, Zhuowen Tu

Figure 1 for Bayesian Diffusion Models for 3D Shape Reconstruction
Figure 2 for Bayesian Diffusion Models for 3D Shape Reconstruction
Figure 3 for Bayesian Diffusion Models for 3D Shape Reconstruction
Figure 4 for Bayesian Diffusion Models for 3D Shape Reconstruction
Viaarxiv icon

Enhancing Vision-Language Pre-training with Rich Supervisions

Add code
Bookmark button
Alert button
Mar 05, 2024
Yuan Gao, Kunyu Shi, Pengkai Zhu, Edouard Belval, Oren Nuriel, Srikar Appalaraju, Shabnam Ghadar, Vijay Mahadevan, Zhuowen Tu, Stefano Soatto

Figure 1 for Enhancing Vision-Language Pre-training with Rich Supervisions
Figure 2 for Enhancing Vision-Language Pre-training with Rich Supervisions
Figure 3 for Enhancing Vision-Language Pre-training with Rich Supervisions
Figure 4 for Enhancing Vision-Language Pre-training with Rich Supervisions
Viaarxiv icon

Non-autoregressive Sequence-to-Sequence Vision-Language Models

Add code
Bookmark button
Alert button
Mar 04, 2024
Kunyu Shi, Qi Dong, Luis Goncalves, Zhuowen Tu, Stefano Soatto

Figure 1 for Non-autoregressive Sequence-to-Sequence Vision-Language Models
Figure 2 for Non-autoregressive Sequence-to-Sequence Vision-Language Models
Figure 3 for Non-autoregressive Sequence-to-Sequence Vision-Language Models
Figure 4 for Non-autoregressive Sequence-to-Sequence Vision-Language Models
Viaarxiv icon

AffordanceLLM: Grounding Affordance from Vision Language Models

Add code
Bookmark button
Alert button
Jan 12, 2024
Shengyi Qian, Weifeng Chen, Min Bai, Xiong Zhou, Zhuowen Tu, Li Erran Li

Viaarxiv icon

Restoration by Generation with Constrained Priors

Add code
Bookmark button
Alert button
Dec 28, 2023
Zheng Ding, Xuaner Zhang, Zhuowen Tu, Zhihao Xia

Viaarxiv icon

TokenCompose: Grounding Diffusion with Token-level Supervision

Add code
Bookmark button
Alert button
Dec 06, 2023
Zirui Wang, Zhizhou Sha, Zheng Ding, Yilin Wang, Zhuowen Tu

Viaarxiv icon

When Is Multilinguality a Curse? Language Modeling for 250 High- and Low-Resource Languages

Add code
Bookmark button
Alert button
Nov 15, 2023
Tyler A. Chang, Catherine Arnett, Zhuowen Tu, Benjamin K. Bergen

Viaarxiv icon

Dolfin: Diffusion Layout Transformers without Autoencoder

Add code
Bookmark button
Alert button
Oct 25, 2023
Yilin Wang, Zeyuan Chen, Liangjun Zhong, Zheng Ding, Zhizhou Sha, Zhuowen Tu

Viaarxiv icon