Picture for Zhuowen Tu

Zhuowen Tu

Enhancing Vision-Language Pre-training with Rich Supervisions

Add code
Mar 05, 2024
Figure 1 for Enhancing Vision-Language Pre-training with Rich Supervisions
Figure 2 for Enhancing Vision-Language Pre-training with Rich Supervisions
Figure 3 for Enhancing Vision-Language Pre-training with Rich Supervisions
Figure 4 for Enhancing Vision-Language Pre-training with Rich Supervisions
Viaarxiv icon

Non-autoregressive Sequence-to-Sequence Vision-Language Models

Add code
Mar 04, 2024
Figure 1 for Non-autoregressive Sequence-to-Sequence Vision-Language Models
Figure 2 for Non-autoregressive Sequence-to-Sequence Vision-Language Models
Figure 3 for Non-autoregressive Sequence-to-Sequence Vision-Language Models
Figure 4 for Non-autoregressive Sequence-to-Sequence Vision-Language Models
Viaarxiv icon

AffordanceLLM: Grounding Affordance from Vision Language Models

Add code
Jan 12, 2024
Figure 1 for AffordanceLLM: Grounding Affordance from Vision Language Models
Figure 2 for AffordanceLLM: Grounding Affordance from Vision Language Models
Figure 3 for AffordanceLLM: Grounding Affordance from Vision Language Models
Figure 4 for AffordanceLLM: Grounding Affordance from Vision Language Models
Viaarxiv icon

Restoration by Generation with Constrained Priors

Add code
Dec 28, 2023
Figure 1 for Restoration by Generation with Constrained Priors
Figure 2 for Restoration by Generation with Constrained Priors
Figure 3 for Restoration by Generation with Constrained Priors
Figure 4 for Restoration by Generation with Constrained Priors
Viaarxiv icon

TokenCompose: Grounding Diffusion with Token-level Supervision

Add code
Dec 06, 2023
Figure 1 for TokenCompose: Grounding Diffusion with Token-level Supervision
Figure 2 for TokenCompose: Grounding Diffusion with Token-level Supervision
Figure 3 for TokenCompose: Grounding Diffusion with Token-level Supervision
Figure 4 for TokenCompose: Grounding Diffusion with Token-level Supervision
Viaarxiv icon

When Is Multilinguality a Curse? Language Modeling for 250 High- and Low-Resource Languages

Add code
Nov 15, 2023
Figure 1 for When Is Multilinguality a Curse? Language Modeling for 250 High- and Low-Resource Languages
Figure 2 for When Is Multilinguality a Curse? Language Modeling for 250 High- and Low-Resource Languages
Figure 3 for When Is Multilinguality a Curse? Language Modeling for 250 High- and Low-Resource Languages
Figure 4 for When Is Multilinguality a Curse? Language Modeling for 250 High- and Low-Resource Languages
Viaarxiv icon

Dolfin: Diffusion Layout Transformers without Autoencoder

Add code
Oct 25, 2023
Figure 1 for Dolfin: Diffusion Layout Transformers without Autoencoder
Figure 2 for Dolfin: Diffusion Layout Transformers without Autoencoder
Figure 3 for Dolfin: Diffusion Layout Transformers without Autoencoder
Figure 4 for Dolfin: Diffusion Layout Transformers without Autoencoder
Viaarxiv icon

SkeleTR: Towrads Skeleton-based Action Recognition in the Wild

Add code
Sep 20, 2023
Figure 1 for SkeleTR: Towrads Skeleton-based Action Recognition in the Wild
Figure 2 for SkeleTR: Towrads Skeleton-based Action Recognition in the Wild
Figure 3 for SkeleTR: Towrads Skeleton-based Action Recognition in the Wild
Figure 4 for SkeleTR: Towrads Skeleton-based Action Recognition in the Wild
Viaarxiv icon

Object-Centric Multiple Object Tracking

Add code
Sep 05, 2023
Figure 1 for Object-Centric Multiple Object Tracking
Figure 2 for Object-Centric Multiple Object Tracking
Figure 3 for Object-Centric Multiple Object Tracking
Figure 4 for Object-Centric Multiple Object Tracking
Viaarxiv icon

Characterizing Learning Curves During Language Model Pre-Training: Learning, Forgetting, and Stability

Add code
Aug 29, 2023
Viaarxiv icon