Picture for Boqing Gong

Boqing Gong

On Discrete Prompt Optimization for Diffusion Models

Add code
Jun 27, 2024
Viaarxiv icon

Understanding the Impact of Negative Prompts: When and How Do They Take Effect?

Add code
Jun 05, 2024
Viaarxiv icon

The Crystal Ball Hypothesis in diffusion models: Anticipating object positions from initial noise

Add code
Jun 04, 2024
Viaarxiv icon

Automatic Jailbreaking of the Text-to-Image Generative AI Systems

Add code
May 28, 2024
Viaarxiv icon

Large-Scale Multi-Center CT and MRI Segmentation of Pancreas with Deep Learning

Add code
May 20, 2024
Figure 1 for Large-Scale Multi-Center CT and MRI Segmentation of Pancreas with Deep Learning
Figure 2 for Large-Scale Multi-Center CT and MRI Segmentation of Pancreas with Deep Learning
Figure 3 for Large-Scale Multi-Center CT and MRI Segmentation of Pancreas with Deep Learning
Figure 4 for Large-Scale Multi-Center CT and MRI Segmentation of Pancreas with Deep Learning
Viaarxiv icon

VideoPrism: A Foundational Visual Encoder for Video Understanding

Add code
Feb 20, 2024
Figure 1 for VideoPrism: A Foundational Visual Encoder for Video Understanding
Figure 2 for VideoPrism: A Foundational Visual Encoder for Video Understanding
Figure 3 for VideoPrism: A Foundational Visual Encoder for Video Understanding
Figure 4 for VideoPrism: A Foundational Visual Encoder for Video Understanding
Viaarxiv icon

Distilling Vision-Language Models on Millions of Videos

Add code
Jan 11, 2024
Figure 1 for Distilling Vision-Language Models on Millions of Videos
Figure 2 for Distilling Vision-Language Models on Millions of Videos
Figure 3 for Distilling Vision-Language Models on Millions of Videos
Figure 4 for Distilling Vision-Language Models on Millions of Videos
Viaarxiv icon

Instruct-Imagen: Image Generation with Multi-modal Instruction

Add code
Jan 03, 2024
Viaarxiv icon

Towards A Unified Neural Architecture for Visual Recognition and Reasoning

Add code
Nov 10, 2023
Figure 1 for Towards A Unified Neural Architecture for Visual Recognition and Reasoning
Figure 2 for Towards A Unified Neural Architecture for Visual Recognition and Reasoning
Figure 3 for Towards A Unified Neural Architecture for Visual Recognition and Reasoning
Figure 4 for Towards A Unified Neural Architecture for Visual Recognition and Reasoning
Viaarxiv icon

Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation

Add code
Oct 09, 2023
Figure 1 for Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation
Figure 2 for Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation
Figure 3 for Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation
Figure 4 for Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation
Viaarxiv icon