Alert button
Picture for Abhay Zala

Abhay Zala

Alert button

Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model

Add code
Bookmark button
Alert button
Apr 15, 2024
Han Lin, Jaemin Cho, Abhay Zala, Mohit Bansal

Viaarxiv icon

EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents

Add code
Bookmark button
Alert button
Mar 18, 2024
Abhay Zala, Jaemin Cho, Han Lin, Jaehong Yoon, Mohit Bansal

Figure 1 for EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents
Figure 2 for EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents
Figure 3 for EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents
Figure 4 for EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents
Viaarxiv icon

DiagrammerGPT: Generating Open-Domain, Open-Platform Diagrams via LLM Planning

Add code
Bookmark button
Alert button
Oct 18, 2023
Abhay Zala, Han Lin, Jaemin Cho, Mohit Bansal

Viaarxiv icon

VideoDirectorGPT: Consistent Multi-scene Video Generation via LLM-Guided Planning

Add code
Bookmark button
Alert button
Sep 26, 2023
Han Lin, Abhay Zala, Jaemin Cho, Mohit Bansal

Figure 1 for VideoDirectorGPT: Consistent Multi-scene Video Generation via LLM-Guided Planning
Figure 2 for VideoDirectorGPT: Consistent Multi-scene Video Generation via LLM-Guided Planning
Figure 3 for VideoDirectorGPT: Consistent Multi-scene Video Generation via LLM-Guided Planning
Figure 4 for VideoDirectorGPT: Consistent Multi-scene Video Generation via LLM-Guided Planning
Viaarxiv icon

Visual Programming for Text-to-Image Generation and Evaluation

Add code
Bookmark button
Alert button
May 24, 2023
Jaemin Cho, Abhay Zala, Mohit Bansal

Figure 1 for Visual Programming for Text-to-Image Generation and Evaluation
Figure 2 for Visual Programming for Text-to-Image Generation and Evaluation
Figure 3 for Visual Programming for Text-to-Image Generation and Evaluation
Figure 4 for Visual Programming for Text-to-Image Generation and Evaluation
Viaarxiv icon

Hierarchical Video-Moment Retrieval and Step-Captioning

Add code
Bookmark button
Alert button
Mar 29, 2023
Abhay Zala, Jaemin Cho, Satwik Kottur, Xilun Chen, Barlas Oğuz, Yasher Mehdad, Mohit Bansal

Figure 1 for Hierarchical Video-Moment Retrieval and Step-Captioning
Figure 2 for Hierarchical Video-Moment Retrieval and Step-Captioning
Figure 3 for Hierarchical Video-Moment Retrieval and Step-Captioning
Figure 4 for Hierarchical Video-Moment Retrieval and Step-Captioning
Viaarxiv icon

CoSIm: Commonsense Reasoning for Counterfactual Scene Imagination

Add code
Bookmark button
Alert button
Jul 08, 2022
Hyounghun Kim, Abhay Zala, Mohit Bansal

Figure 1 for CoSIm: Commonsense Reasoning for Counterfactual Scene Imagination
Figure 2 for CoSIm: Commonsense Reasoning for Counterfactual Scene Imagination
Figure 3 for CoSIm: Commonsense Reasoning for Counterfactual Scene Imagination
Figure 4 for CoSIm: Commonsense Reasoning for Counterfactual Scene Imagination
Viaarxiv icon

DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Transformers

Add code
Bookmark button
Alert button
Feb 08, 2022
Jaemin Cho, Abhay Zala, Mohit Bansal

Figure 1 for DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Transformers
Figure 2 for DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Transformers
Figure 3 for DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Transformers
Figure 4 for DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Transformers
Viaarxiv icon

FixMyPose: Pose Correctional Captioning and Retrieval

Add code
Bookmark button
Alert button
Apr 04, 2021
Hyounghun Kim, Abhay Zala, Graham Burri, Mohit Bansal

Figure 1 for FixMyPose: Pose Correctional Captioning and Retrieval
Figure 2 for FixMyPose: Pose Correctional Captioning and Retrieval
Figure 3 for FixMyPose: Pose Correctional Captioning and Retrieval
Figure 4 for FixMyPose: Pose Correctional Captioning and Retrieval
Viaarxiv icon

ArraMon: A Joint Navigation-Assembly Instruction Interpretation Task in Dynamic Environments

Add code
Bookmark button
Alert button
Nov 15, 2020
Hyounghun Kim, Abhay Zala, Graham Burri, Hao Tan, Mohit Bansal

Figure 1 for ArraMon: A Joint Navigation-Assembly Instruction Interpretation Task in Dynamic Environments
Figure 2 for ArraMon: A Joint Navigation-Assembly Instruction Interpretation Task in Dynamic Environments
Figure 3 for ArraMon: A Joint Navigation-Assembly Instruction Interpretation Task in Dynamic Environments
Figure 4 for ArraMon: A Joint Navigation-Assembly Instruction Interpretation Task in Dynamic Environments
Viaarxiv icon