Alert button
Picture for Wanrong Zhu

Wanrong Zhu

Alert button

High Confidence Level Inference is Almost Free using Parallel Stochastic Optimization

Add code
Bookmark button
Alert button
Jan 17, 2024
Wanrong Zhu, Zhipeng Lou, Ziyang Wei, Wei Biao Wu

Viaarxiv icon

GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone GUI Navigation

Add code
Bookmark button
Alert button
Nov 13, 2023
An Yan, Zhengyuan Yang, Wanrong Zhu, Kevin Lin, Linjie Li, Jianfeng Wang, Jianwei Yang, Yiwu Zhong, Julian McAuley, Jianfeng Gao, Zicheng Liu, Lijuan Wang

Viaarxiv icon

VisIT-Bench: A Benchmark for Vision-Language Instruction Following Inspired by Real-World Use

Add code
Bookmark button
Alert button
Aug 12, 2023
Yonatan Bitton, Hritik Bansal, Jack Hessel, Rulin Shao, Wanrong Zhu, Anas Awadalla, Josh Gardner, Rohan Taori, Ludwig Schimdt

Figure 1 for VisIT-Bench: A Benchmark for Vision-Language Instruction Following Inspired by Real-World Use
Figure 2 for VisIT-Bench: A Benchmark for Vision-Language Instruction Following Inspired by Real-World Use
Figure 3 for VisIT-Bench: A Benchmark for Vision-Language Instruction Following Inspired by Real-World Use
Figure 4 for VisIT-Bench: A Benchmark for Vision-Language Instruction Following Inspired by Real-World Use
Viaarxiv icon

OpenFlamingo: An Open-Source Framework for Training Large Autoregressive Vision-Language Models

Add code
Bookmark button
Alert button
Aug 07, 2023
Anas Awadalla, Irena Gao, Josh Gardner, Jack Hessel, Yusuf Hanafy, Wanrong Zhu, Kalyani Marathe, Yonatan Bitton, Samir Gadre, Shiori Sagawa, Jenia Jitsev, Simon Kornblith, Pang Wei Koh, Gabriel Ilharco, Mitchell Wortsman, Ludwig Schmidt

Figure 1 for OpenFlamingo: An Open-Source Framework for Training Large Autoregressive Vision-Language Models
Figure 2 for OpenFlamingo: An Open-Source Framework for Training Large Autoregressive Vision-Language Models
Figure 3 for OpenFlamingo: An Open-Source Framework for Training Large Autoregressive Vision-Language Models
Figure 4 for OpenFlamingo: An Open-Source Framework for Training Large Autoregressive Vision-Language Models
Viaarxiv icon

Weighted Averaged Stochastic Gradient Descent: Asymptotic Normality and Optimality

Add code
Bookmark button
Alert button
Jul 18, 2023
Ziyang Wei, Wanrong Zhu, Wei Biao Wu

Viaarxiv icon

VELMA: Verbalization Embodiment of LLM Agents for Vision and Language Navigation in Street View

Add code
Bookmark button
Alert button
Jul 12, 2023
Raphael Schumann, Wanrong Zhu, Weixi Feng, Tsu-Jui Fu, Stefan Riezler, William Yang Wang

Figure 1 for VELMA: Verbalization Embodiment of LLM Agents for Vision and Language Navigation in Street View
Figure 2 for VELMA: Verbalization Embodiment of LLM Agents for Vision and Language Navigation in Street View
Figure 3 for VELMA: Verbalization Embodiment of LLM Agents for Vision and Language Navigation in Street View
Figure 4 for VELMA: Verbalization Embodiment of LLM Agents for Vision and Language Navigation in Street View
Viaarxiv icon

LayoutGPT: Compositional Visual Planning and Generation with Large Language Models

Add code
Bookmark button
Alert button
May 24, 2023
Weixi Feng, Wanrong Zhu, Tsu-jui Fu, Varun Jampani, Arjun Akula, Xuehai He, Sugato Basu, Xin Eric Wang, William Yang Wang

Figure 1 for LayoutGPT: Compositional Visual Planning and Generation with Large Language Models
Figure 2 for LayoutGPT: Compositional Visual Planning and Generation with Large Language Models
Figure 3 for LayoutGPT: Compositional Visual Planning and Generation with Large Language Models
Figure 4 for LayoutGPT: Compositional Visual Planning and Generation with Large Language Models
Viaarxiv icon

Collaborative Generative AI: Integrating GPT-k for Efficient Editing in Text-to-Image Generation

Add code
Bookmark button
Alert button
May 18, 2023
Wanrong Zhu, Xinyi Wang, Yujie Lu, Tsu-Jui Fu, Xin Eric Wang, Miguel Eckstein, William Yang Wang

Figure 1 for Collaborative Generative AI: Integrating GPT-k for Efficient Editing in Text-to-Image Generation
Figure 2 for Collaborative Generative AI: Integrating GPT-k for Efficient Editing in Text-to-Image Generation
Figure 3 for Collaborative Generative AI: Integrating GPT-k for Efficient Editing in Text-to-Image Generation
Figure 4 for Collaborative Generative AI: Integrating GPT-k for Efficient Editing in Text-to-Image Generation
Viaarxiv icon