Alert button
Picture for Jing Yu Koh

Jing Yu Koh

Alert button

OmniACT: A Dataset and Benchmark for Enabling Multimodal Generalist Autonomous Agents for Desktop and Web

Add code
Bookmark button
Alert button
Feb 28, 2024
Raghav Kapoor, Yash Parag Butala, Melisa Russak, Jing Yu Koh, Kiran Kamble, Waseem Alshikh, Ruslan Salakhutdinov

Viaarxiv icon

VisualWebArena: Evaluating Multimodal Agents on Realistic Visual Web Tasks

Add code
Bookmark button
Alert button
Jan 24, 2024
Jing Yu Koh, Robert Lo, Lawrence Jang, Vikram Duvvur, Ming Chong Lim, Po-Yu Huang, Graham Neubig, Shuyan Zhou, Ruslan Salakhutdinov, Daniel Fried

Viaarxiv icon

Multimodal Graph Learning for Generative Tasks

Add code
Bookmark button
Alert button
Oct 12, 2023
Minji Yoon, Jing Yu Koh, Bryan Hooi, Ruslan Salakhutdinov

Figure 1 for Multimodal Graph Learning for Generative Tasks
Figure 2 for Multimodal Graph Learning for Generative Tasks
Figure 3 for Multimodal Graph Learning for Generative Tasks
Figure 4 for Multimodal Graph Learning for Generative Tasks
Viaarxiv icon

Generating Images with Multimodal Language Models

Add code
Bookmark button
Alert button
May 26, 2023
Jing Yu Koh, Daniel Fried, Ruslan Salakhutdinov

Figure 1 for Generating Images with Multimodal Language Models
Figure 2 for Generating Images with Multimodal Language Models
Figure 3 for Generating Images with Multimodal Language Models
Figure 4 for Generating Images with Multimodal Language Models
Viaarxiv icon

VQ3D: Learning a 3D-Aware Generative Model on ImageNet

Add code
Bookmark button
Alert button
Feb 14, 2023
Kyle Sargent, Jing Yu Koh, Han Zhang, Huiwen Chang, Charles Herrmann, Pratul Srinivasan, Jiajun Wu, Deqing Sun

Figure 1 for VQ3D: Learning a 3D-Aware Generative Model on ImageNet
Figure 2 for VQ3D: Learning a 3D-Aware Generative Model on ImageNet
Figure 3 for VQ3D: Learning a 3D-Aware Generative Model on ImageNet
Figure 4 for VQ3D: Learning a 3D-Aware Generative Model on ImageNet
Viaarxiv icon

Grounding Language Models to Images for Multimodal Generation

Add code
Bookmark button
Alert button
Jan 31, 2023
Jing Yu Koh, Ruslan Salakhutdinov, Daniel Fried

Figure 1 for Grounding Language Models to Images for Multimodal Generation
Figure 2 for Grounding Language Models to Images for Multimodal Generation
Figure 3 for Grounding Language Models to Images for Multimodal Generation
Figure 4 for Grounding Language Models to Images for Multimodal Generation
Viaarxiv icon

A New Path: Scaling Vision-and-Language Navigation with Synthetic Instructions and Imitation Learning

Add code
Bookmark button
Alert button
Oct 06, 2022
Aishwarya Kamath, Peter Anderson, Su Wang, Jing Yu Koh, Alexander Ku, Austin Waters, Yinfei Yang, Jason Baldridge, Zarana Parekh

Figure 1 for A New Path: Scaling Vision-and-Language Navigation with Synthetic Instructions and Imitation Learning
Figure 2 for A New Path: Scaling Vision-and-Language Navigation with Synthetic Instructions and Imitation Learning
Figure 3 for A New Path: Scaling Vision-and-Language Navigation with Synthetic Instructions and Imitation Learning
Figure 4 for A New Path: Scaling Vision-and-Language Navigation with Synthetic Instructions and Imitation Learning
Viaarxiv icon

Scaling Autoregressive Models for Content-Rich Text-to-Image Generation

Add code
Bookmark button
Alert button
Jun 22, 2022
Jiahui Yu, Yuanzhong Xu, Jing Yu Koh, Thang Luong, Gunjan Baid, Zirui Wang, Vijay Vasudevan, Alexander Ku, Yinfei Yang, Burcu Karagol Ayan, Ben Hutchinson, Wei Han, Zarana Parekh, Xin Li, Han Zhang, Jason Baldridge, Yonghui Wu

Figure 1 for Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
Figure 2 for Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
Figure 3 for Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
Figure 4 for Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
Viaarxiv icon

Simple and Effective Synthesis of Indoor 3D Scenes

Add code
Bookmark button
Alert button
Apr 06, 2022
Jing Yu Koh, Harsh Agrawal, Dhruv Batra, Richard Tucker, Austin Waters, Honglak Lee, Yinfei Yang, Jason Baldridge, Peter Anderson

Figure 1 for Simple and Effective Synthesis of Indoor 3D Scenes
Figure 2 for Simple and Effective Synthesis of Indoor 3D Scenes
Figure 3 for Simple and Effective Synthesis of Indoor 3D Scenes
Figure 4 for Simple and Effective Synthesis of Indoor 3D Scenes
Viaarxiv icon