Alert button
Picture for Chris Kelly

Chris Kelly

Alert button

VisionGPT-3D: A Generalized Multimodal Agent for Enhanced 3D Vision Understanding

Add code
Bookmark button
Alert button
Mar 22, 2024
Chris Kelly, Luhui Hu, Jiayin Hu, Yu Tian, Deshun Yang, Bang Yang, Cindy Yang, Zihao Li, Zaoshan Huang, Yuexian Zou

Figure 1 for VisionGPT-3D: A Generalized Multimodal Agent for Enhanced 3D Vision Understanding
Figure 2 for VisionGPT-3D: A Generalized Multimodal Agent for Enhanced 3D Vision Understanding
Figure 3 for VisionGPT-3D: A Generalized Multimodal Agent for Enhanced 3D Vision Understanding
Figure 4 for VisionGPT-3D: A Generalized Multimodal Agent for Enhanced 3D Vision Understanding
Viaarxiv icon

VisionGPT: Vision-Language Understanding Agent Using Generalized Multimodal Framework

Add code
Bookmark button
Alert button
Mar 14, 2024
Chris Kelly, Luhui Hu, Bang Yang, Yu Tian, Deshun Yang, Cindy Yang, Zaoshan Huang, Zihao Li, Jiayin Hu, Yuexian Zou

Figure 1 for VisionGPT: Vision-Language Understanding Agent Using Generalized Multimodal Framework
Figure 2 for VisionGPT: Vision-Language Understanding Agent Using Generalized Multimodal Framework
Figure 3 for VisionGPT: Vision-Language Understanding Agent Using Generalized Multimodal Framework
Figure 4 for VisionGPT: Vision-Language Understanding Agent Using Generalized Multimodal Framework
Viaarxiv icon

WorldGPT: A Sora-Inspired Video AI Agent as Rich World Models from Text and Image Inputs

Add code
Bookmark button
Alert button
Mar 10, 2024
Deshun Yang, Luhui Hu, Yu Tian, Zihao Li, Chris Kelly, Bang Yang, Cindy Yang, Yuexian Zou

Figure 1 for WorldGPT: A Sora-Inspired Video AI Agent as Rich World Models from Text and Image Inputs
Figure 2 for WorldGPT: A Sora-Inspired Video AI Agent as Rich World Models from Text and Image Inputs
Figure 3 for WorldGPT: A Sora-Inspired Video AI Agent as Rich World Models from Text and Image Inputs
Figure 4 for WorldGPT: A Sora-Inspired Video AI Agent as Rich World Models from Text and Image Inputs
Viaarxiv icon

UnifiedVisionGPT: Streamlining Vision-Oriented AI through Generalized Multimodal Framework

Add code
Bookmark button
Alert button
Nov 16, 2023
Chris Kelly, Luhui Hu, Cindy Yang, Yu Tian, Deshun Yang, Bang Yang, Zaoshan Huang, Zihao Li, Yuexian Zou

Viaarxiv icon

Large Language Models Encode Clinical Knowledge

Add code
Bookmark button
Alert button
Dec 26, 2022
Karan Singhal, Shekoofeh Azizi, Tao Tu, S. Sara Mahdavi, Jason Wei, Hyung Won Chung, Nathan Scales, Ajay Tanwani, Heather Cole-Lewis, Stephen Pfohl, Perry Payne, Martin Seneviratne, Paul Gamble, Chris Kelly, Nathaneal Scharli, Aakanksha Chowdhery, Philip Mansfield, Blaise Aguera y Arcas, Dale Webster, Greg S. Corrado, Yossi Matias, Katherine Chou, Juraj Gottweis, Nenad Tomasev, Yun Liu, Alvin Rajkomar, Joelle Barral, Christopher Semturs, Alan Karthikesalingam, Vivek Natarajan

Figure 1 for Large Language Models Encode Clinical Knowledge
Figure 2 for Large Language Models Encode Clinical Knowledge
Figure 3 for Large Language Models Encode Clinical Knowledge
Figure 4 for Large Language Models Encode Clinical Knowledge
Viaarxiv icon