Alert button
Picture for Lijuan Wang

Lijuan Wang

Alert button

StrokeNUWA: Tokenizing Strokes for Vector Graphic Synthesis

Jan 30, 2024
Zecheng Tang, Chenfei Wu, Zekai Zhang, Mingheng Ni, Shengming Yin, Yu Liu, Zhengyuan Yang, Lijuan Wang, Zicheng Liu, Juntao Li, Nan Duan

Viaarxiv icon

Bring Metric Functions into Diffusion Models

Jan 04, 2024
Jie An, Zhengyuan Yang, Jianfeng Wang, Linjie Li, Zicheng Liu, Lijuan Wang, Jiebo Luo

Viaarxiv icon

COSMO: COntrastive Streamlined MultimOdal Model with Interleaved Pre-Training

Jan 01, 2024
Alex Jinpeng Wang, Linjie Li, Kevin Qinghong Lin, Jianfeng Wang, Kevin Lin, Zhengyuan Yang, Lijuan Wang, Mike Zheng Shou

Viaarxiv icon

InfoVisDial: An Informative Visual Dialogue Dataset by Bridging Large Multimodal and Language Models

Dec 21, 2023
Bingbing Wen, Zhengyuan Yang, Jianfeng Wang, Zhe Gan, Bill Howe, Lijuan Wang

Viaarxiv icon

Interfacing Foundation Models' Embeddings

Dec 12, 2023
Xueyan Zou, Linjie Li, Jianfeng Wang, Jianwei Yang, Mingyu Ding, Zhengyuan Yang, Feng Li, Hao Zhang, Shilong Liu, Arul Aravinthan, Yong Jae Lee, Lijuan Wang

Viaarxiv icon

Segment and Caption Anything

Dec 01, 2023
Xiaoke Huang, Jianfeng Wang, Yansong Tang, Zheng Zhang, Han Hu, Jiwen Lu, Lijuan Wang, Zicheng Liu

Viaarxiv icon

MM-Narrator: Narrating Long-form Videos with Multimodal In-Context Learning

Nov 29, 2023
Chaoyi Zhang, Kevin Lin, Zhengyuan Yang, Jianfeng Wang, Linjie Li, Chung-Ching Lin, Zicheng Liu, Lijuan Wang

Viaarxiv icon

GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone GUI Navigation

Nov 13, 2023
An Yan, Zhengyuan Yang, Wanrong Zhu, Kevin Lin, Linjie Li, Jianfeng Wang, Jianwei Yang, Yiwu Zhong, Julian McAuley, Jianfeng Gao, Zicheng Liu, Lijuan Wang

Viaarxiv icon

MM-VID: Advancing Video Understanding with GPT-4V(ision)

Oct 30, 2023
Kevin Lin, Faisal Ahmed, Linjie Li, Chung-Ching Lin, Ehsan Azarnasab, Zhengyuan Yang, Jianfeng Wang, Lin Liang, Zicheng Liu, Yumao Lu, Ce Liu, Lijuan Wang

Figure 1 for MM-VID: Advancing Video Understanding with GPT-4V(ision)
Figure 2 for MM-VID: Advancing Video Understanding with GPT-4V(ision)
Figure 3 for MM-VID: Advancing Video Understanding with GPT-4V(ision)
Figure 4 for MM-VID: Advancing Video Understanding with GPT-4V(ision)
Viaarxiv icon

DEsignBench: Exploring and Benchmarking DALL-E 3 for Imagining Visual Design

Oct 23, 2023
Kevin Lin, Zhengyuan Yang, Linjie Li, Jianfeng Wang, Lijuan Wang

Viaarxiv icon