Alert button
Picture for Lijuan Wang

Lijuan Wang

Alert button

Aligning Large Multi-Modal Model with Robust Instruction Tuning

Add code
Bookmark button
Alert button
Jun 26, 2023
Fuxiao Liu, Kevin Lin, Linjie Li, Jianfeng Wang, Yaser Yacoob, Lijuan Wang

Figure 1 for Aligning Large Multi-Modal Model with Robust Instruction Tuning
Figure 2 for Aligning Large Multi-Modal Model with Robust Instruction Tuning
Figure 3 for Aligning Large Multi-Modal Model with Robust Instruction Tuning
Figure 4 for Aligning Large Multi-Modal Model with Robust Instruction Tuning
Viaarxiv icon

MultiSum: A Dataset for Multimodal Summarization and Thumbnail Generation of Videos

Add code
Bookmark button
Alert button
Jun 07, 2023
Jielin Qiu, Jiacheng Zhu, William Han, Aditesh Kumar, Karthik Mittal, Claire Jin, Zhengyuan Yang, Linjie Li, Jianfeng Wang, Bo Li, Ding Zhao, Lijuan Wang

Figure 1 for MultiSum: A Dataset for Multimodal Summarization and Thumbnail Generation of Videos
Figure 2 for MultiSum: A Dataset for Multimodal Summarization and Thumbnail Generation of Videos
Figure 3 for MultiSum: A Dataset for Multimodal Summarization and Thumbnail Generation of Videos
Figure 4 for MultiSum: A Dataset for Multimodal Summarization and Thumbnail Generation of Videos
Viaarxiv icon

Neural Voting Field for Camera-Space 3D Hand Pose Estimation

Add code
Bookmark button
Alert button
May 07, 2023
Lin Huang, Chung-Ching Lin, Kevin Lin, Lin Liang, Lijuan Wang, Junsong Yuan, Zicheng Liu

Figure 1 for Neural Voting Field for Camera-Space 3D Hand Pose Estimation
Figure 2 for Neural Voting Field for Camera-Space 3D Hand Pose Estimation
Figure 3 for Neural Voting Field for Camera-Space 3D Hand Pose Estimation
Figure 4 for Neural Voting Field for Camera-Space 3D Hand Pose Estimation
Viaarxiv icon

An Empirical Study of Multimodal Model Merging

Add code
Bookmark button
Alert button
Apr 28, 2023
Yi-Lin Sung, Linjie Li, Kevin Lin, Zhe Gan, Mohit Bansal, Lijuan Wang

Figure 1 for An Empirical Study of Multimodal Model Merging
Figure 2 for An Empirical Study of Multimodal Model Merging
Figure 3 for An Empirical Study of Multimodal Model Merging
Figure 4 for An Empirical Study of Multimodal Model Merging
Viaarxiv icon

Diagnostic Benchmark and Iterative Inpainting for Layout-Guided Image Generation

Add code
Bookmark button
Alert button
Apr 14, 2023
Jaemin Cho, Linjie Li, Zhengyuan Yang, Zhe Gan, Lijuan Wang, Mohit Bansal

Figure 1 for Diagnostic Benchmark and Iterative Inpainting for Layout-Guided Image Generation
Figure 2 for Diagnostic Benchmark and Iterative Inpainting for Layout-Guided Image Generation
Figure 3 for Diagnostic Benchmark and Iterative Inpainting for Layout-Guided Image Generation
Figure 4 for Diagnostic Benchmark and Iterative Inpainting for Layout-Guided Image Generation
Viaarxiv icon

Adaptive Human Matting for Dynamic Videos

Add code
Bookmark button
Alert button
Apr 12, 2023
Chung-Ching Lin, Jiang Wang, Kun Luo, Kevin Lin, Linjie Li, Lijuan Wang, Zicheng Liu

Figure 1 for Adaptive Human Matting for Dynamic Videos
Figure 2 for Adaptive Human Matting for Dynamic Videos
Figure 3 for Adaptive Human Matting for Dynamic Videos
Figure 4 for Adaptive Human Matting for Dynamic Videos
Viaarxiv icon

Equivariant Similarity for Vision-Language Foundation Models

Add code
Bookmark button
Alert button
Mar 25, 2023
Tan Wang, Kevin Lin, Linjie Li, Chung-Ching Lin, Zhengyuan Yang, Hanwang Zhang, Zicheng Liu, Lijuan Wang

Figure 1 for Equivariant Similarity for Vision-Language Foundation Models
Figure 2 for Equivariant Similarity for Vision-Language Foundation Models
Figure 3 for Equivariant Similarity for Vision-Language Foundation Models
Figure 4 for Equivariant Similarity for Vision-Language Foundation Models
Viaarxiv icon

NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation

Add code
Bookmark button
Alert button
Mar 22, 2023
Shengming Yin, Chenfei Wu, Huan Yang, Jianfeng Wang, Xiaodong Wang, Minheng Ni, Zhengyuan Yang, Linjie Li, Shuguang Liu, Fan Yang, Jianlong Fu, Gong Ming, Lijuan Wang, Zicheng Liu, Houqiang Li, Nan Duan

Figure 1 for NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation
Figure 2 for NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation
Figure 3 for NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation
Figure 4 for NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation
Viaarxiv icon

MM-REACT: Prompting ChatGPT for Multimodal Reasoning and Action

Add code
Bookmark button
Alert button
Mar 20, 2023
Zhengyuan Yang, Linjie Li, Jianfeng Wang, Kevin Lin, Ehsan Azarnasab, Faisal Ahmed, Zicheng Liu, Ce Liu, Michael Zeng, Lijuan Wang

Figure 1 for MM-REACT: Prompting ChatGPT for Multimodal Reasoning and Action
Figure 2 for MM-REACT: Prompting ChatGPT for Multimodal Reasoning and Action
Figure 3 for MM-REACT: Prompting ChatGPT for Multimodal Reasoning and Action
Figure 4 for MM-REACT: Prompting ChatGPT for Multimodal Reasoning and Action
Viaarxiv icon

Learning 3D Photography Videos via Self-supervised Diffusion on Single Images

Add code
Bookmark button
Alert button
Feb 21, 2023
Xiaodong Wang, Chenfei Wu, Shengming Yin, Minheng Ni, Jianfeng Wang, Linjie Li, Zhengyuan Yang, Fan Yang, Lijuan Wang, Zicheng Liu, Yuejian Fang, Nan Duan

Figure 1 for Learning 3D Photography Videos via Self-supervised Diffusion on Single Images
Figure 2 for Learning 3D Photography Videos via Self-supervised Diffusion on Single Images
Figure 3 for Learning 3D Photography Videos via Self-supervised Diffusion on Single Images
Figure 4 for Learning 3D Photography Videos via Self-supervised Diffusion on Single Images
Viaarxiv icon