Alert button
Picture for Jiji Tang

Jiji Tang

Alert button

Let Storytelling Tell Vivid Stories: An Expressive and Fluent Multimodal Storyteller

Add code
Bookmark button
Alert button
Mar 12, 2024
Chuanqi Zang, Jiji Tang, Rongsheng Zhang, Zeng Zhao, Tangjie Lv, Mingtao Pei, Wei Liang

Figure 1 for Let Storytelling Tell Vivid Stories: An Expressive and Fluent Multimodal Storyteller
Figure 2 for Let Storytelling Tell Vivid Stories: An Expressive and Fluent Multimodal Storyteller
Figure 3 for Let Storytelling Tell Vivid Stories: An Expressive and Fluent Multimodal Storyteller
Figure 4 for Let Storytelling Tell Vivid Stories: An Expressive and Fluent Multimodal Storyteller
Viaarxiv icon

Towards Efficient Diffusion-Based Image Editing with Instant Attention Masks

Add code
Bookmark button
Alert button
Jan 23, 2024
Siyu Zou, Jiji Tang, Yiyi Zhou, Jing He, Chaoyi Zhao, Rongsheng Zhang, Zhipeng Hu, Xiaoshuai Sun

Viaarxiv icon

Beyond First Impressions: Integrating Joint Multi-modal Cues for Comprehensive 3D Representation

Add code
Bookmark button
Alert button
Aug 06, 2023
Haowei Wang, Jiji Tang, Jiayi Ji, Xiaoshuai Sun, Rongsheng Zhang, Yiwei Ma, Minda Zhao, Lincheng Li, zeng zhao, Tangjie Lv, Rongrong Ji

Figure 1 for Beyond First Impressions: Integrating Joint Multi-modal Cues for Comprehensive 3D Representation
Figure 2 for Beyond First Impressions: Integrating Joint Multi-modal Cues for Comprehensive 3D Representation
Figure 3 for Beyond First Impressions: Integrating Joint Multi-modal Cues for Comprehensive 3D Representation
Figure 4 for Beyond First Impressions: Integrating Joint Multi-modal Cues for Comprehensive 3D Representation
Viaarxiv icon

Structure-CLIP: Enhance Multi-modal Language Representations with Structure Knowledge

Add code
Bookmark button
Alert button
May 06, 2023
Yufeng Huang, Jiji Tang, Zhuo Chen, Rongsheng Zhang, Xinfeng Zhang, Weijie Chen, Zeng Zhao, Tangjie Lv, Zhipeng Hu, Wen Zhang

Figure 1 for Structure-CLIP: Enhance Multi-modal Language Representations with Structure Knowledge
Figure 2 for Structure-CLIP: Enhance Multi-modal Language Representations with Structure Knowledge
Figure 3 for Structure-CLIP: Enhance Multi-modal Language Representations with Structure Knowledge
Figure 4 for Structure-CLIP: Enhance Multi-modal Language Representations with Structure Knowledge
Viaarxiv icon

ERNIE-ViL: Knowledge Enhanced Vision-Language Representations Through Scene Graph

Add code
Bookmark button
Alert button
Jun 30, 2020
Fei Yu, Jiji Tang, Weichong Yin, Yu Sun, Hao Tian, Hua Wu, Haifeng Wang

Figure 1 for ERNIE-ViL: Knowledge Enhanced Vision-Language Representations Through Scene Graph
Figure 2 for ERNIE-ViL: Knowledge Enhanced Vision-Language Representations Through Scene Graph
Figure 3 for ERNIE-ViL: Knowledge Enhanced Vision-Language Representations Through Scene Graph
Figure 4 for ERNIE-ViL: Knowledge Enhanced Vision-Language Representations Through Scene Graph
Viaarxiv icon