Alert button

"Text": models, code, and papers
Alert button

BIVDiff: A Training-Free Framework for General-Purpose Video Synthesis via Bridging Image and Video Diffusion Models

Dec 05, 2023
Fengyuan Shi, Jiaxi Gu, Hang Xu, Songcen Xu, Wei Zhang, Limin Wang

Viaarxiv icon

Photorealistic Video Generation with Diffusion Models

Dec 11, 2023
Agrim Gupta, Lijun Yu, Kihyuk Sohn, Xiuye Gu, Meera Hahn, Li Fei-Fei, Irfan Essa, Lu Jiang, José Lezama

Viaarxiv icon

An Empirical Study of Frame Selection for Text-to-Video Retrieval

Nov 01, 2023
Mengxia Wu, Min Cao, Yang Bai, Ziyin Zeng, Chen Chen, Liqiang Nie, Min Zhang

Figure 1 for An Empirical Study of Frame Selection for Text-to-Video Retrieval
Figure 2 for An Empirical Study of Frame Selection for Text-to-Video Retrieval
Figure 3 for An Empirical Study of Frame Selection for Text-to-Video Retrieval
Figure 4 for An Empirical Study of Frame Selection for Text-to-Video Retrieval
Viaarxiv icon

Interpretable-by-Design Text Classification with Iteratively Generated Concept Bottleneck

Oct 30, 2023
Josh Magnus Ludan, Qing Lyu, Yue Yang, Liam Dugan, Mark Yatskar, Chris Callison-Burch

Figure 1 for Interpretable-by-Design Text Classification with Iteratively Generated Concept Bottleneck
Figure 2 for Interpretable-by-Design Text Classification with Iteratively Generated Concept Bottleneck
Figure 3 for Interpretable-by-Design Text Classification with Iteratively Generated Concept Bottleneck
Figure 4 for Interpretable-by-Design Text Classification with Iteratively Generated Concept Bottleneck
Viaarxiv icon

3DStyle-Diffusion: Pursuing Fine-grained Text-driven 3D Stylization with 2D Diffusion Models

Nov 09, 2023
Haibo Yang, Yang Chen, Yingwei Pan, Ting Yao, Zhineng Chen, Tao Mei

Viaarxiv icon

Fast Inference Through The Reuse Of Attention Maps In Diffusion Models

Dec 13, 2023
Rosco Hunter, Łukasz Dudziak, Mohamed S. Abdelfattah, Abhinav Mehrotra, Sourav Bhattacharya, Hongkai Wen

Viaarxiv icon

MLLMs-Augmented Visual-Language Representation Learning

Dec 01, 2023
Yanqing Liu, Kai Wang, Wenqi Shao, Ping Luo, Yu Qiao, Mike Zheng Shou, Kaipeng Zhang, Yang You

Figure 1 for MLLMs-Augmented Visual-Language Representation Learning
Figure 2 for MLLMs-Augmented Visual-Language Representation Learning
Figure 3 for MLLMs-Augmented Visual-Language Representation Learning
Figure 4 for MLLMs-Augmented Visual-Language Representation Learning
Viaarxiv icon

Improved Visual Grounding through Self-Consistent Explanations

Dec 07, 2023
Ruozhen He, Paola Cascante-Bonilla, Ziyan Yang, Alexander C. Berg, Vicente Ordonez

Viaarxiv icon

Inversion-Free Image Editing with Natural Language

Dec 07, 2023
Sihan Xu, Yidong Huang, Jiayi Pan, Ziqiao Ma, Joyce Chai

Viaarxiv icon

Style Aligned Image Generation via Shared Attention

Dec 04, 2023
Amir Hertz, Andrey Voynov, Shlomi Fruchter, Daniel Cohen-Or

Figure 1 for Style Aligned Image Generation via Shared Attention
Figure 2 for Style Aligned Image Generation via Shared Attention
Figure 3 for Style Aligned Image Generation via Shared Attention
Figure 4 for Style Aligned Image Generation via Shared Attention
Viaarxiv icon