Alert button

"Text": models, code, and papers
Alert button

Idea2Img: Iterative Self-Refinement with GPT-4V(ision) for Automatic Image Design and Generation

Oct 12, 2023
Zhengyuan Yang, Jianfeng Wang, Linjie Li, Kevin Lin, Chung-Ching Lin, Zicheng Liu, Lijuan Wang

Figure 1 for Idea2Img: Iterative Self-Refinement with GPT-4V(ision) for Automatic Image Design and Generation
Figure 2 for Idea2Img: Iterative Self-Refinement with GPT-4V(ision) for Automatic Image Design and Generation
Figure 3 for Idea2Img: Iterative Self-Refinement with GPT-4V(ision) for Automatic Image Design and Generation
Figure 4 for Idea2Img: Iterative Self-Refinement with GPT-4V(ision) for Automatic Image Design and Generation
Viaarxiv icon

Guide3D: Create 3D Avatars from Text and Image Guidance

Aug 18, 2023
Yukang Cao, Yan-Pei Cao, Kai Han, Ying Shan, Kwan-Yee K. Wong

Figure 1 for Guide3D: Create 3D Avatars from Text and Image Guidance
Figure 2 for Guide3D: Create 3D Avatars from Text and Image Guidance
Figure 3 for Guide3D: Create 3D Avatars from Text and Image Guidance
Figure 4 for Guide3D: Create 3D Avatars from Text and Image Guidance
Viaarxiv icon

Flexible Keyword Spotting based on Homogeneous Audio-Text Embedding

Aug 12, 2023
Kumari Nishu, Minsik Cho, Paul Dixon, Devang Naik

Figure 1 for Flexible Keyword Spotting based on Homogeneous Audio-Text Embedding
Figure 2 for Flexible Keyword Spotting based on Homogeneous Audio-Text Embedding
Figure 3 for Flexible Keyword Spotting based on Homogeneous Audio-Text Embedding
Figure 4 for Flexible Keyword Spotting based on Homogeneous Audio-Text Embedding
Viaarxiv icon

Key-phrase boosted unsupervised summary generation for FinTech organization

Oct 16, 2023
Aadit Deshpande, Shreya Goyal, Prateek Nagwanshi, Avinash Tripathy

Viaarxiv icon

Shatter and Gather: Learning Referring Image Segmentation with Text Supervision

Aug 29, 2023
Dongwon Kim, Namyup Kim, Cuiling Lan, Suha Kwak

Viaarxiv icon

3D-STMN: Dependency-Driven Superpoint-Text Matching Network for End-to-End 3D Referring Expression Segmentation

Aug 31, 2023
Changli Wu, Yiwei Ma, Qi Chen, Haowei Wang, Gen Luo, Jiayi Ji, Xiaoshuai Sun

Figure 1 for 3D-STMN: Dependency-Driven Superpoint-Text Matching Network for End-to-End 3D Referring Expression Segmentation
Figure 2 for 3D-STMN: Dependency-Driven Superpoint-Text Matching Network for End-to-End 3D Referring Expression Segmentation
Figure 3 for 3D-STMN: Dependency-Driven Superpoint-Text Matching Network for End-to-End 3D Referring Expression Segmentation
Figure 4 for 3D-STMN: Dependency-Driven Superpoint-Text Matching Network for End-to-End 3D Referring Expression Segmentation
Viaarxiv icon

HowToCaption: Prompting LLMs to Transform Video Annotations at Scale

Oct 07, 2023
Nina Shvetsova, Anna Kukleva, Xudong Hong, Christian Rupprecht, Bernt Schiele, Hilde Kuehne

Figure 1 for HowToCaption: Prompting LLMs to Transform Video Annotations at Scale
Figure 2 for HowToCaption: Prompting LLMs to Transform Video Annotations at Scale
Figure 3 for HowToCaption: Prompting LLMs to Transform Video Annotations at Scale
Figure 4 for HowToCaption: Prompting LLMs to Transform Video Annotations at Scale
Viaarxiv icon

Lang3DSG: Language-based contrastive pre-training for 3D Scene Graph prediction

Oct 25, 2023
Sebastian Koch, Pedro Hermosilla, Narunas Vaskevicius, Mirco Colosi, Timo Ropinski

Viaarxiv icon

Exploring Multilingual Text Data Distillation

Aug 09, 2023
Shivam Sahni, Harsh Patel

Figure 1 for Exploring Multilingual Text Data Distillation
Figure 2 for Exploring Multilingual Text Data Distillation
Figure 3 for Exploring Multilingual Text Data Distillation
Figure 4 for Exploring Multilingual Text Data Distillation
Viaarxiv icon

First-Shot Unsupervised Anomalous Sound Detection With Unknown Anomalies Estimated by Metadata-Assisted Audio Generation

Oct 22, 2023
Hejing Zhang, Qiaoxi Zhu, Jian Guan, Haohe Liu, Feiyang Xiao, Jiantong Tian, Xinhao Mei, Xubo Liu, Wenwu Wang

Viaarxiv icon