Alert button

"Text": models, code, and papers
Alert button

Segment and Track Anything

May 11, 2023
Yangming Cheng, Liulei Li, Yuanyou Xu, Xiaodi Li, Zongxin Yang, Wenguan Wang, Yi Yang

Figure 1 for Segment and Track Anything
Figure 2 for Segment and Track Anything
Figure 3 for Segment and Track Anything
Figure 4 for Segment and Track Anything
Viaarxiv icon

Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers

May 11, 2023
Dahun Kim, Anelia Angelova, Weicheng Kuo

Figure 1 for Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers
Figure 2 for Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers
Figure 3 for Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers
Figure 4 for Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers
Viaarxiv icon

When the Majority is Wrong: Leveraging Annotator Disagreement for Subjective Tasks

May 11, 2023
Eve Fleisig, Rediet Abebe, Dan Klein

Figure 1 for When the Majority is Wrong: Leveraging Annotator Disagreement for Subjective Tasks
Figure 2 for When the Majority is Wrong: Leveraging Annotator Disagreement for Subjective Tasks
Figure 3 for When the Majority is Wrong: Leveraging Annotator Disagreement for Subjective Tasks
Figure 4 for When the Majority is Wrong: Leveraging Annotator Disagreement for Subjective Tasks
Viaarxiv icon

Paint it Black: Generating paintings from text descriptions

Feb 17, 2023
Mahnoor Shahid, Mark Koch, Niklas Schneider

Figure 1 for Paint it Black: Generating paintings from text descriptions
Figure 2 for Paint it Black: Generating paintings from text descriptions
Figure 3 for Paint it Black: Generating paintings from text descriptions
Figure 4 for Paint it Black: Generating paintings from text descriptions
Viaarxiv icon

FDNeRF: Semantics-Driven Face Reconstruction, Prompt Editing and Relighting with Diffusion Models

Jun 01, 2023
Hao Zhang, Yanbo Xu, Tianyuan Dai, Yu-Wing, Tai Chi-Keung Tang

Figure 1 for FDNeRF: Semantics-Driven Face Reconstruction, Prompt Editing and Relighting with Diffusion Models
Figure 2 for FDNeRF: Semantics-Driven Face Reconstruction, Prompt Editing and Relighting with Diffusion Models
Figure 3 for FDNeRF: Semantics-Driven Face Reconstruction, Prompt Editing and Relighting with Diffusion Models
Figure 4 for FDNeRF: Semantics-Driven Face Reconstruction, Prompt Editing and Relighting with Diffusion Models
Viaarxiv icon

"Let's not Quote out of Context": Unified Vision-Language Pretraining for Context Assisted Image Captioning

Jun 01, 2023
Abisek Rajakumar Kalarani, Pushpak Bhattacharyya, Niyati Chhaya, Sumit Shekhar

Figure 1 for "Let's not Quote out of Context": Unified Vision-Language Pretraining for Context Assisted Image Captioning
Figure 2 for "Let's not Quote out of Context": Unified Vision-Language Pretraining for Context Assisted Image Captioning
Figure 3 for "Let's not Quote out of Context": Unified Vision-Language Pretraining for Context Assisted Image Captioning
Figure 4 for "Let's not Quote out of Context": Unified Vision-Language Pretraining for Context Assisted Image Captioning
Viaarxiv icon

LIV: Language-Image Representations and Rewards for Robotic Control

Jun 01, 2023
Yecheng Jason Ma, William Liang, Vaidehi Som, Vikash Kumar, Amy Zhang, Osbert Bastani, Dinesh Jayaraman

Figure 1 for LIV: Language-Image Representations and Rewards for Robotic Control
Figure 2 for LIV: Language-Image Representations and Rewards for Robotic Control
Figure 3 for LIV: Language-Image Representations and Rewards for Robotic Control
Figure 4 for LIV: Language-Image Representations and Rewards for Robotic Control
Viaarxiv icon

Explainable Recommender with Geometric Information Bottleneck

May 09, 2023
Hanqi Yan, Lin Gui, Menghan Wang, Kun Zhang, Yulan He

Figure 1 for Explainable Recommender with Geometric Information Bottleneck
Figure 2 for Explainable Recommender with Geometric Information Bottleneck
Figure 3 for Explainable Recommender with Geometric Information Bottleneck
Figure 4 for Explainable Recommender with Geometric Information Bottleneck
Viaarxiv icon

Transferring General Multimodal Pretrained Models to Text Recognition

Dec 19, 2022
Junyang Lin, Xuancheng Ren, Yichang Zhang, Gao Liu, Peng Wang, An Yang, Chang Zhou

Figure 1 for Transferring General Multimodal Pretrained Models to Text Recognition
Figure 2 for Transferring General Multimodal Pretrained Models to Text Recognition
Figure 3 for Transferring General Multimodal Pretrained Models to Text Recognition
Figure 4 for Transferring General Multimodal Pretrained Models to Text Recognition
Viaarxiv icon

Question Answering as Programming for Solving Time-Sensitive Questions

May 23, 2023
Xinyu Zhu, Cheng Yang, Bei Chen, Siheng Li, Jian-Guang Lou, Yujiu Yang

Figure 1 for Question Answering as Programming for Solving Time-Sensitive Questions
Figure 2 for Question Answering as Programming for Solving Time-Sensitive Questions
Figure 3 for Question Answering as Programming for Solving Time-Sensitive Questions
Figure 4 for Question Answering as Programming for Solving Time-Sensitive Questions
Viaarxiv icon