Alert button

"Text": models, code, and papers
Alert button

Convolution-Based Channel-Frequency Attention for Text-Independent Speaker Verification

Oct 31, 2022
Jingyu Li, Yusheng Tian, Tan Lee

Figure 1 for Convolution-Based Channel-Frequency Attention for Text-Independent Speaker Verification
Figure 2 for Convolution-Based Channel-Frequency Attention for Text-Independent Speaker Verification
Figure 3 for Convolution-Based Channel-Frequency Attention for Text-Independent Speaker Verification
Figure 4 for Convolution-Based Channel-Frequency Attention for Text-Independent Speaker Verification
Viaarxiv icon

L3Cube-HindBERT and DevBERT: Pre-Trained BERT Transformer models for Devanagari based Hindi and Marathi Languages

Nov 27, 2022
Raviraj Joshi

Figure 1 for L3Cube-HindBERT and DevBERT: Pre-Trained BERT Transformer models for Devanagari based Hindi and Marathi Languages
Viaarxiv icon

Towards Weakly-Supervised Text Spotting using a Multi-Task Transformer

Feb 14, 2022
Yair Kittenplon, Inbal Lavi, Sharon Fogel, Yarin Bar, R. Manmatha, Pietro Perona

Figure 1 for Towards Weakly-Supervised Text Spotting using a Multi-Task Transformer
Figure 2 for Towards Weakly-Supervised Text Spotting using a Multi-Task Transformer
Figure 3 for Towards Weakly-Supervised Text Spotting using a Multi-Task Transformer
Figure 4 for Towards Weakly-Supervised Text Spotting using a Multi-Task Transformer
Viaarxiv icon

On the Importance of Image Encoding in Automated Chest X-Ray Report Generation

Nov 24, 2022
Otabek Nazarov, Mohammad Yaqub, Karthik Nandakumar

Figure 1 for On the Importance of Image Encoding in Automated Chest X-Ray Report Generation
Figure 2 for On the Importance of Image Encoding in Automated Chest X-Ray Report Generation
Figure 3 for On the Importance of Image Encoding in Automated Chest X-Ray Report Generation
Figure 4 for On the Importance of Image Encoding in Automated Chest X-Ray Report Generation
Viaarxiv icon

Breakpoint Transformers for Modeling and Tracking Intermediate Beliefs

Nov 15, 2022
Kyle Richardson, Ronen Tamari, Oren Sultan, Reut Tsarfaty, Dafna Shahaf, Ashish Sabharwal

Figure 1 for Breakpoint Transformers for Modeling and Tracking Intermediate Beliefs
Figure 2 for Breakpoint Transformers for Modeling and Tracking Intermediate Beliefs
Figure 3 for Breakpoint Transformers for Modeling and Tracking Intermediate Beliefs
Figure 4 for Breakpoint Transformers for Modeling and Tracking Intermediate Beliefs
Viaarxiv icon

Hate-CLIPper: Multimodal Hateful Meme Classification based on Cross-modal Interaction of CLIP Features

Oct 17, 2022
Gokul Karthik Kumar, Karthik Nandakumar

Figure 1 for Hate-CLIPper: Multimodal Hateful Meme Classification based on Cross-modal Interaction of CLIP Features
Figure 2 for Hate-CLIPper: Multimodal Hateful Meme Classification based on Cross-modal Interaction of CLIP Features
Figure 3 for Hate-CLIPper: Multimodal Hateful Meme Classification based on Cross-modal Interaction of CLIP Features
Figure 4 for Hate-CLIPper: Multimodal Hateful Meme Classification based on Cross-modal Interaction of CLIP Features
Viaarxiv icon

Automatic Text Summarization Methods: A Comprehensive Review

Mar 03, 2022
Divakar Yadav, Jalpa Desai, Arun Kumar Yadav

Figure 1 for Automatic Text Summarization Methods: A Comprehensive Review
Figure 2 for Automatic Text Summarization Methods: A Comprehensive Review
Figure 3 for Automatic Text Summarization Methods: A Comprehensive Review
Figure 4 for Automatic Text Summarization Methods: A Comprehensive Review
Viaarxiv icon

Task Residual for Tuning Vision-Language Models

Nov 18, 2022
Tao Yu, Zhihe Lu, Xin Jin, Zhibo Chen, Xinchao Wang

Figure 1 for Task Residual for Tuning Vision-Language Models
Figure 2 for Task Residual for Tuning Vision-Language Models
Figure 3 for Task Residual for Tuning Vision-Language Models
Figure 4 for Task Residual for Tuning Vision-Language Models
Viaarxiv icon

REVEAL: Retrieval-Augmented Visual-Language Pre-Training with Multi-Source Multimodal Knowledge Memory

Dec 10, 2022
Ziniu Hu, Ahmet Iscen, Chen Sun, Zirui Wang, Kai-Wei Chang, Yizhou Sun, Cordelia Schmid, David A. Ross, Alireza Fathi

Figure 1 for REVEAL: Retrieval-Augmented Visual-Language Pre-Training with Multi-Source Multimodal Knowledge Memory
Figure 2 for REVEAL: Retrieval-Augmented Visual-Language Pre-Training with Multi-Source Multimodal Knowledge Memory
Figure 3 for REVEAL: Retrieval-Augmented Visual-Language Pre-Training with Multi-Source Multimodal Knowledge Memory
Figure 4 for REVEAL: Retrieval-Augmented Visual-Language Pre-Training with Multi-Source Multimodal Knowledge Memory
Viaarxiv icon

ARTS: Eliminating Inconsistency between Text Detection and Recognition with Auto-Rectification Text Spotter

Oct 20, 2021
Humen Zhong, Jun Tang, Wenhai Wang, Zhibo Yang, Cong Yao, Tong Lu

Figure 1 for ARTS: Eliminating Inconsistency between Text Detection and Recognition with Auto-Rectification Text Spotter
Figure 2 for ARTS: Eliminating Inconsistency between Text Detection and Recognition with Auto-Rectification Text Spotter
Figure 3 for ARTS: Eliminating Inconsistency between Text Detection and Recognition with Auto-Rectification Text Spotter
Figure 4 for ARTS: Eliminating Inconsistency between Text Detection and Recognition with Auto-Rectification Text Spotter
Viaarxiv icon