Alert button

"Text": models, code, and papers
Alert button

Attentive Mask CLIP

Dec 16, 2022
Yifan Yang, Weiquan Huang, Yixuan Wei, Houwen Peng, Xinyang Jiang, Huiqiang Jiang, Fangyun Wei, Yin Wang, Han Hu, Lili Qiu, Yuqing Yang

Figure 1 for Attentive Mask CLIP
Figure 2 for Attentive Mask CLIP
Figure 3 for Attentive Mask CLIP
Figure 4 for Attentive Mask CLIP
Viaarxiv icon

AI based approach to Trailer Generation for Online Educational Courses

Jan 10, 2023
Prakhar Mishra, Chaitali Diwan, Srinath Srinivasa, G. Srinivasaraghavan

Figure 1 for AI based approach to Trailer Generation for Online Educational Courses
Figure 2 for AI based approach to Trailer Generation for Online Educational Courses
Figure 3 for AI based approach to Trailer Generation for Online Educational Courses
Figure 4 for AI based approach to Trailer Generation for Online Educational Courses
Viaarxiv icon

Channel-aware Decoupling Network for Multi-turn Dialogue Comprehension

Jan 10, 2023
Zhuosheng Zhang, Hai Zhao, Longxiang Liu

Figure 1 for Channel-aware Decoupling Network for Multi-turn Dialogue Comprehension
Figure 2 for Channel-aware Decoupling Network for Multi-turn Dialogue Comprehension
Figure 3 for Channel-aware Decoupling Network for Multi-turn Dialogue Comprehension
Figure 4 for Channel-aware Decoupling Network for Multi-turn Dialogue Comprehension
Viaarxiv icon

L3Cube-MahaSBERT and HindSBERT: Sentence BERT Models and Benchmarking BERT Sentence Representations for Hindi and Marathi

Nov 21, 2022
Ananya Joshi, Aditi Kajale, Janhavi Gadre, Samruddhi Deode, Raviraj Joshi

Figure 1 for L3Cube-MahaSBERT and HindSBERT: Sentence BERT Models and Benchmarking BERT Sentence Representations for Hindi and Marathi
Figure 2 for L3Cube-MahaSBERT and HindSBERT: Sentence BERT Models and Benchmarking BERT Sentence Representations for Hindi and Marathi
Figure 3 for L3Cube-MahaSBERT and HindSBERT: Sentence BERT Models and Benchmarking BERT Sentence Representations for Hindi and Marathi
Figure 4 for L3Cube-MahaSBERT and HindSBERT: Sentence BERT Models and Benchmarking BERT Sentence Representations for Hindi and Marathi
Viaarxiv icon

RSD-GAN: Regularized Sobolev Defense GAN Against Speech-to-Text Adversarial Attacks

Jul 14, 2022
Mohammad Esmaeilpour, Nourhene Chaalia, Patrick Cardinal

Figure 1 for RSD-GAN: Regularized Sobolev Defense GAN Against Speech-to-Text Adversarial Attacks
Viaarxiv icon

Using Multiple Instance Learning to Build Multimodal Representations

Dec 11, 2022
Peiqi Wang, William M. Wells, Seth Berkowitz, Steven Horng, Polina Golland

Figure 1 for Using Multiple Instance Learning to Build Multimodal Representations
Figure 2 for Using Multiple Instance Learning to Build Multimodal Representations
Figure 3 for Using Multiple Instance Learning to Build Multimodal Representations
Figure 4 for Using Multiple Instance Learning to Build Multimodal Representations
Viaarxiv icon

LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking

Apr 19, 2022
Yupan Huang, Tengchao Lv, Lei Cui, Yutong Lu, Furu Wei

Figure 1 for LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking
Figure 2 for LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking
Figure 3 for LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking
Figure 4 for LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking
Viaarxiv icon

Text-Driven Separation of Arbitrary Sounds

Apr 12, 2022
Kevin Kilgour, Beat Gfeller, Qingqing Huang, Aren Jansen, Scott Wisdom, Marco Tagliasacchi

Figure 1 for Text-Driven Separation of Arbitrary Sounds
Figure 2 for Text-Driven Separation of Arbitrary Sounds
Figure 3 for Text-Driven Separation of Arbitrary Sounds
Figure 4 for Text-Driven Separation of Arbitrary Sounds
Viaarxiv icon

MUGEN: A Playground for Video-Audio-Text Multimodal Understanding and GENeration

Apr 28, 2022
Thomas Hayes, Songyang Zhang, Xi Yin, Guan Pang, Sasha Sheng, Harry Yang, Songwei Ge, Qiyuan Hu, Devi Parikh

Figure 1 for MUGEN: A Playground for Video-Audio-Text Multimodal Understanding and GENeration
Figure 2 for MUGEN: A Playground for Video-Audio-Text Multimodal Understanding and GENeration
Figure 3 for MUGEN: A Playground for Video-Audio-Text Multimodal Understanding and GENeration
Figure 4 for MUGEN: A Playground for Video-Audio-Text Multimodal Understanding and GENeration
Viaarxiv icon

Contrastive Positive Sample Propagation along the Audio-Visual Event Line

Nov 18, 2022
Jinxing Zhou, Dan Guo, Meng Wang

Figure 1 for Contrastive Positive Sample Propagation along the Audio-Visual Event Line
Figure 2 for Contrastive Positive Sample Propagation along the Audio-Visual Event Line
Figure 3 for Contrastive Positive Sample Propagation along the Audio-Visual Event Line
Figure 4 for Contrastive Positive Sample Propagation along the Audio-Visual Event Line
Viaarxiv icon