Alert button

"Text": models, code, and papers
Alert button

VoiceExtender: Short-utterance Text-independent Speaker Verification with Guided Diffusion Model

Oct 07, 2023
Yayun He, Zuheng Kang, Jianzong Wang, Junqing Peng, Jing Xiao

Viaarxiv icon

From Scarcity to Efficiency: Improving CLIP Training via Visual-enriched Captions

Oct 11, 2023
Zhengfeng Lai, Haotian Zhang, Wentao Wu, Haoping Bai, Aleksei Timofeev, Xianzhi Du, Zhe Gan, Jiulong Shan, Chen-Nee Chuah, Yinfei Yang, Meng Cao

Figure 1 for From Scarcity to Efficiency: Improving CLIP Training via Visual-enriched Captions
Figure 2 for From Scarcity to Efficiency: Improving CLIP Training via Visual-enriched Captions
Figure 3 for From Scarcity to Efficiency: Improving CLIP Training via Visual-enriched Captions
Figure 4 for From Scarcity to Efficiency: Improving CLIP Training via Visual-enriched Captions
Viaarxiv icon

Language Models As Semantic Indexers

Oct 11, 2023
Bowen Jin, Hansi Zeng, Guoyin Wang, Xiusi Chen, Tianxin Wei, Ruirui Li, Zhengyang Wang, Zheng Li, Yang Li, Hanqing Lu, Suhang Wang, Jiawei Han, Xianfeng Tang

Figure 1 for Language Models As Semantic Indexers
Figure 2 for Language Models As Semantic Indexers
Figure 3 for Language Models As Semantic Indexers
Figure 4 for Language Models As Semantic Indexers
Viaarxiv icon

Image Super-resolution Via Latent Diffusion: A Sampling-space Mixture Of Experts And Frequency-augmented Decoder Approach

Oct 18, 2023
Feng Luo, Jinxi Xiang, Jun Zhang, Xiao Han, Wei Yang

Figure 1 for Image Super-resolution Via Latent Diffusion: A Sampling-space Mixture Of Experts And Frequency-augmented Decoder Approach
Figure 2 for Image Super-resolution Via Latent Diffusion: A Sampling-space Mixture Of Experts And Frequency-augmented Decoder Approach
Figure 3 for Image Super-resolution Via Latent Diffusion: A Sampling-space Mixture Of Experts And Frequency-augmented Decoder Approach
Figure 4 for Image Super-resolution Via Latent Diffusion: A Sampling-space Mixture Of Experts And Frequency-augmented Decoder Approach
Viaarxiv icon

Direct Neural Machine Translation with Task-level Mixture of Experts models

Oct 18, 2023
Isidora Chara Tourni, Subhajit Naskar

Viaarxiv icon

Grounded and Well-rounded: A Methodological Approach to the Study of Cross-modal and Cross-lingual Grounding

Oct 18, 2023
Timothee Mickus, Elaine Zosa, Denis Paperno

Viaarxiv icon

Coco-Nut: Corpus of Japanese Utterance and Voice Characteristics Description for Prompt-based Control

Sep 24, 2023
Aya Watanabe, Shinnosuke Takamichi, Yuki Saito, Wataru Nakata, Detai Xin, Hiroshi Saruwatari

Figure 1 for Coco-Nut: Corpus of Japanese Utterance and Voice Characteristics Description for Prompt-based Control
Figure 2 for Coco-Nut: Corpus of Japanese Utterance and Voice Characteristics Description for Prompt-based Control
Figure 3 for Coco-Nut: Corpus of Japanese Utterance and Voice Characteristics Description for Prompt-based Control
Figure 4 for Coco-Nut: Corpus of Japanese Utterance and Voice Characteristics Description for Prompt-based Control
Viaarxiv icon

Effects of Human Adversarial and Affable Samples on BERT Generalizability

Oct 17, 2023
Aparna Elangovan, Jiayuan He, Yuan Li, Karin Verspoor

Figure 1 for Effects of Human Adversarial and Affable Samples on BERT Generalizability
Figure 2 for Effects of Human Adversarial and Affable Samples on BERT Generalizability
Figure 3 for Effects of Human Adversarial and Affable Samples on BERT Generalizability
Figure 4 for Effects of Human Adversarial and Affable Samples on BERT Generalizability
Viaarxiv icon

ReForm-Eval: Evaluating Large Vision Language Models via Unified Re-Formulation of Task-Oriented Benchmarks

Oct 17, 2023
Zejun Li, Ye Wang, Mengfei Du, Qingwen Liu, Binhao Wu, Jiwen Zhang, Chengxing Zhou, Zhihao Fan, Jie Fu, Jingjing Chen, Xuanjing Huang, Zhongyu Wei

Figure 1 for ReForm-Eval: Evaluating Large Vision Language Models via Unified Re-Formulation of Task-Oriented Benchmarks
Figure 2 for ReForm-Eval: Evaluating Large Vision Language Models via Unified Re-Formulation of Task-Oriented Benchmarks
Figure 3 for ReForm-Eval: Evaluating Large Vision Language Models via Unified Re-Formulation of Task-Oriented Benchmarks
Figure 4 for ReForm-Eval: Evaluating Large Vision Language Models via Unified Re-Formulation of Task-Oriented Benchmarks
Viaarxiv icon

Towards reducing hallucination in extracting information from financial reports using Large Language Models

Oct 16, 2023
Bhaskarjit Sarmah, Tianjie Zhu, Dhagash Mehta, Stefano Pasquali

Viaarxiv icon