Alert button

"Text": models, code, and papers
Alert button

On the Relevance of Phoneme Duration Variability of Synthesized Training Data for Automatic Speech Recognition

Oct 12, 2023
Nick Rossenbach, Benedikt Hilmes, Ralf Schlüter

Figure 1 for On the Relevance of Phoneme Duration Variability of Synthesized Training Data for Automatic Speech Recognition
Figure 2 for On the Relevance of Phoneme Duration Variability of Synthesized Training Data for Automatic Speech Recognition
Figure 3 for On the Relevance of Phoneme Duration Variability of Synthesized Training Data for Automatic Speech Recognition
Figure 4 for On the Relevance of Phoneme Duration Variability of Synthesized Training Data for Automatic Speech Recognition
Viaarxiv icon

Effects of Human Adversarial and Affable Samples on BERT Generalizability

Oct 13, 2023
Aparna Elangovan, Jiayuan He, Yuan Li, Karin Verspoor

Figure 1 for Effects of Human Adversarial and Affable Samples on BERT Generalizability
Figure 2 for Effects of Human Adversarial and Affable Samples on BERT Generalizability
Figure 3 for Effects of Human Adversarial and Affable Samples on BERT Generalizability
Figure 4 for Effects of Human Adversarial and Affable Samples on BERT Generalizability
Viaarxiv icon

KCTS: Knowledge-Constrained Tree Search Decoding with Token-Level Hallucination Detection

Oct 13, 2023
Sehyun Choi, Tianqing Fang, Zhaowei Wang, Yangqiu Song

Viaarxiv icon

UniDoc: A Universal Large Multimodal Model for Simultaneous Text Detection, Recognition, Spotting and Understanding

Aug 19, 2023
Hao Feng, Zijian Wang, Jingqun Tang, Jinghui Lu, Wengang Zhou, Houqiang Li, Can Huang

Figure 1 for UniDoc: A Universal Large Multimodal Model for Simultaneous Text Detection, Recognition, Spotting and Understanding
Figure 2 for UniDoc: A Universal Large Multimodal Model for Simultaneous Text Detection, Recognition, Spotting and Understanding
Figure 3 for UniDoc: A Universal Large Multimodal Model for Simultaneous Text Detection, Recognition, Spotting and Understanding
Figure 4 for UniDoc: A Universal Large Multimodal Model for Simultaneous Text Detection, Recognition, Spotting and Understanding
Viaarxiv icon

SPEGTI: Structured Prediction for Efficient Generative Text-to-Image Models

Aug 14, 2023
Sadeep Jayasumana, Daniel Glasner, Srikumar Ramalingam, Andreas Veit, Ayan Chakrabarti, Sanjiv Kumar

Figure 1 for SPEGTI: Structured Prediction for Efficient Generative Text-to-Image Models
Figure 2 for SPEGTI: Structured Prediction for Efficient Generative Text-to-Image Models
Figure 3 for SPEGTI: Structured Prediction for Efficient Generative Text-to-Image Models
Figure 4 for SPEGTI: Structured Prediction for Efficient Generative Text-to-Image Models
Viaarxiv icon

GIST: Generating Image-Specific Text for Fine-grained Object Classification

Aug 04, 2023
Kathleen M. Lewis, Emily Mu, Adrian V. Dalca, John Guttag

Figure 1 for GIST: Generating Image-Specific Text for Fine-grained Object Classification
Figure 2 for GIST: Generating Image-Specific Text for Fine-grained Object Classification
Figure 3 for GIST: Generating Image-Specific Text for Fine-grained Object Classification
Figure 4 for GIST: Generating Image-Specific Text for Fine-grained Object Classification
Viaarxiv icon

Measuring reasoning capabilities of ChatGPT

Oct 08, 2023
Adrian Groza

Viaarxiv icon

uSee: Unified Speech Enhancement and Editing with Conditional Diffusion Models

Oct 02, 2023
Muqiao Yang, Chunlei Zhang, Yong Xu, Zhongweiyang Xu, Heming Wang, Bhiksha Raj, Dong Yu

Figure 1 for uSee: Unified Speech Enhancement and Editing with Conditional Diffusion Models
Figure 2 for uSee: Unified Speech Enhancement and Editing with Conditional Diffusion Models
Figure 3 for uSee: Unified Speech Enhancement and Editing with Conditional Diffusion Models
Figure 4 for uSee: Unified Speech Enhancement and Editing with Conditional Diffusion Models
Viaarxiv icon

LightGrad: Lightweight Diffusion Probabilistic Model for Text-to-Speech

Aug 31, 2023
Jie Chen, Xingchen Song, Zhendong Peng, Binbin Zhang, Fuping Pan, Zhiyong Wu

Figure 1 for LightGrad: Lightweight Diffusion Probabilistic Model for Text-to-Speech
Figure 2 for LightGrad: Lightweight Diffusion Probabilistic Model for Text-to-Speech
Figure 3 for LightGrad: Lightweight Diffusion Probabilistic Model for Text-to-Speech
Viaarxiv icon

Evaluating Generative Models for Graph-to-Text Generation

Jul 27, 2023
Shuzhou Yuan, Michael Färber

Figure 1 for Evaluating Generative Models for Graph-to-Text Generation
Figure 2 for Evaluating Generative Models for Graph-to-Text Generation
Figure 3 for Evaluating Generative Models for Graph-to-Text Generation
Figure 4 for Evaluating Generative Models for Graph-to-Text Generation
Viaarxiv icon