Alert button

"Text": models, code, and papers
Alert button

Language Model-Based Emotion Prediction Methods for Emotional Speech Synthesis Systems

Jun 30, 2022
Hyun-Wook Yoon, Ohsung Kwon, Hoyeon Lee, Ryuichi Yamamoto, Eunwoo Song, Jae-Min Kim, Min-Jae Hwang

Figure 1 for Language Model-Based Emotion Prediction Methods for Emotional Speech Synthesis Systems
Figure 2 for Language Model-Based Emotion Prediction Methods for Emotional Speech Synthesis Systems
Figure 3 for Language Model-Based Emotion Prediction Methods for Emotional Speech Synthesis Systems
Figure 4 for Language Model-Based Emotion Prediction Methods for Emotional Speech Synthesis Systems
Viaarxiv icon

ScaleFace: Uncertainty-aware Deep Metric Learning

Sep 12, 2022
Roman Kail, Kirill Fedyanin, Nikita Muravev, Alexey Zaytsev, Maxim Panov

Figure 1 for ScaleFace: Uncertainty-aware Deep Metric Learning
Figure 2 for ScaleFace: Uncertainty-aware Deep Metric Learning
Figure 3 for ScaleFace: Uncertainty-aware Deep Metric Learning
Figure 4 for ScaleFace: Uncertainty-aware Deep Metric Learning
Viaarxiv icon

PDO-e$\text{S}^\text{2}$CNNs: Partial Differential Operator Based Equivariant Spherical CNNs

Apr 08, 2021
Zhengyang Shen, Tiancheng Shen, Zhouchen Lin, Jinwen Ma

Figure 1 for PDO-e$\text{S}^\text{2}$CNNs: Partial Differential Operator Based Equivariant Spherical CNNs
Figure 2 for PDO-e$\text{S}^\text{2}$CNNs: Partial Differential Operator Based Equivariant Spherical CNNs
Figure 3 for PDO-e$\text{S}^\text{2}$CNNs: Partial Differential Operator Based Equivariant Spherical CNNs
Figure 4 for PDO-e$\text{S}^\text{2}$CNNs: Partial Differential Operator Based Equivariant Spherical CNNs
Viaarxiv icon

Injecting Text in Self-Supervised Speech Pretraining

Aug 27, 2021
Zhehuai Chen, Yu Zhang, Andrew Rosenberg, Bhuvana Ramabhadran, Gary Wang, Pedro Moreno

Figure 1 for Injecting Text in Self-Supervised Speech Pretraining
Figure 2 for Injecting Text in Self-Supervised Speech Pretraining
Figure 3 for Injecting Text in Self-Supervised Speech Pretraining
Figure 4 for Injecting Text in Self-Supervised Speech Pretraining
Viaarxiv icon

LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs

Nov 03, 2021
Christoph Schuhmann, Richard Vencu, Romain Beaumont, Robert Kaczmarczyk, Clayton Mullis, Aarush Katta, Theo Coombes, Jenia Jitsev, Aran Komatsuzaki

Figure 1 for LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs
Figure 2 for LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs
Figure 3 for LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs
Figure 4 for LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs
Viaarxiv icon

Multi-Tailed, Multi-Headed, Spatial Dynamic Memory refined Text-to-Image Synthesis

Oct 15, 2021
Amrit Diggavi Seshadri, Balaraman Ravindran

Figure 1 for Multi-Tailed, Multi-Headed, Spatial Dynamic Memory refined Text-to-Image Synthesis
Figure 2 for Multi-Tailed, Multi-Headed, Spatial Dynamic Memory refined Text-to-Image Synthesis
Figure 3 for Multi-Tailed, Multi-Headed, Spatial Dynamic Memory refined Text-to-Image Synthesis
Figure 4 for Multi-Tailed, Multi-Headed, Spatial Dynamic Memory refined Text-to-Image Synthesis
Viaarxiv icon

LViT: Language meets Vision Transformer in Medical Image Segmentation

Jun 29, 2022
Zihan Li, Yunxiang Li, Qingde Li, You Zhang, Puyang Wang, Dazhou Guo, Le Lu, Dakai Jin, Qingqi Hong

Figure 1 for LViT: Language meets Vision Transformer in Medical Image Segmentation
Figure 2 for LViT: Language meets Vision Transformer in Medical Image Segmentation
Figure 3 for LViT: Language meets Vision Transformer in Medical Image Segmentation
Figure 4 for LViT: Language meets Vision Transformer in Medical Image Segmentation
Viaarxiv icon

3D Rendering Framework for Data Augmentation in Optical Character Recognition

Sep 27, 2022
Andreas Spruck, Maximiliane Hawesch, Anatol Maier, Christian Riess, Jürgen Seiler, André Kaup

Figure 1 for 3D Rendering Framework for Data Augmentation in Optical Character Recognition
Figure 2 for 3D Rendering Framework for Data Augmentation in Optical Character Recognition
Figure 3 for 3D Rendering Framework for Data Augmentation in Optical Character Recognition
Figure 4 for 3D Rendering Framework for Data Augmentation in Optical Character Recognition
Viaarxiv icon

Primitive Representation Learning for Scene Text Recognition

May 10, 2021
Ruijie Yan, Liangrui Peng, Shanyu Xiao, Gang Yao

Figure 1 for Primitive Representation Learning for Scene Text Recognition
Figure 2 for Primitive Representation Learning for Scene Text Recognition
Figure 3 for Primitive Representation Learning for Scene Text Recognition
Figure 4 for Primitive Representation Learning for Scene Text Recognition
Viaarxiv icon

Tiered Reinforcement Learning: Pessimism in the Face of Uncertainty and Constant Regret

May 25, 2022
Jiawei Huang, Li Zhao, Tao Qin, Wei Chen, Nan Jiang, Tie-Yan Liu

Figure 1 for Tiered Reinforcement Learning: Pessimism in the Face of Uncertainty and Constant Regret
Figure 2 for Tiered Reinforcement Learning: Pessimism in the Face of Uncertainty and Constant Regret
Viaarxiv icon