Alert button

"Information": models, code, and papers
Alert button

Duration-aware pause insertion using pre-trained language model for multi-speaker text-to-speech

Add code
Bookmark button
Alert button
Feb 27, 2023
Dong Yang, Tomoki Koriyama, Yuki Saito, Takaaki Saeki, Detai Xin, Hiroshi Saruwatari

Figure 1 for Duration-aware pause insertion using pre-trained language model for multi-speaker text-to-speech
Figure 2 for Duration-aware pause insertion using pre-trained language model for multi-speaker text-to-speech
Figure 3 for Duration-aware pause insertion using pre-trained language model for multi-speaker text-to-speech
Figure 4 for Duration-aware pause insertion using pre-trained language model for multi-speaker text-to-speech
Viaarxiv icon

Blind Multimodal Quality Assessment: A Brief Survey and A Case Study of Low-light Images

Mar 18, 2023
Miaohui Wang, Zhuowei Xu, Mai Xu, Weisi Lin

Figure 1 for Blind Multimodal Quality Assessment: A Brief Survey and A Case Study of Low-light Images
Figure 2 for Blind Multimodal Quality Assessment: A Brief Survey and A Case Study of Low-light Images
Figure 3 for Blind Multimodal Quality Assessment: A Brief Survey and A Case Study of Low-light Images
Figure 4 for Blind Multimodal Quality Assessment: A Brief Survey and A Case Study of Low-light Images
Viaarxiv icon

Tag2Text: Guiding Vision-Language Model via Image Tagging

Add code
Bookmark button
Alert button
Mar 10, 2023
Xinyu Huang, Youcai Zhang, Jinyu Ma, Weiwei Tian, Rui Feng, Yuejie Zhang, Yaqian Li, Yandong Guo, Lei Zhang

Figure 1 for Tag2Text: Guiding Vision-Language Model via Image Tagging
Figure 2 for Tag2Text: Guiding Vision-Language Model via Image Tagging
Figure 3 for Tag2Text: Guiding Vision-Language Model via Image Tagging
Figure 4 for Tag2Text: Guiding Vision-Language Model via Image Tagging
Viaarxiv icon

Accurate Real-time Polyp Detection in Videos from Concatenation of Latent Features Extracted from Consecutive Frames

Mar 10, 2023
Hemin Ali Qadir, Younghak Shin, Jacob Bergsland, Ilangko Balasingham

Figure 1 for Accurate Real-time Polyp Detection in Videos from Concatenation of Latent Features Extracted from Consecutive Frames
Figure 2 for Accurate Real-time Polyp Detection in Videos from Concatenation of Latent Features Extracted from Consecutive Frames
Figure 3 for Accurate Real-time Polyp Detection in Videos from Concatenation of Latent Features Extracted from Consecutive Frames
Figure 4 for Accurate Real-time Polyp Detection in Videos from Concatenation of Latent Features Extracted from Consecutive Frames
Viaarxiv icon

DDS3D: Dense Pseudo-Labels with Dynamic Threshold for Semi-Supervised 3D Object Detection

Add code
Bookmark button
Alert button
Mar 10, 2023
Jingyu Li, Zhe Liu, Jinghua Hou, Dingkang Liang

Figure 1 for DDS3D: Dense Pseudo-Labels with Dynamic Threshold for Semi-Supervised 3D Object Detection
Figure 2 for DDS3D: Dense Pseudo-Labels with Dynamic Threshold for Semi-Supervised 3D Object Detection
Figure 3 for DDS3D: Dense Pseudo-Labels with Dynamic Threshold for Semi-Supervised 3D Object Detection
Figure 4 for DDS3D: Dense Pseudo-Labels with Dynamic Threshold for Semi-Supervised 3D Object Detection
Viaarxiv icon

Improving Few-Shot Learning for Talking Face System with TTS Data Augmentation

Add code
Bookmark button
Alert button
Mar 09, 2023
Qi Chen, Ziyang Ma, Tao Liu, Xu Tan, Qu Lu, Xie Chen, Kai Yu

Figure 1 for Improving Few-Shot Learning for Talking Face System with TTS Data Augmentation
Figure 2 for Improving Few-Shot Learning for Talking Face System with TTS Data Augmentation
Figure 3 for Improving Few-Shot Learning for Talking Face System with TTS Data Augmentation
Figure 4 for Improving Few-Shot Learning for Talking Face System with TTS Data Augmentation
Viaarxiv icon

RiDDLE: Reversible and Diversified De-identification with Latent Encryptor

Add code
Bookmark button
Alert button
Mar 09, 2023
Dongze Li, Wei Wang, Kang Zhao, Jing Dong, Tieniu Tan

Figure 1 for RiDDLE: Reversible and Diversified De-identification with Latent Encryptor
Figure 2 for RiDDLE: Reversible and Diversified De-identification with Latent Encryptor
Figure 3 for RiDDLE: Reversible and Diversified De-identification with Latent Encryptor
Figure 4 for RiDDLE: Reversible and Diversified De-identification with Latent Encryptor
Viaarxiv icon

Fusing Visual Appearance and Geometry for Multi-modality 6DoF Object Tracking

Add code
Bookmark button
Alert button
Feb 22, 2023
Manuel Stoiber, Mariam Elsayed, Anne E. Reichert, Florian Steidle, Dongheui Lee, Rudolph Triebel

Figure 1 for Fusing Visual Appearance and Geometry for Multi-modality 6DoF Object Tracking
Figure 2 for Fusing Visual Appearance and Geometry for Multi-modality 6DoF Object Tracking
Figure 3 for Fusing Visual Appearance and Geometry for Multi-modality 6DoF Object Tracking
Figure 4 for Fusing Visual Appearance and Geometry for Multi-modality 6DoF Object Tracking
Viaarxiv icon

Envisioning the Next-Gen Document Reader

Add code
Bookmark button
Alert button
Feb 15, 2023
Catherine Yeh, Nedim Lipka, Franck Dernoncourt

Figure 1 for Envisioning the Next-Gen Document Reader
Figure 2 for Envisioning the Next-Gen Document Reader
Figure 3 for Envisioning the Next-Gen Document Reader
Figure 4 for Envisioning the Next-Gen Document Reader
Viaarxiv icon

TRUSformer: Improving Prostate Cancer Detection from Micro-Ultrasound Using Attention and Self-Supervision

Add code
Bookmark button
Alert button
Mar 03, 2023
Mahdi Gilany, Paul Wilson, Andrea Perera-Ortega, Amoon Jamzad, Minh Nguyen Nhat To, Fahimeh Fooladgar, Brian Wodlinger, Purang Abolmaesumi, Parvin Mousavi

Figure 1 for TRUSformer: Improving Prostate Cancer Detection from Micro-Ultrasound Using Attention and Self-Supervision
Figure 2 for TRUSformer: Improving Prostate Cancer Detection from Micro-Ultrasound Using Attention and Self-Supervision
Figure 3 for TRUSformer: Improving Prostate Cancer Detection from Micro-Ultrasound Using Attention and Self-Supervision
Figure 4 for TRUSformer: Improving Prostate Cancer Detection from Micro-Ultrasound Using Attention and Self-Supervision
Viaarxiv icon