Alert button

"Text": models, code, and papers
Alert button

PERF: Panoramic Neural Radiance Field from a Single Panorama

Oct 25, 2023
Guangcong Wang, Peng Wang, Zhaoxi Chen, Wenping Wang, Chen Change Loy, Ziwei Liu

Figure 1 for PERF: Panoramic Neural Radiance Field from a Single Panorama
Figure 2 for PERF: Panoramic Neural Radiance Field from a Single Panorama
Figure 3 for PERF: Panoramic Neural Radiance Field from a Single Panorama
Figure 4 for PERF: Panoramic Neural Radiance Field from a Single Panorama
Viaarxiv icon

Unified speech and gesture synthesis using flow matching

Oct 08, 2023
Shivam Mehta, Ruibo Tu, Simon Alexanderson, Jonas Beskow, Éva Székely, Gustav Eje Henter

Viaarxiv icon

Sparse Fine-tuning for Inference Acceleration of Large Language Models

Oct 13, 2023
Eldar Kurtic, Denis Kuznedelev, Elias Frantar, Michael Goin, Dan Alistarh

Viaarxiv icon

AXNav: Replaying Accessibility Tests from Natural Language

Oct 13, 2023
Maryam Taeb, Amanda Swearngin, Eldon Schoop, Ruijia Cheng, Yue Jiang, Jeffrey Nichols

Viaarxiv icon

Text Injection for Capitalization and Turn-Taking Prediction in Speech Models

Aug 14, 2023
Shaan Bijwadia, Shuo-yiin Chang, Weiran Wang, Zhong Meng, Hao Zhang, Tara N. Sainath

Figure 1 for Text Injection for Capitalization and Turn-Taking Prediction in Speech Models
Figure 2 for Text Injection for Capitalization and Turn-Taking Prediction in Speech Models
Figure 3 for Text Injection for Capitalization and Turn-Taking Prediction in Speech Models
Figure 4 for Text Injection for Capitalization and Turn-Taking Prediction in Speech Models
Viaarxiv icon

Clustering of Spell Variations for Proper Nouns Transliterated from the other languages

Oct 12, 2023
Prathamesh Pawar

Viaarxiv icon

CAPro: Webly Supervised Learning with Cross-Modality Aligned Prototypes

Oct 15, 2023
Yulei Qin, Xingyu Chen, Yunhang Shen, Chaoyou Fu, Yun Gu, Ke Li, Xing Sun, Rongrong Ji

Figure 1 for CAPro: Webly Supervised Learning with Cross-Modality Aligned Prototypes
Figure 2 for CAPro: Webly Supervised Learning with Cross-Modality Aligned Prototypes
Figure 3 for CAPro: Webly Supervised Learning with Cross-Modality Aligned Prototypes
Figure 4 for CAPro: Webly Supervised Learning with Cross-Modality Aligned Prototypes
Viaarxiv icon

TextDiff: Mask-Guided Residual Diffusion Models for Scene Text Image Super-Resolution

Aug 13, 2023
Baolin Liu, Zongyuan Yang, Pengfei Wang, Junjie Zhou, Ziqi Liu, Ziyi Song, Yan Liu, Yongping Xiong

Figure 1 for TextDiff: Mask-Guided Residual Diffusion Models for Scene Text Image Super-Resolution
Figure 2 for TextDiff: Mask-Guided Residual Diffusion Models for Scene Text Image Super-Resolution
Figure 3 for TextDiff: Mask-Guided Residual Diffusion Models for Scene Text Image Super-Resolution
Figure 4 for TextDiff: Mask-Guided Residual Diffusion Models for Scene Text Image Super-Resolution
Viaarxiv icon

Story Visualization by Online Text Augmentation with Context Memory

Aug 19, 2023
Daechul Ahn, Daneul Kim, Gwangmo Song, Seung Hwan Kim, Honglak Lee, Dongyeop Kang, Jonghyun Choi

Figure 1 for Story Visualization by Online Text Augmentation with Context Memory
Figure 2 for Story Visualization by Online Text Augmentation with Context Memory
Figure 3 for Story Visualization by Online Text Augmentation with Context Memory
Figure 4 for Story Visualization by Online Text Augmentation with Context Memory
Viaarxiv icon

PAI-Diffusion: Constructing and Serving a Family of Open Chinese Diffusion Models for Text-to-image Synthesis on the Cloud

Sep 11, 2023
Chengyu Wang, Zhongjie Duan, Bingyan Liu, Xinyi Zou, Cen Chen, Kui Jia, Jun Huang

Figure 1 for PAI-Diffusion: Constructing and Serving a Family of Open Chinese Diffusion Models for Text-to-image Synthesis on the Cloud
Figure 2 for PAI-Diffusion: Constructing and Serving a Family of Open Chinese Diffusion Models for Text-to-image Synthesis on the Cloud
Figure 3 for PAI-Diffusion: Constructing and Serving a Family of Open Chinese Diffusion Models for Text-to-image Synthesis on the Cloud
Figure 4 for PAI-Diffusion: Constructing and Serving a Family of Open Chinese Diffusion Models for Text-to-image Synthesis on the Cloud
Viaarxiv icon