Alert button

"Text": models, code, and papers
Alert button

Syntax-Guided Transformers: Elevating Compositional Generalization and Grounding in Multimodal Environments

Nov 07, 2023
Danial Kamali, Parisa Kordjamshidi

Viaarxiv icon

Leveraging Speech PTM, Text LLM, and Emotional TTS for Speech Emotion Recognition

Sep 19, 2023
Ziyang Ma, Wen Wu, Zhisheng Zheng, Yiwei Guo, Qian Chen, Shiliang Zhang, Xie Chen

Figure 1 for Leveraging Speech PTM, Text LLM, and Emotional TTS for Speech Emotion Recognition
Figure 2 for Leveraging Speech PTM, Text LLM, and Emotional TTS for Speech Emotion Recognition
Figure 3 for Leveraging Speech PTM, Text LLM, and Emotional TTS for Speech Emotion Recognition
Figure 4 for Leveraging Speech PTM, Text LLM, and Emotional TTS for Speech Emotion Recognition
Viaarxiv icon

Is it Possible to Modify Text to a Target Readability Level? An Initial Investigation Using Zero-Shot Large Language Models

Sep 22, 2023
Asma Farajidizaji, Vatsal Raina, Mark Gales

Viaarxiv icon

Text-to-3D using Gaussian Splatting

Sep 29, 2023
Zilong Chen, Feng Wang, Huaping Liu

Viaarxiv icon

Navigating Text-To-Image Customization:From LyCORIS Fine-Tuning to Model Evaluation

Sep 26, 2023
Shin-Ying Yeh, Yu-Guan Hsieh, Zhidong Gao, Bernard B W Yang, Giyeong Oh, Yanmin Gong

Viaarxiv icon

Synthetic Text Generation using Hypergraph Representations

Sep 06, 2023
Natraj Raman, Sameena Shah

Figure 1 for Synthetic Text Generation using Hypergraph Representations
Figure 2 for Synthetic Text Generation using Hypergraph Representations
Figure 3 for Synthetic Text Generation using Hypergraph Representations
Figure 4 for Synthetic Text Generation using Hypergraph Representations
Viaarxiv icon

Robust Generalization Strategies for Morpheme Glossing in an Endangered Language Documentation Context

Nov 05, 2023
Michael Ginn, Alexis Palmer

Viaarxiv icon

MaRU: A Manga Retrieval and Understanding System Connecting Vision and Language

Oct 22, 2023
Conghao Tom Shen, Violet Yao, Yixin Liu

Viaarxiv icon

Ground-A-Video: Zero-shot Grounded Video Editing using Text-to-image Diffusion Models

Oct 02, 2023
Hyeonho Jeong, Jong Chul Ye

Figure 1 for Ground-A-Video: Zero-shot Grounded Video Editing using Text-to-image Diffusion Models
Figure 2 for Ground-A-Video: Zero-shot Grounded Video Editing using Text-to-image Diffusion Models
Figure 3 for Ground-A-Video: Zero-shot Grounded Video Editing using Text-to-image Diffusion Models
Figure 4 for Ground-A-Video: Zero-shot Grounded Video Editing using Text-to-image Diffusion Models
Viaarxiv icon

Contrastive Latent Space Reconstruction Learning for Audio-Text Retrieval

Sep 16, 2023
Kaiyi Luo, Xulong Zhang, Jianzong Wang, Huaxiong Li, Ning Cheng, Jing Xiao

Figure 1 for Contrastive Latent Space Reconstruction Learning for Audio-Text Retrieval
Figure 2 for Contrastive Latent Space Reconstruction Learning for Audio-Text Retrieval
Figure 3 for Contrastive Latent Space Reconstruction Learning for Audio-Text Retrieval
Viaarxiv icon