Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Prudhvi Raj Dachapally

Query-Key Normalization for Transformers

Oct 08, 2020

Alex Henry, Prudhvi Raj Dachapally, Shubham Pawar, Yuxuan Chen

Figure 1 for Query-Key Normalization for Transformers

Figure 2 for Query-Key Normalization for Transformers

Figure 3 for Query-Key Normalization for Transformers

Figure 4 for Query-Key Normalization for Transformers

Abstract:Low-resource language translation is a challenging but socially valuable NLP task. Building on recent work adapting the Transformer's normalization to this setting, we propose QKNorm, a normalization technique that modifies the attention mechanism to make the softmax function less prone to arbitrary saturation without sacrificing expressivity. Specifically, we apply $\ell_2$ normalization along the head dimension of each query and key matrix prior to multiplying them and then scale up by a learnable parameter instead of dividing by the square root of the embedding dimension. We show improvements averaging 0.928 BLEU over state-of-the-art bilingual benchmarks for 5 low-resource translation pairs from the TED Talks corpus and IWSLT'15.

* 8 pages, 2 figures, accepted at Findings of EMNLP 2020

Via

Access Paper or Ask Questions

In-depth Question classification using Convolutional Neural Networks

Mar 31, 2018

Prudhvi Raj Dachapally, Srikanth Ramanam

Figure 1 for In-depth Question classification using Convolutional Neural Networks

Figure 2 for In-depth Question classification using Convolutional Neural Networks

Figure 3 for In-depth Question classification using Convolutional Neural Networks

Figure 4 for In-depth Question classification using Convolutional Neural Networks

Abstract:Convolutional neural networks for computer vision are fairly intuitive. In a typical CNN used in image classification, the first layers learn edges, and the following layers learn some filters that can identify an object. But CNNs for Natural Language Processing are not used often and are not completely intuitive. We have a good idea about what the convolution filters learn for the task of text classification, and to that, we propose a neural network structure that will be able to give good results in less time. We will be using convolutional neural networks to predict the primary or broader topic of a question, and then use separate networks for each of these predicted topics to accurately classify their sub-topics.

* 4 pages, short paper

Via

Access Paper or Ask Questions

Facial Emotion Detection Using Convolutional Neural Networks and Representational Autoencoder Units

Jun 05, 2017

Prudhvi Raj Dachapally

Figure 1 for Facial Emotion Detection Using Convolutional Neural Networks and Representational Autoencoder Units

Figure 2 for Facial Emotion Detection Using Convolutional Neural Networks and Representational Autoencoder Units

Figure 3 for Facial Emotion Detection Using Convolutional Neural Networks and Representational Autoencoder Units

Figure 4 for Facial Emotion Detection Using Convolutional Neural Networks and Representational Autoencoder Units

Abstract:Emotion being a subjective thing, leveraging knowledge and science behind labeled data and extracting the components that constitute it, has been a challenging problem in the industry for many years. With the evolution of deep learning in computer vision, emotion recognition has become a widely-tackled research problem. In this work, we propose two independent methods for this very task. The first method uses autoencoders to construct a unique representation of each emotion, while the second method is an 8-layer convolutional neural network (CNN). These methods were trained on the posed-emotion dataset (JAFFE), and to test their robustness, both the models were also tested on 100 random images from the Labeled Faces in the Wild (LFW) dataset, which consists of images that are candid than posed. The results show that with more fine-tuning and depth, our CNN model can outperform the state-of-the-art methods for emotion recognition. We also propose some exciting ideas for expanding the concept of representational autoencoders to improve their performance.

* 6 pages, 8 figures, and 3 tables

Via

Access Paper or Ask Questions