Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:PESTO: Switching Point based Dynamic and Relative Positional Encoding for Code-Mixed Languages

Nov 12, 2021

Mohsin Ali, Kandukuri Sai Teja, Sumanth Manduru, Parth Patwa, Amitava Das

Figure 1 for PESTO: Switching Point based Dynamic and Relative Positional Encoding for Code-Mixed Languages

Figure 2 for PESTO: Switching Point based Dynamic and Relative Positional Encoding for Code-Mixed Languages

Figure 3 for PESTO: Switching Point based Dynamic and Relative Positional Encoding for Code-Mixed Languages

Share this with someone who'll enjoy it:

Abstract:NLP applications for code-mixed (CM) or mix-lingual text have gained a significant momentum recently, the main reason being the prevalence of language mixing in social media communications in multi-lingual societies like India, Mexico, Europe, parts of USA etc. Word embeddings are basic build-ing blocks of any NLP system today, yet, word embedding for CM languages is an unexplored territory. The major bottleneck for CM word embeddings is switching points, where the language switches. These locations lack in contextually and statistical systems fail to model this phenomena due to high variance in the seen examples. In this paper we present our initial observations on applying switching point based positional encoding techniques for CM language, specifically Hinglish (Hindi - English). Results are only marginally better than SOTA, but it is evident that positional encoding could bean effective way to train position sensitive language models for CM text.

* Accepted as Student Abstract at AAAI 2022

View paper on

Share this with someone who'll enjoy it:

Title:PESTO: Switching Point based Dynamic and Relative Positional Encoding for Code-Mixed Languages

Paper and Code