Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"speech": models, code, and papers

TnT - A Statistical Part-of-Speech Tagger

Mar 13, 2000
Thorsten Brants

Trigrams'n'Tags (TnT) is an efficient statistical part-of-speech tagger. Contrary to claims found elsewhere in the literature, we argue that a tagger based on Markov models performs at least as well as other current approaches, including the Maximum Entropy framework. A recent comparison has even shown that TnT performs significantly better for the tested corpora. We describe the basic model of TnT, the techniques used for smoothing and for handling unknown words. Furthermore, we present evaluations on two corpora.

* Proceedings of ANLP-2000, Seattle, WA 
* 8 pages 

  Access Paper or Ask Questions

A Text to Speech (TTS) System with English to Punjabi Conversion

Nov 13, 2014
Prabhsimran Singh, Amritpal Singh

The paper aims to show how an application can be developed that converts the English language into the Punjabi Language, and the same application can convert the Text to Speech(TTS) i.e. pronounce the text. This application can be really beneficial for those with special needs.

* International Journal of Computer and Communication System Engineering, Volume 1, Issue 04, December 2014 
* 5 pages, 8 figures, 3 tables 

  Access Paper or Ask Questions

Massively Multilingual Adversarial Speech Recognition

Apr 03, 2019
Oliver Adams, Matthew Wiesner, Shinji Watanabe, David Yarowsky

We report on adaptation of multilingual end-to-end speech recognition models trained on as many as 100 languages. Our findings shed light on the relative importance of similarity between the target and pretraining languages along the dimensions of phonetics, phonology, language family, geographical location, and orthography. In this context, experiments demonstrate the effectiveness of two additional pretraining objectives in encouraging language-independent encoder representations: a context-independent phoneme objective paired with a language-adversarial classification objective.

* Accepted at NAACL-HLT 2019 

  Access Paper or Ask Questions

Online Updating of Word Representations for Part-of-Speech Tagging

Apr 02, 2016
Wenpeng Yin, Tobias Schnabel, Hinrich Schütze

We propose online unsupervised domain adaptation (DA), which is performed incrementally as data comes in and is applicable when batch DA is not possible. In a part-of-speech (POS) tagging evaluation, we find that online unsupervised DA performs as well as batch DA.

* EMNLP'2015. Released POS tagger "FLORS" for online domain adaptation 

  Access Paper or Ask Questions

BUT Opensat 2019 Speech Recognition System

Jan 30, 2020
Martin Karafiát, Murali Karthick Baskar, Igor Szöke, Hari Krishna Vydana, Karel Veselý, Jan "Honza'' Černocký

The paper describes the BUT Automatic Speech Recognition (ASR) systems submitted for OpenSAT evaluations under two domain categories such as low resourced languages and public safety communications. The first was challenging due to lack of training data, therefore various architectures and multilingual approaches were employed. The combination led to superior performance. The second domain was challenging due to recording in extreme conditions such as specific channel, speaker under stress and high levels of noise. Data augmentation process was inevitable to get reasonably good performance.


  Access Paper or Ask Questions

Part-of-Speech Tagging with Minimal Lexicalization

Dec 27, 2003
Virginia Savova, Leonid Peshkin

We use a Dynamic Bayesian Network to represent compactly a variety of sublexical and contextual features relevant to Part-of-Speech (PoS) tagging. The outcome is a flexible tagger (LegoTag) with state-of-the-art performance (3.6% error on a benchmark corpus). We explore the effect of eliminating redundancy and radically reducing the size of feature vocabularies. We find that a small but linguistically motivated set of suffixes results in improved cross-corpora generalization. We also show that a minimal lexicon limited to function words is sufficient to ensure reasonable performance.

* 10 pages text; 1 figure. To appear in "Current Issues in Linguistic Theory: Recent Advances in Natural Language Processing";John Benjamins Publishers, Amsterdam 

  Access Paper or Ask Questions

Specifying Intonation from Context for Speech Synthesis

Jul 18, 1994
Scott Prevost, Mark Steedman

This paper presents a theory and a computational implementation for generating prosodically appropriate synthetic speech in response to database queries. Proper distinctions of contrast and emphasis are expressed in an intonation contour that is synthesized by rule under the control of a grammar, a discourse model, and a knowledge base. The theory is based on Combinatory Categorial Grammar, a formalism which easily integrates the notions of syntactic constituency, semantics, prosodic phrasing and information structure. Results from our current implementation demonstrate the system's ability to generate a variety of intonational possibilities for a given sentence depending on the discourse context.

* 18 pages 

  Access Paper or Ask Questions

Evolution of Part-of-Speech in Classical Chinese

Sep 23, 2020
Bai Li

Classical Chinese is a language notable for its word class flexibility: the same word may often be used as a noun or a verb. Bisang (2008) claimed that Classical Chinese is a precategorical language, where the syntactic position of a word determines its part-of-speech category. In this paper, we apply entropy-based metrics to evaluate these claims on historical corpora. We further explore differences between nouns and verbs in Classical Chinese: using psycholinguistic norms, we find a positive correlation between concreteness and noun usage. Finally, we align character embeddings from Classical and Modern Chinese, and find that verbs undergo more semantic change than nouns.

  Access Paper or Ask Questions

Hateminers : Detecting Hate speech against Women

Dec 17, 2018
Punyajoy Saha, Binny Mathew, Pawan Goyal, Animesh Mukherjee

With the online proliferation of hate speech, there is an urgent need for systems that can detect such harmful content. In this paper, We present the machine learning models developed for the Automatic Misogyny Identification (AMI) shared task at EVALITA 2018. We generate three types of features: Sentence Embeddings, TF-IDF Vectors, and BOW Vectors to represent each tweet. These features are then concatenated and fed into the machine learning models. Our model came First for the English Subtask A and Fifth for the English Subtask B. We release our winning model for public use and it's available at

* 5 Pages, 2 Figures, 1 Table, Model Available at 

  Access Paper or Ask Questions

Qualitative and Quantitative Models of Speech Translation

Aug 24, 1994
Hiyan Alshawi

This paper compares a qualitative reasoning model of translation with a quantitative statistical model. We consider these models within the context of two hypothetical speech translation systems, starting with a logic-based design and pointing out which of its characteristics are best preserved or eliminated in moving to the second, quantitative design. The quantitative language and translation models are based on relations between lexical heads of phrases. Statistical parameters for structural dependency, lexical transfer, and linear order are used to select a set of implicit relations between words in a source utterance, a corresponding set of relations between target language words, and the most likely translation of the original utterance.

* Appeared in proceedings of the ACL workshop "The Balancing Act, Combining Symbolic and Statistical Approaches to Language", Las Cruces NM, July 1994. LaTeX, 24 pages 

  Access Paper or Ask Questions