Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"speech": models, code, and papers

Sequence Modeling using Gated Recurrent Neural Networks

Jan 01, 2015
Mohammad Pezeshki

In this paper, we have used Recurrent Neural Networks to capture and model human motion data and generate motions by prediction of the next immediate data point at each time-step. Our RNN is armed with recently proposed Gated Recurrent Units which has shown promising results in some sequence modeling problems such as Machine Translation and Speech Synthesis. We demonstrate that this model is able to capture long-term dependencies in data and generate realistic motions.

  Access Paper or Ask Questions

A Compact Architecture for Dialogue Management Based on Scripts and Meta-Outputs

Jun 09, 2000
Manny Rayner, Beth Ann Hockey, Frankie James

We describe an architecture for spoken dialogue interfaces to semi-autonomous systems that transforms speech signals through successive representations of linguistic, dialogue, and domain knowledge. Each step produces an output, and a meta-output describing the transformation, with an executable program in a simple scripting language as the final result. The output/meta-output distinction permits perspicuous treatment of diverse tasks such as resolving pronouns, correcting user misconceptions, and optimizing scripts.

* Language Technology Joint Conference ANLP-NAACL 2000. 29 April - 4 May 2000, Seattle, WA 

  Access Paper or Ask Questions

Practical experiments with regular approximation of context-free languages

Oct 25, 1999
Mark-Jan Nederhof

Several methods are discussed that construct a finite automaton given a context-free grammar, including both methods that lead to subsets and those that lead to supersets of the original context-free language. Some of these methods of regular approximation are new, and some others are presented here in a more refined form with respect to existing literature. Practical experiments with the different methods of regular approximation are performed for spoken-language input: hypotheses from a speech recognizer are filtered through a finite automaton.

* 28 pages. To appear in Computational Linguistics 26(1), March 2000 

  Access Paper or Ask Questions

Transducers from Rewrite Rules with Backreferences

Apr 15, 1999
Dale Gerdemann, Gertjan van Noord

Context sensitive rewrite rules have been widely used in several areas of natural language processing, including syntax, morphology, phonology and speech processing. Kaplan and Kay, Karttunen, and Mohri & Sproat have given various algorithms to compile such rewrite rules into finite-state transducers. The present paper extends this work by allowing a limited form of backreferencing in such rules. The explicit use of backreferencing leads to more elegant and general solutions.

* 8 pages, EACL 1999 Bergen 

  Access Paper or Ask Questions

Robust stochastic parsing using the inside-outside algorithm

Dec 19, 1994
Briscoe, Ted, Waegner, Nick

The paper describes a parser of sequences of (English) part-of-speech labels which utilises a probabilistic grammar trained using the inside-outside algorithm. The initial (meta)grammar is defined by a linguist and further rules compatible with metagrammatical constraints are automatically generated. During training, rules with very low probability are rejected yielding a wide-coverage parser capable of ranking alternative analyses. A series of corpus-based experiments describe the parser's performance.

* Revised and updated version of paper from AAAI Workshop on Probabilistically-based Natural Language Processing Techniques, 1992, 16 pages, uuencoded, compressed postscript 

  Access Paper or Ask Questions

Use of Machine Learning Technique to maximize the signal over background for $H \rightarrow ττ$

Jul 07, 2021
Kanhaiya Gupta

In recent years, artificial neural networks (ANNs) have won numerous contests in pattern recognition and machine learning. ANNS have been applied to problems ranging from speech recognition to prediction of protein secondary structure, classification of cancers, and gene prediction. Here, we intend to maximize the chances of finding the Higgs boson decays to two $\tau$ leptons in the pseudo dataset using a Machine Learning technique to classify the recorded events as signal or background.

* 9 pages, 14 figures 

  Access Paper or Ask Questions

Automated Word Stress Detection in Russian

Jul 12, 2019
Maria Ponomareva, Kirill Milintsevich, Ekaterina Chernyak, Anatoly Starostin

In this study we address the problem of automated word stress detection in Russian using character level models and no part-speech-taggers. We use a simple bidirectional RNN with LSTM nodes and achieve the accuracy of 90% or higher. We experiment with two training datasets and show that using the data from an annotated corpus is much more efficient than using a dictionary, since it allows us to take into account word frequencies and the morphological context of the word.

* Published in Proceedings of the First Workshop on Subword and Character Level Models in NLP, pages 31 35, Copenhagen, Denmark, September 7, 2017 
* SCLeM 2017 

  Access Paper or Ask Questions

How2: A Large-scale Dataset for Multimodal Language Understanding

Nov 01, 2018
Ramon Sanabria, Ozan Caglayan, Shruti Palaskar, Desmond Elliott, Loïc Barrault, Lucia Specia, Florian Metze

In this paper, we introduce How2, a multimodal collection of instructional videos with English subtitles and crowdsourced Portuguese translations. We also present integrated sequence-to-sequence baselines for machine translation, automatic speech recognition, spoken language translation, and multimodal summarization. By making available data and code for several multimodal natural language tasks, we hope to stimulate more research on these and similar challenges, to obtain a deeper understanding of multimodality in language processing.

  Access Paper or Ask Questions

Understanding Abuse: A Typology of Abusive Language Detection Subtasks

May 30, 2017
Zeerak Waseem, Thomas Davidson, Dana Warmsley, Ingmar Weber

As the body of research on abusive language detection and analysis grows, there is a need for critical consideration of the relationships between different subtasks that have been grouped under this label. Based on work on hate speech, cyberbullying, and online abuse we propose a typology that captures central similarities and differences between subtasks and we discuss its implications for data annotation and feature construction. We emphasize the practical actions that can be taken by researchers to best approach their abusive language detection subtask of interest.

* To appear in the proceedings of the 1st Workshop on Abusive Language Online. Please cite that version 

  Access Paper or Ask Questions