Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"speech": models, code, and papers

A Maximum-Entropy Partial Parser for Unrestricted Text

Jul 17, 1998
Wojciech Skut, Thorsten Brants

This paper describes a partial parser that assigns syntactic structures to sequences of part-of-speech tags. The program uses the maximum entropy parameter estimation method, which allows a flexible combination of different knowledge sources: the hierarchical structure, parts of speech and phrasal categories. In effect, the parser goes beyond simple bracketing and recognises even fairly complex structures. We give accuracy figures for different applications of the parser.

* 9 pages, LaTeX 

  Access Paper or Ask Questions

Compositional Semantics in Verbmobil

Jul 30, 1996
Johan Bos, Björn Gambäck, Christian Lieske, Yoshiki Mori, Manfred Pinkal, Karsten Worm

The paper discusses how compositional semantics is implemented in the Verbmobil speech-to-speech translation system using LUD, a description language for underspecified discourse representation structures. The description language and its formal interpretation in DRT are described as well as its implementation together with the architecture of the system's entire syntactic-semantic processing module. We show that a linguistically sound theory and formalism can be properly implemented in a system with (near) real-time requirements.

* Proceedings of COLING '96 
* 6 pages, LaTeX, uses colap.sty 

  Access Paper or Ask Questions

Tagset Reduction Without Information Loss

May 03, 1995
Thorsten Brants

A technique for reducing a tagset used for n-gram part-of-speech disambiguation is introduced and evaluated in an experiment. The technique ensures that all information that is provided by the original tagset can be restored from the reduced one. This is crucial, since we are interested in the linguistically motivated tags for part-of-speech disambiguation. The reduced tagset needs fewer parameters for its statistical model and allows more accurate parameter estimation. Additionally, there is a slight but not significant improvement of tagging accuracy.

* 3 pages, LaTeX, to appear in proceedings of ACL-95, student session 

  Access Paper or Ask Questions

Nonlinear prediction with neural nets in ADPCM

Mar 22, 2022
Marcos Faundez-Zanuy, Francesc Vallverdu, Enric Monte

In the last years there has been a growing interest for nonlinear speech models. Several works have been published revealing the better performance of nonlinear techniques, but little attention has been dedicated to the implementation of the nonlinear model into real applications. This work is focused on the study of the behaviour of a nonlinear predictive model based on neural nets, in a speech waveform coder. Our novel scheme obtains an improvement in SEGSNR between 1 and 2 dB for an adaptive quantization ranging from 2 to 5 bits.

* Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181), 1998, pp. 345-348 vol.1 
* 4 pages, published in Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181) Seattle, WA, USA. arXiv admin note: text overlap with arXiv:2203.01818 

  Access Paper or Ask Questions

Syllabification by Phone Categorization

Jul 15, 2018
Jacob Krantz, Maxwell Dulin, Paul De Palma, Mark VanDam

Syllables play an important role in speech synthesis, speech recognition, and spoken document retrieval. A novel, low cost, and language agnostic approach to dividing words into their corresponding syllables is presented. A hybrid genetic algorithm constructs a categorization of phones optimized for syllabification. This categorization is used on top of a hidden Markov model sequence classifier to find syllable boundaries. The technique shows promising preliminary results when trained and tested on English words.

* Jacob Krantz, Maxwell Dulin, Paul De Palma, and Mark VanDam. 2018. Syllabification by Phone Categorization. In Proceedings of the Genetic and Evolutionary Computation Conference Companion (GECCO '18) 47-48 

  Access Paper or Ask Questions

Unified and Multilingual Author Profiling for Detecting Haters

Sep 19, 2021
Ipek Baris Schlicht, Angel Felipe Magnossão de Paula

This paper presents a unified user profiling framework to identify hate speech spreaders by processing their tweets regardless of the language. The framework encodes the tweets with sentence transformers and applies an attention mechanism to select important tweets for learning user profiles. Furthermore, the attention layer helps to explain why a user is a hate speech spreader by producing attention weights at both token and post level. Our proposed model outperformed the state-of-the-art multilingual transformer models.

* Published at the CLEF 2021 
* 9 pages, 2 figures, see the original paper: 

  Access Paper or Ask Questions

Refinement of a Structured Language Model

Jan 24, 2000
Ciprian Chelba, Frederick Jelinek

A new language model for speech recognition inspired by linguistic analysis is presented. The model develops hidden hierarchical structure incrementally and uses it to extract meaningful information from the word history - thus enabling the use of extended distance dependencies - in an attempt to complement the locality of currently used n-gram Markov models. The model, its probabilistic parametrization, a reestimation algorithm for the model parameters and a set of experiments meant to evaluate its potential for speech recognition are presented.

* Proceedings of the International Conference on Advances in Pattern Recognition, 1998, pp. 275-284, Plymouth, UK 
* 10 pages 

  Access Paper or Ask Questions

May I Ask Who's Calling? Named Entity Recognition on Call Center Transcripts for Privacy Law Compliance

Oct 29, 2020
Micaela Kaplan

We investigate using Named Entity Recognition on a new type of user-generated text: a call center conversation. These conversations combine problems from spontaneous speech with problems novel to conversational Automated Speech Recognition, including incorrect recognition, alongside other common problems from noisy user-generated text. Using our own corpus with new annotations, training custom contextual string embeddings, and applying a BiLSTM-CRF, we match state-of-the-art results on our novel task.

* Proceedings of the 2020 EMNLP Workshop W-NUT: The Sixth Workshop on Noisy User-generated Text (2020) 1-6 
* The 6th Workshop on Noisy User-generated Text (W-NUT) 2020 at EMNLP 

  Access Paper or Ask Questions

Automatically Identifying Language Family from Acoustic Examples in Low Resource Scenarios

Dec 01, 2020
Peter Wu, Yifan Zhong, Alan W Black

Existing multilingual speech NLP works focus on a relatively small subset of languages, and thus current linguistic understanding of languages predominantly stems from classical approaches. In this work, we propose a method to analyze language similarity using deep learning. Namely, we train a model on the Wilderness dataset and investigate how its latent space compares with classical language family findings. Our approach provides a new direction for cross-lingual data augmentation in any speech-based NLP task.

  Access Paper or Ask Questions

Abusive Language Detection and Characterization of Twitter Behavior

Sep 26, 2020
Dincy Davis, Reena Murali, Remesh Babu

In this work, abusive language detection in online content is performed using Bidirectional Recurrent Neural Network (BiRNN) method. Here the main objective is to focus on various forms of abusive behaviors on Twitter and to detect whether a speech is abusive or not. The results are compared for various abusive behaviors in social media, with Convolutional Neural Netwrok (CNN) and Recurrent Neural Network (RNN) methods and proved that the proposed BiRNN is a better deep learning model for automatic abusive speech detection.

* International Journal of Computer Sciences and Engineering, Vol.8, Issue.7, July 2020 
* 7 pages, 7 figures and 8 tables 

  Access Paper or Ask Questions