Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"speech": models, code, and papers

N-dimensional nonlinear prediction with MLP

Feb 24, 2022
Marcos Faundez-Zanuy

In this paper we propose a Non-Linear Predictive Vector quantizer (PVQ) for speech coding, based on Multi-Layer Perceptrons. With this scheme we have improved the results of our previous ADPCM coder with nonlinear prediction, and we have reduced the bit rate up to 1 bit per sample.

* 2002 11th European Signal Processing Conference, 2002, pp. 1-4 
* 4 pages 

  Access Paper or Ask Questions

VLSI Systems for signal processing and Communications

Jun 10, 2021
Aditya Kulkarni, Atharva Kulkarni, Ankit Lad, Laksh Maheshwari, Jayant Majji

The growing advances in VLSI technology and design tools have exponentially expanded the application domain of digital signal processing over the past 10 years. This survey emphasises on the architectural and performance parameters of VLSI for DSP applications such as speech processing, wireless communication, analog to digital converters, etc

  Access Paper or Ask Questions

Linguistic unit discovery from multi-modal inputs in unwritten languages: Summary of the "Speaking Rosetta" JSALT 2017 Workshop

Feb 14, 2018
Odette Scharenborg, Laurent Besacier, Alan Black, Mark Hasegawa-Johnson, Florian Metze, Graham Neubig, Sebastian Stueker, Pierre Godard, Markus Mueller, Lucas Ondel, Shruti Palaskar, Philip Arthur, Francesco Ciannella, Mingxing Du, Elin Larsen, Danny Merkx, Rachid Riad, Liming Wang, Emmanuel Dupoux

We summarize the accomplishments of a multi-disciplinary workshop exploring the computational and scientific issues surrounding the discovery of linguistic units (subwords and words) in a language without orthography. We study the replacement of orthographic transcriptions by images and/or translated text in a well-resourced language to help unsupervised discovery from raw speech.

* Accepted to ICASSP 2018 

  Access Paper or Ask Questions

Handling Sparse Data by Successive Abstraction

May 29, 1996
Christer Samuelsson

A general, practical method for handling sparse data that avoids held-out data and iterative reestimation is derived from first principles. It has been tested on a part-of-speech tagging task and outperformed (deleted) interpolation with context-independent weights, even when the latter used a globally optimal parameter setting determined a posteriori.

* Coling 96 
* 6 pages, uuencoded, gzipped PostScript 

  Access Paper or Ask Questions

Exploring Language-Independent Emotional Acoustic Features via Feature Selection

Sep 01, 2010
Arslan Shaukat, Ke Chen

We propose a novel feature selection strategy to discover language-independent acoustic features that tend to be responsible for emotions regardless of languages, linguistics and other factors. Experimental results suggest that the language-independent feature subset discovered yields the performance comparable to the full feature set on various emotional speech corpora.

* 15 pages, 2 figures, 6 tables 

  Access Paper or Ask Questions

A Bimachine Compiler for Ranked Tagging Rules

Jul 19, 2004
Wojciech Skut, Stefan Ulrich, Kathrine Hammervold

This paper describes a novel method of compiling ranked tagging rules into a deterministic finite-state device called a bimachine. The rules are formulated in the framework of regular rewrite operations and allow unrestricted regular expressions in both left and right rule contexts. The compiler is illustrated by an application within a speech synthesis system.

* 7 pages, 3 figures Proceedings of COLING 2004 (to appear) 

  Access Paper or Ask Questions

SPANISH 1992 (S92): corpus-based analysis of present-day Spanish for medical purposes

Apr 15, 1994

S92 research was begun in 1987 to analyze word frequencies in present-day Spanish for making speech pathology evaluation tools. 500 2,000-word samples of children, adolescents and adults' language were input between 1988-1991, calculations done in 1992; statistical and Lewandowski analyses were carried out in 1993.

* 20 pages 

  Access Paper or Ask Questions

Improving Minimal Gated Unit for Sequential Data

May 21, 2019
Kazuki Takamura, Satoshi Yamane

In order to obtain a model which can process sequential data related to machine translation and speech recognition faster and more accurately, we propose adopting Chrono Initializer as the initialization method of Minimal Gated Unit. We evaluated the method with two tasks: adding task and copy task. As a result of the experiment, the effectiveness of the proposed method was confirmed.

* 2 pages, 5 figures 

  Access Paper or Ask Questions

Is Word Sense Disambiguation just one more NLP task?

Feb 25, 1999
Yorick Wilks

This paper compares the tasks of part-of-speech (POS) tagging and word-sense-tagging or disambiguation (WSD), and argues that the tasks are not related by fineness of grain or anything like that, but are quite different kinds of task, particularly becuase there is nothing in POS corresponding to sense novelty. The paper also argues for the reintegration of sub-tasks that are being separated for evaluation

  Access Paper or Ask Questions

Heroes, Villains, and Victims, and GPT-3 -- Automated Extraction of Character Roles Without Training Data

May 16, 2022
Dominik Stammbach, Maria Antoniak, Elliott Ash

This paper shows how to use large-scale pre-trained language models to extract character roles from narrative texts without training data. Queried with a zero-shot question-answering prompt, GPT-3 can identify the hero, villain, and victim in diverse domains: newspaper articles, movie plot summaries, and political speeches.

  Access Paper or Ask Questions