Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ricardo Ribeiro

Towards Using Machine Translation Techniques to Induce Multilingual Lexica of Discourse Markers

Mar 31, 2015

António Lopes, David Martins de Matos, Vera Cabarrão, Ricardo Ribeiro, Helena Moniz, Isabel Trancoso, Ana Isabel Mata

Abstract:Discourse markers are universal linguistic events subject to language variation. Although an extensive literature has already reported language specific traits of these events, little has been said on their cross-language behavior and on building an inventory of multilingual lexica of discourse markers. This work describes new methods and approaches for the description, classification, and annotation of discourse markers in the specific domain of the Europarl corpus. The study of discourse markers in the context of translation is crucial due to the idiomatic nature of these structures. Multilingual lexica together with the functional analysis of such structures are useful tools for the hard task of translating discourse markers into possible equivalents from one language to another. Using Daniel Marcu's validated discourse markers for English, extracted from the Brown Corpus, our purpose is to build multilingual lexica of discourse markers for other languages, based on machine translation techniques. The major assumption in this study is that the usage of a discourse marker is independent of the language, i.e., the rhetorical function of a discourse marker in a sentence in one language is equivalent to the rhetorical function of the same discourse marker in another language.

* 6 pages

Via

Access Paper or Ask Questions

On the Application of Generic Summarization Algorithms to Music

Jun 18, 2014

Francisco Raposo, Ricardo Ribeiro, David Martins de Matos

Figure 1 for On the Application of Generic Summarization Algorithms to Music

Abstract:Several generic summarization algorithms were developed in the past and successfully applied in fields such as text and speech summarization. In this paper, we review and apply these algorithms to music. To evaluate this summarization's performance, we adopt an extrinsic approach: we compare a Fado Genre Classifier's performance using truncated contiguous clips against the summaries extracted with those algorithms on 2 different datasets. We show that Maximal Marginal Relevance (MMR), LexRank and Latent Semantic Analysis (LSA) all improve classification performance in both datasets used for testing.

* IEEE Signal Processing Letters, IEEE, vol. 22, n. 1, January 2015
* 12 pages, 1 table; Submitted to IEEE Signal Processing Letters

Via

Access Paper or Ask Questions

Automatic Fado Music Classification

Jun 17, 2014

Pedro Girão Antunes, David Martins de Matos, Ricardo Ribeiro, Isabel Trancoso

Figure 1 for Automatic Fado Music Classification

Figure 2 for Automatic Fado Music Classification

Figure 3 for Automatic Fado Music Classification

Figure 4 for Automatic Fado Music Classification

Abstract:In late 2011, Fado was elevated to the oral and intangible heritage of humanity by UNESCO. This study aims to develop a tool for automatic detection of Fado music based on the audio signal. To do this, frequency spectrum-related characteristics were captured form the audio signal: in addition to the Mel Frequency Cepstral Coefficients (MFCCs) and the energy of the signal, the signal was further analysed in two frequency ranges, providing additional information. Tests were run both in a 10-fold cross-validation setup (97.6% accuracy), and in a traditional train/test setup (95.8% accuracy). The good results reflect the fact that Fado is a very distinctive musical style.

* 4 pages, 1 figure, 5 tables

Via

Access Paper or Ask Questions

Centrality-as-Relevance: Support Sets and Similarity as Geometric Proximity

Jan 16, 2014

Ricardo Ribeiro, David Martins de Matos

Figure 1 for Centrality-as-Relevance: Support Sets and Similarity as Geometric Proximity

Figure 2 for Centrality-as-Relevance: Support Sets and Similarity as Geometric Proximity

Figure 3 for Centrality-as-Relevance: Support Sets and Similarity as Geometric Proximity

Figure 4 for Centrality-as-Relevance: Support Sets and Similarity as Geometric Proximity

Abstract:In automatic summarization, centrality-as-relevance means that the most important content of an information source, or a collection of information sources, corresponds to the most central passages, considering a representation where such notion makes sense (graph, spatial, etc.). We assess the main paradigms, and introduce a new centrality-based relevance model for automatic summarization that relies on the use of support sets to better estimate the relevant content. Geometric proximity is used to compute semantic relatedness. Centrality (relevance) is determined by considering the whole input source (and not only local information), and by taking into account the existence of minor topics or lateral subjects in the information sources to be summarized. The method consists in creating, for each passage of the input source, a support set consisting only of the most semantically related passages. Then, the determination of the most relevant content is achieved by selecting the passages that occur in the largest number of support sets. This model produces extractive summaries that are generic, and language- and domain-independent. Thorough automatic evaluation shows that the method achieves state-of-the-art performance, both in written text, and automatically transcribed speech summarization, including when compared to considerably more complex approaches.

* Journal Of Artificial Intelligence Research, Volume 42, pages 275-308, 2011

Via

Access Paper or Ask Questions

Key Phrase Extraction of Lightly Filtered Broadcast News

Jun 20, 2013

Luis Marujo, Ricardo Ribeiro, David Martins de Matos, João P. Neto, Anatole Gershman, Jaime Carbonell

Figure 1 for Key Phrase Extraction of Lightly Filtered Broadcast News

Figure 2 for Key Phrase Extraction of Lightly Filtered Broadcast News

Figure 3 for Key Phrase Extraction of Lightly Filtered Broadcast News

Figure 4 for Key Phrase Extraction of Lightly Filtered Broadcast News

Abstract:This paper explores the impact of light filtering on automatic key phrase extraction (AKE) applied to Broadcast News (BN). Key phrases are words and expressions that best characterize the content of a document. Key phrases are often used to index the document or as features in further processing. This makes improvements in AKE accuracy particularly important. We hypothesized that filtering out marginally relevant sentences from a document would improve AKE accuracy. Our experiments confirmed this hypothesis. Elimination of as little as 10% of the document sentences lead to a 2% improvement in AKE precision and recall. AKE is built over MAUI toolkit that follows a supervised learning approach. We trained and tested our AKE method on a gold standard made of 8 BN programs containing 110 manually annotated news stories. The experiments were conducted within a Multimedia Monitoring Solution (MMS) system for TV and radio news/programs, running daily, and monitoring 12 TV and 4 radio channels.

* In 15th International Conference on Text, Speech and Dialogue (TSD 2012)

Via

Access Paper or Ask Questions