Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Luiz W. P. Biscainho

DEL/Poli & PEE/COPPE, Universidade Federal do Rio de Janeiro

Extracting accent features in spoken Brazilian Portuguese without sociolinguistic labels

Jun 02, 2026

Pedro H. L. Leite, Pedro Benevenuto Valadares, Luiz W. P. Biscainho

Abstract:Regional accent classification in Brazilian Portuguese (pt-BR) suffers from the need for reliable labeling. While large self-supervised learning (SSL) speech models are powerful, their training pipelines dilute sociophonetic information, since accent labels are generally not reliable or are not used in training objectives. This work introduces a novel workflow for feature extraction using only acoustic labels. By isolating explicit regional accent landmarks and using a phoneme-based forced aligner (ZIPA), our targeted feature set captures dialectal variance more effectively than utterance embeddings, demonstrating that localized features can outperform general-purpose architectures on accent-related tasks using minimal and objective data labels.

* This work was submitted to the XLIV Brazilian Symposium on Telecommunications and Signal Processing (SBrT 2026)

Via

Access Paper or Ask Questions

FiPA-SR -- FiLM-Conditioned Perceptually Informed Audio Super-Resolution

May 28, 2026

Wallace Abreu, Luiz W. P. Biscainho

Abstract:Audio bandwidth extension aims to reconstruct missing high-frequency content from bandlimited signals. This paper proposes FiPA-SR, a GAN-based perceptual architecture capable of handling different input bandwidths within a single model. Building upon the previous $\textrm{AEROMamba}_\textrm{P}$ framework, the proposed model incorporates FiLM layers to adapt the reconstruction process according to the respective bandwidth. Experiments on the MUSDB dataset show that FiPA-SR outperforms the state-of-the-art AudioSR model across 8, 20, and 32 kHz input sampling rates. Moreover, the proposed architecture uses approximately 3$\times$ less GPU memory and performs inference more than 60$\times$ faster than the diffusion-based baseline.

* Submitted to the XLIV BRAZILIAN SYMPOSIUM ON TELECOMMUNICATIONS AND SIGNAL PROCESSING - SBrT 2026

Via

Access Paper or Ask Questions

Adapting Meter Tracking Models to Latin American Music

Apr 14, 2023

Lucas S. Maia, Martín Rocamora, Luiz W. P. Biscainho, Magdalena Fuentes

Abstract:Beat and downbeat tracking models have improved significantly in recent years with the introduction of deep learning methods. However, despite these improvements, several challenges remain. Particularly, the adaptation of available models to underrepresented music traditions in MIR is usually synonymous with collecting and annotating large amounts of data, which is impractical and time-consuming. Transfer learning, data augmentation, and fine-tuning techniques have been used quite successfully in related tasks and are known to alleviate this bottleneck. Furthermore, when studying these music traditions, models are not required to generalize to multiple mainstream music genres but to perform well in more constrained, homogeneous conditions. In this work, we investigate simple yet effective strategies to adapt beat and downbeat tracking models to two different Latin American music traditions and analyze the feasibility of these adaptations in real-world applications concerning the data and computational requirements. Contrary to common belief, our findings show it is possible to achieve good performance by spending just a few minutes annotating a portion of the data and training a model in a standard CPU machine, with the precise amount of resources needed depending on the task and the complexity of the dataset.

* Accepted at ISMIR 2022. This version was made after a bug fix in the code, which lead to minor modifications in the results (updated in Figure 1 and Table 1). The paper's conclusions remain unchanged

Via

Access Paper or Ask Questions

Mobile Sound Recognition for the Deaf and Hard of Hearing

Oct 19, 2018

Leonardo A. Fanzeres, Adriana S. Vivacqua, Luiz W. P. Biscainho

Abstract:Human perception of surrounding events is strongly dependent on audio cues. Thus, acoustic insulation can seriously impact situational awareness. We present an exploratory study in the domain of assistive computing, eliciting requirements and presenting solutions to problems found in the development of an environmental sound recognition system, which aims to assist deaf and hard of hearing people in the perception of sounds. To take advantage of smartphones computational ubiquity, we propose a system that executes all processing on the device itself, from audio features extraction to recognition and visual presentation of results. Our application also presents the confidence level of the classification to the user. A test of the system conducted with deaf users provided important and inspiring feedback from participants.

* 25 pages, 8 figures

Via

Access Paper or Ask Questions