Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Franz Pernkopf

Graz University of Technology

Lightweight and perceptually-guided voice conversion for electro-laryngeal speech

Jan 07, 2026

Benedikt Mayrhofer, Franz Pernkopf, Philipp Aichinger, Martin Hagmüller

Abstract:Electro-laryngeal (EL) speech is characterized by constant pitch, limited prosody, and mechanical noise, reducing naturalness and intelligibility. We propose a lightweight adaptation of the state-of-the-art StreamVC framework to this setting by removing pitch and energy modules and combining self-supervised pretraining with supervised fine-tuning on parallel EL and healthy (HE) speech data, guided by perceptual and intelligibility losses. Objective and subjective evaluations across different loss configurations confirm their influence: the best model variant, based on WavLM features and human-feedback predictions (+WavLM+HF), drastically reduces character error rate (CER) of EL inputs, raises naturalness mean opinion score (nMOS) from 1.1 to 3.3, and consistently narrows the gap to HE ground-truth speech in all evaluated metrics. These findings demonstrate the feasibility of adapting lightweight voice conversion architectures to EL voice rehabilitation while also identifying prosody generation and intelligibility improvements as the main remaining bottlenecks.

* 5 pages, 5 figures. Audio samples available at https://spsc-tugraz.github.io/lw-elvc-icassp26/ Preprint submitted to ICASSP

Via

Access Paper or Ask Questions

On the Multiangle Discrete Fractional Fourier Transform

May 08, 2025

Christian Oswald, Franz Pernkopf

Abstract:The efficiently computed multiangle centered discrete fractional Fourier transform (MA-CDFRFT) [1] has proven as a useful tool for time-frequency analysis; however, its scope is limited to the centered discrete fractional Fourier transform (CDFRFT). Meanwhile, extensive research on the standard DFRFT has lead to a better understanding of this transform as well as numerous possible choices for eigenvectors for implementation. In this letter we present a simple adaptation of the MA-CDFRFT which allows us to efficiently compute its standard counterpart, which we call the multiangle DFRFT (MA-DFRFT). Furthermore, we formalize the symmetries inherent to the MA-CDFRFT and MA-DFRFT to halve the number of FFTs needed to compute these transforms, paving the way for applications in resource constrained environments.

Via

Access Paper or Ask Questions

FMCW Radar Interference Mitigation based on the Fractional Fourier Transform

Apr 04, 2025

Christian Oswald, Franz Pernkopf

Abstract:In this paper, we propose a novel method for frequency modulated continuous wave (FMCW) radar mutual interference mitigation based on the discrete fractional Fourier transform (DFrFT). Interference chirps are detected and mitigated by compression and zeroing in the fractional domain. We provide an efficient implementation that can deal with multiple interferers, where we perform consecutive DFrFTs utilizing its angle-additivity property. For that purpose, we generalize and reduce the computational complexity of the multi-angle centered discrete fractional Fourier transform [1]. Our algorithm is designed to be simple and fast such that it can be implemented in hardware. We evaluate our algorithm on a synthetic I/Q-modulated dataset and outperform reference methods in terms of the mean squared error, signal-to-interference-plus-noise ratio, error vector magnitude, true positive rate, false alarm rate and F1-score.

Via

Access Paper or Ask Questions

Adaptive Variational Inference in Probabilistic Graphical Models: Beyond Bethe, Tree-Reweighted, and Convex Free Energies

Feb 05, 2025

Harald Leisenberger, Franz Pernkopf

Abstract:Variational inference in probabilistic graphical models aims to approximate fundamental quantities such as marginal distributions and the partition function. Popular approaches are the Bethe approximation, tree-reweighted, and other types of convex free energies. These approximations are efficient but can fail if the model is complex and highly interactive. In this work, we analyze two classes of approximations that include the above methods as special cases: first, if the model parameters are changed; and second, if the entropy approximation is changed. We discuss benefits and drawbacks of either approach, and deduce from this analysis how a free energy approximation should ideally be constructed. Based on our observations, we propose approximations that automatically adapt to a given model and demonstrate their effectiveness for a range of difficult problems.

* This work has been submitted to the Conference on Uncertainty in Artificial Intelligence (UAI) 2025 for possible publication

Via

Access Paper or Ask Questions

Function Space Diversity for Uncertainty Prediction via Repulsive Last-Layer Ensembles

Dec 20, 2024

Sophie Steger, Christian Knoll, Bernhard Klein, Holger Fröning, Franz Pernkopf

Abstract:Bayesian inference in function space has gained attention due to its robustness against overparameterization in neural networks. However, approximating the infinite-dimensional function space introduces several challenges. In this work, we discuss function space inference via particle optimization and present practical modifications that improve uncertainty estimation and, most importantly, make it applicable for large and pretrained networks. First, we demonstrate that the input samples, where particle predictions are enforced to be diverse, are detrimental to the model performance. While diversity on training data itself can lead to underfitting, the use of label-destroying data augmentation, or unlabeled out-of-distribution data can improve prediction diversity and uncertainty estimates. Furthermore, we take advantage of the function space formulation, which imposes no restrictions on network parameterization other than sufficient flexibility. Instead of using full deep ensembles to represent particles, we propose a single multi-headed network that introduces a minimal increase in parameters and computation. This allows seamless integration to pretrained networks, where this repulsive last-layer ensemble can be used for uncertainty aware fine-tuning at minimal additional cost. We achieve competitive results in disentangling aleatoric and epistemic uncertainty for active learning, detecting out-of-domain data, and providing calibrated uncertainty estimates under distribution shifts with minimal compute and memory.

Via

Access Paper or Ask Questions

Robustness of Explainable Artificial Intelligence in Industrial Process Modelling

Jul 12, 2024

Benedikt Kantz, Clemens Staudinger, Christoph Feilmayr, Johannes Wachlmayr, Alexander Haberl, Stefan Schuster, Franz Pernkopf

Figure 1 for Robustness of Explainable Artificial Intelligence in Industrial Process Modelling

Figure 2 for Robustness of Explainable Artificial Intelligence in Industrial Process Modelling

Figure 3 for Robustness of Explainable Artificial Intelligence in Industrial Process Modelling

Figure 4 for Robustness of Explainable Artificial Intelligence in Industrial Process Modelling

Abstract:eXplainable Artificial Intelligence (XAI) aims at providing understandable explanations of black box models. In this paper, we evaluate current XAI methods by scoring them based on ground truth simulations and sensitivity analysis. To this end, we used an Electric Arc Furnace (EAF) model to better understand the limits and robustness characteristics of XAI methods such as SHapley Additive exPlanations (SHAP), Local Interpretable Model-agnostic Explanations (LIME), as well as Averaged Local Effects (ALE) or Smooth Gradients (SG) in a highly topical setting. These XAI methods were applied to various types of black-box models and then scored based on their correctness compared to the ground-truth sensitivity of the data-generating processes using a novel scoring evaluation methodology over a range of simulated additive noise. The resulting evaluation shows that the capability of the Machine Learning (ML) models to capture the process accurately is, indeed, coupled with the correctness of the explainability of the underlying data-generating process. We furthermore show the differences between XAI methods in their ability to correctly predict the true sensitivity of the modeled industrial process.

* 11 pages, 3 figures, accepted at the ICML'24 Workshop ML4MS

Via

Access Paper or Ask Questions

On the Convexity and Reliability of the Bethe Free Energy Approximation

May 24, 2024

Harald Leisenberger, Christian Knoll, Franz Pernkopf

Figure 1 for On the Convexity and Reliability of the Bethe Free Energy Approximation

Figure 2 for On the Convexity and Reliability of the Bethe Free Energy Approximation

Figure 3 for On the Convexity and Reliability of the Bethe Free Energy Approximation

Figure 4 for On the Convexity and Reliability of the Bethe Free Energy Approximation

Abstract:The Bethe free energy approximation provides an effective way for relaxing NP-hard problems of probabilistic inference. However, its accuracy depends on the model parameters and particularly degrades if a phase transition in the model occurs. In this work, we analyze when the Bethe approximation is reliable and how this can be verified. We argue and show by experiment that it is mostly accurate if it is convex on a submanifold of its domain, the 'Bethe box'. For verifying its convexity, we derive two sufficient conditions that are based on the definiteness properties of the Bethe Hessian matrix: the first uses the concept of diagonal dominance, and the second decomposes the Bethe Hessian matrix into a sum of sparse matrices and characterizes the definiteness properties of the individual matrices in that sum. These theoretical results provide a simple way to estimate the critical phase transition temperature of a model. As a practical contribution we propose $\texttt{BETHE-MIN}$, a projected quasi-Newton method to efficiently find a minimum of the Bethe free energy.

* This work has been submitted to the Journal of Machine Learning Research (JMLR) for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Via

Access Paper or Ask Questions

Rao-Blackwellising Bayesian Causal Inference

Feb 22, 2024

Christian Toth, Christian Knoll, Franz Pernkopf, Robert Peharz

Figure 1 for Rao-Blackwellising Bayesian Causal Inference

Figure 2 for Rao-Blackwellising Bayesian Causal Inference

Figure 3 for Rao-Blackwellising Bayesian Causal Inference

Figure 4 for Rao-Blackwellising Bayesian Causal Inference

Abstract:Bayesian causal inference, i.e., inferring a posterior over causal models for the use in downstream causal reasoning tasks, poses a hard computational inference problem that is little explored in literature. In this work, we combine techniques from order-based MCMC structure learning with recent advances in gradient-based graph learning into an effective Bayesian causal inference framework. Specifically, we decompose the problem of inferring the causal structure into (i) inferring a topological order over variables and (ii) inferring the parent sets for each variable. When limiting the number of parents per variable, we can exactly marginalise over the parent sets in polynomial time. We further use Gaussian processes to model the unknown causal mechanisms, which also allows their exact marginalisation. This introduces a Rao-Blackwellization scheme, where all components are eliminated from the model, except for the causal order, for which we learn a distribution via gradient-based optimisation. The combination of Rao-Blackwellization with our sequential inference procedure for causal orders yields state-of-the-art on linear and non-linear additive noise benchmarks with scale-free and Erdos-Renyi graph structures.

* 8 pages + references + appendices (19 pages total)

Via

Access Paper or Ask Questions

End-to-End Training of Neural Networks for Automotive Radar Interference Mitigation

Dec 15, 2023

Christian Oswald, Mate Toth, Paul Meissner, Franz Pernkopf

Abstract:In this paper we propose a new method for training neural networks (NNs) for frequency modulated continuous wave (FMCW) radar mutual interference mitigation. Instead of training NNs to regress from interfered to clean radar signals as in previous work, we train NNs directly on object detection maps. We do so by performing a continuous relaxation of the cell-averaging constant false alarm rate (CA-CFAR) peak detector, which is a well-established algorithm for object detection using radar. With this new training objective we are able to increase object detection performance by a large margin. Furthermore, we introduce separable convolution kernels to strongly reduce the number of parameters and computational complexity of convolutional NN architectures for radar applications. We validate our contributions with experiments on real-world measurement data and compare them against signal processing interference mitigation methods.

* 2023 IEEE International Radar Conference (RADAR), 6 pages, 4 figures

Via

Access Paper or Ask Questions

"UWBCarGraz" Dataset for Car Occupancy Detection using Ultra-Wideband Radar

Nov 17, 2023

Jakob Möderl, Stefan Posch, Franz Pernkopf, Klaus Witrisal

Abstract:We present a data-driven car occupancy detection algorithm using ultra-wideband radar based on the ResNet architecture. The algorithm is trained on a dataset of channel impulse responses obtained from measurements at three different activity levels of the occupants (i.e. breathing, talking, moving). We compare the presented algorithm against a state-of-the-art car occupancy detection algorithm based on variational message passing (VMP). Our presented ResNet architecture is able to outperform the VMP algorithm in terms of the area under the receiver operating curve (AUC) at low signal-to-noise ratios (SNRs) for all three activity levels of the target. Specifically, for an SNR of -20 dB the VMP detector achieves an AUC of 0.87 while the ResNet architecture achieves an AUC of 0.91 if the target is sitting still and breathing naturally. The difference in performance for the other activities is similar. To facilitate the implementation in the onboard computer of a car we perform an ablation study to optimize the tradeoff between performance and computational complexity for several ResNet architectures. The dataset used to train and evaluate the algorithm is openly accessible. This facilitates an easy comparison in future works.

* v1 (17.11.2023). 6 pages, 5 figures

Via

Access Paper or Ask Questions