Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Stefan Goetz

HRTFformer: A Spatially-Aware Transformer for Personalized HRTF Upsampling in Immersive Audio Rendering

Oct 02, 2025

Xuyi Hu, Jian Li, Shaojie Zhang, Stefan Goetz, Lorenzo Picinali, Ozgur B. Akan, Aidan O. T. Hogg

Figure 1 for HRTFformer: A Spatially-Aware Transformer for Personalized HRTF Upsampling in Immersive Audio Rendering

Figure 2 for HRTFformer: A Spatially-Aware Transformer for Personalized HRTF Upsampling in Immersive Audio Rendering

Figure 3 for HRTFformer: A Spatially-Aware Transformer for Personalized HRTF Upsampling in Immersive Audio Rendering

Figure 4 for HRTFformer: A Spatially-Aware Transformer for Personalized HRTF Upsampling in Immersive Audio Rendering

Abstract:Personalized Head-Related Transfer Functions (HRTFs) are starting to be introduced in many commercial immersive audio applications and are crucial for realistic spatial audio rendering. However, one of the main hesitations regarding their introduction is that creating personalized HRTFs is impractical at scale due to the complexities of the HRTF measurement process. To mitigate this drawback, HRTF spatial upsampling has been proposed with the aim of reducing measurements required. While prior work has seen success with different machine learning (ML) approaches, these models often struggle with long-range spatial consistency and generalization at high upsampling factors. In this paper, we propose a novel transformer-based architecture for HRTF upsampling, leveraging the attention mechanism to better capture spatial correlations across the HRTF sphere. Working in the spherical harmonic (SH) domain, our model learns to reconstruct high-resolution HRTFs from sparse input measurements with significantly improved accuracy. To enhance spatial coherence, we introduce a neighbor dissimilarity loss that promotes magnitude smoothness, yielding more realistic upsampling. We evaluate our method using both perceptual localization models and objective spectral distortion metrics. Experiments show that our model surpasses leading methods by a substantial margin in generating realistic, high-fidelity HRTFs.

* 10 pages and 5 figures

Via

Access Paper or Ask Questions

Three mechanistically different variability and noise sources in the trial-to-trial fluctuations of responses to brain stimulation

Dec 22, 2024

Ke Ma, Siwei Liu, Mengjie Qin, Stefan Goetz

Figure 1 for Three mechanistically different variability and noise sources in the trial-to-trial fluctuations of responses to brain stimulation

Figure 2 for Three mechanistically different variability and noise sources in the trial-to-trial fluctuations of responses to brain stimulation

Figure 3 for Three mechanistically different variability and noise sources in the trial-to-trial fluctuations of responses to brain stimulation

Figure 4 for Three mechanistically different variability and noise sources in the trial-to-trial fluctuations of responses to brain stimulation

Abstract:Motor-evoked potentials (MEPs) are among the few directly observable responses to external brain stimulation and serve a variety of applications, often in the form of input-output (IO) curves. Previous statistical models with two variability sources inherently consider the small MEPs at the low-side plateau as part of the neural recruitment properties. However, recent studies demonstrated that small MEP responses under resting conditions are contaminated and over-shadowed by background noise of mostly technical quality, e.g., caused by the amplifier, and suggested that the neural recruitment curve should continue below this noise level. This work intends to separate physiological variability from background noise and improve the description of recruitment behaviour. We developed a triple-variability-source model around a logarithmic logistic function without a lower plateau and incorporated an additional source for background noise. Compared to models with two or fewer variability sources, our approach better described IO characteristics, evidenced by lower Bayesian Information Criterion scores across all subjects and pulse shapes. The model independently extracted hidden variability information across the stimulated neural system and isolated it from background noise, which led to an accurate estimation of the IO curve parameters. This new model offers a robust tool to analyse brain stimulation IO curves in clinical and experimental neuroscience and reduces the risk of spurious results from inappropriate statistical methods. The presented model together with the corresponding calibration method provides a more accurate representation of MEP responses and variability sources, advances our understanding of cortical excitability, and may improve the assessment of neuromodulation effects.

* 11 pages, 4 figures

Via

Access Paper or Ask Questions

"You still have to study" -- On the Security of LLM generated code

Aug 13, 2024

Stefan Goetz, Andreas Schaad

Figure 1 for "You still have to study" -- On the Security of LLM generated code

Figure 2 for "You still have to study" -- On the Security of LLM generated code

Figure 3 for "You still have to study" -- On the Security of LLM generated code

Figure 4 for "You still have to study" -- On the Security of LLM generated code

Abstract:We witness an increasing usage of AI-assistants even for routine (classroom) programming tasks. However, the code generated on basis of a so called "prompt" by the programmer does not always meet accepted security standards. On the one hand, this may be due to lack of best-practice examples in the training data. On the other hand, the actual quality of the programmers prompt appears to influence whether generated code contains weaknesses or not. In this paper we analyse 4 major LLMs with respect to the security of generated code. We do this on basis of a case study for the Python and Javascript language, using the MITRE CWE catalogue as the guiding security definition. Our results show that using different prompting techniques, some LLMs initially generate 65% code which is deemed insecure by a trained security engineer. On the other hand almost all analysed LLMs will eventually generate code being close to 100% secure with increasing manual guidance of a skilled engineer.

Via

Access Paper or Ask Questions

Design and Implementation of DC-to-5~MHz Wide-Bandwidth High-Power High-Fidelity Converter

Sep 12, 2023

Jinshui Zhang, Boshuo Wang, Xiaoyang Tian, Angel Peterchev, Stefan Goetz

Figure 1 for Design and Implementation of DC-to-5~MHz Wide-Bandwidth High-Power High-Fidelity Converter

Figure 2 for Design and Implementation of DC-to-5~MHz Wide-Bandwidth High-Power High-Fidelity Converter

Figure 3 for Design and Implementation of DC-to-5~MHz Wide-Bandwidth High-Power High-Fidelity Converter

Figure 4 for Design and Implementation of DC-to-5~MHz Wide-Bandwidth High-Power High-Fidelity Converter

Abstract:Advances in power electronics have made it possible to achieve high power levels, e.g., reaching GW in grids, or alternatively high output bandwidths, e.g., beyond MHz in communication. Achieving both simultaneously, however, remains challenging. Various applications, ranging from efficient multichannel wireless power transfer to cutting-edge medical and neuroscience applications, are demanding both high power and wide bandwidth. Conventional inverters can achieve high power and high quality at grid or specific frequency ranges but lose their fidelity when reaching higher output frequencies. Resonant circuits can promise a high output frequency but only a narrow bandwidth. We overcome the hardware challenges by combining gallium-nitride (GaN) transistors with modular cascaded double-H bridge circuits and control that can manage typical timing and balancing issues. We developed a lightweight embedded control solution that includes an improved look-up-table digital synthesizer and a novel adaptive-bias-elimination nearest-level modulation. This solution effectively solves the conflict between a high power level and high output bandwidth and can--in contrast to previous approaches--in principle be scaled in both dimensions. Our prototype exhibits a frequency range from DC to 5 MHz with <18% total voltage distortion across the entire frequency spectrum, while achieving a power level of >5 kW. We conducted tests by sweeping the output frequency and two channel-mixing trials, which included a practical magnetogenetics-oriented stimulation pulse and an entertaining trial to reproduce the famous Arecibo message with the current spectrum.

* 8 pages, 11 figures

Via

Access Paper or Ask Questions