Abstract:To evaluate the performance of audio signal processing algorithms and to train data-driven algorithms, e.g., as applied in hearing instruments, either simulated or recorded data can be used. While large batches of simulated data can be generated using mathematical models, recorded data provide a more adequate representation of real-life scenarios. Therefore, in this paper, the Hearing Instrument Dataset in Various Acoustical Scenarios (HIDVAS) is introduced. This dataset consists of both impulse responses and audio recordings using eight external loudspeakers, two external microphones, and a dummy head. On this dummy head behind-the-ear (BTE) hearing instrument shells with two microphones per shell are mounted, and in the dummy head's ears receiver-in-canal (RIC) hearing instrument loudspeakers are inserted. The dummy head also contains microphones located at its eardrum. The impulse responses have been computed from a swept-sine recording for each microphone-loudspeaker pair, and the audio recordings have been obtained by playing back audio (male and female speech, speech shaped noise, singing voice, stringed instrument, wind instrument, and percussion instrument) through each individual loudspeaker and recording simultaneously using all microphones. These recordings have been repeated for four hearing instrument domes (open, semi-open, closed, and no-RIC) in three reverberation conditions in one room (T30 = 0.09 s, T30 = 0.47 s, and T30 = 0.73 s), and in one reverberation condition in a different room (T30 = 1.48 s). The usage of the dataset as a `hearing instrument in a box' is exemplified with three example use cases.
Abstract:In public address systems and hearing aids, the maximally achievable amplification or gain is limited by acoustic feedback. Therefore, in order to be able to apply a higher gain, feedback cancellation methods are required. In addition, it is oftentimes also desirable to dereverberate a recorded signal, that is, remove the late reverberation component of the signal, before playing it back. In this paper, it is shown that under two mild conditions, the acoustic feedback signal can be written as a reverberant version of the source signal. Therefore, it is possible to treat the joint dereverberation and acoustic feedback cancellation problem as a dereverberation-only problem, meaning that dereverberation algorithms can be applied to the joint problem. Simulations corroborate this finding

Abstract:Two algorithms for combined acoustic echo cancellation (AEC) and noise reduction (NR) are analysed, namely the generalised echo and interference canceller (GEIC) and the extended multichannel Wiener filter (MWFext). Previously, these algorithms have been examined for linear echo paths, and assuming access to voice activity detectors (VADs) that separately detect desired speech and echo activity. However, algorithms implementing VADs may introduce detection errors. Therefore, in this paper, the previous analyses are extended by 1) modelling general nonlinear echo paths by means of the generalised Bussgang decomposition, and 2) modelling VAD error effects in each specific algorithm, thereby also allowing to model specific VAD assumptions. It is found and verified with simulations that, generally, the MWFext achieves a higher NR performance, while the GEIC achieves a more robust AEC performance.




Abstract:In many speech recording applications, noise and acoustic echo corrupt the desired speech. Consequently, combined noise reduction (NR) and acoustic echo cancellation (AEC) is required. Generally, a cascade approach is followed, i.e., the AEC and NR are designed in isolation by selecting a separate signal model, formulating a separate cost function, and using a separate solution strategy. The AEC and NR are then cascaded one after the other, not accounting for their interaction. In this paper, however, an integrated approach is proposed to consider this interaction in a general multi-microphone/multi-loudspeaker setup. Therefore, a single signal model of either the microphone signal vector or the extended signal vector, obtained by stacking microphone and loudspeaker signals, is selected, a single mean squared error cost function is formulated, and a common solution strategy is used. Using this microphone signal model, a multi channel Wiener filter (MWF) is derived. Using the extended signal model, an extended MWF (MWFext) is derived, and several equivalent expressions are found, which nevertheless are interpretable as cascade algorithms. Specifically, the MWFext is shown to be equivalent to algorithms where the AEC precedes the NR (AEC NR), the NR precedes the AEC (NR-AEC), and the extended NR (NRext) precedes the AEC and post-filter (PF) (NRext-AECPF). Under rank-deficiency conditions the MWFext is non-unique, such that this equivalence amounts to the expressions being specific, not necessarily minimum-norm solutions for this MWFext. The practical performances nonetheless differ due to non-stationarities and imperfect correlation matrix estimation, resulting in the AEC-NR and NRext-AEC-PF attaining best overall performance.
Abstract:In many speech recording applications, the recorded desired speech is corrupted by both noise and acoustic echo, such that combined noise reduction (NR) and acoustic echo cancellation (AEC) is called for. A common cascaded design corresponds to NR filters preceding AEC filters. These NR filters aim at reducing the near-end room noise (and possibly partially the echo) and operate on the microphones only, consequently requiring the AEC filters to model both the echo paths and the NR filters. In this paper, however, we propose a design with extended NR (NRext) filters preceding AEC filters under the assumption of the echo paths being additive maps, thus preserving the addition operation. Here, the NRext filters aim at reducing both the near-end room noise and the far-end room noise component in the echo, and operate on both the microphones and loudspeakers. We show that the succeeding AEC filters remarkably become independent of the NRext filters, such that the AEC filters are only required to model the echo paths, improving the AEC performance. Further, the degrees of freedom in the NRext filters scale with the number of loudspeakers, which is not the case for the NR filters, resulting in an improved NR performance.