In their everyday life, the speech recognition performance of human listeners is influenced by diverse factors, such as the acoustic environment, the talker and listener positions, possibly impaired hearing, and optional hearing devices. Prediction models come closer to considering all required factors simultaneously to predict the individual speech recognition performance in complex acoustic environments. While such predictions may still not be sufficiently accurate for serious applications, they can already be performed and demand an accessible representation. In this contribution, an interactive representation of speech recognition performance is proposed, which focuses on the listeners head orientation and the spatial dimensions of an acoustic scene. A exemplary modeling toolchain, including an acoustic rendering model, a hearing device model, and a listener model, was used to generate a data set for demonstration purposes. Using the spatial speech recognition maps to explore this data set demonstrated the suitability of the approach to observe possibly relevant behavior. The proposed representation provides a suitable target to compare and validate different modeling approaches in ecologically relevant contexts. Eventually, it may serve as a tool to use validated prediction models in the design of spaces and devices which take speech communication into account.
The effect of hearing impairment on speech perception was described by Plomp (1978) as a sum of a loss of class A, due to signal attenuation, and a loss of class D, due to signal distortion. While a loss of class A can be compensated by linear amplification, a loss of class D, which severely limits the benefit of hearing aids in noisy listening conditions, cannot. Not few users of hearing aids keep complaining about the limited benefit of their devices in noisy environments. Recently, in an approach to model human speech recognition by means of a re-purposed automatic speech recognition system, the loss of class D was explained by introducing a level uncertainty which reduces the individual accuracy of spectro-temporal signal levels. Based on this finding, an implementation of a patented dynamic range manipulation scheme (PLATT) is proposed, which aims to mitigate the effect of increased level uncertainty on speech recognition in noise by expanding spectral modulation patterns in the range of 2 to 4 ERB. An objective evaluation of the benefit in speech recognition thresholds in noise using an ASR-based speech recognition model suggests that more than half of the class D loss due to an increased level uncertainty might be compensable.