Amplification and Assistive Devices (AAD)
Linda Thibodeau, PhD
faculty
University of Texas
University of Texas at Dallas
Dallas, Texas
Disclosure(s): Phonak: Consultant (Ongoing), Grant/Research Support (Ongoing), Speaker/Honoraria (Ongoing)
Despite normal auditory thresholds, some individuals have speech recognition deficits in noisy environments. One convenient solution may be use of the iPhone app, Live Listen (LL) with wireless receivers, AirPods Pro (AP) if the Bluetooth transmission delay can be tolerated. A novel method using a voice-to-text app, Otter, and a manikin, KEMAR, was used to compare iPhone with AP (Bluetooth) to a Roger On with a Focus II receiver (proprietary digital modulation). Across noise types and SNR, the best score was obtained with Roger On (95.78%) followed by LL with AP in noise cancellation mode (92.50%) followed AP alone (84.78%).
Summary:
Objectives
This study aimed to compare speech recognition in noise of Live Listen (LL) (embedded in iPhone) to a Phonak Roger remote microphone (RM) system when connected to one AirPod Pro (AP) receiver in two modes using KEMAR and voice-to-text (VTT) transcription method.
Rationale
Communication deficits are common in noisy environments encountered by individuals even without hearing impairment. RM systems are designed to mitigate the effects of noise by increasing the signal-to-noise ratio (SNR) using a transmitter and a receiver. As presented in the 2022 AAA conference poster session, VTT (Otter), KEMAR, and RM systems can be used as an alternative method to compare speech recognition without involving human subjects. Given the complex hearing conditions in real life, noise type and SNR were also of interest.
Design
The conditions were set up in an audiometric soundbooth so that speech signal was presented in front of KEMAR (0 degrees; 3 feet) with babble and speech noise presented behind him (180 degrees; 8.5 feet) at three SNRs (-5, 0, +5 dB). The iPhone was positioned 1 foot from the front speaker and the speech was transmitted via LL to one AP receiver in KEMAR’s right ear, set to two modes, transparency and noise cancellation (NC). The output from the AP receiver was transmitted to the Zwislocki coupler in KEMAR and then fed into a laptop running Otter, to transcript the speech signals. To compare LL to an RM system designed for persons with hearing challenges, a Roger On transmitter was tested with a Roger Focus II receiver in KEMAR’s right ear. Additionally, two comparison conditions were tested: one with KEMAR alone without use of any devices and another with an AP receiver alone in KEMAR's right ear set to transparency mode without connecting the iPhone. For each condition, HINT list 4 sentences (N=10) were presented three times and the average recognition score of the three trials was used.
Results
Three factors of interest were noise type, RM condition, and SNR. The average accuracy of the VTT transcription for each factor irrespective of the other two was as follows:
1. Noise type,
· 86.87% (speech noise)
· 85.69% (babble noise)
2. RM condition,
· 95.78% (Roger On/Focus II)
· 92.50% (LL/AP NC)
· 84.78% (AP alone)
· 81.22% (LL/AP transparency)
· 77.11% (KEMAR alone)
3. SNR,
96.10% (+5 dB)
88.50% (0 dB)
74.23% (-5 dB).
The mean accuracy of VTT transcription was compared among three RM conditions (Roger, LL NC, and LL transparency) in a challenging SNR (-5 dB) for both noise types:
Babble: 86.67, 87.00, 66.00% respectively
Speech: 94.33, 86.33, 72.00% respectively.
As expected, the accuracy was generally lower for babble relative to speech noise.
Conclusions
Using LL on the iPhone with AP receivers in NC mode could potentially facilitate speech recognition for persons with normal auditory thresholds who experience challenges in noisy environments. However, the Bluetooth transmission delay may add an additional interference that is not present with devices such as Roger RM systems that use digital modulation.