Research (R)
Audie Gilchrist
Univeristy of Louisville
Louisville, Kentucky
Disclosure(s): No financial or nonfinancial relationships to disclose.
Emily Keller
University of Louisville
University of Louisville
Louisville, Kentucky
Disclosure(s): No financial or nonfinancial relationships to disclose.
Yonghee Oh, PhD
Assistant Professor
Department of Otolaryngology, HNS and Communicative Disorders, School of Medicine, University of Louisville
University of Louisville
Louisville, Kentucky
Disclosure(s): No financial or nonfinancial relationships to disclose.
Speech perception abilities are enhanced with the addition of multisensory information that coincides with the speech stimulus. Our previous studies demonstrated the positive effects of visual cues on speech perception abilities in noise. The current study aimed to explore the effects of a tactile stimulus on speech perception ability in noise. Additionally, the study aimed to identify the optimal temporal characteristics (modulation rate and depth) as well as the signal-to-noise ratio (SNR) at which the individuals benefited the most. Results revealed optimal benefit with a 10 Hz modulation rate and 75% modulation depth at intermediate SNRs (-9 to -3 dB).
Summary:
Rationale/Purpose
The inputs delivered to different sensory organs provide us with complementary information about the environment. A series of our previous studies demonstrated that presenting abstract visual information of speech envelopes substantially improves speech perception ability in normal hearing (NH) listeners (Oh et al., 2022; 2023). The purpose of this study was to expand this audiovisual speech perception to the tactile domain and to delineate the signal-to-noise ratio (SNR) conditions under which tactile presentations of the acoustic amplitude envelopes have their most significant impact on speech perception.
Methods
Ten young NH adults (age 20-24 years) participated in speech-in-noise recognition measurements in auditory-only and auditory-tactile conditions. For the auditory stimuli, two different speech materials were used as target sentences (Harvard sentences, IEEE, 1969) and a simultaneous speech-shaped noise (SSN) masker in our experiment. The target speech level was fixed at individual comfortable levels and presented in various levels of the masker from -12 to 0 dB SNR. Acoustic amplitude envelopes of target signals were extracted in various modulation parameters (rate: 4, 10, and 30 Hz; depth: 0%, 25%, 50%, 75%, and 100%). For the tactile stimuli, the extracted target envelopes were temporally synchronized with the amplitude of vibrotactile stimuli with the pulse width modulation technique (Barr, 2001). Auditory stimuli were delivered via a frequency-equalized loudspeaker, positioned in the front hemifield at a distance of 1 m from the center of the listener’s head. Vibrotactile stimuli were delivered via a DC vibration motor to the left index finger of participants. The subjects were instructed to pay attention to the target sentences and make their answer. Sentence recognition results in the auditory-tactile condition were compared to those in the auditory-only condition.
Results/Conclusions
Average results showed that adding a temporally synchronized tactile cue to the auditory signal did provide significant improvements in speech recognition (2 to 25%) abilities compared to audio-only stimulation. Especially, there was a special zone at more intermediate SNRs (-9 to -3 dB) where auditory-tactile integration results in substantial benefits. Another finding is that the maximum improvement in speech-in-noise performance was observed when the vibrotactile cue synced with the acoustic amplitude envelope modulated at a 10-Hz modulation rate and 75% modulation depth compared to the audio-only condition. Our findings suggest that multisensory interactions are fundamentally important for speech perception processing in NH listeners, especially at the intermediate SNR levels. The outcome of this multisensory speech processing highly depends on temporal coherence characteristics between multimodal sensory inputs. This suggests that a multisensory integration process in speech perception requires salient temporal cues to enhance speech perception ability in noisy environments.