Schedule of Events | Search Abstracts | Symposia | Invited Symposia | Poster Sessions | Data Blitz Sessions

Poster A3

Speech reconstruction of higher formant and dispersion dynamics predicts listeners’ ability to resolve multispeaker scenarios

Poster Session A - Saturday, April 13, 2024, 2:30 – 4:30 pm EDT, Sheraton Hall ABC

Francisco Cervantes Constantino1 (, Rodrigo Caramés Harcevnicow1, Ángel Caputi2; 1Universidad de la Republica, Uruguay, 2Instituto Clemente Estable

Speaker identification cues are readily available for listeners in the rich spectrotemporal information content of the human voice. It is not clear whether encoding the dynamics of its spectral modulations may assist the processes of speech intelligibility and selective attention. Here, we hypothesize that formant information, broadcasted by the speaker’s laryngeal system configuration during vocal production, can be tracked by (and reliably decoded from) cortical networks during speech listening. In addition, we address whether any such encoding may impact listener behavior in a ‘cocktail-party’ task. For this, we investigate cortical activity in electroencephalogram (EEG) signals using the stimulus reconstruction technique, measuring how much can instantaneous formant (F1-F5) and related formant dispersion (delta-F) variations be decoded from the EEG. Participants (N=73) listen to brief (~9 s), independent solo speech presentations from dozens of different of speakers while undergoing EEG. The neural representation of vocal modulations in the single-trial data is addressed by measuring decodability of formant dynamics from the EEG headset. Our results show above-chance performance of listener decoders for the vocal spectral features corresponding to higher formants F4 and F5, as well as delta-F. Furthermore, a multidimensional index of performance measures, constructed across the three spectral features, characterized participants’ success in selecting and understanding speech in additional ‘cocktail-party’ task settings. Our results suggest that decoding of these important vocal identity-related modulations from electrophysiological activity may add to the battery of objective measures of speech listening, in naturalistic conditions.

Topic Area: ATTENTION: Auditory


CNS Account Login


April 13–16  |  2024