Advances in speech prosody perception research: Integrating behavioral, neuroimaging, (neuro)genomics, and clinical techniquesSymposium Session 11: Tuesday, April 16, 2024, 1:30 – 3:30 pm EDT, Ballroom West
Chairs: Tamar Regev1, Srishti Nayak2; 1MIT, 2Vanderbilt University Medical Center
Presenters: Maya Inbar, Shir Genzer, Anat Perry, Eitan Grossman, Ayelet N. Landau, Anna Greenwald, Tamar Regev, Hee So Kim, Niharika Jhingan, Hope Kean, Colton Casto, Evelina Fedorenko, Srishti Nayak, Alyssa C. Scartozzi, Daniel E. Gustavson, Nicole Creanza, Cyrille L. Magne, Jennifer E. Below, Reyna L. Gordon
Prosody encompasses the acoustic features of spoken language—pitch, loudness, duration, timbre—that carry linguistic, emotional, and social information. Although prosody plays an essential role in human communication and has attracted significant attention in psycholinguistics, the cognitive, neural, and biological mechanisms supporting prosody perception remain unclear. This symposium seeks to spotlight the significance of prosody research in the field of human communication while presenting recent advances in understanding speech prosody perception. We will examine the major components of prosody, emphasizing rhythmic, intonation, and emotional information embedded in speech. The speakers will present recent advances enabled by diverse methodologies–including behavioral, EEG, fMRI, neurogenetics, and clinical approaches. These insights will shed light on how listeners perceive and process speech prosody, with relevance to real-world communication contexts. We will conclude with an inclusive discussion, engaging both our symposium speakers and the audience, to tackle important open questions in prosody research. Among the issues up for debate: Do distinct aspects of prosody rely on shared or unique processing mechanisms? What might the neural architecture for processing prosody in the brain look like, and how might its dysfunctions be linked to disorders in prosody perception and production? How can the study of individual differences enhance our understanding of prosody skills in the population and their relevance for language and learning?
Intonation Units in spontaneous speech evoke a neural response
Maya Inbar1, Shir Genzer1, Anat Perry1, Eitan Grossman1, Ayelet N. Landau1; 1The Hebrew University of Jerusalem
Spontaneous speech is produced in chunks called Intonation Units (IUs). IUs are defined by a set of prosodic cues and presumably occur in all human languages. Recent work has shown that across different grammatical and socio-cultural conditions IUs form rhythms of approximately one unit per second. Linguistic theory suggests that IUs pace the flow of information and serve as a window onto the dynamic focus of attention in speech processing. As a result, IUs provide a promising and hitherto unexplored theoretical framework for studying the neural mechanisms of communication. We identify a neural response unique to the boundary defined by the IU, and relate our findings to the body of research on rhythmic brain mechanisms in speech processing. We measured the EEG of participants (N=50) who listened to different speakers recounting an emotional life event in Hebrew. We analyzed the speech stimuli linguistically, and modeled the EEG response at word offset using a GLM approach. Words were categorized as either IU-final or IU-nonfinal. Additionally, we quantified an acoustic-based measure of prosodic boundary strength. We find that the EEG response to IU-final words differs from the response to IU-nonfinal words even when equating acoustic boundary strength. Finally, we study the unique contribution of IUs and acoustic boundary strength in predicting delta-band EEG. This analysis suggests that IU-related neural activity, which is tightly linked to the classic Closure Positive Shift, could be a time-locked component that captures the previously characterized delta-band neural speech tracking.
Effects of stroke on emotional prosody processing
Anna Greenwald1; 1Georgetown University Medical Center, Washington, DC, USA
After injury to the brain’s right hemisphere, patients often have lasting difficulty producing and/or comprehending emotional prosody – at least when the injury happens in adulthood. Interestingly, such difficulties are rarely observed in children and adults who had a stroke around the time of birth. This is reminiscent of the relative absence of other language impairments in this population. It is now well established that after a large left-hemisphere perinatal stroke, language functions normally supported by left perisylvian cortex can be supported by homotopic right-hemisphere regions instead. Could the inverse be true for emotional prosody after large right-hemisphere perinatal stroke? And could contralesional left-hemisphere activation also play a role in recovery from aprosodia after stroke in adulthood? Using functional MRI, we identified emotional prosody areas in 10 participants who had a right-hemisphere stroke around the time of birth, 10 participants who had a right-hemisphere stroke in adulthood, and matched controls for both groups. As expected, prosody activation was right-lateralized in controls. In contrast, most perinatal stroke survivors showed their strongest prosody activation in left perisylvian cortex, homotopic to the right perisylvian areas most strongly activated in controls. Prosody activation after adult stroke was more variable, likely due to greater variability in lesion size and location. However, left-hemisphere activation was particularly high in one participant whose stroke affected all right-hemisphere areas activated by controls. These results highlight left perisylvian activation as an important contributor to emotional prosody processing after large right-hemisphere stroke and as a potential target for aprosodia treatment.
A network of brain areas is sensitive to prosody and distinct from language and auditory areas
Tamar Regev1, Hee So Kim1, Niharika Jhingan1, Hope Kean1, Colton Casto1, Evelina Fedorenko1; 1MIT
Supra-segmental prosody refers to acoustic features of speech beyond phonetics. Prosodic features include pitch, loudness, and duration/pauses and convey linguistic, emotional, and other socially-relevant information. Does the brain contain specialized areas for processing prosody or is prosodic information processed by known auditory or language areas? Previous neuroimaging studies have reported sensitivity to prosody in numerous brain regions, but have only included a few conditions, making it difficult to infer the underlying neural computations. We designed a new fMRI ‘localizer’ for prosody-sensitive areas based on a contrast between prosody-rich stimuli vs. stimuli with distorted prosody, and then characterized these areas with respect to diverse auditory, linguistic, and social conditions. Our prosody localizer contrast identified several temporal and frontal areas, which were strongly sensitive to prosody with or without linguistic content. We replicated this finding in two distinct experiments (n=37 participants overall). These prosody-sensitive areas were adjacent to but distinct from language areas which extract meaning from linguistic input, and from areas that support pitch perception, speech perception, and general cognitive demands. Furthermore, these areas were selective for prosody over diverse types of natural sounds, but showed some response to communicative signals, especially facial expressions and non-speech vocalizations. These results suggest that prosody is processed by a network of brain areas that lie in close proximity to language-selective areas and are broadly sensitive to non-linguistic communicative cues. This work lays a critical foundation for further investigations of the neural basis of prosodic processing and its disorders.
Genetic individual differences in speech rhythm sensitivity: implications for the cognitive neuroscience of language and learning
Srishti Nayak1, Alyssa C. Scartozzi2, Daniel E. Gustavson3, Nicole Creanza2, Cyrille L. Magne4, Jennifer E. Below1, Reyna L. Gordon1; 1Vanderbilt University Medical Center, 2Vanderbilt University, 3University of Colorado, Boulder, 4Middle Tennessee State University
Disruptions in stress and rhythm perception in speech (an aspect of prosody perception) have been linked to developmental disorders of speech and language, and learning disorders. Individual differences in word-level stress perception have been previously linked with variability in reading skills in children and adults, and developmental speech-language disorders. Here, we report on the first genome-wide association study (GWAS) of prosody, revealing that the genetic variant (SNP) significantly associated with prosody (p = 8.39e-10) occurs in or genetically upstream of gene TMEM108, involved in brain and central nervous system development (e.g., neuronal migration), function (e.g., cellular response to BDNF), and structure (e.g., fetal brain basal ganglia). Top genes associated with prosody, while non-significant, echo these results. Further, we explored the evolutionary history of human prosody perception by investigating comparative biology of prosody perception and vocal learning in songbirds. Gene-set enrichment analyses showed that genes expressed in songbird brain Area X (a key song learning brain area, homologous to human basal ganglia) were overrepresented in human prosody-related genes (p < 7.14e-3). All analyses were adjusted for multiple test corrections. This work highlights the biology of speech rhythm perception and the broader neurobiological and neurodevelopmental functions associated with it. Our comparative approach using songbird models provides evidence for shared evolutionary mechanisms between human speech rhythm perception and songbird vocal learning, extending similar findings about human musical rhythm (Gordon et al., 2021). These results are a key step in mapping biological relationships between prosody and other language and learning processes.
CNS Account Login
April 13–16 | 2024