< Symposia
Symposium Session 9 - Cognitive Insights into Attention and Cross-Modal Integration from Rapid Invisible Frequency Tagging
Chairs: Hyojin Park1, Ole Jensen2; 1University of Birmingham, 2University of Oxford
Presenters: Hyojin Park, Charlie Reynolds, Yali Pan, Ana Pesquita, Ole Jensen, Katrien Segaert, James Dowsett, Inés Martín Muñoz, Paul Taylor, Xingshan Li, Linejieqiong Huang, Qiwei Zhang, Yali Pan, Ole Jensen, Ole Jensen, Lijuan Wang, Steven Frisson, Yali Pan
A central challenge in cognitive neuroscience is to understand how sensory information is routed during attention allocation and how cross-modal inputs are integrated. Assessing the excitability of sensory regions provides an important means of investigating these processes. A recently developed method, rapid invisible frequency tagging (RIFT), offers a powerful tool for probing sensory processing with high temporal resolution. RIFT involves flickering task-relevant stimuli at very high frequencies (50–80 Hz). Although these flickers remain invisible to observers, they elicit reliable neuronal responses that can be recorded using electroencephalography (EEG) and magnetoencephalography (MEG). This symposium will showcase recent applications of RIFT in cognitive neuroscience, with a focus on attention and multisensory integration. We will present evidence demonstrating how RIFT can be used to track the allocation of covert and presaccadic attention across a range of paradigms, including cross-modal integration and reading. Furthermore, we will introduce novel interventional approaches that combine RIFT with methods designed to modulate the excitability of the early visual cortex. These advances illustrate how RIFT can provide both observational and causal insights into sensory processing. In summary, RIFT represents an emerging methodological innovation that enables researchers to measure and manipulate rapid neural dynamics underlying visual and cross-modal integration. By offering fast, non-invasive, and robust access to neuronal excitability in sensory regions, RIFT opens new opportunities for investigating the mechanisms through which attention and integration shape cognition. This symposium will highlight the promise of RIFT as a versatile tool for advancing our understanding of brain function.
Presentations
Enhancing Speech Comprehension via Cross-Modal Integration with Rapid Invisible Frequency Tagging
Hyojin Park1, Charlie Reynolds1, Yali Pan1, Ana Pesquita1, Ole Jensen2, Katrien Segaert1; 1University of Birmingham, 2University of Oxford
Understanding speech in noisy environments relies on both auditory and visual cues, with lip movements providing a powerful scaffold for comprehension. We hypothesised that externally modulating visual speech signals using non-invasive rhythmic stimulation could enhance crossmodal integration and improve speech understanding. To test this, we developed a novel paradigm applying Rapid Invisible Frequency Tagging (RIFT) to naturalistic audiovisual speech. Forty participants viewed videos of a speaker under dichotic listening conditions, where one ear received speech matching the visual input (task-relevant) and the other a mismatched stream (task-irrelevant). Both auditory streams were tagged at 40 Hz. A 55 Hz invisible flicker was applied to the speaker’s mouth region, modulated either by the task-relevant or task-irrelevant auditory amplitude envelope. Behaviourally, RIFT significantly improved speech comprehension when the visual flicker was driven by the relevant auditory amplitude. MEG recordings confirmed robust auditory and visual tagging responses in their respective cortices across all conditions. Critically, visual tagging was stronger when driven by relevant speech rhythms, and this enhancement predicted individual comprehension performance. These findings demonstrate that subtle modulation of visual input with task-relevant auditory rhythms can increase visual cortical excitability and promote crossmodal integration. RIFT therefore provides a promising, non-invasive approach to boosting speech intelligibility in multi-speaker environments, with potential applications for older adults, individuals with hearing loss, and populations with auditory processing disorders.
Decoding Real-World Visual Scenes from the Human Gamma Band with Flicker-Evoked Oscillations
James Dowsett1, Inés Martín Muñoz2, Paul Taylor3,4; 1University of Stirling, 2Technical University Munich, 3LMU Munich, 4University of Zurich
Current approaches to investigate the role of neural oscillations in natural scene processing have been limited to artificial stimuli and long data collection. We present a new way to decode real-world scenes participants are viewing from the steady-state visual evoked potentials (SSVEPs) evoked while wearing flickering LCD glasses. We discovered that SSVEP responses from real world scenes are surprisingly complex and have distinct waveform shapes: they differ markedly across scenes and participants but are consistent within individuals, even across multiple days. SSVEP shape varies greatly between stimuli, but is reliable, meaning that decoding works even with a single electrode. Decoding is highly accurate with 5-10 seconds of data and was still above chance level with less than a second of data. This decoding approach works almost as well with 40 Hz visual flicker as with 10 Hz and 1 Hz, demonstrating the possibility of using high frequency flicker as a cognitively meaningful measure in real-world mobile EEG experiments. Decomposing the SSVEPs into frequency bands showed that the information about the visual scene is present across all of the harmonics of the flicker frequency, but with 40 Hz (gamma band) showing the highest amount of information across the different flicker frequencies tested. These findings implicate a broad range of oscillations in encoding real-world scenes, with a particular importance for 40 Hz. The SSVEP’s temporal profile is a rich source of information for decoding.
Top-Down Modulation of Visual Attention Resolves Word Boundary Ambiguity in Chinese Reading
Xingshan Li1, Linejieqiong Huang1, Qiwei Zhang1, Yali Pan2, Ole Jensen3; 1Chinese Academy of Sciences, 2University of Birmingham, 3University of Oxford
The absence of inter-word spaces in Chinese often creates ambiguity in word boundaries, as exemplified by overlapping ambiguous strings (OAS). For instance, in the OAS “网站台”, the first two characters form the word “网站” (website), whereas the last two form the word “站台” (platform). This is analogous to the English phrase “milk tea bag,” which can be parsed as “milk tea” + “bag” or “milk” + “tea bag.” During reading, Chinese readers must decide which word the middle character belongs to. In this study, we used a recently developed technique, rapid invisible frequency tagging (RIFT), to investigate whether covert attention contributes to word segmentation. EEG was recorded while participants read three-character strings (ABC) and performed a word segmentation task. The strings were either ambiguous (both AB and BC formed words) or unambiguous (only AB or BC formed a word). Character C was flickered at 60 Hz while participants fixated on character B. Results showed (1) stronger tagging responses when character C belonged to the preferentially segmented word, and (2) an early onset of this effect—approximately 120 ms for OAS and 50 ms for unambiguous strings. These findings indicate that attention dynamics are tightly linked to word segmentation and play a crucial role in resolving word boundaries. The early timing of the tagging response suggests that the effect likely originates from early visual areas such as V1.
Neural Evidence for Multilevel Parafoveal Processing Supporting Natural Reading
Ole Jensen1, Lijuan Wang2, Steven Frisson2, Yali Pan2; 1University of Oxford, 2University of Birmingham
Fluent reading requires extraction of information not only from the fixated word but also from upcoming parafoveal words. A central question is whether lexical, semantic, and phonological information can be accessed and integrated from parafoveal words during natural reading. Across three MEG/eye-tracking studies, we investigated neural signatures of parafoveal processing using frequency tagging. In the first study, we applied Rapid Invisible Frequency Tagging (RIFT) to examine lexical processing. Target words were subliminally flickered at 60 Hz, and tagging responses were measured during fixations on the preceding word, when the targets were in the parafovea. Pre-target responses were stronger when upcoming words were low compared with high lexical frequency, and the magnitude of this parafoveal effect predicted individual reading speed, indicating lexical extraction. In the second study, we used RIFT to probe semantic processing of parafoveal words. Target words were either congruent or incongruent with sentence context. Pre-target tagging responses were weaker and delayed for incongruent words, providing evidence that semantic information is accessed and integrated parafoveally before direct fixation. Finally, we examined phonological processing by embedding orthography–phonology consistent or inconsistent target words in sentences while driving auditory cortex with a 40 Hz tagging tone. Orthography–phonology inconsistent words elicited stronger left auditory cortex coherence 94–232 ms after pre-target fixation, showing extraction of phonological information while the words are in the parafovea. Together, these findings demonstrate that lexical, semantic, and phonological information is extracted from parafoveal words within 100–200 ms, supporting cascaded and parallel models of reading.
CNS Account Login
Save the Date
CNS2026
March 7 – 10, 2026
Vancouver, B.C.