Predicting the next sentence (not word) in pretraining: What model-brain alignment tells us about discourse-level comprehension

Poster Session A - Saturday, April 13, 2024, 2:30 – 4:30 pm EDT, Sheraton Hall ABC

Shaoyun Yu1 (, Chanyuan Gu1, Kexin Huang1, Ping Li1; 1The Hong Kong Polytechnic University

In large language models (LLMs), word prediction is a backbone task for pretraining, and it has also been examined empirically and theoretically in models of human language processing (Goldstein et al., 2022). However, recent studies highlighted the limitations of relying on this task (Arana et al., 2023). Crucially, human language is acquired and represented at multiple levels, and the language user needs to integrate the meanings of words and sentences to achieve a full understanding of discourse (Li & Clariana, 2019). This study models language comprehension beyond word prediction to use next sentence prediction (NSP; Devlin et al., 2019) to investigate mechanisms of discourse-level comprehension. Two independent fMRI datasets about sentence reading were used (Li et al., 2022; Pereira et al., 2018). We found that discourse-level pretraining using NSP enhances a model’s alignment with brain data (evaluated with RSA). In particular, increased model-brain alignment was mainly observed in the right hemispheric homologues of core language regions (IFG, IFGorb, ATG) and in the multiple-demand network (left MFG, ACC, right SFG, MFGorb, PrecG), highlighting the contributions of non-classical language regions to high-level language understanding. NSP also enabled the model to better capture subjects’ comprehension speed (Caucheteux et al., 2022) and to better encode contextual information (Toneva & Wehbe, 2019). We discuss these findings in light of current neurolinguistic theories (MacGregor et al., 2022; Yang et al., 2019). This study supported NSP’s cognitive plausibility, and demonstrated that model-brain alignment can be a viable approach to addressing outstanding questions in the neuroscience of language.

Topic Area: LANGUAGE: Other


April 13–16  |  2024