Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2025 Dec 1.
Published in final edited form as: Brain Res. 2024 Jul 19;1844:149127. doi: 10.1016/j.brainres.2024.149127

Memory representations are flexibly adapted to orthographic systems: A comparison of English and Hebrew

Erin S Isbilen a,*, Abigail Laver b, Noam Siegelman c, Richard N Aslin a,d
PMCID: PMC11411488  NIHMSID: NIHMS2014407  PMID: 39033951

Abstract

Across languages, speech unfolds in the same temporal order, constrained by the forward flow of time. But the way phonology is spatially mapped onto orthography is language-specific, ranging from left-to-right, right-to-left, and top-to-bottom, among others. While the direction of writing systems influences how known words are visually processed, it is unclear whether it influences learning and memory for novel orthographic regularities. The present study tested English and Hebrew speakers on an orthographic word-referent mapping task in their native orthographies (written left-to-right and right-to-left, respectively), where the onsets and offsets of words were equally informative cues to word identity. While all individuals learned orthographic word-referent mappings significantly above chance, the parts of the word that were most strongly represented varied. English monolinguals false alarmed most to competing foils that began with the same bigram as the target, representing word onsets most strongly. However, Hebrew bilinguals trained on their native orthography showed no difference between false alarm rates to onset and offset competitors, representing the beginning and ends of words equally strongly. Importantly, Hebrew bilinguals tested on English words displayed a more English-like false alarm pattern (although not a full switch), suggesting that memory biases adapt to the opposite directionality of encountered text while retaining traces of native language biases. These findings demonstrate that experience with different writing systems influences how individuals represent novel orthographic words, starting in the earliest stages of learning.

Keywords: Statistical learning, Memory, Reading, Individual differences, Language acquisition

1. Introduction

One of the most challenging feats of everyday cognition is the process of acquiring the ability to read. Most of us take literacy for granted, but of course it is not a universal accomplishment, even in Western, Educated, Industrialized, Rich, Democratic (WEIRD) societies such as the United States, where up to 21 % of individuals lack proficient reading skills by adulthood (Goodman et al., 2013; Rampey et al., 2016). This sharply contrasts with auditory spoken language skills, which are almost universally acquired in hearing populations. Reading is not only acquired after spoken language skills have become quite sophisticated, with formal reading instruction commencing around age 5 or later, but reading is parasitic on the structural properties of spoken language (Mattingly, 1972). Children must not only map sounds to orthographic symbols, but they must do so efficiently as they move their gaze across an array of text while holding preceding items in memory.

Unlike spoken language, where the coarticulation of phonetic information (e.g., from one syllable to the next) blurs the boundaries between words (Cole & Jakimik, 1980; Saffran, 2001), orthographic information in most languages has clearly demarcated segmentation cues: the spaces between words (although see Thai and Chinese scripts for exceptions). But the advantage of this inherent segmentation cue to written word boundaries is offset by the need to learn a language-specific spatial arrangement of words: the direction in which text is ordered. That is, the temporal order of spoken language is universally constrained by time, which only moves in a single direction. By contrast the spatial order of written materials is not subject to this inherent constraint, and can follow a wide variety of directions. For example, in English and many other languages, the temporal order of speech is mapped spatially onto text in a left-to-right configuration, but in other languages the mapping of speech to text is from right-to-left (e.g., Hebrew) or from top-to-bottom (e.g., Japanese). Indeed, some scripts even map speech non-linearly, such as in Devanagari, where consonants are positioned left-to-right but vowels are arranged above, below or on either side of those consonants (Rao et al., 2017), even when the vowel precedes the consonants in speech. Thus, children must not only acquire spoken language before learning to read, but they must then learn the specific temporal-to-spatial mapping of their written language. This difference in spatial ordering may influence how written words are learned and represented across languages, particularly for individuals learning to read in multiple writing systems that feature conflicting spatial orderings (for example, left-to-right and right-to-left).

Spoken language is largely acquired by “mere exposure” to the native language available to young children in the ambient environment. Parents speak and children listen, but parents rarely, if ever, explain the structure of the language. In contrast, reading can be acquired by a mixture of mere exposure and more overt (didactic) methods. A basic mechanism that is hypothesized to play a key role in both spoken and written language acquisition is statistical learning—individuals’ sensitivity to recurrent patterns in the environment. This ability supports the acquisition of structure across a vast array of inputs spanning sensory modalities, facilitating the construction of higher-order units by grouping discrete items together based on their frequency of co-occurrence, such as tones into melodies (Saffran et al., 1999), syllables into words (Saffran et al., 1996), and words into phrases (Saffran & Wilson, 2003). In the visual domain, statistical learning is central not only to the acquisition of sequential information (e.g., items presented one after another; Fiser & Aslin, 2001), but also to spatial information (e. g., the co-occurrence of items in space; Fiser & Aslin, 2002), both key elements of reading. Indeed, in addition to enabling the discovery of text-based regularities, such as grouping letters into morphological units (Crepaldi et al., 2010; Vidal et al., 2021), statistical learning also facilitates the cross-modal mapping of phonology to orthography, and eventually to semantics (He & Tong, 2017; Sawi & Rueckl, 2019), skills that are foundational to literacy (Treiman & Kessler, 2022).

Although a remarkable breadth of research now illustrates the critical contribution of statistical learning to language, many of these studies overlook an important source of variability in the acquisition and representation of linguistic structure. The vast majority of the planet’s population is multilingual (Grosjean, 2010), yet most published statistical learning studies have been conducted with English-speaking monolinguals, who represent the minority of language learners (an issue that is symptomatic of psycholinguistic research more broadly; Christiansen et al., 2022). To illustrate, a large-scale meta-analysis on auditory-linguistic statistical learning (Isbilen & Christiansen, 2022) reported that experiments with native English speakers accounted for 354 of the 636 studies in their sample—approximately 56 %. The second most common languages in the meta-analysis were conducted with native Spanish and French speakers (33 studies each), only 5 % each of the total studies. While increasing work has begun to broaden in scope to include native speakers of typologically distinct languages, including Mandarin (e.g., Chen et al., 2017), Hebrew (e.g., Shufaniya & Arnon, 2018), and Khalkha Mongolian (LaCross, 2015), such studies still comprise only a small fraction of research on statistical language learning. This highlights a potential lack of generalizability that arises from only studying English monolinguals, and a need to diversify statistical learning research to better characterize how linguistic pattern learning varies across the world’s population.

While psycholinguistic studies often control for language background to rule it out as a potential source of variance in their experimental manipulations, recent efforts have been marshaled toward analyzing language background as a meaningful source of individual differences in statistical learning. Prior knowledge fundamentally shapes subsequent learning, with the structure of one’s native language (L1) either helping or hindering the acquisition of novel statistical patterns. For example, participants display an advantage for stimuli that adhere to high frequency syllable combinations in their L1, whereas novel items composed of low frequency combinations present a significant challenge to learning (Elazar et al., 2022; Finn & Kam, 2008; Siegelman et al., 2018). Similarly, the structure of participants’ native phonology can tune their attention to specific aspects of novel input. Speakers of Khalkha Mongolian, whose native language features vowel harmony, were able to acquire similar non-adjacent vowel dependencies in an artificial language task, whereas American English speakers were not (LaCross, 2015). The structure of participants’ native language, therefore, fundamentally biases how they process new linguistic information, and experience with multiple linguistic systems may constrain learning even further.

To explore this latter possibility, a growing body of literature compares performance between monolinguals and bilinguals to determine whether learning multiple languages invokes meaningful changes in the ability to track statistical regularities (Weiss et al., 2020), although most examine speech rather than text. Much of this literature tackles the idea of bilingual advantage: the possibility that experience using multiple languages affords gains in statistical learning skills. This hypothesis is rooted in the premise that attending to multiple linguistic systems necessitates attendance to a much wider array of regularities, which may promote heightened sensitivity to statistical structure more broadly. However, there also exist arguments for a bilingual disadvantage for certain aspects of language processing, since bilinguals inevitably receive less input for each of their respective languages (e.g., Gollan et al., 2011). Indeed, statistical learning studies on the bilingual advantage have produced mixed results (see Bulgarelli et al., 2018, for a review), making it difficult to discern precisely how second language (L2) experience shapes statistical learning. For instance, bilingualism has been shown to enhance learning outcomes, with bilingual adults outperforming monolingual adults on an auditory statistical learning task comprising an artificial tonal language—even when bilingual participants had no prior exposure to tone languages (Wang & Saffran, 2014). Relatedly, others report that a combination of speaker and exemplar variability disadvantaged monolingual children in a cross-situational learning task, where participants encountered novel spoken words that could label multiple competing referents. However, bilingual children were less affected by this variability and better able to learn word-referent mappings (Crespo et al., 2023).

Conversely, several studies also reveal a distinct lack of bilingual advantage. For example, a comprehensive study by Yim and Rudoy (2013) uncovered no bilingual advantage in children for learning either auditory or visual-nonlinguistic regularities in conventional statistical learning tasks. Poepsel and Weiss (2016) report no difference in learning single auditory word-referent mappings between mono- and bilinguals, suggesting comparable statistical learning capacities. However, bilinguals in their sample did show an advantage for learning mappings where multiple words referred to the same object. While this may suggest that bilinguals could be more adept at detecting the presence of multiple structures, it is also possible that they are able to represent information more flexibly than monolinguals, due to their experience maintaining multiple statistical representations for the same concept (see Weiss et al., 2015). In either case, these results underscore the fact that there are myriad sources of variability that arise both from the structure of one’s native language and from exposure to multiple languages, which may influence how subsequent learning unfolds (Bulgarelli et al., 2018).

Rather than a focus on the question of bilingual advantage, the current paper explores how experience with different languages influences the way newly acquired statistical patterns are represented in memory. Even if learning is achieved to comparable degrees by monolinguals and bilinguals, it may be achieved in distinct manners, with participants representing novel words differently based on the structure of their native language. This hypothesis is grounded in research demonstrating that L1 word processing forms the basis of L2 word processing (Miller, 2019). Bilingual individuals tend to deploy orthographic processing strategies from their native language when reading items in their L2, suggesting that reading of the same wordforms systematically differs across individuals depending on native language background (Akamatsu, 2003). Yet, experience with a second language also influences processing, with bilinguals who are equally fluent in L1 and L2 exhibiting a “preferred language” that can bias how new exemplars are segmented (Cutler et al., 1992). Furthermore, work on early exposure to L2 followed by extensive exposure to L1 reveals that there is a trace of expertise for L2 learning—even when the L2 that participants were exposed to in early childhood was never spoken by those individuals (Au et al., 2002; Pallier et al., 2003). In the written domain, directional scanning biases accrued from one’s native language (e.g., right-to-left for Arabic and Urdu readers) can even transfer to scan habits for non-linguistic visual materials, but decreases with exposure to text that follows the opposite directionality (e.g., English, which is read left-to-right; Padakannaya et al., 2002). The processing of novel information is therefore molded by multiple, sometimes conflicting, forces from L1 and L2.

The present paper investigates how adults of different language backgrounds form novel word-referent mappings across different orthographic systems.1 Although abundant work has examined the statistical learning of spoken regularities (Isbilen & Christiansen, 2022) and visual-nonlinguistic regularities (see Ren et al., 2023; Lee et al., 2022 for meta-analyses), fewer investigate the acquisition of novel orthographic regularities. Of these, none have tested how the direction of the writing system impacts memory representations for novel orthographic words. Furthermore, while prior statistical learning studies have simulated the idea of L1 transfer to L2 by training participants sequentially on auditory regularities in two different artificial languages that share phonological features (Franco et al., 2011; Gebhart et al., 2009), none to our knowledge have tested how entrenched or flexible native L1 reading direction biases are when learning novel orthographic regularities in one’s L2, which follows the opposite spatial mapping (e.g., right-to-left vs. left-to-right). Our experimental design thus pinpoints three theoretical questions: 1) whether the spatial direction of participants’ native writing system biases how they represent novel words in their native orthographies, 2) whether bilinguals shift or maintain these spatial representation biases when presented with novel words in their L2 orthography, given that their L2 and L1 writing systems have opposite spatial directions, and 3) whether L2 proficiency mediates learning and memory representations for orthographic systems in bilingual individuals.

We recruited English monolinguals and Hebrew-English bilinguals to partake in the experiment. Participants were assigned to three conditions: English native speakers trained on novel word-referent mappings in English orthography, Hebrew native speakers trained on novel word-referent mappings in Hebrew orthography (their L1), and Hebrew native speakers trained on novel word-referent mappings in English orthography (their L2). In addition, Hebrew speakers in both conditions were assessed on their English proficiency and history, to determine how second language skills and experience mediated learning and memory.

Hebrew, like English, is an orthographically deep language, where the same letter can possess multiple phonological mappings. It is a Semitic language that uses an alphabetic abjad script, where most letters represent consonants and vowel information which is generally not conveyed in print (see, e.g., Shimron, 2006). Importantly, Hebrew is read in the opposite direction than European languages such as English (right-to-left rather than left-to-right), and thereby necessitates shifts in strategies for Hebrew readers when processing Hebrew versus English text. Classic research shows that where on the word the eyes first land when reading—variously referred to as preferred landing position or preferred viewing position—tends to be toward the left side of the word in English, at the approximate midpoint between the onset and center of the word (Rayner, 1979; see Ducrot & Pynte, 2002, for similar results in French). This “quarter of the way into the word” tendency extends to the reading of Hebrew text, although here, the beginning of the word is to the right rather than the left (Deutsch & Rayner, 1999). It would seem, then, that this onset bias is consistent across languages, regardless of the writing system’s orientation. However, other research suggests that landing positions are less asymmetrical in Hebrew and Arabic (which is also read right-to-left) than English and other Indo-European languages and tends to be more central than onset-biased (Brysbaert & Nazir, 2005; Nazir et al., 2004), which affords a broader view of words (e.g., onsets and offsets). Indeed, readers tend to fixate on the parts of words that they expect will minimize uncertainty and maximize information uptake, shifting from word onsets to offsets if offset cues are more informative (Carr et al., 2024), suggesting that encoding can be flexible. While most of this research focuses on how visual words are processed, what remains unclear is whether these propensities also manifest in memory differences when new words enter the lexicon.

In our orthographic statistical learning task, participants encountered novel visual objects that could be mapped to one of two competing written words (based on the design of Magnuson et al., 2003). The words were composed such that they represented analogous phonological forms in English and Hebrew script, to ensure that the only differences between the stimuli were the alphabet—and therefore the direction—in which they were written. Importantly, we elected to utilize overlapping orthographic forms where items formed neighborhoods of similarly spelled words, a relatively uncommon practice in statistical word learning studies (Escudero Mulak et al., 2016), where the words tend to be maximally distinct to facilitate learning. The use of overlapping orthographic forms enabled us to determine whether participants would tend to false alarm to (i.e., incorrectly endorse; Macmillan & Kaplan, 1985) certain types of similarly spelled foils, and thereby indicate a bias toward encoding and representing certain parts of the word over others. The study of error patterns is modeled off prior research that utilizes error regularization to tap into underlying memory representations: participants tend to incorrectly regularize their responses toward sequences that they are more familiar with, and therefore are represented more strongly in memory (e.g., Botvinick & Bylsma, 2005).

As in Magnuson et al. (2003), the training corpus was organized into 4 orthographic neighborhoods of 4 words, where the words in each neighborhood showcased significant orthographic overlap. All of the words in each orthographic neighborhood shared the same central letter cluster (e.g., “al” in the neighborhood: dali, dalo, nali, nalo). The neighborhoods were further organized such that each word in the set possessed an onset competitor that shared the same first bigram (dali & dalo; nali & nalo), an offset competitor that shared the same final bigram (dali & nali; dalo & nalo), and one medial competitor that shared only the central bigram (dali & nalo; dalo & nali). As in Magnuson et al. (2003), the medial competitors served as a baseline foil: these items preserved the neighborhood structure of the stimuli, but we predicted no differences in false alarm rates to these items since they are the least confusable with the target (i.e., they differed by both the first and last letter rather than one or the other). The test trials comprised a four-alternative forced-choice task, where a target word was flashed on the screen before participants were presented with all four referents in the orthographic neighborhood and were prompted to choose the referent that matched the presented word. Crucially, this balanced design ensured that the first bigram and second bigram of the word—or the onset and offsets respectively—were equally informative cues to word meaning: any differences in participants’ false alarm patterns at test could not be due to the structure of the artificial language, but rather the biases they bring from their native language.

We hypothesized that learners across all conditions would exhibit significant learning of novel orthographic word-referent mappings. While we did not necessarily expect monolinguals and bilinguals to differ in their overall statistical learning abilities, we did anticipate that there might emerge potential differences in their memory representations of items, inspired by findings from the eye tracking literature showing different processing biases for English and Hebrew words. We predicted that English speakers trained on their native orthography would demonstrate a strong onset bias, false alarming most to words that shared the same first bigram as the target at test over offset competitors, similar to the pattern they demonstrate for processing both known (Allopenna et al., 1998) and novel (Magnuson et al., 2003) auditory words. For native Hebrew speakers, who have experience reading in both right-to-left (L1) and left-to-right (L2) directions and whose initial attentional bias to words may be more central to afford a better view of the whole word (Brysbaert & Nazir, 2005; Nazir et al., 2004), we anticipated smaller differences between onset and offset competitors, since an onset in one language constitutes an offset in the other. We also predicted that false alarm patterns in the Hebrew-English condition might be mediated by second language proficiency, with more advanced L2 learners exhibiting error patterns closer to English monolinguals (i.e., display a larger difference in false alarm rates between onsets and offsets). Together, these studies serve to illuminate potential differences in implicit learning across writing systems and language backgrounds, pointing to the direction of the L1 writing system as a critical factor that shapes subsequent orthographic word learning.

2. Method

2.1. Participants

Based on the methods of Escudero et al. (2016), 32 native English monolinguals (who reported no experience with other languages) and 64 native Hebrew speakers who spoke English as their L2 were recruited using Prolific (https://www.prolific.co/). English monolinguals were assigned to the English orthography condition, while Hebrew-English bilinguals were randomly assigned to either the Hebrew orthography or English orthography condition (N=32 each). The sample size for these studies was based on Magnuson et al. (2003), who utilized a sample size of 16 participants. However, due to the online format of the current studies, and the decreased ability to control for factors such as screen size, participant distance from screen, and internet connectivity, which may result in interruptions or introduce lags to stimulus presentation, we collected double this number (32 participants/condition) to ensure better statistical power. Of the 32 Hebrew speakers recruited for the English orthography condition, 4 failed to complete the study, yielding a final sample size of 28 for that condition.

Speaker status (English monolingual, Hebrew-English bilingual) was pre-screened for using the screening tools on Prolific and verified during the experiment via a language background questionnaire (see below). None of the participants had any known speech or reading disorders. Participation in all conditions was limited to students between the ages of 18–30 years, to keep the educational background of the participants similar. The demographic information for each condition is reported in Table 1. Participants were compensated at the Prolific recommended rate at the time the study was run ($8.00/hour). Participation was limited to the use of laptops and desktop computers (while barring phones and tablets), to better control participant screen size.

Table 1.

Demographic information by condition.

Condition N Mean age, years (SD) Gender distribution
English: Native Orthography 32 24 (2.4) 12 female, 18 male, 2 non-binary
Hebrew: Native Orthography 32 25.5 (3.6) 18 female, 14 male
Hebrew: Non-native Orthography 28 27.4 (4) 16 female, 12 male

2.2. Materials

The training corpus consisted of 16 nonwords that followed a consonant–vowel-consonant–vowel (CVCV) structure, adapted from Magnuson et al. (2003). English and Hebrew written stimuli were structured such that they represented similar phonological forms in the two languages. For example, the stimulus nali (in the English orthography condition) corresponds to a parallel written nonword in Hebrew representing the same phonological form (in this case, נאלי). To avoid the homography characteristics of Hebrew orthography, where the same letter can elicit multiple phonological mappings, diacritics (“nikud”) were used in a specific ambiguous letter (i.e., the letter ו, which can mark both the /o/ and /u/ sound), to disambiguate phonological forms. Note that while the use of diacritics is not the most ecological for adult readers of Hebrew, their inclusion was necessary to disambiguate phonological forms in this condition to maximize its comparability to the English orthography condition, where homography was not present. Additionally, as prior research shows that Hebrew readers process information morphemically for Hebrew words that are Semitic in origin, but not for words that are non-Semitic in origin (Velan et al., 2013), our stimuli deliberately avoided existing Semitic roots and patterns. Therefore, in the current study there was no morphological information available at the sub-lexical level: word onsets, offsets, and medial bigrams only cued orthographic neighborhood and did not encode semantic information. The 16 words comprised 4 orthographic neighborhoods, wherein each word in the set had a complementary onset, offset, and medial competitor (e.g., nali: nalo, dali, dalo). All of the words were four-letters-long, and presented in Calibri script using 50pt font. The full lists of items are reported in Table 2.

Table 2.

Words for each condition.

English Hebrew
niga ניגא
nigu ניגוּ
diga דיגא
digu דיגוּ
nalo נאלוֹ
nali נאלי
dalo דאלוֹ
dali דאלי
guno גוּנוֹ
guni גוּני
luno לוּנוֹ
luni לוּני
goda גוֹדא
godu גוֹדוּ
loda לוֹדא
lodu לוֹדוּ

The visual referents utilized in this experiment were sourced from the Novel Object and Unusual Name (NOUN) database (Horst & Hout, 2016), which has been previously used in the statistical learning literature (e.g., Hendrickson & Perfors, 2019). Each word in the experiment was assigned its own unique referent, which themselves were distinct from one another despite the similarity of their labels. The same referent-label combinations were used for both the Hebrew and the English conditions, to ensure that any observed differences between conditions were not due to differences in the mapping between phonological forms and specific referents. The referent images were all in color against a white background. Each image was created by the original authors at 300 dots per inch, and were constructed to be 4 in. x 4 in. in size (although the sizes of the images may have varied slightly depending on participant screen size). The full battery of objects is available online from the original authors (https://www.sussex.ac.uk/wordlab/noun), and the subset of referents utilized for the present experiments are depicted in Appendix 1.

English speakers’ native language was verified via a demographics questionnaire, where they were also asked about experience with other languages. For the native Hebrew speakers, we collected two English proficiency assessments. The first was the Lexical Test for Advanced Learners of English (LexTALE; Lemhöfer & Broersma, 2012), a highly reliable and widely used assessment of English proficiency using a lexical decision judgment (see Siegelman et al., 2023). The stimuli comprised 60 items: 40 words and 20 nonwords. Participants in this lexical decision task were asked to indicate whether they thought each word was a real English word or a nonword. The words varied between 4 and 12 letters in length (M=7.3), with their frequencies ranging between 1–26 occurrences per million (M=6.4), as estimated by the CELEX database (Baayen et al., 1995). The nonwords were pronounceable and orthotactically legal according to English spelling conventions, originally created by Lemhöfer and Broersma (2012) by recombining existing morphemes (e.g., plaudate) or changing the number of letters in an existing English word (e.g., skave). The full list of items is available online from the original authors (https://www.lextale.com), and is reported here in Appendix 2.

The final English language assessment was an abridged version of the Language Experience Proficiency Questionnaire (LEAP-Q; Marian et al., 2007; as used in Siegelman et al., 2022), another widely used survey that assesses language background and use in bilingual and multilingual participants. This tool in particular was previously employed for studies comparing sensitivity to written English and Hebrew words (e.g., Mor & Prior, 2020). From this questionnaire, we used a self-report measure that indicates the amount of time participants prefer reading in English vs. Hebrew when a text is available in both languages (e.g., choose to read English 10 % of the time, read Hebrew 90 % of the time), how much they prefer to speak one language over the other with a person who is equally fluent in both (e.g., choose to speak English 40 % of the time and Hebrew 60 % of the time), and how much they generally use English vs. Hebrew (e.g., use English 30 % of the time, Hebrew 70 % of the time). The full battery of questions can be found online (https://bilingualism.northwestern.edu/leapq/).

Stimulus presentation and data collection utilized the Gorilla Experiment Builder software (https://gorilla.sc/). All stimuli are available through the Open Science Framework: (https://osf.io/6ucp8/).

2.3. Procedure

After providing informed consent, participants were trained on novel written word-referent mappings, where a single referent appeared on the screen in tandem with two novel written words (either in the Latin or Hebrew alphabet, depending on the condition). Participants were informed that they would be learning an alien language, and that their job was to match the correct written word to the object. Participants were told that they would likely be guessing at first, but to try their best to learn the correct label. These instructions were written in English for the English orthography conditions, and in Hebrew for the Hebrew orthography condition, to ensure that the instruction language did not influence learning on the word-referent mapping task (see Fitneva et al., 2009). However, it is worth noting that early piloting revealed no significant differences in learning between Hebrew-native participants as a result of the language of instruction for either Hebrew or English word-referent pairings.

Following Magnuson et al. (2003), after selecting a word on each training trial, participants were provided with feedback about the accuracy of their responses (see also Frinsel et al., 2024). Like in Magnuson et al. (2003), this ensured that participants were attending to the items rather than clicking through the trials. If the response was correct, the word that the participant selected was highlighted in green and the screen automatically proceeded to the next trial. If the word they selected was incorrect, the word was highlighted in red, then disappeared from the screen. Participants were required to click on the correct word before proceeding to the next trial. Training consisted of 96 trials, wherein each word was presented with its matching referent 6 times (paired twice with each of the other three words in its set as a distractor: e.g., nali + nalo, nali + dali, nali + dalo; Magnuson et al., 2003). The correct referent appeared on the right and left side of the screen an equal number of times. The order of the trials was randomized across participants, and the 96 training trials formed 4 blocks.

After training, learning was tested using a four-alternative forced-choice (4AFC) task modeled on the Visual World Paradigm, where each target appeared with its full set of three distractors. On each trial the written target word appeared at the center of the screen for 500 ms before the four visual referents were presented. The disappearance of the written target word before the four visual referents appeared helped approximate the decay of acoustic words over time in the spoken version of the Visual World Paradigm. This temporally non-overlapping text and referent design also has the added feature of taxing working memory, allowing clearer insights into participants’ memory representations (Isbilen et al., 2020).

After the target word disappeared from the screen, four candidate referents appeared. In addition to the target referent (e.g., nali), the three competitor referents from its corresponding set were presented: its onset competitor (the same first bigram; nalo), its offset competitor (the same final bigram; dali), and its medial competitor (with a different initial consonant and final vowel, but the same central letter cluster; dalo). This design allowed us to analyze participants’ pattern of errors, to determine whether they were more likely to false alarm to certain competitors over others (Magnuson et al., 2003). There were 32 test trials in total, wherein each word in the training corpus appeared as the target twice. The order of the referent images was scrambled on each trial, and the order of the trials was randomized across participants. Unlike the training trials, no feedback was provided on the test trials. An example test trial is depicted in Fig. 1. The total duration of the training and test phases of a session lasted approximately 16 min.

Fig. 1.

Fig. 1.

An example test trial in English (which would be written “נאלי” in Hebrew). The target word is flashed on the screen for 500 ms, after which participants see a new display where they must select the correct referent among all three of its competitors (onset, offset, and medial).

After the test trials, native English speakers completed a brief questionnaire to verify their native language and report any experience with other languages, after which the session ended. Hebrew bilinguals completed two additional tasks to gauge their English proficiency: the LexTALE Lexical Decision Task and the LEAP-Q questionnaire. Immediately after completing the test trials, Hebrew speaking participants in both orthography conditions were presented with a new screen, which explained that they would be partaking in another task. They were informed that they would see words or nonwords on the screen one at a time, and that their task was to indicate whether a presented word was a real English word. If they thought the word was a real English word, they were instructed to press a button labeled “yes.” If they thought the word was not a real English word, they were instructed to press a button labeled “no.” The target word/nonword and the yes and no buttons appeared simultaneously on the screen on each trial. Participants were informed that all items reflected American English spelling. The instructions for this task were presented in English for both the English and Hebrew orthography conditions (Lemhöfer & Broersma, 2012; Siegelman et al., 2023). The task lasted approximately 7 min. Finally, Hebrew speaking participants completed the abridged LEAP-Questionnaire, which lasted approximately 3 min.

2.4. Results

The analyses utilized the lme4 package (Bates et al., 2020) in the R Statistical Software, Version 4.3.2. (R Core Team, 2020). Logistic mixed effects models were used to test for differences in training and test performance across the conditions. Our outcome variables of interest were training performance accuracy, test performance accuracy, and for the error data, onset over offset selection. Our main predictor of interest was condition (English: Native Orthography, Hebrew: Native Orthography, Hebrew: Non-native Orthography). For models that focused solely on Hebrew speakers, models were run with orthography type as a predictor (native vs. non-native). Additional predictors included block for the training data, which gauged the improvement in performance across the training task. For Hebrew speakers, we also ran several additional models using each L2 measure as predictors of training and test performance. Each predictor was run as its own separate model. All models contain by-subject and by-item random intercepts, the maximal random effects structure for the models to converge (Barr et al., 2013).2 Versions of the models that utilized by-participant random slopes were attempted on the data, but this led to several failed convergences, and were thus dropped from the analyses. All data and analysis code for these experiments are available through the Open Science Framework: (https://osf.io/6ucp8/).

2.4.1. Training accuracy

As shown in Table 3, participants in all conditions performed well (~70 %) and were significantly above chance (50 %) at selecting the correct written word for a presented referent across the entire training session (native English speakers: t(31) = 10.25, p < .0001, d = 1.81; Hebrew speakers: native orthography: (t(31) = 12.81, p < .0001, d = 2.26; Hebrew speakers: non-native orthography: t(27) = 11.78, p < .0001, d = 2.23). Logistic mixed effects models revealed a significant increase in accuracy across training blocks, with participants improving throughout the course of training in all conditions (see Table 4, which reports the mean training score across the entire session, as well as the training scores for each of the four training blocks).

Table 3.

Mean training performance by condition.

Condition (Native Language: Orthography type) Block 1 of training Mean (SD) Block 2 of training Mean (SD) Block 3 of training Mean (SD) Block 4 of training Mean (SD) Total training score Mean (SD)
English: Native Orthography 63 % (12) 71 % (13) 72 % (17) 74 % (17) 70 % (11)
Hebrew: Native Orthography 63 % (11) 71 % (12) 72 % (12) 77 % (15) 71 % (10)
Hebrew: Non-native Orthography 63 % (12) 72 % (14) 76 % (12) 79 % (14) 73 % (10)
Table 4.

Model results of training block by condition.

Condition (Native Language: Orthography type) β SE z p
English: Native Orthography 0.18 0.04 5.05 <.0001*
Hebrew: Native Orthography 0.23 0.04 6.39 <.0001*
Hebrew: Non-native Orthography 0.28 0.04 6.86 <.0001*

Logistic mixed effect models revealed no effect of native language (English vs. Hebrew) on training performance (p = 0.34). For Hebrew speakers, there was no significant effect of orthography type (native vs. non-native) on training (p = 0.46).

2.4.2. Test accuracy

As shown in Table 5, participants in all conditions scored significantly above chance (25 %) at test (native English speakers: t(31) = 9.84, p < .0001, d = 1.74; Hebrew speakers: native orthography: t(31) = 9.88, p < .0001, d = 1.75; Hebrew speakers: non-native orthography: t (27) = 12.75, p < .0001, d = 2.41). Logistic mixed effects models revealed no significant difference between native English and Hebrew speakers at test across conditions when controlling for participants and items (p = 0.08), although Hebrew speakers trained on English orthography scored higher numerically. Similarly, there was no significant effect of orthography type (native vs. non-native) on Hebrew speakers’ test performance (p = 0.13).

Table 5.

Mean test performance by condition.

Condition (Native Language: Orthography type) Mean (SD)
English: Native Orthography 59 % (20)
Hebrew: Native Orthography 62 % (21)
Hebrew: Non-native Orthography 70 % (19)

2.4.3. Error analyses for the 4AFC task

The next set of analyses examine potential differences in false alarm patterns across conditions. Preliminary analyses on the false alarm rates to medial competitors revealed no significant differences between the three conditions (p=0.12 or greater; see Table 6 for the means). Thus, the remaining analyses focus only on the onset vs. offset foils. All of these logistic mixed effects models include random intercepts for participant and items. Model results for each analysis are reported in Table 7.

Table 6.

Mean error selection rates for the different foil types by condition.

Condition (Native Language: Orthography type) Onset Mean (SD) Offset Mean (SD) Near Neighbor Mean (SD)
English: Native Orthography 50 % (23) 25 % (14) 25 % (17)
Hebrew: Native Orthography 43 % (29) 38 % (26) 19 % (14)
Hebrew: Non-native Orthography 44 % (30) 32 % (26) 24 % (22)
Table 7.

Model results of onset over offset foil selection by condition.

Condition β SE z p
English vs. Hebrew (Native orthography) −0.64 0.31 −2.10 0.036*
English vs. Hebrew (Non-native orthography) −0.47 0.31 −1.52 0.13
Hebrew Native vs. Hebrew Non-native orthography 0.19 0.37 0.52 0.61

English speakers learning novel words in their native orthography were significantly more likely to false alarm to the onset competitors than to the offset competitors compared to Hebrew speakers trained on their native orthography. Specifically, English speakers exhibited a 25 % difference between onset and offset foils, selecting onsets more often, whereas Hebrew speakers trained on Hebrew words exhibited only a 5 % difference for onset over offset foil selection (p=0.036). Hebrew speakers trained on English showed a slightly larger onset-offset difference then Hebrew speakers trained on their native orthography (12 %), selecting onset competitors more often, but this difference was not significant (p=0.61). Their propensity to select onsets over offsets was also not significantly different from English monolinguals (p=0.13). This suggests a more graded effect for onset selection among Hebrew speakers trained on their L2 orthography, with their means falling between those of English monolinguals and Hebrew speakers trained on their native orthography (see Fig. 2).

Fig. 2.

Fig. 2.

A) Mean false alarms to onset foils over offset foils, by condition, where a higher score represents a greater propensity to select onset foils. English speakers trained on their native orthography (EnglishNat) were most likely to select onset foils over offsets. Hebrew speakers trained on their non-native (HebrewNonNat) English orthography displayed an onset preference between English speakers and Hebrew speakers trained on their native orthographies (HebrewNat), exhibiting a shift toward, but not a full switch to, English-like processing. Error bars reflect standard error. b) Violin plots depicting the distribution of the individual data by condition. A score of 1 reflects an individual’s propensity to choose onset foils over offset foils 100% of the time, a score of 0 reflects an equal propensity to false alarm to onset and offset foils, whereas a score of —1 reflects a propensity to false alarm to offset over onset foils 100% of the time.

2.4.4. Effects of English language exposure on Hebrew speakers’ performance

To account for English language exposure’s impact on training and test performance, Hebrew speaking participants’ LexTALE scores and self-reported English usage from LEAP-Q were used as fixed effects in a series of logistic mixed effects models. There were no significant differences between the two Hebrew conditions in their L2 scores on either measure, nor were there any significant interactions between conditions and L2 in predicting performance on the training task or test (all p = 0.16 or greater). The data for all participants were thus pooled for the remaining analyses unless otherwise noted. Summary statistics for English usage and LexTALE lexical decision performance by condition are reported in Table 8.

Table 8.

Summary statistics for English proficiency measures by condition.

English proficiency measure Hebrew: Hebrew Condition Hebrew: English Condition
Mean (SD) Range Skew Kurtosis Mean (SD) Range Skew Kurtosis
LEAP-Q percent read 34 % (22) 0–80 % 0.19 −0.98 30 % (27) 0–100 0.89 −0.11
LEAP-Q percent speak 10 % (12) 0–50 % 1.45 1.70 14 % (23) 0–100 2.28 5.22
LEAP-Q percent use 21 % (20) 0–98 % 2.35 6.14 27 % (23) 0–99 1.44 1.85
LexTALE score 73 % (12) 52–92 % −0.04 −1.20 73 % (15) 33–98 % −0.38 0.30

Logistic mixed effects models were run with each English proficiency measure. Each measure was used as a single predictor in its own separate model, with either training or test performance as the outcome variable. All English L2 measures influenced training (Table 9), with higher L2 proficiency predicting better training performance. All L2 measures predicted test performance as well, except for the percentage that participants prefer to speak English (Table 10), once more with better L2 proficiency predicting higher test performance. There were no significant interactions between English proficiency and condition on training or test performance (all p = 0.09 or greater). There was also no effect of L2 proficiency on false alarm patterns for models that converged (all p = 0.06 or greater); several failed to converge due to singular fit, suggesting that the models were over-specified.

Table 9.

Model results of training performance by English proficiency measures.

English proficiency measure β SE z p
LEAP-Q percent read 0.007 0.003 2.85 0.004*
LEAP-Q percent speak 0.01 0.004 3.24 0.0012*
LEAP-Q percent use 0.007 0.003 2.30 0.022*
LexTALE score 1.41 0.47 2.97 0.003*
Table 10.

Model results of test performance by English proficiency measures.

English proficiency measure β SE z p
LEAP-Q percent read 0.02 0.006 2.77 0.006*
LEAP-Q percent speak 0.01 0.008 1.58 0.11
LEAP-Q percent use 0.02 0.007 2.33 0.02*
LexTALE score 4.01 1.01 3.97 <.0001*

3. Discussion

Over the last thirty years, the question of how individual differences shape statistical language learning has emerged as a key topic in cognitive science (Kidd et al., 2018; Siegelman et al., 2017). While considerable progress has been made in the study of auditory linguistic (Isbilen & Christiansen, 2022) and visual non-linguistic regularities (Ren et al., 2023; Lee et al., 2022), comparatively few statistical learning studies examine the acquisition of novel regularities in natural language text. This is particularly the case for the subfield of the literature that examines mono- and bilingual differences in statistical learning, as well as the subfield that examines the influence of L1 biases on learning. This leaves open the question of how individuals of diverse language backgrounds learn and represent orthographic regularities in novel (i.e., newly learned) words, particularly for those whose native languages feature writing systems that dramatically differ from English.

Language learning is a lifelong effort, with both L1 distributional patterns and experience with two or more languages systematically shaping how individuals process new linguistic information (Miller, 2019). In three experiments, we examined how the direction of participants’ native writing system influenced memory for novel orthographic patterns—a hitherto unexplored topic in statistical learning research. Furthermore, the study of biscriptality, where participants can read in two different scripts, is relatively underrepresented in bilingualism studies more broadly (see Vaid, 2022, for a review). We focused on English and Hebrew, which feature opposing written directions, and thereby opposing time (phonology) to space (orthography) mappings that might elicit distinct biases in memory representations for novel written words. Furthermore, we evaluated whether memory representations dynamically adapted in response to the encountered input for Hebrew-English bilinguals when trained on English text, and whether this potential shift is stronger for more advanced L2 learners. Importantly, our stimuli were constructed such that the onsets and offsets of words were equally informative cues to word identity. This provided a balanced design for examining whether learners of typologically distinct language backgrounds showed a preference for representing—and presumably encoding—one part of the word over another.

English and Hebrew speakers in all three conditions attained abovechance performance within the first block of training and steadily improved throughout (Magnuson et al., 2003). Similarly, all groups scored significantly above chance at test, successfully acquiring novel orthographic word-referent mappings even when the words possessed substantial orthographic overlap. These findings thereby extend previous studies on statistical word learning on overlapping auditory wordforms (Escudero et al., 2016; Mulak et al., 2019) to the visual domain, which to date, only a limited number of studies have investigated (e.g., Escudero et al., 2023). While these prior studies have primarily focused on the acquisition of monosyllabic words, here, we expand this literature to disyllabic words. To our knowledge, this is the first study to examine such learning in a Semitic writing system.

Furthermore, performance was equal across conditions on both the training and test trials. While Hebrew bilinguals’ test scores were slightly higher than English monolinguals, particularly in the Hebrew-English condition, this difference was nonsignificant. This suggests no (or limited) bilingual advantage for the acquisition of novel orthographic regularities in the current sample. These results contradict the findings of Escudero et al. (2016), where bilingual adults exhibited enhanced cross-situational learning of monosyllabic, overlapping phonological words relative to monolinguals. By contrast, these results support studies that do not show an advantage for bilingual adults in statistical learning, either in conventional triplet tasks where participants learn sequences of three auditory syllables or visual shapes (Yim & Rudoy, 2013), or in the learning of single word-referent mappings (Poepsel & Weiss, 2016). It is possible that bilingualism may influence phonological and orthographic statistical learning differently, as prior literature has almost exclusively focused on the acquisition of auditory linguistic regularities rather than visual (see Weis et al., 2020 for a review). It is also possible that bilinguals whose L1 and L2 comprise the same alphabet and writing system organization might differ from the bilinguals tested here. These open questions may provide fertile ground for future work on monolingual-to-bilingual comparisons.

What our results do suggest is that rather than differences in the capacity to detect statistical regularities, individuals of different linguistic backgrounds diverge in their representations of novel written words, when the beginning and ends of words are equally informative about word identity. The data revealed significant differences between the error patterns of English and Hebrew speakers on their native orthographies, and presumably, the nature of the representations that they formed. English speakers displayed a heightened propensity to false alarm to onset over offset competitors, exhibiting a strong bias to attend to the first letter of the word. However, Hebrew speakers showed no significant difference in their false alarm rates to onset and offset competitors when trained on their native orthography. It is possible that the encoding of Hebrew text may be skewed toward the center of the word rather than toward word onsets like in English, leading to a more balanced spatial bias, consistent with the eye tracking data on preferred landing position (Brysbaert & Nazir, 2005; Nazir et al., 2004). This idea is bolstered by data showing that Hebrew speakers attend to larger temporal windows when reading Hebrew words. For example, letter transpositions (e.g., raed vs. read) have limited effect on native Hebrew speakers’ reading of English text, but similar letter transpositions in Hebrew words significantly degrade performance (Velan & Frost, 2007). This is because Hebrew words tend to code information at the morphemic level (Frost, 2012), and disrupting this information substantially changes the root meaning of Semitic words (although this morphemic encoding is not observed for non-Semitic Hebrew words; Velan et al., 2013). This suggests that the reading of alphabetic orthographies may be subject to language-specific constraints, with the structure of the language dictating how individuals read and encode novel words. This may in turn lead to memory differences in the early stages of learning novel words.

Notably, Hebrew speakers’ memory representations dynamically adapted in response to the encountered input. Hebrew bilinguals in the English orthography condition revealed a larger difference in their onset minus offset false alarm rates than Hebrew speakers trained on their native orthography, in a manner that began to approach native English speakers (see Fig. 2). This suggests that the specific characteristics of the presented stimuli influences bilingual individuals’ learning above and beyond L1 biases alone. This is even the case when the spatial organization of their L1 directly opposes that of their L2: the spatial location of an onset in one writing system comprises an offset in the other. These results parallel work demonstrating that bilinguals’ processing of phonological wordforms adjusts to the encountered input even when it conflicts with their L1 biases. For example, in a series of word reconstruction experiments where participants were prompted to minimally modify nonwords to produce real words (e.g., eltimate, which can be revised into ultimate or estimate), both L1 and L2 English-Mandarin speakers displayed a consonantal bias, changing vowels more quickly while preserving consonants, in line with English morphological patterns (Wiener, 2020). This finding is notable in light of the fact that the distributional patterns of bilinguals’ L1 would have prompted the opposite bias (i.e., a vowel bias for Mandarin), a tendency that was robust even in late L2 learners of English. Comparable results have also been demonstrated for the brain basis of morphological access, where Chinese-English and Spanish-English bilinguals engage the superior temporal gyrus more when processing spoken words in their L2 when these words were most morphologically distinct from their L1 (Sun et al., 2023). This indicates that neural responses become differentially tuned to specific aspects of L1 and L2 input even in young learners.

However, in the current dataset, false alarm rates showed a graded effect for L2 input rather than a full switch to monolingual-like processing. The onset-offset false alarm difference in the Hebrew-English condition was no longer significantly different from native English speakers, but it was also not significantly different from the onset-offset difference of the Hebrew native orthography condition. This mirrors the findings of Carr et al. (2024), where some participants tend to maintain their L1 biases about information spread when learning novel orthographic words. It may be the case that learning in the early phase of acquiring new words is more subject to native language biases, before memory representations of items are fully reinforced by repeated exposures. These results thereby speak to an interplay between the biases that individuals bring from their native language to learning new words and the input they encounter, with L2-specific memory representations emerging but not entirely overwriting L1 biases in the nascent stages of written word learning.

Counter to our predictions, L2 proficiency did not mediate error patterns for Hebrew speakers in the English orthography condition. L2 participants in this sample were relatively advanced learners of English, and it is possible that English proficiency may have had a stronger effect on error patterns in early learners of English that plateaus once a certain level of proficiency is reached. However, second language usage (LEAP-Q) and lexical decision (LexTALE) performance significantly predicted Hebrew speakers’ training and test performance, regardless of condition. Overall, these results suggest a more global effect of L2 experience on statistical learning performance. Bilingual participants who were proficient L2 learners were generally good at learning novel words, regardless of whether these items comprised their native or non-native orthography. These results support findings which demonstrate that artificial language learning performance significantly correlates with indices of both first (Isbilen et al, 2022) and second language learning (Ettlinger et al., 2016). Thus, L2 proficiency mediates overall learning, but memory representations are determined more strongly by the characteristics of the input. Nevertheless, there is evidence that native language orthography impinges to some extent on the early phase of learning words in a non-native orthography (Carr et al., 2024), as reflected in the diminished onset bias in the Hebrew-English condition relative to English monolinguals.

An important limitation of the current study is that we could not entirely decouple the influence of bilingualism from the direction of the writing system. On the one hand, native language background fundamentally shapes individuals’ ability to learn novel linguistic patterns, evidenced by studies showing that the phonological regularities of one’s native language significantly influences their performance on statistical learning tasks (Elazar et al., 2022; Siegelman et al., 2018), as well as by studies showing carry-over effects in natural reading, with participants systematically differing in their reading of the same materials depending on native language background (Akamatsu, 2003; Wang et al., 2003). On the other hand, Hebrew-English bilinguals’ experience reading in two directions (right-to-left and left-to right) may influence processing of their native orthography. Indeed, L2 learning incites changes in native language processing, both at the behavioral (Pallier et al., 2003) and neural level (Brice et al., 2022), suggesting that language processing is dynamic across the lifespan. Although English use was relatively low in these samples, it was likely still a factor, given how even minimal L2 exposure can impact subsequent learning (Au et al., 2002; Cutler et al., 1992; Oh et al., 2010; Potter et al., 2017), whether L1 and L2 are learned simultaneously or sequentially (Amengual, 2019), although simultaneous learning has been found to bear the stronger influence. This is corroborated by the fact that English L2 instruction begins relatively early for Israeli students in primary school (Aronin & Yelenevskaya, 2022), when L1 literacy is in its early developmental stages (averaging around age 7 in the present sample).

In conclusion, the current experiments examined the interplay between native language biases and the learning of novel orthographic items. We found that the direction of the writing system influenced memory representations, and error rates, in learning new object-text mappings among English and Hebrew speakers trained on their native orthographies, with the representations of native Hebrew speakers for their L2 approaching those of monolingual speakers while retaining traces of their L1 biases. Both native language structure—orthography and direction of the writing system—and second language experience fundamentally shape how novel linguistic regularities are processed in the early stages of learning new written words. These results from adults raise the prospect that in the early phases of learning to read one’s L1 in childhood, the spatial biases in the native orthography may exert even greater biases on the learning of L2 orthography.

Acknowledgements

This research was in part supported by the Eunice Kennedy Shriver National Institute of Child Health & Human Development of the National Institutes of Health (#F32HD104542) awarded to ESI and a NIH research grant (HD-037082) awarded to Elissa Newport (subcontract to Richard Aslin).

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Appendix

Appendix 1:

Appendix 1:

Referents for both conditions (adapted from the NOUN database; Houst & Hout, 2016).

Appendix 2:

Lexical test for advanced learners of English (LexTALE; Lemhöfer & Broersma, 2012).

Word Answer (is a word)
abergy No
ablaze Yes
alberation No
allied Yes
bewitch Yes
breeding Yes
carbohydrate Yes
celestial Yes
censorship Yes
cleanliness Yes
crumper No
cylinder Yes
destription No
dispatch Yes
eloquence Yes
exprate No
fellick No
festivity Yes
flaw Yes
fluid Yes
fray Yes
hasty Yes
hurricane Yes
ingenious Yes
interfate No
kermshaw No
kilp No
lengthy Yes
listless Yes
lofty Yes
magrity No
majestic Yes
mensible No
moonlit Yes
muddy Yes
nourishment Yes
plaintively Yes
plaudate No
proom No
pudour No
pulsh No
purrage No
quirty No
rascal Yes
rebondicate No
recipient Yes
savory Yes
scholar Yes
scornful Yes
screech Yes
shin Yes
skave No
slain Yes
spaunch No
stoutly Yes
turmoil Yes
turtle Yes
unkempt Yes
upkeep Yes

Appendix 3: R syntax.

Below is reported the R syntax used for the mixed effects models reported in the results section. All models included participant and items as random intercepts.

Appendix 3a:

Models reported in Section 2.4.1. Training accuracy.

Outcome variable Predictor R syntax
Training accuracy(0 = incorrect response on a trial, 1 = correct response on a trial) Training block (1st, 2nd, 3rd, 4th) glmer(TrainingScore ~ TrainingBlock + (1 | Participant) + (1| Item), data = data, family = binomial, control = glmerControl(optimizer = “bobyqa”), nAGQ=1). Separate models were run for each condition to test whether there was significant improvement across the course of training in all conditions.
Training accuracy(0 = incorrect response on a trial, 1 = correct response on a trial) Condition (English: Native Orthography, Hebrew: Native Orthography, Hebrew: Non-native Orthography) glmer(TrainingScore ~ Condition + (1 | Participant) + (1| Item), data = data, family = binomial, control = glmerControl(optimizer = “bobyqa”), nAGQ=1). One model was run to test whether training performance differed between all three conditions.
Training accuracy(0 = incorrect response on a trial, 1 = correct response on a trial) Hebrew speaker condition (Native vs. Non-native Orthography) glmer(TrainingScore ~ HebrewNativeorNonNativeOrthography + (1 | Participant) + (1| Item), data = data, family = binomial, control = glmerControl(optimizer = “bobyqa”), nAGQ=1). One model was run to test whether training performance differed across orthography types (Native vs. Non-native) for Hebrew speakers.

Appendix 3b:

Models reported in section 2.4.2. Test accuracy

Outcome variable Predictor R syntax
Test accuracy (0 = incorrect response on a trial, 1 = correct response on a trial) Condition (English: Native Orthography, Hebrew: Native Orthography, Hebrew: Non-native Orthography) glmer(TestScore ~ Condition + (1 | Participant) + (1| Item), data = data, family = binomial, control = glmerControl(optimizer = “bobyqa”), nAGQ=1). One model was run to test whether test performance differed between all three conditions.
Test accuracy (0 = incorrect response on a trial, 1 = correct response on a trial) Hebrew speaker condition (Native vs. Non-native Orthography) glmer(TestScore ~ HebrewNatorNonNatOrthography + (1 | Participant) + (1| Item), data = data, family = binomial, control = glmerControl(optimizer = “bobyqa”), nAGQ=1). One model was run to test whether test performance differed across orthography types (Native vs. Non-native) for Hebrew speakers.

Appendix 3c:

Models reported in 2.4.3. Error analyses for the 4AFC task.

Outcome variable Predictor R syntax
Onset vs. Offset selection on incorrect trials (0 = Offset selection, 1 = Onset selection, Blank cells = medial selection) Condition (English: Native Orthography, Hebrew: Native Orthography, Hebrew: Non-native Orthography) glmer(OnsetvsOffset ~ EnglishNatvsHebrewNat + (1 | Participant) + (1| Item), data = data, family = binomial, control = glmerControl(optimizer = “bobyqa”), nAGQ=1). One model was run to test whether error selections differed between all three conditions.
Onset vs. Offset selection on incorrect trials (0 = Offset selection, 1 = Onset selection, Blank cells = medial selection) Hebrew speaker condition (Hebrew: Native orthography vs. Hebrew: Non-native Orthography) glmer(OnsetvsOffset ~ HebrewNatvsHebrewNonNat + (1 | Participant) + (1| Item), data = data, family = binomial, control = glmerControl(optimizer = “bobyqa”), nAGQ=1). One model was run to test whether error selections differed across orthography types (Native vs. Non-native) for Hebrew speakers.

Appendix 3d:

Models reported in 2.4.4. Effects of English language exposure on Hebrew speakers’ performance.

Outcome variable Predictor R syntax
Training accuracy LEAP-Q percent read glmer(TrainingScore ~ LEAP-Qread + (1 | Participant) + (1| Item), data = data, family = binomial, control = glmerControl(optimizer = “bobyqa”), nAGQ=1).
(0 = incorrect response on a trial, 1 = correct response on a trial) One model was run to test whether training performance in Hebrew speakers differed as a function of how much they prefer to read in English over Hebrew.
Training accuracy LEAP-Q percent speak glmer(TrainingScore ~ LEAP-Qspeak + (1 | Participant) + (1| Item), data = data, family = binomial, control = glmerControl(optimizer = “bobyqa”), nAGQ=1).
(0 = incorrect response on a trial, 1 = correct response on a trial) One model was run to test whether training performance in Hebrew speakers differed as a function of how much they prefer to speak English over Hebrew.
Training accuracy LEAP-Q percent use glmer(TrainingScore ~ LEAP-Quse + (1 | Participant) + (1| Item), data = data, family = binomial, control = glmerControl(optimizer = “bobyqa”), nAGQ=1).
(0 = incorrect response on a trial, 1 = correct response on a trial) One model was run to test whether training performance in Hebrew speakers differed as a function of how much they prefer to use English over Hebrew.
Training accuracy LexTALE score glmer(TrainingScore ~ EngLDT + (1 | Participant) + (1| Item), data = data, family = binomial, control = glmerControl(optimizer = “bobyqa”), nAGQ=1).
(0 = incorrect response on a trial, 1 = correct response on a trial) One model was run to test whether training performance in Hebrew speakers differed as a function of their English proficiency as measured by the LexTALE lexical decision task.
Test accuracy LEAP-Q percent read glmer(TestScore ~ LEAP-Qread + (1 | Participant) + (1| Item), data = data, family = binomial, control = glmerControl(optimizer = “bobyqa”), nAGQ=1).
(0 = incorrect response on a trial, 1 = correct response on a trial) One model was run to test whether test performance in Hebrew speakers differed as a function of how much they prefer to read in English over Hebrew.
Test accuracy LEAP-Q percent speak glmer(TestScore ~ LEAP-Qspeak + (1 | Participant) + (1| Item), data = data, family = binomial, control = glmerControl(optimizer = “bobyqa”), nAGQ=1).
(0 = incorrect response on a trial, 1 = correct response on a trial) One model was run to test whether test performance in Hebrew speakers differed as a function of how much they prefer to speak English over Hebrew.
Test accuracy LEAP-Q percent use glmer(TestScore ~ LEAP-Quse + (1 | Participant) + (1| Item), data = data, family = binomial, control = glmerControl(optimizer = “bobyqa”), nAGQ=1).
(0 = incorrect response on a trial, 1 = correct response on a trial) One model was run to test whether test performance in Hebrew speakers differed as a function of how much they prefer to use English over Hebrew.
Test accuracy LexTALE score glmer(TestScore ~ EngLDT + (1 | Participant) + (1| Item), data = data, family = binomial, control = glmerControl(optimizer = “bobyqa”), nAGQ=1).
(0 = incorrect response on a trial, 1 = correct response on a trial) One model was run to test whether test performance in Hebrew speakers differed as a function of their English proficiency as measured by the LexTALE lexical decision task.

Footnotes

CRediT authorship contribution statement

Erin S. Isbilen: Writing – review & editing, Writing – original draft, Visualization, Validation, Software, Project administration, Methodology, Investigation, Funding acquisition, Formal analysis, Data curation, Conceptualization. Abigail Laver: Writing – review & editing, Software, Project administration, Methodology. Noam Siegelman: Writing – review & editing, Methodology, Formal analysis, Conceptualization. Richard N. Aslin: Writing – review & editing, Supervision, Methodology, Funding acquisition, Conceptualization.

1

Padakannaya, Georgiou, & Winksel (2022) define “script” as the visual symbols in print and “orthography” as the specific mapping rules between a language and its script. We thereby utilize the term orthography for our experiments, since they involve both the learning of visual symbols and the rules for how they are read (left-to-right vs. right-to-left).

2

The logistic mixed effects models used the following R syntax: glmer (Outcome ~ Predictor + (1 | Participant) + (1| Item), data = data, family = binomial, control = glmerControl(optimizer = “bobyqa”), nAGQ = 1). A full list of models run on the data are reported in Appendix 3.

Data availability

All de-identified data is available through the Open Science Frame-work (https://osf.io/6ucp8/)

References

  1. Akamatsu N, 2003. The effects of first language orthographic features on second language reading in text. Lang. Learn. 53 (2), 207–231. [Google Scholar]
  2. Allopenna PD, Magnuson JS, Tanenhaus MK, 1998. Tracking the time course of spoken word recognition using eye movements: Evidence for continuous mapping models. J. Mem. Lang 38 (4), 419–439. [Google Scholar]
  3. Amengual M, 2019. Type of early bilingualism and its effect on the acoustic realization of allophonic variants: Early sequential and simultaneous bilinguals. Int. J. Biling 23 (5), 954–970. [Google Scholar]
  4. Aronin L, Yelenevskaya M, 2022. Teaching English in multilingual Israel: Who teaches whom and how. A review of recent research 2014–2020. Lang. Teach 55 (1), 24–45. [Google Scholar]
  5. Au TKF, Knightly LM, Jun SA, Oh JS, 2002. Overhearing a language during childhood. Psychol. Sci 13 (3), 238–243. [DOI] [PubMed] [Google Scholar]
  6. Baayen RH, Piepenbrock R, & Gulikers L (1995). The CELEX lexical database (release 2). Distributed by the linguistic data consortium, University of Pennsylvania. [Google Scholar]
  7. Barr DJ, Levy R, Scheepers C, Tily HJ, 2013. Random effects structure for confirmatory hypothesis testing: Keep it maximal. J. Mem. Lang 68 (3), 255–278. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Bates D, Mächler M, Bolker B, Walker S, Christensen RH, Singmann H, Dai B, Scheipl F, Grothendieck G, Green P, Fox J, Bauer A, Krivitsky PN (2020). Package ‘lme4’. R Package. Version 1.1–26. https://cran.r-project.org/web/packages/lme4/lme4.pdf. [Google Scholar]
  9. Botvinick M, Bylsma LM, 2005. Regularization in short-term memory for serial order. J. Exp. Psychol. Learn. Mem. Cogn 31 (2), 351. [DOI] [PubMed] [Google Scholar]
  10. Brice H, Siegelman N, van den Bunt M, Frost SJ, Rueckl JG, Pugh KR, Frost R, 2022. Individual differences in L2 literacy Acquisition: Predicting reading skill from sensitivity to regularities between orthography, phonology, and semantics. Stud. Second. Lang. Acquis 44 (3), 737–758. [Google Scholar]
  11. Brysbaert M, Nazir T, 2005. Visual constraints in written word recognition: evidence from the optimal viewing-position effect. J. Res. Read 28 (3), 216–228. [Google Scholar]
  12. Bulgarelli F, Lebkuecher AL, Weiss DJ, 2018. Statistical learning and bilingualism. Lang. Speech Hear. Serv. Sch 49 (3S), 740–753. [DOI] [PubMed] [Google Scholar]
  13. Carr JW, Fantini M, Perrotti L, Crepaldi D, 2024. Readers target words where they expect to minimize uncertainty. J. Mem. Lang 138, 104530. [Google Scholar]
  14. Chen CH, Gershkoff-Stowe L, Wu CY, Cheung H, Yu C, 2017. Tracking multiple statistics: Simultaneous learning of object names and categories in English and Mandarin speakers. Cognit. Sci 41 (6), 1485–1509. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Christiansen MH, Contreras Kallens P, Trecca F, 2022. Toward a comparative approach to language acquisition. Curr. Dir. Psychol. Sci 31 (2), 131–138. [Google Scholar]
  16. Cole RA, Jakimik J, 1980. A model of speech perception. Percept. Product. Fluent Speech 133 (64), 133–142. [Google Scholar]
  17. Crepaldi D, Rastle K, Davis CJ, 2010. Morphemes in their place: Evidence for position-specific identification of suffixes. Mem. Cogn 38, 312–321. [DOI] [PubMed] [Google Scholar]
  18. Crespo K, Vlach H, Kaushanskaya M, 2023. The effects of bilingualism on children’s cross-situational word learning under different variability conditions. J. Exp. Child Psychol 229, 105621. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Cutler A, Mehler J, Norris D, Segui J, 1992. The monolingual nature of speech segmentation by bilinguals. Cogn. Psychol 24 (3), 381–410. [DOI] [PubMed] [Google Scholar]
  20. Deutsch A, Rayner K, 1999. Initial fixation location effects in reading Hebrew words. Lang. Cognit. Process 14 (4), 393–421. [Google Scholar]
  21. Ducrot S, Pynte J, 2002. What determines the eyes’ landing position in words? Percept. Psychophys 64 (7), 1130–1144. [DOI] [PubMed] [Google Scholar]
  22. Elazar A, Alhama RG, Bogaerts L, Siegelman N, Baus C, Frost R, 2022. When the “Tabula” is anything but “Rasa:” What determines performance in the auditory statistical learning task? Cognit. Sci 46 (2), e13102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Escudero P, Mulak KE, Fu CS, Singh L, 2016. More limitations to monolingualism: Bilinguals outperform monolinguals in implicit word learning. Front. Psychol 7, 1218. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Escudero P, Smit EA, Angwin AJ, 2023. Investigating orthographic versus auditory cross-situational word learning with online and laboratory-based testing. Lang. Learn 73 (2), 543–577. [Google Scholar]
  25. Ettlinger M, Morgan-Short K, Faretta-Stutenberg M, Wong PC, 2016. The relationship between artificial and second language learning. Cognit. Sci 40 (4), 822–847. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Finn AS, Kam CLH, 2008. The curse of knowledge: First language knowledge impairs adult learners’ use of novel statistics for word segmentation. Cognition 108 (2), 477–499. [DOI] [PubMed] [Google Scholar]
  27. Fiser J, Aslin RN, 2001. Unsupervised statistical learning of higher-order spatial structures from visual scenes. Psychol. Sci 12, 499–504. [DOI] [PubMed] [Google Scholar]
  28. Fiser J, Aslin RN, 2002. Statistical learning of higher-order temporal structure from visual shape sequences. J. Exp. Psychol. Learn. Mem. Cogn 28 (3), 458. [DOI] [PubMed] [Google Scholar]
  29. Fitneva SA, Christiansen MH, Monaghan P, 2009. From sound to syntax: Phonological constraints on children’s lexical categorization of new words. J. Child Lang 36 (5), 967–997. [DOI] [PubMed] [Google Scholar]
  30. Franco A, Cleeremans A, Destrebecqz A, 2011. Statistical learning of two artificial languages presented successively: how conscious? Front. Psychol 2, 229. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Frinsel FF, Trecca F, Christiansen MH, 2024. The role of feedback in the statistical learning of language-like regularities. Cognitive Science 48 (3), e13419. [DOI] [PubMed] [Google Scholar]
  32. Frost R, 2012. Towards a universal model of reading. Behav. Brain Sci 35 (5), 263–279. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Gebhart AL, Aslin RN, Newport EL, 2009. Changing structures in midstream: Learning along the statistical garden path. Cognit. Sci 33 (6), 1087–1116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Gollan TH, Slattery TJ, Goldenberg D, Van Assche E, Duyck W, Rayner K, 2011. Frequency drives lexical access in reading but not in speaking: The frequency-lag hypothesis. J. Exp. Psychol. Gen 140 (2), 186–209. 10.1037/a0022256. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Goodman M, Finnegan R, Mohadjer L, Krenzke T, Hogan J, 2013. Literacy. Numeracy, and Problem Solving in Technology-Rich Environments among US Adults: Results from the Program for the International Assessment of Adult Competencies 2012. First Look. NCES 2014–008. National Center for Education Statistics. [Google Scholar]
  36. Grosjean F, 2010. Bilingualism, biculturalism, and deafness. Int. J. Biling. Educ. Biling 13 (2), 133–145. [Google Scholar]
  37. He X, Tong X, 2017. Statistical learning as a key to cracking Chinese orthographic codes. Sci. Stud. Read 21 (1), 60–75. [Google Scholar]
  38. Hendrickson AT, Perfors A, 2019. Cross-situational learning in a Zipfian environment. Cognition 189, 11–22. [DOI] [PubMed] [Google Scholar]
  39. Horst JS, Hout MC, 2016. The Novel Object and Unusual Name (NOUN) Database: A collection of novel images for use in experimental research. Behav. Res. Methods 48 (4), 1393–1409. [DOI] [PubMed] [Google Scholar]
  40. Isbilen ES, Christiansen MH, 2022. Statistical learning of language: a meta-analysis of 25 years of research. Cognit. Sci 46, e13198. [DOI] [PubMed] [Google Scholar]
  41. Isbilen ES, McCauley SM, Kidd E, Christiansen MH, 2020. Statistically-induced chunking recall: A memory-based approach to statistical learning. Cognit. Sci 44, e12848. [DOI] [PubMed] [Google Scholar]
  42. Isbilen ES, McCauley SM, Christiansen MH, 2022. Individual differences in artificial and natural language statistical learning. Cognition 225, 105123. 10.1016/j.cognition.2022.105123. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Kidd E, Donnelly S, Christiansen MH, 2018. Individual differences in language acquisition and processing. Trends Cogn. Sci 22, 154–169. [DOI] [PubMed] [Google Scholar]
  44. LaCross A, 2015. Khalkha Mongolian speakers’ vowel bias: L1 influences on the acquisition of non-adjacent vocalic dependencies. Language, Cognition and Neuroscience 30 (9), 1033–1047. [Google Scholar]
  45. Lee SMK, Cui Y, Tong SX, 2022. Toward a model of statistical learning and reading: Evidence from a meta-analysis. Rev. Educ. Res 92 (4), 651–691. [Google Scholar]
  46. Lemhöfer K, Broersma M, 2012. Introducing LexTALE: A quick and valid lexical test for advanced learners of English. Behav. Res. Methods 44, 325–343. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Macmillan NA, Kaplan HL, 1985. Detection theory analysis of group data: estimating sensitivity from average hit and false-alarm rates. Psychol. Bull 98 (1), 185. [PubMed] [Google Scholar]
  48. Magnuson JS, Tanenhaus MK, Aslin RN, Dahan D, 2003. The time course of spoken word learning and recognition: studies with artificial lexicons. J. Exp. Psychol. Gen 132 (2), 202. [DOI] [PubMed] [Google Scholar]
  49. Marian V, Blumenfeld HK, & Kaushanskaya M (2007). The Language Experience and Proficiency Questionnaire (LEAP-Q): Assessing language profiles in bilinguals and multilinguals. [DOI] [PubMed] [Google Scholar]
  50. Mattingly IG, 1972. Reading, the linguistic process, and linguistic awareness. In: Kavanagh JF, Mattingly IG (Eds.), Language by Ear and by Eye: the Relationships between Speech and Reading. MIT Press, Cambridge, MA, pp. 133–147. [Google Scholar]
  51. Miller RT, 2019. English orthography and reading. The TESOL Encyclopedia of English Language Teaching 1 (7). [Google Scholar]
  52. Mor B, Prior A, 2020. Individual differences in L2 frequency effects in different script bilinguals. Int. J. Biling 24 (4), 672–690. [Google Scholar]
  53. Mulak KE, Vlach HA, Escudero P, 2019. Cross-situational learning of phonologically overlapping words across degrees of ambiguity. Cognit. Sci 43 (5), e12731. [DOI] [PubMed] [Google Scholar]
  54. Nazir TA, Ben-Boutayab N, Decoppet N, Deutsch A, Frost R, 2004. Reading habits, perceptual learning, and recognition of printed words. Brain Lang 88 (3), 294–311. [DOI] [PubMed] [Google Scholar]
  55. Oh JS, Au TKF, Jun SA, 2010. Early childhood language memory in the speech perception of international adoptees. J. Child Lang 37 (5), 1123–1132. [DOI] [PubMed] [Google Scholar]
  56. Padakannaya P, Devi ML, Zaveria B, Chengappa SK, Vaid J, 2002. Directional scanning effect and strength of reading habit in picture naming and recall. Brain Cogn 48 (2–3), 484–490. [PubMed] [Google Scholar]
  57. Pallier C, Dehaene S, Poline JB, LeBihan D, Argenti AM, Dupoux E, Mehler J, 2003. Brain imaging of language plasticity in adopted adults: Can a second language replace the first? Cereb. Cortex 13 (2), 155–161. [DOI] [PubMed] [Google Scholar]
  58. Poepsel TJ, Weiss DJ, 2016. The influence of bilingualism on statistical word learning. Cognition 152, 9–19. [DOI] [PubMed] [Google Scholar]
  59. Potter CE, Wang T, Saffran JR, 2017. Second language experience facilitates statistical learning of novel linguistic materials. Cognit. Sci 41, 913–927. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. R Core Team, 2020. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria: http://www.R-project.org/. [Google Scholar]
  61. Rampey BD, Finnegan R, Goodman M, Mohadjer L, Krenzke T, Hogan J, & Provasnik S (2016). Skills of US Unemployed, Young, and Older Adults in Sharper Focus: Results from the Program for the International Assessment of Adult Competencies (PIAAC) 2012/2014. First Look. NCES 2016–039. National Center for Education Statistics. [Google Scholar]
  62. Rao C, Vaid J, Chen HC, 2017. The processing cost for reading misaligned words is script-specific: Evidence from Hindi and Kannada/Hindi readers. Journal of Cultural Cognitive Science 1, 39–48. [Google Scholar]
  63. Rayner K, 1979. Eye guidance in reading: Fixation locations within words. Perception 8 (1), 21–30. [DOI] [PubMed] [Google Scholar]
  64. Ren J, Wang M, Arciuli J, 2023. A meta-analysis on the correlations between statistical learning, language, and reading outcomes. Dev. Psychol 59 (9), 1626. [DOI] [PubMed] [Google Scholar]
  65. Saffran JR, 2001. Words in a sea of sounds: The output of infant statistical learning. Cognition 81 (2), 149–169. [DOI] [PubMed] [Google Scholar]
  66. Saffran JR, Johnson EK, Aslin RN, Newport EL, 1999. Statistical learning of tone sequences by human infants and adults. Cognition 70 (1), 27–52. [DOI] [PubMed] [Google Scholar]
  67. Saffran JR, Wilson DP, 2003. From syllables to syntax: multilevel statistical learning by 12-month-old infants. Infancy 4 (2), 273–284. [Google Scholar]
  68. Saffran JR, Aslin RN, Newport EL, 1996. Statistical learning by 8-month-old infants. Science 274, 1926–1928. [DOI] [PubMed] [Google Scholar]
  69. Sawi OM, Rueckl J, 2019. Reading and the neurocognitive bases of statistical learning. Sci. Stud. Read 23 (1), 8–23. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Shimron J, 2006. Reading Hebrew: The language and the psychology of reading it. Routledge. [Google Scholar]
  71. Shufaniya A, Arnon I, 2018. Statistical learning is not age-invariant during childhood: Performance improves with age across modality. Cognit. Sci 42 (8), 3100–3115. [DOI] [PubMed] [Google Scholar]
  72. Siegelman N, Bogaerts L, Christiansen MH, Frost R, 2017. Towards a theory of individual differences in statistical learning. Philos. Trans. r. Soc. B 372, 20160059. [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Siegelman N, Bogaerts L, Elazar A, Arciuli J, Frost R, 2018. Linguistic entrenchment: Prior knowledge impacts statistical learning performance. Cognition 177, 198–213. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Siegelman N, Schroeder S, Acartürk C, Ahn HD, Alexeeva S, Amenta S, Kuperman V, 2022. Expanding horizons of cross-linguistic research on reading: The Multilingual Eye-movement Corpus (MECO). Behav. Res. Methods 54 (6), 2843–2863. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Siegelman N, Elgort I, Brysbaert M, Agrawal N, Amenta S, Arsenijević Mijalković J, Chang CS, Chernova D, Chetail F, Clarke AJB, Content A, Crepaldi D, Davaabold N, Delgersuren S, Deutsch A, Dibrova V, Drieghe D, Filipović Đurđević D, Finch B, Frost R, Gattei CA, Geva E, Godfroid A, Griener L, Hernández-Rivera E, Ivanenko A, Järvikivi J, Kawaletz L, Khare A, Lee JR, Lee CE, Manouilidou C, Marelli M, Mashanlo T, Mišić K, Miwa K, Palma P, Plag I, Rezanova Z, Riimed E, Rueckl J, Schroeder S, Sekerina IA, Shalom DE, Slioussar N, Slosar NM, Taler V, Thériault K, Titone D, Tumee O, Wetering R.v.d., Verma A, Weiss AF, Wu DH, Kuperman V, 2023. Rethinking first language–second language similarities and differences in English proficiency: Insights from the English reading online (ENRO) project. Lang. Learn 10.1111/lang.12586. [DOI] [Google Scholar]
  76. Sun X, Marks RA, Zhang K, Yu CL, Eggleston RL, Nickerson N, Kovelman I, 2023. Brain bases of English morphological processing: A comparison between Chinese-English, Spanish-English bilingual, and English monolingual children. Dev. Sci 26 (1), e13251. [DOI] [PMC free article] [PubMed] [Google Scholar]
  77. Treiman R, Kessler B, 2022. Statistical learning in word reading and spelling across languages and writing systems. Sci. Stud. Read 26 (2), 139–149. [Google Scholar]
  78. Vaid J, 2022. Biscriptality: a neglected construct in the study of bilingualism. Journal of Cultural Cognitive Science 6 (2), 135–149. [Google Scholar]
  79. Velan H, Frost R, 2007. Cambridge University versus Hebrew University: The impact of letter transposition on reading English and Hebrew. Psychon. Bull. Rev 14 (5), 913–918. [DOI] [PubMed] [Google Scholar]
  80. Velan H, Deutsch A, Frost R, 2013. The flexibility of letter-position flexibility: Evidence from eye movements in reading Hebrew. J. Exp. Psychol. Hum. Percept. Perform 39 (4), 1143–1152. 10.1037/a0031075. [DOI] [PMC free article] [PubMed] [Google Scholar]
  81. Vidal Y, Viviani E, Zoccolan D, Crepaldi D, 2021. A general-purpose mechanism of visual feature association in visual word identification and beyond. Curr. Biol 31 (6), 1261–1267. [DOI] [PubMed] [Google Scholar]
  82. Wang M, Koda K, Perfetti CA, 2003. Alphabetic and nonalphabetic L1 effects in English word identification: A comparison of Korean and Chinese English L2 learners. Cognition 87 (2), 129–149. [DOI] [PubMed] [Google Scholar]
  83. Wang T, Saffran JR, 2014. Statistical learning of a tonal language: The influence of bilingualism and previous linguistic experience. Front. Psychol 5, 953. [DOI] [PMC free article] [PubMed] [Google Scholar]
  84. Weiss DJ, Poepsel T, Gerfen C, 2015. Tracking multiple inputs. Implicit and Explicit Learning of Languages 167–190. [Google Scholar]
  85. Weiss DJ, Schwob N, Lebkuecher AL, 2020. Bilingualism and statistical learning: Lessons from studies using artificial languages. Biling. Lang. Congn 23 (1), 92–97. [Google Scholar]
  86. Wiener S, 2020. Second language learners develop non-native lexical processing biases. Biling. Lang. Congn 23 (1), 119–130. [Google Scholar]
  87. Yim D, Rudoy J, 2013. Implicit statistical learning and language skills in bilingual children. J. Speech Lang. Hear. Res 56 (1), 310–322. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

All de-identified data is available through the Open Science Frame-work (https://osf.io/6ucp8/)

RESOURCES