Abstract
Reading causes widespread changes in the brain, but its effect on visual word representations is unknown. Learning to read may facilitate visual processing by forming specialized detectors for longer strings or by making word responses more predictable from single letters—that is, by increasing compositionality. We provided evidence for the latter hypothesis using experiments that compared nonoverlapping groups of readers of two Indian languages (Telugu and Malayalam). Readers showed increased single-letter discrimination and decreased letter interactions for bigrams during visual search. Importantly, these interactions predicted subjects’ overall reading fluency. In a separate brain-imaging experiment, we observed increased compositionality in readers, whereby responses to bigrams were more predictable from single letters. This effect was specific to the anterior lateral occipital region, where activations best matched behavior. Thus, learning to read facilitates visual processing by increasing the compositionality of visual word representations.
Keywords: reading, object recognition, visual search, neuroimaging, open data
Reading a word involves processing its visual form, associating it with spoken sounds, and processing its overall meaning. Consequently, learning to read alters a variety of brain systems, including the visual, auditory, and language regions (Dehaene, Cohen, Morais, & Kolinsky, 2015). In particular, reading has a profound influence on the visual regions. It leads to the formation of the visual word-form area (VWFA) in the left occipitotemporal sulcus; the VWFA is selectively activated by words of familiar scripts and by intact words over scrambled controls, and activation levels in this region predict reading fluency (Dehaene et al., 2015). But reading also causes widespread changes throughout the visual cortex, as shown by greater activation for intact words relative to scrambled controls (Dehaene & Cohen, 2011; Dehaene et al., 2010; Lochy et al., 2018; Szwed et al., 2011) as well as for familiar over unfamiliar scripts (Bai, Shi, Jiang, He, & Weng, 2011; Baker et al., 2007; Krafnick et al., 2016; Szwed, Qiao, Jobert, Dehaene, & Cohen, 2014).
Despite these insights, several fundamental questions remain regarding how reading affects letter and word representations. Does reading alter single-letter representations? Does it alter word representations beyond the effect on single letters? These questions have been difficult to answer for two reasons. First, letter representations with and without reading expertise are difficult to characterize because many Western languages use the same script, making it difficult to find subjects fluent in distinct scripts without introducing confounding factors such as phonological mapping, writing systems, and literacy (Dehaene et al., 2015). Indian languages offer a unique opportunity to investigate these issues because of their diverse alphabetic scripts with shared phonological mapping and writing systems (Nag, 2017). This makes it possible for researchers to compare subjects proficient in reading distinct scripts while holding constant other confounding factors.
Second, to characterize changes in word representations, it is critical to establish a quantitative model to relate word responses to letter responses. According to an influential account, reading facilitates visual processing through the formation of specialized local combination detectors (Dehaene, Cohen, Sigman, & Vinckier, 2005). These combination detectors respond to frequently occurring bigrams (e.g., “TH”) and longer strings. Evidence in favor of this account comes from the increased activation of the VWFA as letter strings become orthographically similar to real words (Binder, Medler, Westbury, Liebenthal, & Buchanan, 2006; Lochy et al., 2018; Szwed et al., 2011; Vinckier et al., 2007). However, these results are based on comparing letter strings equated for mean letter frequency. These matched letter strings may contain letters of disparate frequencies or medium frequencies at different positions, which could elicit different responses simply because of letter-frequency and position effects (Scaltritti, Dufau, & Grainger, 2018). Thus, local combination detectors must be invoked only if responses to bigrams cannot be explained using the constituent letters.
An alternative account is that reading might increase compositionality (i.e., make bigrams and longer strings more predictable from single letters). These two accounts make opposite predictions as to how the response to a bigram relates to the constituent letters: If reading leads to the formation of local combination detectors, the response to a bigram will be less predictable from the individual letters. If reading leads to increased compositionality, the response will be more predictable. We evaluated these predictions using a combination of behavioral and neuroimaging experiments.
Method
All subjects had normal or corrected-to-normal vision and gave written informed consent to the experimental protocols, which were approved by the Indian Institute of Science Institutional Human Ethics Committee. Subjects had similar educational status: They were all undergraduate or graduate students at the Indian Institute of Science. All subjects were fluent in English and were fluent in reading either Telugu or Malayalam (but not both).
Fluency test
Subjects were asked to perform a brief fluency test along with every experiment. In this test, a passage of text was shown to the subject in his or her known script (Telugu or Malayalam). In both languages, this passage described how the head of an Indian village introduced computers to the village and employed software professionals to train residents. This passage was prepared by translating the same English passage into both languages. Subjects were asked to silently read the passage on a computer screen and press a button after they finished reading it. After this, a dialogue box appeared, and subjects were asked to summarize the passage in English. This summary was reviewed off-line by the first author to confirm that the subjects indeed comprehended the passage. The time taken by subjects for the button press was taken as a measure of reading fluency. All but 2 subjects from Experiment 3 participated in the fluency test. A minority of the subjects (n = 4) declared afterward that they had read the passage multiple times to memorize it, so their data were excluded from subsequent fluency analyses.
Experiment 1 (single letters)
A total of 39 subjects (28 males; age: M = 25 years, SD = 4; 19 Telugu, 20 Malayalam) participated in this experiment. Here and in all visual search experiments, we chose this sample size because previous studies from our group have obtained highly consistent data using similar sample sizes (Pramod & Arun, 2016). We did not use any stopping criterion. The stimuli consisted of 36 single letters each from the Telugu and Malayalam languages (examples are shown in Fig. 1; see Section S1 in the Supplemental Material available online). The font Nirmala UI was used because it has uniform stroke width. Subjects performed a baseline motor-response task and an oddball visual search task.
Fig. 1.
Malayalam and Telugu scripts. The Malayalam and Telugu languages are spoken in geographically distinct regions in India, highlighted on the map. The scripts have distinct letter shapes but share many phonemes (indicated above each letter). Only 16 example letters are shown here from each language; Telugu has 60 letters, and Malayalam has 53 letters. The full set of stimuli is shown in Section S1 in the Supplemental Material available online. Map courtesy of Free Vector Maps (https://freevectormaps.com/).
In the baseline task, a circle appeared on the left or right of the screen, and subjects had to indicate the side on which the circle appeared by pressing a key (“Z” for left, “M” for right). The average response time (RT) across 20 trials was taken as a measure of baseline motor speed (depicted in Fig. 2b). In the visual search task, each trial began with a fixation cross for 500 ms, followed by a 4 × 4 search array that contained one oddball target and 15 identical distractors (Fig. 2a). The exact position of each item was jittered on each trial according to a uniform distribution with a range of ±0.25° in the vertical and horizontal directions. This was done to prevent alignment cues from influencing search. The vertical dimension of all letters subtended 2° of visual angle on the screen, and the longer dimension varied depending on the letter. A vertical red line divided the screen into two halves. All stimuli were presented using custom scripts written in MATLAB (The MathWorks, Natick, MA) running the Psychophysics Toolbox (Brainard, 1997).
Fig. 2.
Example search array and results from Experiment 1. An example single-letter search array using Telugu letters is shown in (a). Average search time (b) is shown for readers and nonreaders of Telugu and Malayalam letters. The baseline response time is also shown for each group of subjects. Error bars depict standard errors of the mean across subjects, and asterisks indicate statistically significant differences between groups (p < .00005, sign-rank test across pairs). Pairwise search dissimilarity is shown separately for 630 pairs of (c) Telugu letters and (d) Malayalam letters, plotted for readers and nonreaders. Each point represents one search pair; an example easy and hard search pair are shown. The dotted line is the y = x line, and the solid line is the best-fitting line to the data. Asterisks indicate that the correlations were significant (p < .00005).
Subjects were instructed to locate the target as quickly and as accurately as possible and to respond using a key press (“Z” for left, and “M” for right). The trial timed out after 10 s. All stimuli were presented in white against a black background. In all, subjects completed two search trials corresponding to all 36C2 pairs of letters in each language, which amounted to 2,520 correct trials (36C2 pairs × 2 languages × 2 repetitions). Incorrect or missed trials appeared randomly later in the task. Only correct responses were analyzed. Any response exceeding 5 s was removed from analysis provided such a response occurred in less than 15% of the subjects. This step improved data consistency overall. We obtained qualitatively similar results without this step.
Experiment 2 (bigrams)
A total of 16 subjects (10 males; age: M = 24 years, SD = 2; 8 Telugu, 8 Malayalam) participated in this experiment. The stimuli consisted of 25 bigrams each from Telugu and Malayalam, created using all possible combinations of five single letters (shown in Section S1). The single letters were chosen such that the full stimulus set contained a few frequent bigrams in each language. In all, subjects performed searches corresponding to all possible pairs of the 25 bigrams, which amounted to 1,200 correct trials (25C2 searches × 2 languages × 2 repetitions). All other details of the procedure were identical to those in Experiment 1.
Search RTs were averaged across repetitions and subjects to obtain a composite measure that we then converted to a dissimilarity measure (1/RT), as in our previous studies. This resulted in a total of 300 pairwise dissimilarities (25C2 = 300) between all possible pairs of bigrams. Using the approach reported in our previous study (Pramod & Arun, 2016), we modeled the pairwise dissimilarity between two bigrams, AB and CD, as a linear sum of pairwise dissimilarities between single letters at various locations. Specifically,
where CAC and CBD represent the distances between letters at corresponding locations in the two bigrams, XAD and XBC represent the distances between letters at opposite locations in the two bigrams, WAB and WCD represent distances between letters within each of the two bigrams, and c is a constant term.
This part-sum model is extremely general in that it assumes no systematic relation between single-letter distances at corresponding locations in a bigram, across locations in a bigram, or indeed within a given bigram (referred to henceforth as “corresponding, across, and within terms”). It works because a given letter pair occurs repeatedly across bigram pairs (e.g., the pair AC is present at corresponding locations in the bigram pairs AB-CD, AD-CE, and EA-DC). Because there are 5 unique single letters, there are 10 single-letter distances (5C2 = 10) for each term type (corresponding, across, within), which amounts to a total of 31 parameters (10 of each type × 3 types + 1 constant). Because there are 300 dissimilarity measurements and only 31 parameters, the model parameters can be uniquely estimated from the data. When the above model equation is written down for all 300 pairwise dissimilarities, the set of simultaneous equations can be written as y = Xb, where y is a 300 × 1 vector containing the observed dissimilarities; X is a 300 × 31 matrix with 0, 1, or 2 as entries (depending on the absence, presence, or repetition of a particular letter pair at corresponding locations, across locations, or within the two bigrams of a given pair); and b is a 31 × 1 vector of unknowns. We estimated the model parameters using standard linear regression (the regress function in MATLAB).
Experiment 3 (functional MRI)
A total of 35 subjects (31 males; age: M = 25 years, SD = 3; 17 Telugu, 18 Malayalam) participated in functional localizer runs (n = 2) and event-related runs (n = 8) that were randomly interleaved. An anatomical scan was also included for each subject at the beginning. We chose this sample size because it was similar to that used in previous studies of reading (Baker et al., 2007), and we did not use any stopping criterion.
In the functional localizer runs, subjects viewed 16-s blocks of scrambled words (in Telugu, Malayalam, and English), objects, and scrambled objects while performing a one-back task throughout. In each block, 14 stimuli were randomly selected from a pool of images. The Telugu pool comprised 8 two-letter and 38 three-letter words, and the Malayalam pool comprised 12 two-letter and 38 three-letter words. The English pool comprised 36 four-letter words and 45 five-letter words. Telugu and Malayalam letters are typically wider than English letters; thus, we used longer English words so that the overall width of the image was roughly equal for all three languages. Each word was divided into grids—8 × 4 for Indian languages and 8 × 3 for English—and scrambled words were creating by randomly shuffling the grid. The objects pool comprised 80 human-made objects. Scrambled objects were created by scrambling the phase of the Fourier-transformed images and then reconstructing the phase-scrambled image using the inverse Fourier transform.
All images were presented against a black background. Each block consisted of a total of 16 stimuli presented for 0.8 s with a 0.2-s blank interval, among which two randomly chosen stimuli were repeated. Each block ended with a fixation cross presented for 4 s against a blank screen. Thus, each block lasted 20 s. The size of the object images was about 4.5° along the longer dimension, whereas the vertical size of the word stimuli was 2.5°, as in the event-related runs. There were six repetitions of each block across two runs, and each run lasted for 370 s. Stimuli were presented using custom MATLAB scripts written with the Psychophysics Toolbox.
In the event-related runs, the stimuli consisted of 10 single letters and 24 bigrams each in Telugu and Malayalam, for a total of 68 stimuli. The height of the stimuli were equated to subtend 2.5° of visual angle, with longer dimensions that were scaled accordingly to preserve the aspect ratio. The bigrams were chosen so that each letter appeared at least four times; both high- and low-frequency bigrams were used, and the mean bigram dissimilarities were similar across the two languages (see Section S1 for all stimuli). On each trial, the stimulus was presented at the center of the screen with a black background for 300 ms, followed by a blank screen with a fixation cross for 3.7 s. In each run, all stimuli were presented once. Subjects were instructed to maintain fixation on the cross and perform a one-back task (i.e., to press a button whenever an image appeared twice in sequence). Each run contained eight trials with only a fixation cross in order to jitter the interstimulus interval, and eight randomly chosen images were repeated in a given run. Each run lasted 368 s, and there were eight runs in all, yielding eight repeats per stimulus.
Data acquisition
Subjects viewed images projected on a screen through a mirror placed above their eyes. Functional MRI (fMRI) data were acquired using a 32-channel head coil on a 3T Skyra (Siemens, Mumbai, India) at the HealthCare Global Hospital, Bengaluru. Functional scans were performed using a T2*-weighted gradient-echo-planar imaging sequence with the following parameters: repetition time (TR) = 2 s, echo time (TE) = 28 ms, flip angle = 79°, voxel size = 3 × 3 × 3 mm3, field of view = 192 × 192 mm2, and 33 axial-oblique slices for whole-brain coverage. Anatomical scans were performed using T1-weighted images with the following parameters: TR = 2.30 s, TE = 1.99 ms, flip angle = 9°, voxel size = 1 × 1 × 1 mm3, field of view = 256 × 256 × 176 mm3.
Data preprocessing
The raw fMRI data were preprocessed using Statistical Parametric Mapping (SPM) software (Version 12; Welcome Center for Human Neuroimaging; https://www.fil.ion.ucl.ac.uk/spm/software/spm12/). Raw images were realigned, slice-time corrected, coregistered to the anatomical image, segmented, and normalized to the Montreal Neurological Institute (MNI) 305 anatomical template. Repeating the key analyses with voxel activations estimated from individual subjects yielded qualitatively similar results. Smoothing was performed only on the functional localizer blocks using a Gaussian kernel with a full-width half maximum of 5 mm. Default SPM parameters were used, and voxel size after normalization was kept at 3 × 3 × 3 mm3. The data were further processed using GLMdenoise (Version 1.4; Kay, Rokem, Winawer, Dougherty, & Wandell, 2013). GLMdenoise improves the signal-to-noise ratio in the data by regressing out the noise estimated from task-unrelated voxels. The denoised time-series data were modeled using generalized linear modeling in SPM after removing low-frequency drift using a high-pass filter with a cutoff of 128 s. In the main experiment, the activity of each voxel was modeled using 83 regressors (68 stimuli + 1 fixation + 6 motion regressors + 8 runs). In the localizer block, each voxel was modeled using 15 regressors (6 stimuli + 1 fixation + 6 motion regressors + 2 runs).
Regions of interest
All regions of interest (ROIs) were defined using the data from functional localizer blocks together with anatomical considerations. Early visual areas (V1–V4) were defined as the regions that responded more to scrambled objects compared with fixation. The regions identified were further parceled into V1 to V3 and V4 using anatomical masks from the SPM Anatomy Toolbox (Eickhoff et al., 2005). We grouped V1 to V3 into a single ROI because we observed qualitatively similar differences in activations for known and unknown scripts. Lateral occipital cortex was defined as the voxels that responded to objects more than scrambled objects but were restricted using anatomical masks (inferior temporal gyrus, inferior occipital gyrus, and middle occipital gyrus) created from tissue probability map labels available in SPM12. The VWFA was defined as a contiguous region in the occipitotemporal sulcus that responded more to known words (Telugu or Malayalam) compared with scrambled words. The temporal gyrus was defined as voxels in the temporal gyrus (both superior and medial portions, as well as Wernicke’s area) that responded more to known words (Telugu or Malayalam) compared with scrambled words. For each contrast, a voxel-level threshold of p < .001 (uncorrected) or cluster-level threshold of p < .05 (family-wise-error corrected) was used to define contiguous regions. However, for 6 subjects, the VWFA could not be identified, and therefore a lower threshold of p > .05 (uncorrected; the lowest-threshold p value used was .2) was used until we observed a contiguous cluster of at least 40 voxels in left occipitotemporal sulcus. The lateral occipital and VWFA voxels were further restricted to the top 200 and top 20 significant voxels, respectively (according to the t value in the functional contrast). We obtained similar results with other choices of voxel selection. Finally, all results were visualized on the cortical surface using the MATLAB program BSPMVIEW (http://www.bobspunt.com/bspmview/). A summary of the typical locations and numbers of voxels in each ROI is given in Section S7 in the Supplemental Material.
Neural similarity in fMRI
For each ROI, the dissimilarity between each pair of stimuli was computed as 1 – r, where r is the Spearman correlation coefficient between the activity patterns evoked across voxels by the two stimuli. The dissimilarities were z scored and then averaged across subjects.
Voxel population model
For each bigram, we modeled the response of a population of voxels as a linear combination of the response of the voxels to individual letters. For example, if there were 100 voxels in a given ROI, then for each bigram, its response was modeled as y = Xb, where y is 100 × 1 vector of beta (activation) values across voxels for that bigram, X is a 100 × 3 matrix, where the first two columns correspond to the beta values for the corresponding voxels for the two constituent letters of the bigram, and the third column is a vector of 1s corresponding to a constant term, and b is a 3 × 1 vector of unknown weights that corresponds to the summation weights.
To evaluate model fit, we calculated the correlation between the observed and predicted response for each voxel. This procedure prevents the model fit from being biased by overall activation-level differences between voxels. The correlation coefficients were averaged across all bigrams to obtain an average model correlation for that ROI in a given subject. The model fit was compared between readers and nonreaders using paired-sample t tests across subject-wise model correlations.
Behavioral dissimilarity for fMRI bigrams
We estimated the behavioral dissimilarities for the bigrams used in this experiment with a reduced part-sum model. Recall that the part-sum model estimates separate letter dissimilarities for corresponding, across, and within terms, but the estimated terms were all correlated with the single-letter dissimilarities. We therefore modified the part-sum model to a highly reduced model in which single-letter dissimilarities from Experiment 1 combined linearly as follows:
where dAC, dBD, dAD, dBC, dAB, and dCD are pairwise single-letter dissimilarities observed in Experiment 1 and α, β, and γ are unknown scaling terms for letter relations at corresponding locations, across locations, and within bigrams. Thus, this model had only four free parameters that could be estimated again using linear regression. To predict the behavioral dissimilarities between the bigrams used in the fMRI experiment, we first estimated the parameters of this reduced model from the bigram searches in Experiment 2 and then used these parameters, together with the single-letter dissimilarities from Experiment 1, to generate the predicted dissimilarities for all pairs of bigrams used in Experiment 3. This was then compared with the neural similarity calculated above. We confirmed the validity of this approach by comparing these predicted dissimilarities with search dissimilarities directly estimated in an additional experiment (see Section S7 of the Supplemental Material).
Results
We compared letter and word representations in distinct groups of readers that had similar educational levels and fluency in either of two Indian languages (Telugu and Malayalam). These languages have distinctive scripts with many shared phonemes and highly similar writing systems (Fig. 1). We selected visually distinct letters with identical pronunciations from both languages (see Section S1). This design not only eliminates confounding factors due to phonology, writing systems, or literacy but also isolates the effect of reading expertise from intrinsic shape differences across the two scripts.
Experiment 1 (single letters)
We first investigated whether reading expertise modulates single-letter representations. We recruited 39 readers to perform an oddball visual search task involving Telugu letters and Malayalam letters (see Fig. S1 in the Supplemental Material). An example search array using Telugu letters is shown in Figure 2a. Subjects were equally accurate on searches involving known and unknown scripts (mean accuracy: 99% for known scripts, 98% for unknown scripts). However, they were faster for searches involving letters of known scripts (Fig. 2b). To compare letter representations, we used the reciprocal of search time as a measure of dissimilarity between letters (Arun, 2012). This can be interpreted as the underlying salience signal that accumulates during visual search (Sunder & Arun, 2016); it combines linearly across object attributes (Pramod & Arun, 2014, 2016), search types (Vighneshvel & Arun, 2013), and even top-down influences (Sunder & Arun, 2016).
For each language, we plotted the pairwise dissimilarity for readers against that of nonreaders across all letter pairs. This revealed a strong positive correlation for both Telugu letters (Fig. 2c) and Malayalam letters (Fig. 2d). These correlations were close to the consistency of the responses within each group (correlation between dissimilarities in odd- and even-numbered subjects: r = .83, 95% CI = [.8, .85], and r = .87, 95% CI = [.85, .89], for readers and nonreaders of Telugu; r = .83, 95% CI = [.80, .85], and r = .87, 95% CI = [.85, .89], respectively, for readers and nonreaders of Malayalam; all correlations: p < .00005). Reading expertise also resulted in increased dissimilarity for more similar letters, as shown by a negative correlation between baseline letter dissimilarity (as measured in nonreaders) and the increase in dissimilarity for readers over nonreaders (r = –.43, 95% CI = [–.36, –.49] for Telugu; r = –.49, 95% CI = [–.43, –.55] for Malayalam; both correlations: p < .00005). These subtle alterations did not affect the global arrangement of letters in perceptual space (see Section S2 in the Supplemental Material). Letters that co-occurred in a bigram showed greater similarity in readers (see Section S2), and their sounds were perceived as more similar (see Section S3 in the Supplemental Material). In sum, reading subtly altered letter representations through increased discrimination of similar letters.
Experiment 2 (bigrams)
Next, we set out to characterize how reading expertise affects the representations of longer strings. Subjects performed oddball visual search involving bigrams of either familiar or unfamiliar scripts (Fig. 3a). Readers were again faster to discriminate bigrams of known scripts over bigrams of unknown scripts (Fig. 3b). Once again, reading had a subtle effect on bigram representations, as evidenced by a strong correlation between bigram dissimilarities of readers and nonreaders (r = .80, 95% CI = [.76, .84] for Telugu; r = .83, 95% CI = [.79, .86] for Malayalam; p < .00005). These subtle alterations did not result in qualitative changes in the overall perceptual representation (see Section S4 in the Supplemental Material). However, the critical question remained: Are these changes driven solely by the increased discrimination of single letters? Or are there additional emergent properties that make readers better able to distinguish bigrams?
Fig. 3.
Example search array and results from Experiment 2. An example search array using Telugu bigrams is shown in (a). The Telugu and Malayalam letters used to create all 25 possible bigrams are shown below the search array. Average search time (b) is shown for readers and nonreaders of Telugu bigrams and Malayalam bigrams. The baseline response time (RT) is also shown for each group of subjects. Error bars depict standard errors of the mean across subjects, and asterisks indicate significant differences between groups (p < .00005). A schematic of the part-sum model is shown in (c). According to this model, the net dissimilarity (1/RT) between bigrams AB and CD can be explained using single-letter dissimilarities between letters at corresponding locations, at opposite locations in the two bigrams, and within each of the bigrams (see the text for further details). Observed bigram dissimilarity (d) is plotted against predicted bigram dissimilarity from the part-sum model for Telugu readers on Telugu bigrams. Searches with low-frequency bigrams (n = 91) and high-frequency bigrams (n = 55) are plotted separately from all other search pairs (n = 154; gray circles). Each point represents one search pair. A few example search pairs of each type are shown in the plot. The diagonal line is the y = x line. Asterisks indicate that the mean correlations were significant (p < .00005). Part-sum model parameters (averaged across 10 part relations) are shown for letter dissimilarities at corresponding locations, across locations, and within bigrams for readers and nonreaders of (e) Telugu bigrams and (f) Malayalam bigrams. Error bars indicate standard deviations. Asterisks indicate statistical significance (*p < .05, ***p < .0005, ****p < .00005 on a signed-rank test across 10 part relations between readers and nonreaders). Average search time (g) is shown for transposed-letter searches (e.g., AB among BA) and repeated-letter searches (e.g., AA vs. BB) for readers and nonreaders, averaged across Telugu and Malayalam readers. Error bars depict standard errors of the mean across subjects, and asterisks indicate significant differences between groups (p < .00005 on a rank-sum test across search times for 20 AB–BA pairs across the two languages, or across AA–BB pairs). The model equations below the graph show how smaller within-bigram terms lead to increased dissimilarity for transposed letters but not repeated letters. For transposed-letter searches, letters are identical at opposite locations, so the opposite-location terms are multiplied by zero, but the smaller within-bigram terms for readers lead to larger dissimilarities (and therefore faster searches). For repeated-letter searches, the within-bigram terms are multiplied by zero by definition, and therefore there is no benefit for readers. Partial correlation (h) is illustrated between reading fluency and each part-sum model term (after factoring out all other terms) across subjects. We used subjects’ data across multiple experiments to perform this analysis. See Section S6 in the Supplemental Material for details. The combined model is based on predicting reading fluency as a linear combination of all model terms. Error bars represent ±1 SD, and asterisks indicate significant partial correlations (*p < .05, **p < .005, ***p < .0005).
Can bigram dissimilarities be predicted from letters?
To address these issues, we drew on our finding that dissimilarities between object parts combine linearly in visual search (Pramod & Arun, 2016). Specifically, the net dissimilarity between two bigrams AB and CD is given as a linear sum of part relations at corresponding locations, part relations at opposite locations, and part relations within each bigram (Fig. 3c; also see the Method section). Given many pairwise dissimilarities between bigrams, this part-sum model attempts to recover the underlying letter–letter relations that accurately predict this data. The model works because a given letter pair, say AC, can be found at corresponding locations across multiple bigram pairs (e.g., AB–CD, AD–CE, BA–DC), allowing us to recover its contribution to the overall dissimilarity whenever A and C are present at matched locations in two bigrams. Likewise, the pair AC is present in many bigram pairs at opposite locations (e.g., AB–DC, AD–EC) and within bigrams (e.g., AC–BD, AC–DE), which allows us to recover its contribution to the net dissimilarity when it occurs at opposite locations in two bigrams or, likewise, within bigrams. This model, based on 1/RT or search dissimilarity, outperformed other models with fewer parameters, as well as models based on reaction time (see Section S4).
This model yielded excellent predictions of the data. It yielded a significant positive correlation between the observed and predicted bigram dissimilarities for Telugu readers tested on bigrams of their script (Fig. 3d). Because model coefficients represent dissimilarities between single letters, we first asked whether they were consistent with each other. This was indeed the case: We found a significant correlation between corresponding terms and across terms (r = .81, 95% CI = [.38, .95], p = .004) and a negative correlation between corresponding terms and within terms that approached significance (r = –.62, 95% CI = [–.89, .02], p = .06). The negative sign of within terms represents an effect akin to distractor heterogeneity in visual search (Pramod & Arun, 2016; Vighneshvel & Arun, 2013): When the letters in a target bigram are similar to each other, the search for that bigram among distractors is more efficient. All three types of terms contributed to the overall model fit (see Section S4). The corresponding terms were correlated with the single-letter dissimilarities observed in Experiment 1 (r = .83, 95% CI = [.41, .96], p = .003).
If reading expertise leads to the formation of specialized detectors for letter combinations, the part-sum model should be unable to predict searches involving high-frequency bigrams because it encodes single-letter dissimilarities but not bigram frequency. Note that the model can account for letter-frequency effects because it estimates the underlying single-letter dissimilarity, which in turn could depend on letter frequency. We observed no qualitative difference between model fits for high-frequency bigram pairs compared with low-frequency bigram pairs (Fig. 3d). A statistical comparison of the residual error between low- and high-frequency pairs revealed no significant difference (average model residual error: 0.07 for 91 low-frequency pairs, 0.08 for 55 high-frequency pairs; p = .96, rank-sum test). We observed similar patterns for readers of Malayalam letters (model correlation = .91, p < .0005; average residual error: 0.08 for 45 low-frequency pairs, 0.07 for 105 high-frequency pairs; p = .06).
Differences between readers and nonreaders
The part-sum model yielded excellent fits to the observed bigram dissimilarities for both readers and nonreaders (model correlations for readers and nonreaders: r = .89, 95% CI = [.87, .91], and r = .90, 95% CI = [.87, .92], for Telugu; r = .91, 95% CI = [.89, .93], and r = .92, 95% CI = [.91, .94], for Malayalam; p < .00005). If model predictions are equally good for readers and nonreaders, then what makes readers faster than nonreaders? We compared the strength of corresponding, across, and within model coefficients for readers and nonreaders for Telugu bigrams (Fig. 3e) and Malayalam bigrams (Fig. 3f). Model coefficients for corresponding and across locations were both positive, which means that dissimilar letters at these locations in the two bigrams led to larger net dissimilarity. For both languages, the within-bigram terms were systematically smaller in magnitude for readers compared with nonreaders (Figs. 3f and 3g). We note that the part-sum model directly estimated the underlying single-letter dissimilarities, so any simple change in single-letter dissimilarity would have affected all model terms and not specifically the within-bigram interactions. These reduced within-bigram interactions for readers thus represent an effect that was above that expected from increased single-letter dissimilarities.
This reduced magnitude for readers resulted in larger dissimilarities and, consequently, easy searches. To confirm that this was indeed the case, we calculated the correlation between the observed difference in RTs between readers and nonreaders and asked whether this could be explained by the difference in the respective part-sum model predictions for each group. This analysis revealed a positive and statistically significant correlation (r = .59, 95% CI = [.51, .66] for Telugu bigrams and r = .55, 95% CI = [.47, .62] for Malayalam bigrams; p < .00005).
We also confirmed that the first letter in the bigram was more salient than the second, consistent with the first-letter advantage observed in letter-recognition tasks (see Section S4). If reading does indeed reduce letter interactions within a bigram, then it should have no effect on bigrams with identical letters because the within-bigram dissimilarity is zero by definition. Therefore, we predicted that the dissimilarity between repeated-letter bigrams (e.g., AA and BB) should not be different for readers and nonreaders. In contrast, the dissimilarity between the transposed bigrams AB and BA should be strongly influenced by reading expertise because the within-bigram terms are nonzero whereas the across-location terms are zero. Thus, the part-sum model predicts that readers should be faster than nonreaders on transposed letter searches (AB-BA) but not repeated letter searches (AA-BB), even though both types of searches differ in two letters.
This was indeed the case: Readers were faster than nonreaders on transposed-bigram searches (Fig. 3g). However, they were equally fast for repeated-letter searches (Fig. 3g). The lack of effect for repeated-letter searches was not a floor effect, because there were many easier searches for both readers and nonreaders (shortest average search time for readers and nonreaders: 0.90 s and 0.89 s for Telugu letters; 0.79 s and 0.80 s for Malayalam letters). Thus, reading expertise produced increased discrimination of letter transpositions compared with repeated letters, and this effect was due to decreased letter–letter interactions within a bigram. We also tested subjects on visual search for trigrams in a separate experiment. Here, too, the part-sum model yielded excellent fits, with reduced within-trigram letter interactions for readers compared with nonreaders (see Section S5 in the Supplemental Material).
Can bigram interactions predict reading fluency?
If within-bigram letter interactions are smaller for readers than for nonreaders, could these interactions predict reading fluency? To investigate this, we estimated model parameters using the pairwise bigram dissimilarity from each subject across experiments and asked if they predicted reading fluency. To be sure that the contribution of each term was independent of the others, we performed a partial-correlation analysis. This revealed a significant partial correlation for within-bigram interactions and the constant term, but not for the others (Fig. 3h; see also Section S6 in the Supplemental Material). In other words, subjects with weaker within-bigram interactions were faster at reading. Likewise, subjects with faster motor responses (i.e., larger constant term) were also faster at reading. Combining these factors yielded a better model fit than each factor achieved separately, suggesting that they exert distinct influences on reading (Fig. 3h; see also Section S6).
Experiment 3 (functional MRI)
Brain imaging of single letters and bigrams
So far, we have shown that reading subtly altered letter representations by making similar letters more discriminable and by reducing interactions between letters within a bigram. However, these results were based on comparing visual search for pairs of bigrams; the suggestion that interactions decreased within a bigram was only an indirect inference. In Experiment 3, we measured brain activations for single letters and bigrams and sought to relate bigram responses to single-letter responses.
On the basis of the existing literature, we defined a number of ROIs as potential loci for differences in visual processing between readers and nonreaders. We defined early visual areas (V1–V3), mid-level areas (V4), and high-level visual areas (the lateral occipital region). These are regions where previous studies have reported differences for readers and nonreaders (Baker et al., 2007; Szwed et al., 2014). We then defined the VWFA, which shows greater activations for words compared with scrambled words and objects (Dehaene et al., 2015). Finally, we selected a broad region spanning both the superior and medial temporal gyrus, which also showed greater activations to known compared with unknown scripts, and which is known to be part of the reading network (Friederici & Gierhan, 2013). We complemented these ROI-based analyses with whole-brain searchlight analyses to provide an unbiased overview of the observed differences. All ROIs were defined using a combination of anatomical considerations and functional localizers (see the Method section). A representative subject brain with these ROIs is shown in Figure 4a. We also performed equivalent searchlight analyses to complement all ROI analyses (see Section S7). In the main experiment, subjects viewed single letters and bigrams while performing a one-back task, which we used to obtain single-image activations for further analysis.
Fig. 4.
Neural correlates of reading expertise and results from Experiment 3. Regions of interest (ROIs) are shown in (a) for an example subject, showing V1 to V3, V4, lateral occipital (LO) region, visual word-form area (VWFA), and temporal gyrus (TG). Average activation levels in Telugu readers, Malayalam readers, and the combination of both are shown for known and unknown scripts, separately for (b) V1 to V3, (c) V4, (d) VWFA, (e) TG, and (f) LO. Error bars indicate ±1 SEM across subjects. Asterisks indicate significant differences between activation levels to known and unknown scripts (*p < .05, **p < .005, ***p < .0005, ****p < .00005, in a signed-rank test comparing subject-wise average activations). The correlation between neural dissimilarity and behavioral dissimilarity for bigrams (g) is shown for each ROI, separately for the known script (left) and unknown script (right). Error bars indicate standard deviation of the correlation between the group behavioral dissimilarity and ROI dissimilarity calculated repeatedly by resampling subjects with replacement across 1,000 iterations. Asterisks inside bars indicate that the correlation between group behavior and group ROI dissimilarity was significant (*p < .05, **p < .005, ***p < .0005, ****p < .00005). Asterisks above bars indicate the fraction of bootstrap samples in which the observed difference was violated (*p < .05, **p < .005). All significant comparisons are indicated.
Do known and unknown scripts elicit differential activations?
We first compared overall activation levels in each ROI between readers and nonreaders. This is an important question because any systematic difference would reveal which brain regions are influenced by reading expertise. For each subject, we calculated the average activation across all voxels and across all stimuli within each script (known and unknown). We compared subject-wise activation levels between known and unknown scripts (Figs. 4b–4f).
For early visual areas (V1–V3), we observed opposite effects for known and unknown scripts for Telugu and Malayalam readers, suggesting that Malayalam letters activate early visual areas more than Telugu letters for both readers and nonreaders. Indeed, comparing the activations for the two languages across all subjects, we obtained a statistically significant difference (average activations of V1–V3: 0.61 for Telugu, 0.47 for Malayalam; p < .00005 using a Wilcoxon signed-rank test on subject-wise activations). This difference, however, was highly significant in Malayalam readers (p < .0005) but not in Telugu readers (p = .09).
The larger activation of early visual cortex for Malayalam might be due to the larger size of Malayalam letters compared with Telugu letters (total letter area, measured using the number of nonzero pixels: 0.08 ± .02 for Telugu and 0.11 ± .02 for Malayalam; p = .0017, rank-sum test across single letters). To investigate whether responses to single letters were driven by low-level image properties, we calculated for each subject the correlation between the average activation of each ROI and the ink area across single letters. The average correlation with ink area was significantly different from zero only in V1 to V3 but not in any other ROI (across subjects: mean r = .2, SEM = .04, p < .00001, one-sample t test for V1–V3; mean r = .06, SEM = .04, p = .16 for V4; mean r = –.02, SEM = .06, p = .69 for lateral occipital complex; mean r = .04, SEM = .05, p = .44 for VWFA; and mean r = –.01, SEM = .05, p = .79 for temporal gyrus). We conclude that responses in early visual areas, but not other ROIs, were driven by low-level properties of letter shape. This is consistent with the known properties of early visual cortex.
We proceeded to compare activations for known and unknown scripts in other visual areas. We observed identical trends in V4, VWFA, and temporal gyrus: Known scripts consistently elicited greater activations in readers of both languages (Figs. 4c–4e). A searchlight analysis confirmed these trends but additionally revealed that this trend was evident in a nearly continuous swath of cortex along the ventral surface from V4 to the VWFA, as well as in several regions around the temporal gyrus (see Section S7).
We observed an opposite pattern of activations in the lateral occipital region. Here, known scripts elicited weaker activation compared with unknown scripts for both languages (Fig. 4f). A searchlight analysis revealed that this was true on the dorsal portion of the occipitotemporal cortex, as well as in parietal regions (see Section S7). Reading expertise thus leads to widespread changes specifically in high-level visual areas but with opposite effects in the lateral occipital region compared with V4 and the VWFA.
We performed two additional analyses using overall activation levels. First, there were differences in overall activation with bigram frequency for Telugu but not Malayalam readers, but this effect was abolished on factoring out letter-frequency effects (see Section S7). Second, we observed a positive correlation between mean VWFA activation levels and reading fluency across subjects (see Section S7), consistent with other studies (Dehaene et al., 2015).
Neural correlates of behavior
Because there were systematic effects of reading on bigrams and single letters in visual search (Experiments 1 and 2), we sought to find the underlying neural representations in the brain. We therefore compared pairwise bigram dissimilarities in behavior with corresponding neural dissimilarities in each ROI. Specifically, for each ROI in a given subject, we calculated the neural dissimilarity between pairs of images using the correlation distance between the voxel activations of the two images (1 – r) and averaged this dissimilarity across subjects. In this manner, we calculated average pairwise neural dissimilarities for all pairs of stimuli in each ROI (for the dissimilarity matrices, see Section S7). We estimated the pairwise behavioral dissimilarities for the bigrams used in fMRI. We then asked how well these pairwise dissimilarities matched behavior for known scripts and unknown scripts. The results revealed two interesting patterns. First, neural dissimilarities in a number of areas were significantly correlated with behavior for both known and unknown scripts (Fig. 4g). However, the best match with behavior for known scripts was in the lateral occipital region, whereas for unknown scripts it was in V1 to V3. A searchlight analysis confirmed these trends (see Section S7): Dissimilarities for known bigrams best matched with neural dissimilarities in occipitotemporal cortex centered around the lateral occipital region, but also with the activation of parietal and motor regions. In contrast, the dissimilarities for unknown bigrams best matched the neural dissimilarities in early visual areas. Thus, perception of known scripts was driven by neural activations in higher visual areas, whereas perception of unknown scripts was driven by neural activations in lower visual areas.
Does reading alter the compositionality of bigram representations?
We next turned to the critical question of whether reading alters the compositionality of bigram representations. If reading reduces interactions between letters, the responses to bigrams should be more predictable from single letters in readers compared with nonreaders. By contrast, if reading leads to the formation of specialized bigram detectors, the responses to bigrams should be less predictable from single letters in readers. Distinguishing between these possibilities would require a model that predicts bigram responses using single-letter responses.
We devised a model to predict the response of a population of voxels to a given bigram using a linear sum of the population response to each individual letter in the bigram (Fig. 5a). This resulted in a separate population model for each bigram. This approach allowed the model to estimate the average compositionality across a population of correlated voxels and overcome the inherent noise in individual voxels. To evaluate model fits, we compiled model predictions for each voxel across bigrams and compared the predictions with the observed activations. This approach prevents the model fits from being biased by voxels with large activation levels. We obtained similar results on fitting a separate model to each voxel. We compared the average model fit for each subject in a given ROI for known and unknown scripts. We obtained comparable model fits for known and unknown scripts in most ROIs (Fig. 5b). The sole exception was the lateral occipital region, where bigrams of known scripts were better predicted by single-letter responses compared with bigrams of unknown scripts (Fig. 5b). A searchlight analysis revealed that this effect was localized to the anterior portion of the left lateral occipital region and to the right fusiform gyrus (see Section S7). Since these regions were identified using their higher model fit for known scripts, any direct comparison of model performance would constitute double dipping. To avoid this circularity, we performed a split-half analysis. We identified the anterior portion of the lateral occipital region using odd-numbered subjects and compared the model fits in even-numbered subjects, and vice versa. This revealed significantly larger model fits in the anterior lateral occipital region for known, compared with unknown, scripts for Telugu and Malayalam, separately as well as in both languages combined (Fig. 5c). We obtained similar results in the right fusiform gyrus (see Section S7).
Fig. 5.
Compositionality of neural bigram representations in Experiment 3. A schematic of the voxel-population model is shown in (a). The response of each bigram across voxels was modeled as a linear combination of the constituent letter responses. To evaluate the model fit, we calculated the correlation between observed and predicted activations for each voxel. Average model correlation across voxels (b) is presented for each of five regions of interest, separately for known and unknown scripts. The regions are V1 to V3, V4, lateral occipital (LO) region, visual word-form area (VWFA), and temporal gyrus (TG). Error bars indicate standard errors of the model correlation across subjects. The asterisk indicates a significant difference between script types (p < .05, using a signed-rank test on subject-wise model correlations between the two groups). Average model correlation in the anterior lateral occipital region (c) is shown for Telugu readers, Malayalam readers, and both groups combined, separately for known and unknown scripts. Error bars represent standard errors of the mean across subjects. Asterisks represent statistical significance, as obtained using a signed-rank test comparing average model correlations across subjects (*p < .05, **p < .005, ****p < .00005).
The increased compositionality for bigrams in object-selective cortex could be an incidental artifact of having stronger signal levels overall, which could increase the explainable variance and therefore model performance. However, this is unlikely because known scripts evoke weaker activity in the lateral occipital region, which should have led to weaker, not stronger, model predictions. We conclude that reading increases the compositionality of bigram representations specifically in object-selective cortex.
Discussion
We investigated the effect of reading expertise on letter representations by comparing Telugu and Malayalam readers using a combination of behavior and brain imaging. In behavior, subjects discriminated letters of their known script better than letters of unknown scripts. This is consistent with the increased discrimination of familiar targets observed for natural objects in visual search (Mruczek & Sheinberg, 2005). We found that the net dissimilarity between strings (bigrams and trigrams) can be accurately predicted using pairwise dissimilarities between letters in the two strings. This is consistent with our previous study in which we reported this result for objects (Pramod & Arun, 2016). This model was able to predict virtually all the explainable variation in the search data for both readers and nonreaders. Importantly, these changes in visual processing directly predicted reading fluency in readers.
If reading expertise led to the formation of specialized bigram or trigram detectors, our models, based only on single-letter responses, would have shown worse performance for known scripts than for unknown scripts and for frequent than for infrequent bigrams. We found no such effects. Thus, bigram detectors, even if present, did not contribute substantively to the observed effects. Importantly, we were able to precisely quantify the effect of reading on word representations by analyzing how model parameters varied between readers and nonreaders. Our main finding was that reading expertise made single letters more discriminable and reduced interactions between letters in a string. Our model accounted for both letter similarity and letter interactions, thereby providing a framework to compare effects of letter substitution and transposition, both widely used as measures of orthographic processing (Dehaene et al., 2015; Grainger, Dufau, Montant, Ziegler, & Fagot, 2012; Ziegler et al., 2013). Further, the reduced interactions may work at multiple scales: Conjoined words are easier to parse when they are frequent than when they are infrequent (e.g., “readingdifficulty” is easier to parse than “heliumchromate”). We propose that visual search using letter strings can be a natural and objective way to study how reading alters visual representations.
Our brain-imaging experiment (Experiment 3) further elucidated the neural basis of word representations. Our main finding is that the anterior ventral portion of the lateral occipital region is a likely locus for the effects observed in behavior. We draw this conclusion because (a) the neural representation of bigrams in the lateral occipital region matched best with behavior for readers, and (b) bigram responses were better predicted from single letters for known scripts specifically in the lateral occipital region but not in other regions. The former finding is interesting because it suggests that reading shifts the neural basis of behavior from lower to higher visual areas. The latter finding is interesting because it indicates that reading makes visual processing more efficient by making words easier to parse into letters. The increased compositionality might result from familiarity with individual letters or with letter combinations. These possibilities will require careful testing. Our findings are congruent with previous reports showing that reading-related plasticity occurs both at the level of single letters (Szwed et al., 2011; Vinckier et al., 2007) and at the level of words (Glezer, Jiang, & Riesenhuber, 2009; Glezer, Kim, Rule, Jiang, & Riesenhuber, 2015; Riesenhuber & Glezer, 2017). Importantly, our findings elucidate the nature of the plasticity that might occur at the word level, suggesting that it reduces interactions between letters, making word responses more compositional.
That the lateral occipital region could play a role in reading is consistent with evidence that alexia also induces general visual-processing deficits (Behrmann, Nelson, & Sekuler, 1998; Roberts, Lambon Ralph, & Woollams, 2010; Starrfelt, Habekost, & Gerlach, 2010) and often involves damage to regions posterior to the VWFA (Barton, 2011; Seghier et al., 2012). It is also possible that compositionality increases in other areas, such as the VWFA, but the increase is undetectable because these areas have far fewer voxels and therefore weaker statistical power. A conclusive demonstration that the anterior ventral lateral occipital region participates in reading would require perturbing its activity during reading tasks.
We also found a widespread effect of reading expertise across many high-level visual regions, in keeping with the existing literature (Dehaene et al., 2010; Szwed et al., 2014; Szwed, Ventura, Querido, Cohen, & Dehaene, 2012). But unlike in previous studies, we compared readers of closely related Indian languages (i.e., with distinct orthographies and shared phonemes) who had similar educational levels. This enabled us to establish that these effects were truly due to reading expertise and not to letter shapes or other confounding factors. We found opposite trends in different visual areas: In V4 and along the occipitotemporal sulcus up to the VWFA, we found greater activation to known scripts. This is consistent with the effects of learning observed in these regions (Clarke, Pell, Ranganath, & Tyler, 2016; Folstein, Palmeri, Van Gulick, & Gauthier, 2015; Skeide et al., 2017). In the occipitotemporal regions in and around the lateral occipital region, we observed greater activation for unknown scripts. This is consistent with the increased response to novel stimuli in the homologous region in the monkey, the inferior temporal cortex (Meyer, Walker, Cho, & Olson, 2014; Mruczek & Sheinberg, 2007). Whether these effects are specific to reading scripts or are a more general effect of familiarity in these regions can be resolved by comparing activations for familiar objects and scripts after accounting for differences in visual experience. Likewise, these effects could also arise from different effects of attention on these regions, although such attentional effects have never been proposed or reported. Distinguishing familiarity effects from attentional effects will require careful independent control of task difficulty, attention, and familiarity.
Our observations both confirm and extend our understanding of the VWFA in several ways. First, we consistently localized the VWFA for both Indian languages and observed no difference in its anatomical location across language (see Section S7). This is consistent with other studies in which the VWFA was observed at similar locations for multiple languages (Bai et al., 2011; Krafnick et al., 2016; Szwed et al., 2014). Second, we found a positive correlation between VWFA activation levels and fluency (Dehaene et al., 2015). Third, neural dissimilarity in the VWFA was significantly correlated with behavioral dissimilarity in readers but not in nonreaders (Fig. 4g). There have been surprisingly few studies on this point: Only one study has shown VWFA representations to be correlated with subjective visual dissimilarity (Rothlein & Rapp, 2014), but this could be due to explicit letter reading by subjects. Our measure of behavioral dissimilarity (visual search) did not require explicit reading and was similar for readers and nonreaders. Thus, this finding suggests that the VWFA receives letter-shape information for only known scripts. Finally, we observed concordant effects in the VWFA with both the lateral occipital and temporal gyrus regions, consistent with its status as an intermediate region between the visual and auditory processing of language (Dehaene et al., 2015; Friederici & Gierhan, 2013).
Our central finding that reading makes word responses more compositional raises the intriguing question of how compositionality could benefit reading. Here, we drew on previous work on the motor system suggesting that viewing multiple movement targets enables parallel planning (Bhutani, Sengupta, Basu, Prabhu, & Murthy, 2017; Cisek & Kalaska, 2010; McSorley, Gilchrist, & McCloy, 2019; Wu et al., 2013). Similarly, simultaneous viewing of a string of letters in a word might enable the parallel programming of the associated sounds, thereby enabling efficient reading.
Supplemental Material
Supplemental material, Arun_openpractices_disclosure for Reading Increases the Compositionality of Visual Word Representations by Aakash Agrawal, K. V. S. Hari and S. P. Arun in Psychological Science
Supplemental material, Arun_Supplemental_Material_rev for Reading Increases the Compositionality of Visual Word Representations by Aakash Agrawal, K. V. S. Hari and S. P. Arun in Psychological Science
Acknowledgments
We are grateful to Mike Tarr, John Pyles, and Elissa Aminoff for organizing an excellent functional MRI workshop at the Indian Institute of Science (IISc) and for help with standardizing scan and task parameters, which laid the groundwork for this study.
Footnotes
Action Editor: John Jonides served as action editor for this article.
Author Contributions: A. Agrawal and S. P. Arun designed the experiments. A. Agrawal collected the data, A. Agrawal and S. P. Arun analyzed the data, and all the authors interpreted the results. A. Agrawal and S. P. Arun wrote the manuscript with input from K. V. S. Hari. All the authors approved the final manuscript for submission.
ORCID iD: S. P. Arun
https://orcid.org/0000-0001-9602-5066
Declaration of Conflicting Interests: The author(s) declared that there were no conflicts of interest with respect to the authorship or the publication of this article.
Funding: This study was funded by the Department of Biotechnology-Indian Institute of Science (IISc) Partnership Programme and by Intermediate and Senior Fellowships from the Wellcome Trust/DBT India Alliance (all to S. P. Arun; Grant Nos. 500027/Z/09/Z and IA/S/17/1/503081). The functional MRI workshop was funded by a Tata Trusts grant, and MRI scan time required to standardize task and scanning parameters was funded by a Carnegie Mellon University-IISc BrainHub grant (both with S. P. Arun as co-principal investigator).
Supplemental Material: Additional supporting information can be found at http://journals.sagepub.com/doi/suppl/10.1177/0956797619881134
Open Practices:
All data have been made publicly available via the Open Science Framework and can be accessed at https://osf.io/wytek/. The design and analysis plans for the experiments were not preregistered. The complete Open Practices Disclosure for this article can be found at http://journals.sagepub.com/doi/suppl/10.1177/0956797619881134. This article has received the badge for Open Data. More information about the Open Practices badges can be found at http://www.psychologicalscience.org/publications/badges.
References
- Arun S. P. (2012). Turning visual search time on its head. Vision Research, 74, 86–92. doi: 10.1016/j.visres.2012.04.005 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bai J., Shi J., Jiang Y., He S., Weng X. (2011). Chinese and Korean characters engage the same visual word form area in proficient early Chinese-Korean bilinguals. PLOS ONE, 6(7), Article e22765. doi: 10.1371/journal.pone.0022765 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Baker C. I., Liu J., Wald L. L., Kwong K. K., Benner T., Kanwisher N. (2007). Visual word processing and experiential origins of functional selectivity in human extrastriate cortex. Proceedings of the National Academy of Sciences, USA, 104, 9087–9092. doi: 10.1073/pnas.0703300104 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Barton J. J. S. (2011). Disorder of higher visual function. Current Opinion in Neurology, 24(1), 1–5. doi: 10.1097/WCO.0b013e328341a5c2 [DOI] [PubMed] [Google Scholar]
- Behrmann M., Nelson J., Sekuler E. B. (1998). Visual complexity in letter-by-letter reading: “Pure” alexia is not pure. Neuropsychologia, 36, 1115–1132. [DOI] [PubMed] [Google Scholar]
- Bhutani N., Sengupta S., Basu D., Prabhu N. G., Murthy A. (2017). Parallel activation of prospective motor plans during visually-guided sequential saccades. The European Journal of Neuroscience, 45, 631–642. doi: 10.1111/ejn.13496 [DOI] [PubMed] [Google Scholar]
- Binder J. R., Medler D. A., Westbury C. F., Liebenthal E., Buchanan L. (2006). Tuning of the human left fusiform gyrus to sublexical orthographic structure. NeuroImage, 33, 739–748. doi: 10.1016/j.neuroimage.2006.06.053 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brainard D. H. (1997). The psychophysics toolbox. Spatial Vision, 10, 433–436. doi: 10.1163/156856897X00357 [DOI] [PubMed] [Google Scholar]
- Cisek P., Kalaska J. F. (2010). Neural mechanisms for interacting with a world full of action choices. Annual Review of Neuroscience, 33, 269–298. doi: 10.1146/annurev.neuro.051508.135409 [DOI] [PubMed] [Google Scholar]
- Clarke A., Pell P. J., Ranganath C., Tyler L. K. (2016). Learning warps object representations in the ventral temporal cortex. Journal of Cognitive Neuroscience, 28, 1010–1023. doi: 10.1162/jocn_a_00951 [DOI] [PubMed] [Google Scholar]
- Dehaene S., Cohen L. (2011). The unique role of the visual word form area in reading. Trends in Cognitive Sciences, 15, 254–262. doi: 10.1016/j.tics.2011.04.003 [DOI] [PubMed] [Google Scholar]
- Dehaene S., Cohen L., Morais J., Kolinsky R. (2015). Illiterate to literate: Behavioural and cerebral changes induced by reading acquisition. Nature Reviews Neuroscience, 16, 234–244. doi: 10.1038/nrn3924 [DOI] [PubMed] [Google Scholar]
- Dehaene S., Cohen L., Sigman M., Vinckier F. (2005). The neural code for written words: A proposal. Trends in Cognitive Sciences, 9, 335–341. doi: 10.1016/j.tics.2005.05.004 [DOI] [PubMed] [Google Scholar]
- Dehaene S., Pegado F., Braga L. W., Ventura P., Nunes Filho G., Jobert A., . . . Cohen L. (2010). How learning to read changes the cortical networks for vision and language. Science, 330, 1359–1364. doi: 10.1126/science.1194140 [DOI] [PubMed] [Google Scholar]
- Eickhoff S. B., Stephan K. E., Mohlberg H., Grefkes C., Fink G. R., Amunts K., Zilles K. (2005). A new SPM toolbox for combining probabilistic cytoarchitectonic maps and functional imaging data. NeuroImage, 25, 1325–1335. doi: 10.1016/j.neuroimage.2004.12.034 [DOI] [PubMed] [Google Scholar]
- Folstein J., Palmeri T. J., Van Gulick A. E., Gauthier I. (2015). Category learning stretches neural representations in visual cortex. Current Directions in Psychological Science, 24, 17–23. doi: 10.1177/0963721414550707 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Friederici A. D., Gierhan S. M. E. (2013). The language network. Current Opinion in Neurobiology, 23, 250–254. doi: 10.1016/j.conb.2012.10.002 [DOI] [PubMed] [Google Scholar]
- Glezer L. S., Jiang X., Riesenhuber M. (2009). Evidence for highly selective neuronal tuning to whole words in the “visual word form area.” Neuron, 62, 199–204. doi: 10.1016/j.neuron.2009.03.017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Glezer L. S., Kim J., Rule J., Jiang X., Riesenhuber M. (2015). Adding words to the brain’s visual dictionary: Novel word learning selectively sharpens orthographic representations in the VWFA. The Journal of Neuroscience, 35, 4965–4972. doi: 10.1523/JNEUROSCI.4031-14.2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Grainger J., Dufau S., Montant M., Ziegler J. C., Fagot J. (2012). Orthographic processing in baboons (Papio papio). Science, 336, 245–248. doi: 10.1126/science.1218152 [DOI] [PubMed] [Google Scholar]
- Kay K. N., Rokem A., Winawer J., Dougherty R. F., Wandell B. A. (2013). GLMdenoise: A fast, automated technique for denoising task-based fMRI data. Frontiers in Neuroscience, 7, Article 7. doi: 10.3389/fnins.2013.00247 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Krafnick A. J., Tan L. H., Flowers D. L., Luetje M. M., Napoliello E. M., Siok W. T., . . . Eden G. F. (2016). Chinese character and English word processing in children’s ventral occipitotemporal cortex: fMRI evidence for script invariance. NeuroImage, 133, 302–312. doi: 10.1016/j.neuroimage.2016.03.021 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lochy A., Jacques C., Maillard L., Colnat-Coulbois S., Rossion B., Jonas J. (2018). Selective visual representation of letters and words in the left ventral occipito-temporal cortex with intracerebral recordings. Proceedings of the National Academy of Sciences, USA, 115, E7595–E7604. doi: 10.1073/pnas.1718987115 [DOI] [PMC free article] [PubMed] [Google Scholar]
- McSorley E., Gilchrist I. D., McCloy R. (2019). The programming of sequences of saccades. Experimental Brain Research, 237, 1009–1018. doi: 10.1007/s00221-019-05481-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Meyer T., Walker C., Cho R. Y., Olson C. R. (2014). Image familiarization sharpens response dynamics of neurons in inferotemporal cortex. Nature Neuroscience, 17, 1388–1394. doi: 10.1038/nn.3794 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mruczek R. E. B., Sheinberg D. L. (2005). Distractor familiarity leads to more efficient visual search for complex stimuli. Perception & Psychophysics, 67, 1016–1031. doi: 10.3758/BF03193628 [DOI] [PubMed] [Google Scholar]
- Mruczek R. E. B., Sheinberg D. L. (2007). Context familiarity enhances target processing by inferior temporal cortex neurons. The Journal of Neuroscience, 27, 8533–8545. doi: 10.1523/JNEUROSCI.2106-07.2007 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nag S. (2017). Learning to read alphasyllabaries. In Cain K., Compton D. L., Parrila R. K. (Eds.), Theories of reading development (pp. 75–98). Amsterdam, The Netherlands: John Benjamins. doi: 10.1075/swll.15.05nag [DOI] [Google Scholar]
- Pramod R. T., Arun S. P. (2014). Features in visual search combine linearly. Journal of Vision, 14(4), Article 6. doi: 10.1167/14.4.6 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pramod R. T., Arun S. P. (2016). Object attributes combine additively in visual search. Journal of Vision, 16(5), Article 8. doi: 10.1167/16.5.8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Riesenhuber M., Glezer L. S. (2017). Evidence for rapid localist plasticity in the ventral visual stream: The example of words. Language, Cognition and Neuroscience, 32, 286–294. doi: 10.1080/23273798.2016.1210178 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Roberts D. J., Lambon Ralph M. A., Woollams A. M. (2010). When does less yield more? The impact of severity upon implicit recognition in pure alexia. Neuropsychologia, 48, 2437–2446. doi: 10.1016/j.neuropsychologia.2010.04.002 [DOI] [PubMed] [Google Scholar]
- Rothlein D., Rapp B. (2014). The similarity structure of distributed neural responses reveals the multiple representations of letters. NeuroImage, 89, 331–344. doi: 10.1016/j.neuroimage.2013.11.054 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Scaltritti M., Dufau S., Grainger J. (2018). Stimulus orientation and the first-letter advantage. Acta Psychologica, 183, 37–42. doi: 10.1016/J.ACTPSY.2017.12.009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Seghier M. L., Neufeld N. H., Zeidman P., Leff A. P., Mechelli A., Nagendran A., . . . Price C. J. (2012). Reading without the left ventral occipito-temporal cortex. Neuropsychologia, 50, 3621–3635. doi: 10.1016/j.neuropsychologia.2012.09.030 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Skeide M. A., Kumar U., Mishra R. K., Tripathi V. N., Guleria A., Singh J. P., . . . Huettig F. (2017). Learning to read alters cortico-subcortical cross-talk in the visual system of illiterates. Science Advances, 3(5), Article e1602612. doi: 10.1126/sciadv.1602612 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Starrfelt R., Habekost T., Gerlach C. (2010). Visual processing in pure alexia: A case study. Cortex, 46, 242–255. doi: 10.1016/j.cortex.2009.03.013 [DOI] [PubMed] [Google Scholar]
- Sunder S., Arun S. P. (2016). Look before you seek: Preview adds a fixed benefit to all searches. Journal of Vision, 16(15), Article 3. doi: 10.1167/16.15.3 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Szwed M., Dehaene S., Kleinschmidt A., Eger E., Valabrègue R., Amadon A., Cohen L. (2011). Specialization for written words over objects in the visual cortex. NeuroImage, 56, 330–344. doi: 10.1016/j.neuroimage.2011.01.073 [DOI] [PubMed] [Google Scholar]
- Szwed M., Qiao E., Jobert A., Dehaene S., Cohen L. (2014). Effects of literacy in early visual and occipitotemporal areas of Chinese and French readers. Journal of Cognitive Neuroscience, 26, 459–475. doi: 10.1162/jocn_a_00499 [DOI] [PubMed] [Google Scholar]
- Szwed M., Ventura P., Querido L., Cohen L., Dehaene S. (2012). Reading acquisition enhances an early visual process of contour integration. Developmental Science, 15, 139–149. doi: 10.1111/j.1467-7687.2011.01102.x [DOI] [PubMed] [Google Scholar]
- Vighneshvel T., Arun S. P. (2013). Does linear separability really matter? Complex visual search is explained by simple search. Journal of Vision, 13(11), Article 10. doi: 10.1167/13.11.10 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vinckier F., Dehaene S., Jobert A., Dubus J. P., Sigman M., Cohen L. (2007). Hierarchical coding of letter strings in the ventral stream: Dissecting the inner organization of the visual word-form system. Neuron, 55, 143–156. doi: 10.1016/j.neuron.2007.05.031 [DOI] [PubMed] [Google Scholar]
- Wu E. X. W., Gilani S. O., van Boxtel J. J. A., Amihai I., Chua F. K., Yen S.-C. (2013). Parallel programming of saccades during natural scene viewing: Evidence from eye movement positions. Journal of Vision, 13(12), Article 17. doi: 10.1167/13.12.17 [DOI] [PubMed] [Google Scholar]
- Ziegler J. C., Hannagan T., Dufau S., Montant M., Fagot J., Grainger J. (2013). Transposed-letter effects reveal orthographic processing in baboons. Psychological Science, 24, 1609–1611. doi: 10.1177/0956797612474322 [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Supplemental material, Arun_openpractices_disclosure for Reading Increases the Compositionality of Visual Word Representations by Aakash Agrawal, K. V. S. Hari and S. P. Arun in Psychological Science
Supplemental material, Arun_Supplemental_Material_rev for Reading Increases the Compositionality of Visual Word Representations by Aakash Agrawal, K. V. S. Hari and S. P. Arun in Psychological Science





