Abstract
Computer programming is a cornerstone of modern society, yet little is known about how the human brain enables this recently invented cultural skill. According to the neural recycling hypothesis, cultural skills (e.g., reading, math) repurpose preexisting neural “information maps.” Alternatively, such maps could emerge de novo during learning, as they do in artificial neural networks. Representing and manipulating logical algorithms, such as “for” loops and “if” conditionals, is key to programming. Are representations of these algorithms acquired when people learn to program? Alternatively, do they predate instruction and get “recycled”? College students (n = 22, 11 females and 11 males) participated in a functional magnetic resonance imaging study before and after their first programming course (Python) and completed a battery of behavioral tasks. After a one-semester Python course, reading Python functions (relative to working memory control) activated an independently localized left-lateralized frontoparietal reasoning network. This same network was already engaged by pseudocode, plain English descriptions of Python, even before the course. Critically, multivariate population codes in this frontoparietal network distinguished “for” loops and “if” conditional algorithms, both before and after. Representational similarity analysis revealed shared information in the frontoparietal network before and after instruction. Programming recycles preexisting representations of logical algorithms in frontoparietal cortices, supporting the recycling framework of cultural skill acquisition.
Keywords: algorithm, cultural skill, fMRI, neural recycling, programming, reasoning
Significance Statement
Computer programming is a foundational skill in modern society, yet its neural basis remains poorly understood. The neural recycling hypothesis proposes that new cultural abilities like reading and math emerge by repurposing preexisting neural representations. We tested this hypothesis in programming by tracking brain activity before and after individuals learned to code. Using functional magnetic resonance imaging, we found that a left-lateralized frontoparietal reasoning network represents core programming algorithms (“for” loops and “if” conditionals) even before formal instruction. After learning Python, this network continued to encode these algorithms, consistently representing algorithms before and after instruction. These findings support the idea that programming recycles preexisting cognitive structures for logical reasoning, providing a neural basis for how culture builds upon biological foundations.
Introduction
Modern life depends on computer programming and its applications, including artificial intelligence. Yet, coding was invented only 80 years ago, precluding evolutionary adaptation. How does the human brain enable coding?
Programming engages multiple cognitive processes, but logical reasoning is central (Ambrosio et al., 2014; Farghaly and El-Kafrawy, 2021). Programmers manipulate logical algorithms, such as “for” loops and “if” conditionals (e.g., chars = [x for x in a_string if x in “abcde”]; Pennington, 1987; Perrenet et al., 2005). Where do representations of code-relevant algorithms come from? One possibility is that they are acquired de novo through flexible general-purpose learning (McClelland, 1988; Elman, 1989; Quartz and Sejnowski, 1997; Heyes, 2018; Birch and Heyes, 2021). Such accounts are bolstered by the advance in artificial neural networks, which learn to code in various languages despite lacking preexisting algorithm representations (e.g., CodeBERT; Feng et al., 2020).
Alternatively, the “neural recycling” hypothesis proposes that cultural skills reuse and modify phylogenetically and ontogenetically older neural information maps (Dehaene and Cohen, 2007). Reading is thought to recycle visual object representations in the ventral stream (McCandliss et al., 2003; Dehaene et al., 2015; Dehaene-Lambertz et al., 2018) and mathematics parietal approximate number representations (Feigenson et al., 2004; Cantlon and Brannon, 2007; Amalric and Dehaene, 2016). For programming, the recycling framework predicts that, coding modifies preexisting neural population codes of logical algorithms.
In adults, frontoparietal networks support logical deduction, e.g., “If X then Y. Not Y, therefore not X,” and rules (Badre et al., 2010; Woolgar et al., 2016; Coetzee and Monti, 2018). Primitive forms of logic are present in preverbal infants (Cesana-Arlotti et al., 2018, 2020), and in nonhuman primates, prefrontal neurons encode conditional rules (Hoshi et al., 2000; Wallis et al., 2001). Here, we tested the hypothesis that learning to program recycles representations of logical algorithms in frontoparietal circuits. Indirect evidence comes from behavioral studies showing that logical reasoning abilities are one of the best predictors of coding performance (Anderson et al., 1984; McCoy and Burton, 1988; Shute, 1991; Graafsma et al., 2023).
Programming provides a unique test of the recycling framework, since unlike reading and math, it is often learned in adulthood, enabling neural measurement before and after instruction. Expert programmers recruit left-lateralized frontoparietal reasoning systems, rather than the language systems, during code comprehension and generation (Ivanova et al., 2020; Krueger et al., 2020; Liu et al., 2020; Endres et al., 2021; Ikutani et al., 2021; Xu et al., 2021). Frontoparietal population codes distinguish between algorithm types, such as “for” versus “if” (Liu et al., 2020; Ikutani et al., 2021; Srikant et al., 2022). Yet, neural basis of program comprehension in novices is not known.
To test whether code-relevant algorithm representations emerge as a result of experience, we scanned (fMRI) the same 22 Johns Hopkins undergraduates before and after their first Python course. A challenge of studying cultural skills prior to instruction is novice's lack of familiarity with the symbol system, e.g., programming “syntax.” To circumvent this, we presented code-naive students with custom-made “pseudocode” that describes Python algorithms in plain English (Fig. 1). Participants saw the same algorithms as the real Python code after the course, and they saw pseudocode both before and after the course, enabling within-subject comparison.
Figure 1.
Example stimuli and task design. The example stimuli were presented to participants during the “reading” phase of a trial (20 s). During the “input” phase (6 s), the stimuli remained on the screen with an extra line. For pseudocode reading trials, the extra line was in the format “Now is the word ‘________’”. For pseudocode control trials, the extra line was a sentence. During the “question” phase (4 s), a single line was presented. For code or pseudocode reading trials, participants answered whether the line indicated the correct output of the algorithm. For control trials, participants answered whether the line was present during the “input” phase. Each trial began with a 0.5 s fixation cross. “Input” and “question” phases were also separated by a 0.5 s fixation cross. ITI was 5 s.
If code-relevant algorithm representations are acquired through instruction, we would expect different neural systems engaged before and after learning, e.g., reliance on language system before instruction and frontoparietal networks engaged after learning to program or different frontoparietal subnetworks involved before and after learning. On the other hand, the neural recycling framework predicts that the same neural populations support pseudocode before the course and code after the course.
We functionally identified code-responsive frontoparietal networks in each individual after instruction and tested for decoding between “for” and “if” algorithms in this network both before and after instruction. Additionally, we used representational similarity analysis (RSA) before and after instruction and to test detailed representational content, beyond for/if algorithm distinction.
Materials and Methods
Participants
Participants were 22 Johns Hopkins undergraduates (11 women, 11 men; age range, 18–24; mean age, 19; SD, 1.46) who were programming-naive at the beginning of the study and enrolled in an introductory programming course “Gateway Computing: Python” at Johns Hopkins University. All participants completed the course in full. Participants took part in both MRI scanning sessions. None of the participants had any history of neurological conditions (screened through self-report). Informed consent was obtained from each participant in accordance with the Johns Hopkins Medicine Institutional Review Boards.
Experimental design
At the beginning of the semester, prior to the acquisition of programming knowledge, participants underwent an MRI scanning session (the “PRE” scan). At the end of the semester, after the last class of the programming course, another scanning session was administered (the “POST” scan).
During the PRE scan, participants completed a localizer task to identify their logical reasoning and language networks. During the same scan, participants also completed a “pseudocode” reading task and its corresponding control task. During the POST scan, participants completed another set of pseudocode reading task alongside a code reading task and their corresponding control tasks. The code reading task is conceptually identical to the task we administered in a previous study, with minor modifications (Liu et al., 2020). The “pseudocode” reading task was derived from the code reading task.
In the following paragraphs, we introduce the code and “pseudocode” stimuli, and their corresponding control stimuli (i.e., “scrambled code” and “nonsensical passage”, respectively).
Code stimuli: Python functions
A code comprehension trial consisted of three phases: “reading,” “input,” and “question.” During the “reading” phase, a Python function was presented. During the “input” phase, in addition to the function, an additional line was presented underneath the function to show the input to the function, based on which the participant derived the output. During the “question” phase, a single string of characters was presented, and participants answered whether it was the correct output of the function given the input. An example Python function is shown in Figure 1.
Code control stimuli: scrambled code
A code control stimulus contained all the words, symbols, and indentation structure of a Python function but scrambled at the level of individual words and symbols such that it did not describe any executable algorithm. Participants were instructed to remember the lines in the scrambled function.
A code control trial also consisted of three phases. During the first “reading” phase, the scrambled function was presented. During the “input” phase, an additional scrambled line was presented underneath the scrambled function, which participants also had to remember. During the “question” phase, a single line of scrambled programming code elements was presented. Participant answered whether it was identical to one of the lines they just saw during the preceding phases. An example scrambled function is shown in Figure 1.
Pseudocode stimuli: plain English descriptions of algorithms
Each “pseudocode” stimulus was a short passage written in plain English to describe the algorithm implemented by a code stimulus (Python function). There is a one-to-one mapping between each element in a Python function and the words or phrases used in its corresponding pseudocode passage. An example pseudocode passage is shown in Figure 1.
The design and the task of a pseudocode reading trial was identical to a code reading trial. A pseudocode passage was presented during the “reading” phase. During the “input” phase, an additional sentence in the format “Now input is the word ‘______’” was presented underneath the pseudocode passage, where “______” was replaced with a specific character string. Participants performed the algorithm described by the pseudocode based on the information contained in the additional sentence. During the “question” phase, a single character string was presented, and participants answered whether it was the correct result when the algorithm is performed.
The pseudocode passages underwent multiple rounds of pilot-testing and modification to ensure comprehensibility and unambiguous wordings.
Please note that the “pseudocode” passages used in this experiment were different from the “pseudocode” used by software engineers. In the sense of software engineering, a pseudocode is a conceptual outline of a program. Although it does not have to be expressed in any particular programming language, usually it still resembles a programming script with the use of symbols, indentations, and nonlinguistic expressions. In this experiment, we borrowed the term “pseudocode,” but not its engineering style, to refer to the plain English descriptions of algorithms.
Pseudocode control stimuli: nonsensical passages
Same as the code control task, the pseudocode control task was a memory task. However, instead of randomly scrambling all the words in a pseudocode passage (as have been done to create code control stimuli), we rearranged the words to create nonsensical passages containing grammatically correct sentences which, as a whole, did not implement an executable algorithm. Specifically, each nonsensical passage and the additional line presented during the “input” phase of the trial was created by pooling the words from two pseudocode passages (plus their additional lines) and manually creating sentences out of the pooled words. An example nonsensical passage is shown in Figure 1.
All the experiment stimuli were contained in an Open Science Foundation repository (https://osf.io/2ncfm/), including Python functions, scrambled functions, pseudocode passages, and nonsensical passages, in both text format and picture format which was presented to participants inside the MRI scanner.
Localizer tasks and stimuli
The localizer task was identical to the one used in our previous study (Liu et al., 2020). It aimed to identify brain responses specific to formal logic, symbolic mathematics, and language comprehension, with each condition serving as a control for the others. This combined language/math/logic localizer task design was adapted from previous studies (Monti et al., 2009, 2012; Kanjlia et al., 2016).
During the language trials, participants determined whether two visually presented sentences, one in active voice and one in passive voice, conveyed the same meaning. For instance, they compared sentences like “The child that the babysitter chased ate the apple” with “The apple was eaten by the babysitter that the child chased.” During the math trials, participants assessed whether the variable X had an identical value across two equations. For example, they compared equations such as “X minus twenty-five equals forty-one” with “X minus fifty-four equals twelve.” For the formal logic trials, participants evaluated whether two logical statements were consistent, wherein both statements were valid inferences from each other and thus had the same truth table. For instance, they compared statements like “If either not Z or not Y, then X” with “If not X, then both Z and Y.”
Consistent with previous works (Monti et al., 2009, 2012; Kanjlia et al., 2016), in the current study, the language network was localized by contrasting the neural response during language trials against math trials. The logical reasoning network was localized by contrasting logic trials against language trials.
Procedure
The PRE scan (pseudocode + localizer) and the POST scan (pseudocode + code) were separated by one semester (3.5 months). At least a day prior to both scanning sessions, participants took part in a behavioral practice session to familiarize with the stimuli.
Each of the code or pseudocode reading comprehension trial began with a 0.5 s fixation cross. The three phases of each trial—“reading,” “input,” and “question”—lasted for 20, 6, and 4 s, respectively, with a 0.5 s fixation cross between the “input” and the “question” phases. During the “question” phase, once the participant made a choice by pressing a button to indicate “true” or “false,” the remaining time of the question phase was skipped, and a 5 s intertrial interval (ITI) began. Including the ITI, each trial took at most 36 s.
The PRE scan consisted of nine runs. In each run, participants completed eight pseudocode trials and four pseudocode control trials. We did not present real Python code during the PRE scan, as completely naive participants might interpret such stimuli as unfamiliar ciphers rather than meaningful algorithms. This could lead to idiosyncratic or disengaged responses, compromising task engagement and interpretability of the neural data. Using pseudocode instead ensured that participants could understand and engage with the stimuli consistently.
The POST scan also consisted of nine runs. In each run, participants completed eight pseudocode trials, eight code trials, four pseudocode control trials, and four code control trials. The pseudocode/code stimuli involved two conditions: half of them described “for” loop algorithms, where an operation is done repeatedly across a collection of items. The other half described “if” conditional algorithms, where an operation was done only if a certain criterion is met. Within each run, the correct answer to half of the trials was “yes,” while the correct answer to the other half was “no.” The order of trials was counterbalanced pseudorandomly across participants.
During the PRE scan session, participants underwent four localizer runs, during which they completed the language/math/logic tasks. The beginning of each trial was marked by a 1 s fixation cross. For each pair of sentences/equations/logical statements, one of them appeared first, with the other following 3 s later. The whole pair remained visible on the screen for a duration of 16 s. Participants indicated their response as true or false by pressing either of two buttons. A 5 s ITI began after participants made a response.
Each of the four localizer runs comprised eight trials of language, math, and logic tasks, respectively, and six randomly distributed 5 s rest periods. In half of the trials, the correct response was “yes,” and the order of trials was counterbalanced across participants.
MRI data acquisition and preprocessing
All functional and structural MRI data were acquired at the F.M. Kirby Research Center of Functional Brain Imaging on a 3 T Phillips dStream Achieva scanner. T1-weighted structural images were collected in 150 axial slices with 1 mm isotropic voxels using a magnetization-prepared rapid gradient-echo sequence. Functional T2*-weighted BOLD scans were collected using a gradient echoplanar imaging sequence with the following parameters: 36 sequential ascending axial slices; repetition time, 2 s; echo time, 0.03 s; flip angle, 70°; field of view matrix, 76 × 70; slice thickness, 2.5 mm; interslice gap, 0.5; slice-coverage FH, 107.5; voxel size, 2.53 × 2.47 × 2.50 mm; PE direction, L/R; first-order shimming. Six dummy scans were collected at the beginning of each run but were not saved.
During functional scans, stimuli were presented with custom scripts written in PsychoPy (Peirce et al., 2019). The stimuli were presented visually on a rear projection screen, cut to fit the scanner bore. The participant viewed the screen via a front-silvered, 45° inclined mirror attached to the top of the head coil. The stimuli were projected with an Epson PowerLite 7350 projector. The resolution of the projected image was 1,600 × 1,200. Due to the hardware update in the scanning facility, for 4 out of the 22 participants, the visual stimuli were presented on an MRI-compatible display (Cambridge Research Systems BOLDscreen 32 UHD LCD displays) with a resolution of 3,840 × 2,160.
We used FSL, FreeSurfer, the HCP workbench, and in-house Python and R scripts to analyze MRI data. During preprocessing, functional data were motion-corrected, high-pass filtered with a 128 s cutoff, and resampled to the cortical surface for each participant using the standard FreeSurfer pipeline (Dale et al., 1999; Smith et al., 2004; Glasser et al., 2013). The surface data were then smoothed with a 6 mm FWHM Gaussian kernel and prewhitened to remove temporal autocorrelation. Note that smoothing was performed on the surface, rather than in the volume, and that 6 mm smoothing on the surface corresponds to approximate 3 mm smoothing in the volume (Hagler et al., 2006). Cerebellar and subcortical structures were excluded from the analysis.
For multivariate pattern analyses (MVPA), including support vector machine (SVM) decoding and RSA, we used surface data processed with the same pipeline, except that no smoothing was applied.
Statistical analysis
Univariate contrasts derived from general linear models (GLMs)
In this analysis, we sought to reveal the neural networks with greater response to code or pseudocode than to their corresponding memory controls. For each participant, separate GLMs were constructed for the PRE pseudocode reading scan, the POST pseudocode + code reading scan, and the localizer scan. In each of these GLMs, we included a separate nuisance regressor to model out time points with excessive motion, defined as time points in which frame displacement root mean squared were >1.5 mm (Kim et al., 2017). White matter signal and cerebral spinal fluid signal were also included in each of the individual subject GLMs as nuisance regressors.
In the GLM for the PRE pseudocode reading scan, eight regressors, with temporal derivatives, were included after convolving with a canonical double-gamma hemodynamic response function. Three of the regressors modeled the 20 s stimulus “reading” phase of the three trial conditions (FOR pseudocode, IF pseudocode, nonsensical pseudocode memory control), respectively. Another three of the regressors modeled the “input” phase of the three conditions, respectively. One regressor modeled the “question” phase (proposed output and participant response), and the last regressor modeled the ITI.
In the GLM for the POST pseudocode + code reading scan, 14 regressors were included. Six modeled the stimulus “reading” phase of each trial condition (FOR pseudocode, IF pseudocode, nonsensical pseudocode memory control, FOR code, IF code, scrambled code memory control), respectively. Six modeled the input phase for each trial condition, one modeled the “question” phase, and one modeled the ITI.
In the GLM for the localizer scan, four regressors were included to model the language trials, logic trials, math trials, and the resting periods, respectively.
In all these GLMs, each run was modeled separately, and runs were combined within each participant using a fixed-effect model (Dale et al., 1999; Smith et al., 2004). Random-effect models were applied to conduct group-level analysis across participants. Models for group-level analyses were corrected for multiple comparisons using a nonparametric permutation test with a cluster-forming threshold of p < 0.01 and family-wise error rate (FWER) controlled at p < 0.05 (Nichols and Holmes, 2002; Winkler et al., 2014; Eklund et al., 2016, 2019).
MVPA: “for” versus “if” decoding in regions of interest (ROIs)
In this analysis, we used SVM classifiers to decode the neural representation of algorithms in three ROIs on the left hemisphere: the intraparietal sulcus (IPS), the lateral prefrontal cortex (PFC), and the primary auditory cortex (A1) which served as a control.
The IPS and the PFC ROI search spaces were generated by combining the cortical parcels in the 400-parcel map reported by Schaefer et al. (2018) which encompassed the vertices activated in the group code reading > memory control contrast in our previous study (Liu et al., 2020). The A1 ROI was anatomically defined as the transverse temporal portion of a gyral-based atlas (Morosan et al., 2001; Desikan et al., 2006).
We used a functional ROI (fROI) approach for vertex selection within each ROI in each participant. The fROI approach addresses the variations in neural responses among individuals when performing a cognitive task (Kanwisher et al., 1997; Nieto-Castañón and Fedorenko, 2012). This approach allowed us to pinpoint the neural populations specific to each participant that were activated during the experimental conditions, all within a broader search space defined for the group as a whole. For each participant, in each ROI search space, the top 350 responsive vertices were selected based on the fixed-effect univariate code reading > memory control contrasts of the participant collected during the POST scan. The selected vertices constitute the fROI of that particular participant.
For each participant, for each stimuli set (PRE pseudocode, POST pseudocode, or POST code), a separate GLM was constructed using the unsmoothed data to model the 20 second stimulus presentation phase of each trial. The resultant spatial patterns of z-statistics of the beta parameter estimation for each code/pseudocode reading trial were used to train a decoding classifier. Specifically, we trained and tested an SVM classifier on the 350-vertex spatial patterns within each ROI search space for each participant. The SVM classifier was implemented by the PyMVPA toolbox, trained and tested using the default parameters (Hanke et al., 2009).
To ensure uniform signal strength throughout the MRI scanning sessions, we applied data normalization within each scanning run (Lee and Kable, 2018; Stehr et al., 2023). Within each run, in every fROI, the mean and standard deviation were calculated across trials and vertices and used for normalization, setting the mean to 0 and the standard deviation to 1. To avoid the dependency between trials from the same run artificially inflating the decoding accuracy, we performed a ninefold leave-one-run-out cross–validation (Etzel et al., 2011; Mumford et al., 2014; Valente et al., 2021). In each cross-validation fold, the classifier was trained on the data from eight out of the nine task runs and tested on the left-out run. The resulting nine accuracy values were averaged to derive the observed accuracy for a single participant in one fROI. Statistical significance of the group mean decoding accuracy across all 22 participants was tested with the Wilcoxon signed-rank test against the chance level of 50%.
PRE pseudocode versus POST code RSA
The goal of this analysis was to explore whether there were consistent neural representations of the same algorithms before and after programming instruction. As in other analyses in this study, we used a within-subject design to maximize sensitivity to whether neural populations that support code comprehension after learning already encoded algorithm-relevant information prior to instruction.
In this study, participants saw the same 72 algorithms written as pseudocode during the PRE scan and as Python code during the POST scan. One participant was excluded from this analysis because he saw different pseudocode and code algorithms due to an error in experiment administration. For each of the remaining 21 participants, within each fROI, we created a symmetrical 72-by-72 representational similarity matrix (RSM) based on the spatial neural response patterns to PRE pseudocode and the POST code, respectively. The spatial neural response patterns were derived from the GLM created for the “for” versus “if” decoding, as introduced in the previous subsection. Each entry in the RSM was the additive inverse of the Euclidean distance between the activation patterns for a pair of stimuli, such that a higher value indicated greater similarity between a pair of stimuli.
The correlation between the lower triangles of the PRE pseudocode RSM and the POST code RSM was measured with Kendall's tau (Kriegeskorte et al., 2008b; Nili et al., 2014). Statistical significance was computed via a bootstrapping process where we computed the correlation between null RSMs with randomly permutated item labels. The permutation was performed 1,000 times to generate a distribution of null group mean correlation values across participants. The statistical significance of the real group mean was evaluated as the probability of observing a null value greater than the real value, under the null distribution.
We also conducted this PRE pseudocode versus POST code RSA across the whole cortical surface using the searchlight method (Kriegeskorte et al., 2008a; Su et al., 2012). RSMs were constructed based on the spatial activation patterns in the “searchlight” surrounding each cortical vertex. A searchlight corresponding to a vertex encompassed all the vertices located within an 8-mm-diameter circle (as determined by geodesic distance) centered on that vertex (Kriegeskorte et al., 2006; Glasser et al., 2013). Searchlights containing any subcortical vertex were excluded.
To control for the FWER, we applied a cluster-based permutation correction method (Su et al., 2012; Regev et al., 2013; Schreiber and Krekelberg, 2013; Elli et al., 2019; Musz et al., 2022; Liu et al., 2023). This correction involved a cluster-forming threshold of uncorrected p < 0.001 and a cluster-wise FWER threshold of p < 0.05 (Winkler et al., 2014; Eklund et al., 2016, 2019).
Whole-brain searchlight RSA between neural activation and feature properties
In this analysis, we sought to identify the nature of the algorithm-related information encoded in the brain. We compared the second-order similarity structures (i.e., the RSM) between neural responses and the stimuli along some feature dimensions. For each stimulus set (PRE pseudocode, POST pseudocode, POST code), a neural RSM was constructed in each searchlight on the cortical surface, as introduced in the previous subsection. On the other hand, nine RSMs were created to capture the similarity structure of the stimuli with regard to nine features: control structures (“for” or “if”); targets of the control structures (the loops in “for” functions can iterate through each letter in a character string, each item in a list, etc.; the conditionals in “if” functions can be based on the identity of the first letter in a string, the length of a string, etc.); the data type of the result of the algorithm (list vs character string); operations taken within the control structures (e.g., repeat three times, add some characters, reverse); the objects derived from the operations, relative to the input string (a single letter from the input string, a new string created from combining an element from the input string and some new characters, etc.); the “semantics” of the algorithms represented in large language models (code, Microsoft CodeBERT; pseudocode, OpenAI text-embedding-ada-002); token count; character count; and the pixel layout of the visual presentation of the stimuli (Fig. 4; see Supplemental Methods for details on the operational definitions of similarity along these dimensions).
Figure 4.
Example model matrices for RSA. Top, the nine representational similarity matrices (RSM) for the nine features of the code stimuli. The “visual” RSM was left out as a confound for partial correlation, while the other eight RSMs were submitted to PCA to derive two composite feature RSMs. Bottom, Feature loadings for the two principal components. Component 1 is dominated by features related to the superficial appearance of the stimuli; Component 2 has contributions from features determining the “meaning” of the algorithms. The color of each bar segment in the feature loading plot matches the background color of the corresponding feature name. This experiment included two batches of algorithms, each involved pseudocode and code versions. This figure uses the code stimuli from Batch 1 as an example. Please see Figure S4 for the RSMs and PCA results for both batches and stimuli types.
Because any difference in any feature dimension led to a visual difference, the visual RSM was reserved as a confound for the correlation between neural RSMs and feature RSMs, which was addressed using partial correlation. The other eight feature RSMs were submitted to a principal component analysis (PCA) to derive two orthogonal composite RSMs through linear combination. Figure S4b showed the weights of each feature for each component. Consistent throughout the stimulus sets, Component 1 was dominated by character count and token count, capturing the “appearance” of the stimuli, whereas Component 2 involves high contributions from the features which defined the algorithms, thus capturing their “meanings.”
In each searchlight on the cortical surface, we first computed the zero-order (no partial) correlation between the neural RSM and either of the two component RSMs. Next, we computed the partial correlation with the visual RSM included as a confound (please see Supplemental Methods for implementation details). We also computed the difference between the zero-order and the partial correlation values to highlight the brain region(s) most affected by the exclusion of visual information. All the resulting brain maps were controlled for FWER using the same method as the whole-cortex analysis described in the previous subsection, with cluster-forming threshold of uncorrected p < 0.001 and a cluster-wise FWER threshold of p < 0.05.
Results
Participants self-reported no prior programming experience before the introductory Python programming course. Before starting the course, participants came for their first functional magnetic resonance imaging (fMRI) scan (the PRE scan). In this first scan, they viewed pseudocode passages describing “for” and “if” programming snippets. After the semester, they completed a second scan (POST scan), where they viewed Python code snippets and a new set of pseudocode passages. Participants also completed a localizer task to identify frontoparietal logical reasoning networks and frontotemporal language networks in individual participants (Monti et al., 2009, 2012; Kanjlia et al., 2016). All analyses were conducted within individual participants, comparing PRE and POST neural responses in each individual.
Behavioral results
At the end of the semester, the average course grade was a B, with none of the participants failing the course (mean, 86.8%; SD, 8.9%; range, 61.6–96.1%). We assessed participants’ learning outcome with a multiple-choice test of Python “syntax” knowledge (accuracy mean, 70%; SD, 12.1%; random guess chance level, 20%), as well as an open-ended challenging fill-in-the-blank test applying programming knowledge to implement algorithms (accuracy mean, 31.1%; SD, 16.4%; chance level, 0%). In-scanner performance on code, pseudocode, and control tasks was above 80% in both the PRE and POST scans. The memory control tasks were more difficult than corresponding pseudocode or code reading tasks, in behavioral performance. See Supplemental Results and Figure S1 for details.
A left-lateralized frontoparietal reasoning network responds to code before and after initial programming instruction: univariate analysis
After a single semester of Python instruction, a left-lateralized frontoparietal network responded more to the Python code than a memory control task (laterality index, mean, −0.56; SD, 0.35 signed-rank test against 0, W = 11; p = 2.6 × 10−5; Fig. 2a; see Table 1 for activated clusters; see Fig. S2d for laterality index). This network is highly similar to what has previously been observed for expert programmers and, like in experts, includes a posterior left middle temporal component (Siegmund et al., 2014; Floyd et al., 2017; Castelhano et al., 2018; Ivanova et al., 2020; Liu et al., 2020; Ikutani et al., 2021; Xu et al., 2021). As previously observed in experts, the Python comprehension network of beginning programmers overlapped more with frontoparietal circuits involved in logical reasoning than with the frontotemporal language network (Fig. 2b; overlap with logic measured in dice coefficients, mean, 0.44; SD, 0.14; overlap with language, mean, 0.17; SD, 0.07; logic vs language overlap permutation tested against each other, nonparametric permutation test p = 2 × 10−5).
Figure 2.
Univariate responses. a, From top row to bottom row: pseudocode (pcode) > memory control contrast during the PRE scan; pseudocode > memory control contrast during the POST scan; code > memory control contrast during the POST scan. Logical reasoning network is outlined with green; language network is in blue. All activation maps were corrected for FWER with cluster-forming threshold of p < 0.01 and cluster-wise p < 0.05. b, Overlap (Dice coefficient) between spatial activation patterns in a with logic or language (lang) networks localized within the same individuals. Error bars indicate standard error. Horizontal black lines on each bar indicate chance level overlap given the observed number of active vertices. Each dot represents a single participant. In each pair of bars, dots representing the same participant are connected. Statistical significance of each bar was tested against respective chance level. ***p < 0.001. For the activation maps with medial view, and with logic and language network overlaid, please see Figure S2.
Table 1.
Clusters revealed in the group maps of univariate contrasts and MVPA
| Cluster descriptions | peak MNI coordinates | Cluster size | peak-p | |||
|---|---|---|---|---|---|---|
| X | Y | Z | vertices | mm2 | ||
| Univariate code > control (POST) | ||||||
| Left hemisphere | ||||||
| IPS, angular gyrus | −35.2 | −66.7 | 45.1 | 1,693 | 2,578.04 | 1.24 × 10−10 |
| Lateral PFC: extending from the precentral sulcus to frontal pole | −42.1 | 11.4 | 43.9 | 1,629 | 3,550.79 | 1.16 × 10−10 |
| Precuneus | −4.5 | −71.5 | 40.6 | 1,129 | 2,012.91 | 2.16 × 10−9 |
| Posterior middle temporal gyrus | −58.9 | −39.8 | −15.4 | 370 | 886.6 | 3.30 × 10−7 |
| Calcarine sulcus | −4.9 | −78.3 | 4.7 | 326 | 807.95 | 1.45 × 10−8 |
| Medial superior frontal gyrus | −6 | 35.9 | 44.8 | 151 | 439.86 | 1.65 × 10−6 |
| Right hemisphere | ||||||
| Calcarine sulcus | 11.4 | −84.6 | 0.6 | 549 | 1,589.92 | 3.23 × 10−11 |
| IPS, angular gyrus | 45.1 | −56.6 | 45.4 | 559 | 934.2 | 8.71 × 10−7 |
| Precuneus | 3.9 | −65.4 | 38.9 | 379 | 692.25 | 6.73 × 10−8 |
| Posterior cingulate gyrus | 4.4 | −30.7 | 29.3 | 257 | 397.79 | 4.74 × 10−7 |
| Lingual gyrus | 13.2 | −78 | −10.8 | 151 | 407.32 | 8.12 × 10−9 |
| Univariate pcode > control (PRE) | ||||||
| Left hemisphere | ||||||
| IPS, angular gyrus, posterior superior temporal sulcus | −39.7 | −67.1 | 47.6 | 1,255 | 1,806 | 9.64 × 10−9 |
| Precuneus | −5.4 | −59.3 | 36.6 | 769 | 1,523.78 | 2.29 × 10−7 |
| Lateral PFC: middle frontal gyrus, inferior frontal sulcus | −40.8 | 21.4 | 41.1 | 590 | 1,265.57 | 1.19 × 10−6 |
| Posterior cingulate gyrus | −5.7 | −41.2 | 25.6 | 229 | 354.93 | 4.94 × 10−6 |
| Posterior middle temporal gyrus | −63.8 | −42.7 | −3.3 | 219 | 501.36 | 6.80 × 10−6 |
| Right hemisphere | ||||||
| Calcarine sulcus | 11.6 | −78.1 | 4.2 | 763 | 2,048.15 | 1.78 × 10−7 |
| Univariate pcode > control (POST) | ||||||
| Left hemisphere | ||||||
| IPS, angular gyrus, posterior superior temporal sulcus | −39.7 | −67.1 | 47.6 | 1,255 | 1,806 | 9.64 × 10−9 |
| Precuneus | −5.4 | −59.3 | 36.6 | 769 | 1,523.78 | 2.29 × 10−7 |
| Lateral PFC: middle frontal gyrus, inferior frontal sulcus | −40.8 | 21.4 | 41.1 | 590 | 1,265.57 | 1.19 × 10−6 |
| Posterior cingulate gyrus | −5.7 | −41.2 | 25.6 | 229 | 354.93 | 4.94 × 10−6 |
| Posterior middle temporal gyrus | −63.8 | −42.7 | −3.3 | 219 | 501.36 | 6.80 × 10−6 |
| Right hemisphere | ||||||
| Calcarine sulcus | 11.6 | −78.1 | 4.2 | 763 | 2,048.15 | 1.78 × 10−7 |
| RSA: PRE pcode versus POST code | ||||||
| Left hemisphere | ||||||
| Lingual gyrus, calcarine sulcus | −13.8 | −98.1 | −13.1 | 160 | 397.68 | 1.38 × 10−6 |
| IPS | −28.2 | −64.9 | 41 | 78 | 77.51 | 2.36 × 10−6 |
| Precentral sulcus | −39.8 | 2.1 | 38.3 | 64 | 96.41 | 2.78 × 10−5 |
| Inferior frontal gyrus | −48.3 | 25.9 | 4.5 | 61 | 124.24 | 5.16 × 10−8 |
| Right hemisphere | ||||||
| Posterior superior temporal sulcus | 46.8 | −53.6 | 27.4 | 254 | 408.76 | 2.26 × 10−6 |
| Lingual gyrus, calcarine sulcus | 15.5 | −71.6 | 6.8 | 197 | 603.03 | 1.03 × 10−6 |
| Superior occipital gyrus | 15.8 | −88.4 | 34.8 | 172 | 468.64 | 6.62 × 10−8 |
| Inferior frontal gyrus | 50.5 | −3.7 | 5.4 | 129 | 314.29 | 2.42 × 10−9 |
| Angular gyrus | 48 | −56.5 | 46.8 | 81 | 136.93 | 3.16 × 10−7 |
| Inferior temporal pole | 26.6 | −3.8 | −36.4 | 76 | 168.25 | 3.51 × 10−7 |
| Posterior middle temporal gyrus | 55.7 | −61.6 | 3.1 | 76 | 147 | 1.89 × 10−6 |
| Calcarine sulcus | 24.9 | −65.2 | 6.6 | 74 | 112.99 | 3.75 × 10−6 |
| Superior parietal gyrus | 16.1 | −56.5 | 62.1 | 66 | 105.92 | 8.73 × 10−5 |
| Inferior frontal sulcus | 39.9 | 30.5 | 14.8 | 53 | 111.38 | 2.33 × 10−6 |
| Posterior lateral fissure | 38.3 | −28.4 | 20.6 | 60 | 74.01 | 2.46 × 10−5 |
| IPS | 19.5 | −60.9 | 54.7 | 51 | 70.5 | 3.66 × 10−5 |
| RSA: POST code “appearance” component (partial correlation with visual) | ||||||
| Left hemisphere | ||||||
| Calcarine sulcus, occipital pole | −20.6 | −76.4 | −7.8 | 943 | 2,269.57 | 1.70 × 10−10 |
| Right hemisphere | ||||||
| Calcarine sulcus, lingual gyrus | 11.4 | −84.6 | 0.6 | 320 | 979.22 | 8.83 × 10−11 |
| Occipitotemporal gyrus | 26 | −79.2 | 18.9 | 16 | 33.94 | 6.16 × 10−5 |
| RSA: POST code “meanings” component (partial correlation with visual) | ||||||
| Left hemisphere | ||||||
| The union of multiple subregions in posterior lateral occipitotemporoparietal cortex | −28.7 | −64 | 41.9 | 3,233 | 5,420.22 | 8.23 × 10−8 |
| Calcarine sulcus, occipitotemporal gyrus | −14.5 | −83.6 | −13.7 | 895 | 2,223.58 | 6.84 × 10−10 |
| Inferior frontal gyrus | −53.9 | 24.2 | 15.9 | 413 | 932.36 | 4.61 × 10−7 |
| Precuneus | −5.5 | −66.2 | 49.9 | 267 | 488.56 | 9.02 × 10−7 |
| Precentral gyrus | −46 | −1.9 | 46.9 | 266 | 640.61 | 9.94 × 10−6 |
| Middle frontal gyrus | −37.2 | 35.9 | 15.5 | 168 | 359.06 | 1.30 × 10−5 |
| Posterior dorsal cingulate gyrus | −11 | −53.2 | 28.5 | 137 | 286.68 | 1.57 × 10−5 |
| Calcarine sulcus | −18.6 | −66.8 | 8.3 | 128 | 230.76 | 3.29 × 10−6 |
| Superior parietal gyrus | −33.7 | −45.9 | 57.7 | 99 | 198.86 | 1.87 × 10−5 |
| Posterior lateral fissure | −43.3 | −35.7 | 8.6 | 89 | 150.28 | 3.04 × 10−5 |
| Orbital gyrus | −20.6 | 30.1 | −16.2 | 61 | 153.45 | 7.28 × 10−7 |
| IPS | −22.1 | −73.9 | 36.8 | 56 | 122.21 | 1.88 × 10−5 |
| Subcentral cortex | −45.7 | −20.4 | 14.6 | 56 | 126.3 | 1.29 × 10−6 |
| Occipitoparietal sulcus | −12.6 | −64.9 | 28.3 | 51 | 97.2 | 1.85 × 10−6 |
| Right hemisphere | ||||||
| Angular gyrus | 49.2 | −60.3 | 35.8 | 720 | 1,473.75 | 1.38 × 10−7 |
| Calcarine sulcus, lingual gyrus | 9.2 | −77.4 | 4.5 | 560 | 1,509.98 | 8.05 × 10−7 |
| Anterior cingulate cortex | 12.9 | 32.2 | 22.2 | 218 | 474.17 | 1.13 × 10−6 |
| Precuneus | 13.6 | −64 | 31.3 | 195 | 394.82 | 2.49 × 10−6 |
| Inferior frontal gyrus | 49.6 | 35 | −9.7 | 215 | 441.31 | 1.10 × 10−5 |
| Inferior occipital cortex | 43.4 | −62.5 | −1.6 | 165 | 286.35 | 9.06 × 10−6 |
| Middle occipital cortex | 29.3 | −83.7 | 5.6 | 137 | 285.75 | 4.86 × 10−7 |
| Cuneus gyrus | 15.7 | −63.7 | 11.3 | 127 | 334.16 | 5.14 × 10−5 |
| Anterior superior temporal sulcus | 54.6 | 3.2 | −19 | 106 | 244.37 | 4.23 × 10−7 |
| IPS | 35.5 | −54.8 | 39.8 | 113 | 135.8 | 2.54 × 10−5 |
| Supramarginal gyrus | 59.3 | −43.9 | 24.8 | 90 | 128.56 | 3.83 × 10−7 |
| Superior temporal sulcus | 49.2 | −29 | 9.3 | 95 | 211.48 | 4.49 × 10−6 |
| Middle frontal gyrus | 47.8 | 34.2 | 26.6 | 85 | 211.1 | 2.13 × 10−6 |
| Inferior frontal sulcus | 39.9 | 19.7 | 34.1 | 90 | 156.87 | 8.41 × 10−5 |
| RSA: PRE pcode “appearance” component (partial correlation with visual) | ||||||
| Left hemisphere | ||||||
| Calcarine sulcus, cuneus gyrus | −3.2 | −89.1 | 8.1 | 267 | 623.3 | 5.35 × 10−9 |
| Lingual gyrus | −6.5 | −94.5 | −9.1 | 65 | 145.24 | 1.75 × 10−6 |
| Right hemisphere | ||||||
| Calcarine sulcus, lingual gyrus, cuneus gyrus | 15.4 | −80.4 | 7.6 | 325 | 912.47 | 2.27 × 10−8 |
| RSA: PRE pcode “meanings” component (partial correlation with visual) | ||||||
| Left hemisphere | ||||||
| The union of multiple subregions in posterior lateral occipitotemporalparietal cortex and primary visual cortex | −9.5 | −91 | 6.1 | 4,533 | 8,693.31 | 1.67 × 10−9 |
| Inferior frontal gyrus | −41.1 | 25.7 | 4.9 | 858 | 1,763.52 | 8.12 × 10−7 |
| Middle frontal gyrus | −26 | 3.9 | 50.5 | 477 | 976.62 | 2.20 × 10−7 |
| Dorsomedial frontal gyrus, anterior cingulate | −9.2 | 9.4 | 67 | 454 | 1,022.45 | 2.18 × 10−6 |
| Precuneus | −7.8 | −58.2 | 30.8 | 356 | 588.7 | 5.38 × 10−7 |
| Anterior superior temporal sulcus | −57.3 | −5.6 | −30.1 | 272 | 743.01 | 2.45 × 10−8 |
| Occipital pole | −19 | −92.4 | −10.7 | 130 | 311.95 | 3.09 × 10−5 |
| Posterior cingulate cortex | −5.9 | −30.4 | 41.6 | 122 | 224.99 | 1.37 × 10−5 |
| Temporal pole | −29.4 | 9.2 | −40.7 | 127 | 361.84 | 2.58 × 10−5 |
| Postcentral sulcus | −56.9 | −18.6 | 35.3 | 115 | 150.36 | 2.22 × 10−5 |
| Middle frontal sulcus | −23.9 | 52.3 | 24.4 | 102 | 286.05 | 1.41 × 10−7 |
| Precentral gyrus | −47 | −9.3 | 51.3 | 97 | 238.56 | 1.82 × 10−5 |
| Temporal pole | −43.6 | 13.3 | −33.1 | 72 | 265.77 | 3.79 × 10−5 |
| Superior parietal gyrus | −9.2 | −63.1 | 59 | 55 | 74.32 | 0.000123 |
| Right hemisphere | ||||||
| The union of multiple subregions in posterior lateral occipitotemporalparietal cortex and primary visual cortex | 12.9 | −78.3 | 12.4 | 5,206 | 10,896.5 | 2.16 × 10−10 |
| Precentral sulcus | 36.7 | 14.5 | 35 | 473 | 875.32 | 2.64 × 10−7 |
| Anterior middle frontal cortex | 30.8 | 43 | 20.7 | 380 | 791.94 | 6.75 × 10−7 |
| Precuneus | 7.6 | −50.7 | 48.7 | 302 | 475.71 | 5.59 × 10−7 |
| Inferior frontal gyrus | 51.4 | 27.7 | 13.9 | 245 | 530.45 | 1.73 × 10−6 |
| Superior frontal sulcus | 22.4 | 4.9 | 51.1 | 140 | 220.84 | 1.67 × 10−5 |
| Posterior cingulate cortex | 5.9 | −2.4 | 42.4 | 102 | 166.86 | 4.76 × 10−5 |
| Superior frontal gyrus | 13.7 | 34.2 | 51.6 | 99 | 291.69 | 3.57 × 10−5 |
| Occipitoparietal sulcus | 14.8 | −62.9 | 28.3 | 78 | 154.66 | 1.58 × 10−5 |
| Inferior insula sulcus | 38.4 | −3.4 | −5.7 | 76 | 135.87 | 2.35 × 10−6 |
| Superior frontal gyrus | 5 | 4.6 | 64.8 | 77 | 191.11 | 2.35 × 10−5 |
| Central sulcus | 43.8 | −19.1 | 39.1 | 74 | 116.02 | 9.07 × 10−6 |
| Precentral sulcus | 16.2 | −11.8 | 62.4 | 67 | 129 | 0.000291 |
| Postcentral sulcus | 47.9 | −23.6 | 42.6 | 54 | 68.61 | 7.19 × 10−6 |
All clusters are corrected for FWER with cluster-forming threshold p < 0.01 and cluster-wise threshold p < 0.05.
Even prior to programming instruction (i.e., during the PRE scan), a similar left-lateralized frontoparietal network was already engaged during the comprehension of plain English pseudocode algorithms (compared with memorizing sentences containing the same words but without executable algorithms; laterality index mean, −0.61; SD, 0.33; W = 5; p = 1.2 × 10−4). Like real code after instruction, pseudocode before instruction exhibited greater overlap with the logic than the language network, despite being presented in written English (overlap with logic, mean, 0.36; SD, 0.16; against chance p < 0.001; with language, mean, 0.18; SD, 0.11; against chance p < 0.001; logic vs language p = 7.3 × 10−4). Similar but weaker neural responses to pseudocode were observed during the POST scan (Supplemental Results; Fig. S2a–c). All three experiments also produced activation in midline regions, the left precuneus and the left posterior cingulate gyrus, partially overlapping with canonical locations of the default-mode network (DMN; Spreng et al., 2008).
In a direct contrast comparing code after instruction to pseudocode before instruction, responses were observed throughout the same frontoparietal network activated by code, suggesting an increase in univariate activity for code after instruction. A similar pattern was observed when comparing code to pseudocode in the POST scan. However, real code in the POST scan activated the frontal pole, unlike pseudocode either before or after instruction (Fig. 2a; Table 1). The frontal pole region has been implicated in symbolic integration and nested reasoning (Monti et al., 2007, 2009). In a direct contrast, activation in the frontal pole was also greater for code than for both PRE pseudocode (paired within-participant second-order contrast of [POST code > control] > [PRE pseudocode > control]) and POST pseudocode (within-participant within-run fixed-effect contrast between code and pseudocode in the POST scan (Fig. S2e,f). These results are consistent with the effect of time (POST > PRE) and format (code > pseudocode) on univariate neural responses in the frontoparietal network in general and frontal pole in particular.
Neural populations in the reasoning network represent Python-relevant algorithms before and after programming instruction: “for” versus “if” MVPA decoding
As previously shown for coding experts, in Python students with one semester of experience, code-responsive IPS and lateral PFC showed sensitivity to the distinction between “for” loop and “if” conditional algorithms in Python code during the POST scan (Wilcoxon signed-rank test against chance, IPS W = 218.0; p = 1.8 × 10−4; PFC W = 223.0; p = 9.2 × 10−5; control region primary auditory cortex A1 W = 108.5; p = 0.73; see Supplemental Results for mean and SD of decoding accuracy). Decoding accuracy in the IPS and the PFC was significantly greater than in a control region A1 (IPS vs A1, W = 179.5; p = 0.0027. PFC vs A1, W = 189; p = 0.00085; Fig. 3a).
Figure 3.
MVPA. a, Binary decoding between FOR loop and IF conditional algorithms, with a chance level of 50%. Inset, masks for the ROI search spaces used in MVPA. Within each ROI search space, participant-specific fROI were selected based on their respective code > memory control contrast in the POST scan. Each dot represents the result from one participant. b, Trial-wise representational similarity (Kendall's tau) between pseudocode (PRE scan) and code (POST scan) spatial responses. Each dot represents a single participant. *p < 0.05; **p < 0.01; ***p < 0.001.
Neural populations that respond to code after programming instruction already represented the “for”-versus-“if” distinction even prior to instruction. Parietal (IPS) and prefrontal (PFC) vertices that responded to code (over the memory control task) in the POST scan showed multivariate sensitivity to the distinction between “for” and “if” algorithms in the plain English pseudocode during the PRE scan (IPS W = 227.5; p = 4.9 × 10−5; PFC W = 248.0; p = 2.0 × 10−6; control region A1 W = 84.5; p = 0.66. Comparison against A1 IPS vs A1, W = 234.5; p = 7.3 × 10−6. PFC vs A1, W = 253.0; p < 1 × 10−7; Fig. 3a). For-versus-if pseudocode can also be decoded during the POST scan (Fig. S3a). Since the frontal pole showed increased univariate responses to code after instruction, we looked specifically at for-versus-if decoding in this region. We found weak but significant decoding in the frontal pole that was similar before and after instruction, suggesting recycling of representations in this region (Supplemental Results; Fig. S3a).
To determine whether decoding accuracy could be driven by superficial linguistic cues rather than meaningful algorithm representations, we conducted the same decoding analyses on control stimuli, which preserved the words and symbols of the original code or pseudocode but did not implement any algorithm. Consistent with our previous findings (Liu et al., 2024), decoding accuracy in both the PFC and IPS was at chance across all three stimulus sets: PRE pseudocode, POST pseudocode, and POST code. The only exception was a modest accuracy of 54.9% (p = 0.04) in the IPS during the POST code condition, which may reflect a false positive (see Supplemental Results for details). We thus show that above-chance decoding of real stimuli reflects sensitivity to algorithmic content rather than keyword presence.
Representation of Python code “semantics” in frontoparietal reasoning network before and after programming instruction: RSA
First, we asked whether vertices that go on to become involved in code comprehension represent similar information before and after instruction, without assuming a representational model. Each participant saw the same algorithms presented as pseudocode during the PRE scan and as code during the POST scan, allowing us to directly compare representational content before and after programming instruction and probe its content. Focusing on vertices that would eventually respond to programming code after instruction, we computed the RSM among neural responses to code algorithms (after instruction), as well as the similarity among neural responses to pseudocode algorithms (before instruction) in the IPS, the PFC, and the A1, which serves as a control region.
The code RSM and the pseudocode RSM were significantly correlated in the IPS and the PFC, suggesting shared information in the frontoparietal network before and after instruction. The value in the A1 was numerically lower but also reached statistical significance (Kendall's tau, IPS, mean, 0.0162; SD, 0.019; nonparametric permutation test p = 6.9 × 10−4; PFC, mean, 0.0196; SD, 0.023; p = 5 × 10−4; A1, mean, 0.0130; SD, 0.016; p = 0.002. Nonparametric sign tests IPS vs A1, p = 0.30; PFC vs A1, p = 0.13; Fig. 3b). The similarity value in the A1 could be potentially due to perceptual consistency in ambient noises during MRI scans, as A1 does not exhibit sensitivity to code content in any other analysis. Whole-cortex searchlight RSA revealed significant representational similarity between PRE pseudocode and POST code in the left IPS and PFC as we well as a small cluster in V1 (Fig. S3b), while no significant similarity was observed in A1.
Next, we used whole-cortex RSA to look for neural sensitivity to the “meanings” of algorithms, beyond the “if” versus “for” distinction and to separate the “meanings” of code/pseudocode algorithms from the potentially confounded “appearance” (or perceptual) similarity. We measured the similarity of code/pseudocode functions along six “meanings” dimensions and three “appearance” dimensions and used these to predict neural responses to code/pseudocode across the whole cortex.
“Meanings” dimensions included “if” versus “for” plus five others. Four were generated by experimenters and used to create the code functions (e.g., data type, character string or list; operation performed; result type; see Materials and Methods, Fig. 4, and Supplemental Methods for details). The final “meanings” dimension was generated using pretrained large language models: CodeBERT for Python (Feng et al., 2020) and OpenAI text-embedding-ada-002 for pseudocode (OpenAI, 2022). “Appearance” similarity of code/pseudocode functions was measured in terms of the number of characters in the function, the number of “words” (tokens) in the function, and low-level visual similarity computed as pixel overlap.
In whole-cortex searchlight RSA, we used PCA to reduce the dimensions to two orthogonal components: Component 1 captured the appearance similarity of the functions, while Component 2 captured the meanings of the code algorithms (Fig. 3; also see Materials and Methods). Next, we used partial correlation to remove the effect of pixel overlap on neural activity and the feature components.
Whole-cortex partial correlation revealed neural sensitivity to the “meanings” component in left-lateralized frontoparietal reasoning network and frontotemporal networks extending into inferior temporal and lateral occipital cortices, as well as the precuneus (Fig. 5; also see Fig. S5a for the results without partial correlation with visual pixel overlap RSM). In contrast, the neural representations of the “appearance” component were concentrated in the primary visual cortex. To assess the influence of the low-level visual (pixel overlap) similarity, we subtracted the partial correlation maps from the zero-order (no partial) correlation maps. Through the subtraction, the primary visual cortex emerged as the sole area where the application of partial correlation substantially diminished the correlation value between the neural RSMs and the feature RSMs (Fig. S5c). These results suggest that the “meanings” of algorithms are represented in frontoparietal and frontotemporal network, and such “meanings” representation is beyond perceptual (visual) information.
Figure 5.
Whole-cortex searchlight representational similarity (Kendall's tau) between PRE pseudocode (top two rows) or POST code (bottom two rows) spatial activation patterns and either composite feature. Partial correlation was performed with visual RSM as the confound. Corrected for FWER with cluster-forming threshold of p < 0.001 and cluster-wise p < 0.05. Clusters with <50 vertices are not displayed. Please see Figure S5b for the maps for POST pseudocode. For maps of zero-order (no partial) correlation and the difference between zero-order and partial correlation, please see Figure S5a,c.
Finally, to evaluate whether programming instruction changes representations of logical algorithms, we compared the RSA “meanings” maps of PRE pseudocode and POST code within participants. There were no significant differences between PRE and POST, except in the primary visual cortex, where there was more similarity for code POST. These findings suggest that, beyond perceptual processing, the representations of algorithm “meanings” measured in the current study remain largely stable before and after learning to program.
Discussion
We find that acquisition of programming skills involves rapid recycling of preexisting representations of logical algorithms in a left-lateralized frontoparietal network that supports logical reasoning, prior to code exposure. Code-relevant logical algorithms are represented in this frontoparietal network even before programming instruction. Activity patterns in frontoparietal neural networks that go on to respond to Python code after instruction distinguish between “for” and “if” algorithms in the same participants before instruction. In other words, we could decode the difference between plain English descriptions of executable “for” and “if” Python algorithms in a left-lateralized frontoparietal reasoning network.
This early presence of structured algorithm representations suggests that, unlike for modern AI models, task-relevant representations of code in human brain do not emerge de novo as a result of instruction. Rather, the present results support the idea that programming “recycles” preexisting neural representations of logical algorithms (Liu et al., 2020; Dehaene et al., 2022). Our results are consistent with individual difference behavioral studies, which show that across programming languages, reasoning abilities—including fluid intelligence, working memory, mathematical problem-solving, and logical deduction—are the strongest predictors of programming learning outcome in first-time programming students (Anderson et al., 1984; McCoy and Burton, 1988; Shute, 1991; Prat et al., 2020; Farghaly and El-Kafrawy, 2021; Graafsma et al., 2023). Reasoning abilities outperform linguistic abilities as predictors of programming learning outcomes (Farghaly and El-Kafrawy, 2021; Graafsma et al., 2023). Neuroimaging studies also find that reasoning, including logical deduction (e.g., “If X then Y. Not Y. Therefore, not X”), engages a frontoparietal network similar to the one observed in the current study (Monti et al., 2007, 2009; Monti and Osherson, 2012; Pischedda et al., 2017; Coetzee and Monti, 2018; Wertheim and Ragni, 2020; Holyoak and Lu, 2021). Together with prior evidence, the current results suggest that there is a specific relationship between frontoparietal reasoning systems and that logical algorithm representations are recycled by programming instruction.
Open questions remain about what neural representational changes occur during learning. How does programming instruction recycle logical algorithms? One possibility is that learners mentally map code stimuli onto preexisting neural representations of algorithms, which are also accessible through pseudocode, with minimal change to the algorithm representations themselves. On this view, most of what is learned during programming instruction is a way to tap into preexisting logical representations via a novel symbol system (i.e., Python syntax). We hypothesize instead that programming instruction both creates a novel access route and modifies and elaborates preexisting logical representations for code-specific needs. The old and new representations overlap enough for recycling. We failed to find clear evidence of representational change in the current study. However, we used simple Python code designed for programming beginners. It may be that processing these simple functions mostly requires reuse of preexisting algorithm representations. Advanced programmers acquire more sophisticated and specialized algorithmic knowledge, which may involve modifying the same frontoparietal algorithm representations. Some support for this idea comes from a study with proficient coders taking part in a coding competition “AtCoder” (Ikutani et al., 2021). Experts with better performance on the in-scanner programming task, and thus more expertise, showed stronger algorithm decoding in a subset of the frontoparietal network recruited by programming.
Understanding the plastic changes in neural structures and representations of logical algorithms that occur during programming instruction is an important direction for future work (Parnin et al., 2017; Hongo et al., 2022; Hishikawa et al., 2023). The frontal pole and other parts of the frontoparietal network showed largest univariate response to code after instruction. Although we did not find enhancement in multivariate decoding in the current study, future work comparing experts with varying degrees of proficiency using more advanced code stimuli is warranted. Neuroplastic changes may also occur outside the frontoparietal network. For example, we observed activation and algorithm-related representational structure in midline regions overlapping with the DMN, which was not observed in our prior study with programming experts (Liu et al., 2020).
In addition to changes within and beyond the reasoning network, programming instruction may also shape how the frontoparietal system interacts with other domain-specific networks, such as language. Some evidence suggests that language systems may be involved in representing the surface form of code, i.e., the language-like symbols (Coetzee and Monti, 2018; Srikant et al., 2022; Liu et al., 2024). Interestingly, similar to the canonical language network, responses to Python code in frontoparietal circuits are highly left lateralized, both in new learners and in expert programmers (Siegmund et al., 2014; Floyd et al., 2017; Castelhano et al., 2018; Ivanova et al., 2020; Krueger et al., 2020; Liu et al., 2020; Ikutani et al., 2021; Xu et al., 2021). This observation gives rise to a prediction to be tested in future work: whether connectivity between frontoparietal and language systems are enhanced when people learn to code.
A further open question concerns the ontogenetic origins of frontoparietal algorithm representations. Our participants were young adults, who, even prior to programming instruction, may have acquired algorithm representations through education or other life experience. Frontoparietal reasoning systems undergo protracted postnatal development and remain plastic throughout adolescence and to some degree adulthood (Huttenlocher and Dabholkar, 1997; Gogtay et al., 2004). Alternatively, algorithm representations recycled by code might be innate or emerge in infancy. Logical reasoning abilities are a human universal present across cultures in a wide variety of contexts, from programming to hunting and invention of new tools (Liebenberg, 2013; Pinker, 2022). Some recent evidence suggests that precursors of reasoning abilities (e.g., disjunctive logical deduction in nonverbal visual tasks) emerge in early infancy (Cesana-Arlotti et al., 2018, 2020; Feiman et al., 2022). There is also evidence that frontoparietal system are “online” from early infancy (Raz and Saxe, 2020) and engaged during simple rule learning in 8-month-olds (Werchan et al., 2016). Some evidence suggests that reconfiguration of frontoparietal connectivity underlies the development of reasoning ability (Mackey et al., 2013; Wendelken et al., 2015, 2017). Future neuroimaging work with children is needed to test the developmental origins of logical representations that go on to enable programming.
Altogether, this study positions programming as a valuable model for understanding neural recycling, in which preexisting reasoning systems are adapted for new cognitive functions. By investigating how instruction adapts and refines existing algorithmic representations, future research can further clarify how the brain supports the acquisition of culturally invented symbol systems like programming.
Data Availability
Preprocessed neuroimaging data and behavioral data and some scripts for data analysis are available at an OSF repository: https://osf.io/2ncfm/.
References
- Amalric M, Dehaene S (2016) Origins of the brain networks for advanced mathematics in expert mathematicians. Proc Natl Acad Sci U S A 113:4909–4917. 10.1073/pnas.1603205113 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ambrosio AP, Almeida LS, Macedo J, Franco AHR (2014) Exploring core cognitive skills of computational thinking.
- Anderson JR, Farrell R, Sauers R (1984) Learning to program in LISP. Cogn Sci 8:87–129. 10.1207/s15516709cog0802_1 [DOI] [Google Scholar]
- Badre D, Kayser AS, D'Esposito M (2010) Frontal cortex and the discovery of abstract action rules. Neuron 66:315–326. 10.1016/j.neuron.2010.03.025 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Birch J, Heyes C (2021) The cultural evolution of cultural evolution. Philos Trans R Soc Lond B Biol Sci 376:20200051. 10.1098/rstb.2020.0051 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cantlon JF, Brannon EM (2007) Basic math in monkeys and college students. PLoS Biol 5:e328. 10.1371/journal.pbio.0050328 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Castelhano J, Duarte IC, Ferreira C, Duraes J, Madeira H, Castelo-Branco M (2018) The role of the insula in intuitive expert bug detection in computer code: an fMRI study. Brain Imaging Behav 13:623–637. 10.1007/s11682-018-9885-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cesana-Arlotti N, Martín A, Téglás E, Vorobyova L, Cetnarski R, Bonatti LL (2018) Precursors of logical reasoning in preverbal human infants. Science 359:1263–1266. 10.1126/science.aao3539 [DOI] [PubMed] [Google Scholar]
- Cesana-Arlotti N, Kovács ÁM, Téglás E (2020) Infants recruit logic to learn about the social world. Nat Commun 11:5999. 10.1038/s41467-020-19734-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Coetzee J, Monti M (2018) At the core of reasoning: dissociating deductive and non-deductive load. Hum Brain Mapp 39:1850–1861. 10.1002/hbm.23979 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dale AM, Fischl B, Sereno MI (1999) Cortical surface-based analysis: i. Segmentation and surface reconstruction. Neuroimage 9:179–194. 10.1006/nimg.1998.0395 [DOI] [PubMed] [Google Scholar]
- Dehaene-Lambertz G, Monzalvo K, Dehaene S (2018) The emergence of the visual word form: longitudinal evolution of category-specific ventral visual areas during reading acquisition. PLoS Biol 16:e2004103. 10.1371/journal.pbio.2004103 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dehaene S, Cohen L (2007) Cultural recycling of cortical maps. Neuron 56:384–398. 10.1016/j.neuron.2007.10.004 [DOI] [PubMed] [Google Scholar]
- Dehaene S, Cohen L, Morais J, Kolinsky R (2015) Illiterate to literate: behavioural and cerebral changes induced by reading acquisition. Nat Rev Neurosci 16:234–244. 10.1038/nrn3924 [DOI] [PubMed] [Google Scholar]
- Dehaene S, Al Roumi F, Lakretz Y, Planton S, Sablé-Meyer M (2022) Symbols and mental programs: a hypothesis about human singularity. Trends Cogn Sci 26:751–766. 10.1016/j.tics.2022.06.010 [DOI] [PubMed] [Google Scholar]
- Desikan RS, et al. (2006) An automated labeling system for subdividing the human cerebral cortex on MRI scans into gyral based regions of interest. Neuroimage 31:968–980. 10.1016/j.neuroimage.2006.01.021 [DOI] [PubMed] [Google Scholar]
- Eklund A, Nichols TE, Knutsson H (2016) Cluster failure: why fMRI inferences for spatial extent have inflated false-positive rates. Proc Natl Acad Sci U S A 113:7900–7905. 10.1073/pnas.1602413113 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Eklund A, Knutsson H, Nichols TE (2019) Cluster failure revisited: impact of first level design and physiological noise on cluster false positive rates. Hum Brain Mapp 40:2017–2032. 10.1002/hbm.24350 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Elli GV, Lane C, Bedny M (2019) A double dissociation in sensitivity to verb and noun semantics across cortical networks. Cereb Cortex 29:4803–4817. 10.1093/cercor/bhz014 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Elman JL (1989) Representation and structure in connectionist models.
- Endres M, Karas Z, Hu X, Kovelman I, Weimer W (2021) Relating reading, visualization, and coding for new programmers: a neuroimaging study. In: 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE), pp 600–612. [Google Scholar]
- Etzel JA, Valchev N, Keysers C (2011) The impact of certain methodological choices on multivariate analysis of fMRI data with support vector machines. Neuroimage 54:1159–1167. 10.1016/j.neuroimage.2010.08.050 [DOI] [PubMed] [Google Scholar]
- Farghaly AA, El-Kafrawy PM (2021) Exploring the use of cognitive tests to predict programming performance: a systematic literature review. In: 2021 31st International Conference on Computer Theory and Applications (ICCTA), pp 40–48. IEEE. [Google Scholar]
- Feigenson L, Dehaene S, Spelke E (2004) Core systems of number. Trends Cogn Sci 8:307–314. 10.1016/j.tics.2004.05.002 [DOI] [PubMed] [Google Scholar]
- Feiman R, Mody S, Carey S (2022) The development of reasoning by exclusion in infancy. Cogn Psychol 135:101473. 10.1016/j.cogpsych.2022.101473 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Feng Z, Guo D, Tang D, Duan N, Feng X, Gong M, Shou L, Qin B, Liu T, Jiang D (2020) Codebert: a pre-trained model for programming and natural languages. arXiv preprint arXiv:200208155.
- Floyd B, Santander T, Weimer W (2017) Decoding the representation of code in the brain: an fMRI study of code review and expertise. In: Proceedings of the 39th International Conference on Software Engineering, pp 175–186. IEEE Press. [Google Scholar]
- Glasser MF, et al. (2013) The minimal preprocessing pipelines for the human connectome project. Neuroimage 80:105–124. 10.1016/j.neuroimage.2013.04.127 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gogtay N, Giedd JN, Lusk L, Hayashi KM, Greenstein D, Vaituzis AC, Nugent TF 3rd, Herman DH, Clasen LS, Toga AW (2004) Dynamic mapping of human cortical development during childhood through early adulthood. Proc Natl Acad Sci U S A 101:8174–8179. 10.1073/pnas.0402680101 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Graafsma IL, Robidoux S, Nickels L, Roberts M, Polito V, Zhu JD, Marinus E (2023) The cognition of programming: logical reasoning, algebra and vocabulary skills predict programming performance following an introductory computing course. J Cogn Psychol 35:364–381. 10.1080/20445911.2023.2166054 [DOI] [Google Scholar]
- Hagler DJ, Saygin AP, Sereno MI (2006) Smoothing and cluster thresholding for cortical surface-based group analysis of fMRI data. Neuroimage 33:1093–1103. 10.1016/j.neuroimage.2006.07.036 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hanke M, Halchenko YO, Sederberg PB, Hanson SJ, Haxby JV, Pollmann S (2009) PyMVPA: a python toolbox for multivariate pattern analysis of fMRI data. Neuroinformatics 7:37–53. 10.1007/s12021-008-9041-y [DOI] [PMC free article] [PubMed] [Google Scholar]
- Heyes C (2018) Cognitive gadgets. Cambridge, MA: Harvard University Press. [Google Scholar]
- Hishikawa K, Yoshinaga K, Togo H, Hongo T, Hanakawa T (2023) Changes in functional brain activity patterns associated with computer programming learning in novices. Brain Struct Funct 228:1691–1701. 10.1007/s00429-023-02674-3 [DOI] [PubMed] [Google Scholar]
- Holyoak K, Lu H (2021) Emergence of relational reasoning. Curr Opin Behav Sci 37:118–124. 10.1016/j.cobeha.2020.11.012 [DOI] [Google Scholar]
- Hongo T, Yakou T, Yoshinaga K, Kano T, Miyazaki M, Hanakawa T (2022) Structural neuroplasticity in computer programming beginners. Cereb Cortex 33:5375–5381. 10.1093/cercor/bhac425 [DOI] [PubMed] [Google Scholar]
- Hoshi E, Shima K, Tanji J (2000) Neuronal activity in the primate prefrontal cortex in the process of motor selection based on two behavioral rules. J Neurophysiol 83:2355–2373. 10.1152/jn.2000.83.4.2355 [DOI] [PubMed] [Google Scholar]
- Huttenlocher PR, Dabholkar AS (1997) Regional differences in synaptogenesis in human cerebral cortex. J Comp Neurol 387:167–178. 10.1002/(SICI)1096-9861(19971020)387:2<167::AID-CNE1>3.0.CO;2-Z [DOI] [PubMed] [Google Scholar]
- Ikutani Y, Kubo T, Nishida S, Hata H, Matsumoto K, Ikeda K, Nishimoto S (2021) Expert programmers have fine-tuned cortical representations of source code. eNeuro 8:ENEURO.0405-0420.2020. 10.1523/ENEURO.0405-20.2020 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ivanova AA, Srikant S, Sueoka Y, Kean HH, Dhamala R, O'Reilly U-M, Bers MU, Fedorenko E (2020) Comprehension of computer code relies primarily on domain-general executive brain regions. Elife 9:e58906. 10.7554/eLife.58906 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kanjlia S, Lane C, Feigenson L, Bedny M (2016) Absence of visual experience modifies the neural basis of numerical thinking. Proc Natl Acad Sci U S A 113:11172–11177. 10.1073/pnas.1524982113 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kanwisher N, McDermott J, Chun MM (1997) The fusiform face area: a module in human extrastriate cortex specialized for face perception. J Neurosci 17:4302–4311. 10.1523/JNEUROSCI.17-11-04302.1997 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kim JS, Kanjlia S, Merabet LB, Bedny M (2017) Development of the visual word form area requires visual experience: evidence from blind braille readers. J Neurosci 37:11495–11504. 10.1523/JNEUROSCI.0997-17.2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kriegeskorte N, Goebel R, Bandettini P (2006) Information-based functional brain mapping. Proc Natl Acad Sci U S A 103:3863–3868. 10.1073/pnas.0600244103 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kriegeskorte N, Mur M, Bandettini P (2008a) Representational similarity analysis - connecting the branches of systems neuroscience. Front Syst Neurosci 2:249. 10.3389/neuro.06.004.2008 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kriegeskorte N, Mur M, Ruff DA, Kiani R, Bodurka J, Esteky H, Tanaka K, Bandettini PA (2008b) Matching categorical object representations in inferior temporal cortex of man and monkey. Neuron 60:1126–1141. 10.1016/j.neuron.2008.10.043 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Krueger R, Huang Y, Liu X, Santander T, Weimer W, Leach K (2020) Neurological divide: an fMRI study of prose and code writing. In: 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE). [Google Scholar]
- Lee S, Kable JW (2018) Simple but robust improvement in multivoxel pattern classification. PLoS One 13:e0207083. 10.1371/journal.pone.0207083 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liebenberg L (2013) The origin of science. Cape Town: CyberTracker. [Google Scholar]
- Liu Y-F, Kim J, Wilson C, Bedny M (2020) Computer code comprehension shares neural resources with formal logical inference in the fronto-parietal network. Elife 9:e59340. 10.7554/eLife.59340 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu Y-F, Rapp B, Bedny M (2023) Reading braille by touch recruits posterior parietal cortex. J Cogn Neurosci 35:1593–1616. 10.1162/jocn_a_02041 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu Y-F, Wilson C, Bedny M (2024) Contribution of the language network to the comprehension of Python programming code. Brain Lang 251:105392. 10.1016/j.bandl.2024.105392 [DOI] [PubMed] [Google Scholar]
- Mackey A, Miller Singley A, Bunge S (2013) Intensive reasoning training alters patterns of brain connectivity at rest. J Neurosci 33:4796–4803. 10.1523/JNEUROSCI.4141-12.2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
- McCandliss BD, Cohen L, Dehaene S (2003) The visual word form area: expertise for reading in the fusiform gyrus. Trends Cogn Sci 7:293–299. 10.1016/S1364-6613(03)00134-7 [DOI] [PubMed] [Google Scholar]
- McClelland JL (1988) Connectionist models and psychological evidence. J Mem Lang 27:107–123. 10.1016/0749-596X(88)90069-1 [DOI] [Google Scholar]
- McCoy LP, Burton JK (1988) The relationship of computer programming and mathematics in secondary students.
- Monti M, Osherson D, Martinez M, Parsons L (2007) Functional neuroanatomy of deductive inference: a language-independent distributed network. Neuroimage 37:1005–1016. 10.1016/j.neuroimage.2007.04.069 [DOI] [PubMed] [Google Scholar]
- Monti M, Parsons L, Osherson D (2009) The boundaries of language and thought in deductive inference. Proc Natl Acad Sci U S A 106:12554–12559. 10.1073/pnas.0902422106 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Monti M, Parsons L, Osherson D (2012) Thought beyond language: neural dissociation of algebra and natural language. Psychol Sci 23:914–922. 10.1177/0956797612437427 [DOI] [PubMed] [Google Scholar]
- Monti M, Osherson D (2012) Logic, language and the brain. Brain Res 1428:33–42. 10.1016/j.brainres.2011.05.061 [DOI] [PubMed] [Google Scholar]
- Morosan P, Rademacher J, Schleicher A, Amunts K, Schormann T, Zilles K (2001) Human primary auditory cortex: cytoarchitectonic subdivisions and mapping into a spatial reference system. Neuroimage 13:684–701. 10.1006/nimg.2000.0715 [DOI] [PubMed] [Google Scholar]
- Mumford JA, Davis T, Poldrack RA (2014) The impact of study design on pattern estimation for single-trial multivariate pattern analysis. Neuroimage 103:130–138. 10.1016/j.neuroimage.2014.09.026 [DOI] [PubMed] [Google Scholar]
- Musz E, Loiotile R, Chen J, Bedny M (2022) Naturalistic audio-movies reveal common spatial organization across “visual” cortices of different blind individuals. Cereb Cortex 33:1–10. 10.1093/cercor/bhac048 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nichols TE, Holmes AP (2002) Nonparametric permutation tests for functional neuroimaging: a primer with examples. Hum Brain Mapp 15:1–25. 10.1002/hbm.1058 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nieto-Castañón A, Fedorenko E (2012) Subject-specific functional localizers increase sensitivity and functional resolution of multi-subject analyses. Neuroimage 63:1646–1669. 10.1016/j.neuroimage.2012.06.065 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nili H, Wingfield C, Walther A, Su L, Marslen-Wilson W, Kriegeskorte N (2014) A toolbox for representational similarity analysis. PLoS Comput Biol 10:e1003553. 10.1371/journal.pcbi.1003553 [DOI] [PMC free article] [PubMed] [Google Scholar]
- OpenAI (2022) New and improved embedding model.
- Parnin C, Siegmund J, Peitek N (2017) On the nature of programmer expertise. In: Psychology of Programming Interest Group Workshop.
- Peirce J, Gray JR, Simpson S, MacAskill M, Höchenberger R, Sogo H, Kastman E, Lindeløv JK (2019) Psychopy2: experiments in behavior made easy. Behav Res Methods 51:195–203. 10.3758/s13428-018-01193-y [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pennington N (1987) Stimulus structures and mental representations in expert comprehension of computer programs. Cogn Psychol 19:295–341. 10.1016/0010-0285(87)90007-7 [DOI] [Google Scholar]
- Perrenet J, Groote JF, Kaasenbrood E (2005) Exploring students’ understanding of the concept of algorithm: levels of abstraction. ACM SIGCSE Bull 37:64–68. 10.1145/1151954.1067467 [DOI] [Google Scholar]
- Pinker S (2022) Rationality: what it is, why it seems scarce, why it matters. New York, NY: Penguin. [Google Scholar]
- Pischedda D, Görgen K, Haynes J-D, Reverberi C (2017) Neural representations of hierarchical rule sets: the human control system represents rules irrespective of the hierarchical level to which they belong. J Neurosci 37:12281–12296. 10.1523/JNEUROSCI.3088-16.2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Prat CS, Madhyastha TM, Mottarella MJ, Kuo C-H (2020) Relating natural language aptitude to individual differences in learning programming languages. Sci Rep 10:3817. 10.1038/s41598-020-60661-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Quartz SR, Sejnowski TJ (1997) The neural basis of cognitive development: a constructivist manifesto. Behav Brain Sci 20:537–556. 10.1017/S0140525X97001581 [DOI] [PubMed] [Google Scholar]
- Raz G, Saxe R (2020) Learning in infancy is active, endogenously motivated, and depends on the prefrontal cortices. Annu Rev Dev Psychol 2:247–268. 10.1146/annurev-devpsych-121318-084841 [DOI] [Google Scholar]
- Regev M, Honey C, Simony E, Hasson U (2013) Selective and invariant neural responses to spoken and written narratives. J Neurosci 33:15978–15988. 10.1523/JNEUROSCI.1580-13.2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schaefer A, Kong R, Gordon EM, Laumann TO, Zuo X-N, Holmes AJ, Eickhoff SB, Yeo BTT (2018) Local-global parcellation of the human cerebral cortex from intrinsic functional connectivity MRI. Cereb Cortex 28:3095–3114. 10.1093/cercor/bhx179 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schreiber K, Krekelberg B (2013) The statistical analysis of multi-voxel patterns in functional imaging. PLoS One 8:e69328. 10.1371/journal.pone.0069328 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shute VJ (1991) Who is likely to acquire programming skills? J Educ Comput Res 7:1–24. 10.2190/VQJD-T1YD-5WVB-RYPJ [DOI] [Google Scholar]
- Siegmund J, Kästner C, Apel S, Parnin C, Bethmann A, Leich T, Saake G, Brechmann A (2014) Understanding understanding source code with functional magnetic resonance imaging. In: Proceedings of the 36th International Conference on Software Engineering, pp 378–389. ACM. [Google Scholar]
- Smith SM, et al. (2004) Advances in functional and structural MR image analysis and implementation as FSL. Neuroimage 23:S208–S219. 10.1016/j.neuroimage.2004.07.051 [DOI] [PubMed] [Google Scholar]
- Spreng RN, Mar RA, Kim ASN (2008) The common neural basis of autobiographical memory, prospection, navigation, theory of mind, and the default mode: a quantitative meta-analysis. J Cogn Neurosci 21:489–510. 10.1162/jocn.2008.21029 [DOI] [PubMed] [Google Scholar]
- Srikant S, Lipkin B, Ivanova AA, Fedorenko E, O’Reilly U-M (2022) Convergent Representations of Computer Programs in Human and Artificial Neural Networks. In: 36th Conference on Neural Information Processing Systems. [Google Scholar]
- Stehr DA, Garcia JO, Pyles JA, Grossman ED (2023) Optimizing multivariate pattern classification in rapid event-related designs. J Neurosci Methods 387:109808. 10.1016/j.jneumeth.2023.109808 [DOI] [PubMed] [Google Scholar]
- Su L, Fonteneau E, Marslen-Wilson W, Kriegeskorte N (2012) Spatiotemporal searchlight representational similarity analysis in EMEG source space. In: 2012s International Workshop on Pattern Recognition in NeuroImaging, pp 97–100. [Google Scholar]
- Valente G, Castellanos AL, Hausfeld L, De Martino F, Formisano E (2021) Cross-validation and permutations in MVPA: validity of permutation strategies and power of cross-validation schemes. Neuroimage 238:118145. 10.1016/j.neuroimage.2021.118145 [DOI] [PubMed] [Google Scholar]
- Wallis JD, Anderson KC, Miller EK (2001) Single neurons in prefrontal cortex encode abstract rules. Nature 411:953–956. 10.1038/35082081 [DOI] [PubMed] [Google Scholar]
- Wendelken C, Ferrer E, Whitaker KJ, Bunge SA (2015) Fronto-parietal network reconfiguration supports the development of reasoning ability. Cereb Cortex 26:2178–2190. 10.1093/cercor/bhv050 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wendelken C, Ferrer E, Ghetti S, Bailey SK, Cutting L, Bunge SA (2017) Frontoparietal structural connectivity in childhood predicts development of functional connectivity and reasoning ability: a large-scale longitudinal investigation. J Neurosci 37:8549. 10.1523/JNEUROSCI.3726-16.2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Werchan DM, Collins AG, Frank MJ, Amso D (2016) Role of prefrontal cortex in learning and generalizing hierarchical rules in 8-month-old infants. J Neurosci 36:10314–10322. 10.1523/JNEUROSCI.1351-16.2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wertheim J, Ragni M (2020) The neurocognitive correlates of human reasoning: a meta-analysis of conditional and syllogistic inferences. J Cogn Neurosci 32:1061–1078. 10.1162/jocn_a_01531 [DOI] [PubMed] [Google Scholar]
- Winkler AM, Ridgway GR, Webster MA, Smith SM, Nichols TE (2014) Permutation inference for the general linear model. Neuroimage 92:381–397. 10.1016/j.neuroimage.2014.01.060 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Woolgar A, Jackson J, Duncan J (2016) Coding of visual, auditory, rule, and response information in the brain: 10 years of multivoxel pattern analysis. J Cogn Neurosci 28:1433–1454. 10.1162/jocn_a_00981 [DOI] [PubMed] [Google Scholar]
- Xu S, Li Y, Liu J (2021) The neural correlates of computational thinking: collaboration of distinct cognitive components revealed by fMRI. Cereb Cortex 31:5579–5597. 10.1093/cercor/bhab182 [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
Preprocessed neuroimaging data and behavioral data and some scripts for data analysis are available at an OSF repository: https://osf.io/2ncfm/.





