Skip to main content
. 2020 Dec 15;9:e58906. doi: 10.7554/eLife.58906

Figure 2. Main experimental results.

(A) Candidate brain systems of interest. The areas shown represent the ‘parcels’ used to define the MD and language systems in individual participants (see Materials and methods and Figure 3—figure supplement 1). (B, C) Mean responses to the language localizer conditions (SR – sentence reading and NR – nonwords reading) and to the critical task (SP – sentence problems and CP – code problems) in systems of interest across programming languages (B – Python, C – ScratchJr). In the MD system, we see strong responses to code problems in both hemispheres and to both programming languages; the fact that this response is stronger than the response to content-matched sentence problems suggests that it reflects activity evoked by code comprehension per se rather than just activity evoked by problem content. In the language system, responses to code problems elicit a response that is substantially weaker than that elicited by sentence problems; further, only in Experiment 1 do we observe responses to code problems that are reliably stronger than responses to the language localizer control condition (nonword reading). Here and elsewhere, error bars show standard error of the mean across participants, and the dots show responses of individual participants.

Figure 2.

Figure 2—figure supplement 1. Behavioral results.

Figure 2—figure supplement 1.

(A) Python code problems had mean accuracies of 85.1% and 86.2% for the English-identifier (CP_en) and Japanese-identifier (CP_jap) conditions, respectively, and sentence problems (SP) had a mean accuracy of 81.5%. There was no main effect of condition (CP_en, CP_jap, SP), problem structure (seq – sequential, for – for loops, if – if statements), or problem content (math vs. string); however, there was a three-way interaction among Condition (sentence problems > code with English identifiers), Problem Type (string >math), and Problem Structure (for loop >sequential; p=0.02). Accuracy data from one participant had to be excluded due to a bug in the script. (B) ScratchJr code problems had a mean accuracy of 78.0%, and sentence problems had a mean accuracy of 87.8% (the difference was significant: p=0.006). (C) Python problems with English identifiers had a mean response time (RT) of 17.56 s (SD = 9.05), Python problems with Japanese identifiers had a mean RT of 19.39 s (SD = 10.1), and sentence problems had a mean RT of 21.32 s (SD = 11.6). Problems with Japanese identifiers took longer to answer than problems with English identifiers (β = 3.10, p=0.002), and so did sentence problems (β = 6.12, p<0.001). There was also an interaction between Condition (sentence problems > code with English identifiers) and Program Structure (for >seq; β = −5.25, p<0.001), as well as between Condition (CP_jap > CP_en) and Program Structure (if >seq; β = −2.83, p=0.04). There was no significant difference in RTs between math and string manipulation problems. (D) ScratchJr code problems had a mean RT of 1.14 s (SD = 0.86), and sentence problems had a mean RT of 1.03 s (SD = 0.78); the difference was not significant. The RTs are reported with respect to video offset. Items where >50% participants chose the incorrect answer for the (easy) verbal condition were excluded from accuracy calculations. (E) Mean accuracies for all Python participants were above chance. (F) Mean accuracies for all ScratchJr participants were above chance.
Figure 2—figure supplement 2. Random-effects group-level analysis of Experiment 1 data (Python, code problems > sentence problems contrast).

Figure 2—figure supplement 2.

Similar to analyses reported in the main text, code-evoked activity is bilateral and recruits fronto-parietal but not temporal regions. Cluster threshold p<0.05, cluster-size FDR-corrected; voxel threshold: p<0.001, uncorrected.
Figure 2—figure supplement 3. Random-effects group-level analysis of Experiment 2 data (ScratchJr, code problems > sentence problems contrast).

Figure 2—figure supplement 3.

Similar to analyses reported in the main text, ScratchJr-evoked activity has a small right hemisphere bias. Cluster threshold p<0.05, cluster-size FDR-corrected; voxel threshold: p<0.001, uncorrected.