Skip to main content
PLOS One logoLink to PLOS One
. 2020 Nov 5;15(11):e0240909. doi: 10.1371/journal.pone.0240909

The role of explicit memory in syntactic persistence: Effects of lexical cueing and load on sentence memory and sentence production

Chi Zhang 1,*, Sarah Bernolet 2, Robert J Hartsuiker 1
Editor: Michael P Kaschak3
PMCID: PMC7643978  PMID: 33151975

Abstract

Speakers’ memory of sentence structure can persist and modulate the syntactic choices of subsequent utterances (i.e., structural priming). Much research on structural priming posited a multifactorial account by which an implicit learning process and a process related to explicit memory jointly contribute to the priming effect. Here, we tested two predictions from that account: (1) that lexical repetition facilitates the retrieval of sentence structures from memory; (2) that priming is partly driven by a short-term explicit memory mechanism with limited resources. In two pairs of structural priming and sentence structure memory experiments, we examined the effects of structural priming and its modulation by lexical repetition as a function of cognitive load in native Dutch speakers. Cognitive load was manipulated by interspersing the prime and target trials with easy or difficult mathematical problems. Lexical repetition boosted both structural priming (Experiments 1a2a) and memory for sentence structure (Experiments 1b2b) and did so with a comparable magnitude. In Experiment 1, there were no load effects, but in Experiment 2, with a stronger manipulation of load, both the priming and memory effects were reduced with a larger cognitive load. The findings support an explicit memory mechanism in structural priming that is cue-dependent and attention-demanding, consistent with a multifactorial account of structural priming.

Introduction

A central issue of language production is the role of working memory in the production processes. Speaking often proceeds at a fast rate, suggestive of a view by which speakers do not always need working memory to control the formulation of utterances. In fact, in his classic book Speaking, Levelt [1] argued that other than in determining the message that speakers intend to utter, most of the processes involved in speaking function “in a reflex-like, automatic way”. However, speakers do sometimes show limited memory capacity in stages later than message planning. For example, the number of words speakers are able to plan in advance is reduced when they are distracted by a secondary task [2]. Such evidence suggests that speakers might store a certain number of lexical items in their short-term memory before their utterance starts and that they have to assign attentional resources to maintain these memory traces. Important for our purposes, some accounts of syntactic encoding argued that memory retrieval also facilitates the subsequent syntactic choice (e.g., [3]). Here we tested two predictions of that account: (1) that speakers make use of lexical cues to reinforce an effect of memory retrieval on syntactic encoding; (2) that such a memory effect is constrained by the limited memory capacity of speakers.

Grammatical encoding is the process in which speakers map the concepts to ordered sequences of words that feature grammatical functions (e.g., subject, object). Sentence production models often envision grammatical encoding as an automatic process that requires little involvement of working memory (e.g., [1, 3]). However, recent evidence has demonstrated that certain aspects of syntactic processing in sentence production are constrained by limited memory capacity (e.g., [2, 46]). Because only a limited amount of information can be kept in working memory [7], a process can be shown to require working memory if a concurrent task that also occupies working memory capacity interferes with this process. This phenomenon is often referred to as the cognitive load effect. Such load effects have been found in grammatical encoding processes of sentence production. For example, if speakers formulated sentences when their working memory was occupied by a word list they had to remember, the accessibility of the previously memorized lexical information was reduced, resulting in a change of word-order preference [6]. Similarly, speakers produced more subject-verb agreement number errors in sentence production when there was a cognitive load [5]. Hartsuiker and Moors [8] argued against a strictly dichotomous view of automatic vs. non-automatic processes and suggested a gradual view by which processes like syntactic processing in sentence production have some but not all "automaticity features" (such as being constrained by working memory).

This non-binary view of automaticity in syntactic processing meshes nicely with studies on structural priming, which also show both automatic and non-automatic effects. These studies tap into the persistence of syntactic structure, in particular how recently experienced sentence structures influence the subsequent syntactic choice ([9, 10]). Structural priming as an experimental paradigm is advantageous in many ways for psycholinguists to understand the comprehension, production, and acquisition mechanisms that involve syntactic representations [11]. However, there is still debate about the underlying mechanisms of structural priming. Some accounts posited that the syntactic structure persists automatically (e.g., [12]), whereas others argued that some processes in structural priming are constrained by limited memory capacity (e.g., [13, 14]). In particular, the latter account proposes that structural priming is driven by an automatic process of implicit learning and a non-automatic, explicit memory-related process. The explicit memory-related process would involve retrieval and adaptation of the previous sentence, and such retrieval would be more likely to occur if words are repeated between prime and target sentences (lexical cueing effect). The goal of the current study is to pinpoint the non-automatic (i.e., explicit memory-related) components in syntactic persistence. We asked whether lexical overlap elicits cueing effects on the retrieval of sentence structure from memory and whether the effects of structural priming varies with the limited capacity of memory.

Bock [9] first discovered structural priming in an experimental setting: In a series of tasks disguised as a recognition memory test, speakers were more likely to choose a passive structure (e.g., The church is being struck by lightning) over an active structure (e.g., The lightning is striking the church) to describe a picture after they heard a passive sentence. Structural priming is often considered as an effect that entails the autonomous repetition of syntactic structures independent of processing at other linguistic levels such as lexical access ([9, 15, 16], but see [17]). Nevertheless, it was also found that lexical overlap considerably enhances the magnitude of structural priming (i.e., lexical boost). Pickering and Branigan [12] demonstrated that, in a sentence completion task, the chance for a speaker to complete a target sentence fragment (e.g., The patient showed…) with an argument structure of [-NP-NP] increased after completing prime fragments like The racing driver showed the helpful mechanic… or The racing driver gave the helpful mechanic… because the speaker was forced to use an [-NP-NP] structure to complete the sentence in both prime fragments. The priming effect with the lexically overlapping prime fragment (showed—showed) was much larger than that with the non-overlapping prime (gave—showed). Similar effects were found in numerous studies (e.g., [14, 1821]). In a recent meta-analysis of structural priming effects in sentence production, the priming effect with lexical overlap was twice as large as that without lexical overlap [10], which underlined the crucial interaction between lexical access and syntactic encoding in structural priming.

It has been several decades since the debate began about how such a structural priming effect comes about. Pickering and Branigan [12] accounted for structural priming and the lexical boost in terms of a lexicalist model of production [22]. In this model, syntactic information is encompassed in the lemma stratum, wherein a lemma node is linked to combinatorial nodes that represent the syntactic structures licensed by the word. When a speaker comprehends or produces a sentence, the lemma node and the combinatorial node specific to the context are activated and the connection between these nodes is strengthened. The residual activation of the combinatorial node transiently enhances the preference of the prime structure in subsequent processing, which results in structural priming. If the head of the prime sentence is activated again, more activation of the lemma node streams to the combinatorial node. This reinforced activation at the combinatorial node further boosts the preference of the primed structure, causing a lexical boost effect. This residual activation model of structural priming predicts that the effects of structural priming and the lexical boost should decay rapidly. In the lexicalist model [22] that formed the basis of the residual activation model, the activation decays automatically as a power-law function of time. There is no principled reason for the residual activation model to predict that the decay of the activation is modulated by the assignment of attention.

On the other hand, Chang and colleagues [13] posited an implicit learning model of structural priming. The account assumes that speakers make predictions about upcoming utterances. They adjust their syntactic preferences by tuning the weight of the form-meaning mapping each time a prediction error occurs. This adaptation of syntactic preference will be incorporated into the statistical distribution of the form-meaning mapping, which consolidates over time as implicit syntactic knowledge. The implicit learning process in structural priming persists over multiple filler trials (i.e., long-term structural priming; [2325]). However, the initial implicit learning model of structural priming could not predict the lexical boost effect on structural priming. Chang and colleagues [13] thus tentatively assumed that there is also a mechanism related to explicit memory that is orthogonal to the implicit learning of abstract syntax. This view was later developed into a multifactorial account of structural priming [13, 14, 26].

In such a multifactorial account, apart from the implicit learning processes that essentially underlie the lexical-independent priming, speakers also temporarily store the surface structure and the wording of the prime sentences in explicit memory, possibly for the sake of maintaining/monitoring the coherence in the conversation (see [27] for discussion). The encoded prime sentence can be retrieved in the subsequent production tasks so as to facilitate the syntactic processing, thus contributing to the general structural priming effect. When lexical items are repeated between prime and target, speakers would take the repeated item as a retrieval cue that tracks and reconstructs the information from the prime sentence, which will further enhance the retrieval of prime structure from explicit memory. Thus, the lexical boost effect is mainly modeled as a lexical cueing effect of explicit memory retrieval. Consistent with the non-binary view of automaticity in language production [8], the multifactorial account predicts that the persistence of syntactic structure involves a tacit, incidental, and automatic procedure (i.e., implicit learning) and an effortful, non-automatic procedure (i.e., an explicit memory-related process).

In Chang and colleagues’ initial model, the explicit memory process is mainly proposed as an alternative mechanism that compensates for the insensitivity of the implicit learning model to lexical overlap. Thus, the validity of this postulated explicit memory mechanism in structural priming was initially called into question (e.g., [28]). Nevertheless, a number of studies provide evidence for the possible involvement of explicit memory in structural priming ([14, 2936]). Importantly, several studies demonstrated that some effects of structural priming show characteristics that are analogous to a short-term memory effect ([14, 31, 36]). Hartsuiker et al. [14] examined the effect of structural priming and the lexical boost in written and spoken dialogue. In two experiments, the number of filler trials between prime and target (i.e., lag) was manipulated. They found that structural priming remained robust for up to 6 intervening trials. However, the effect of lexical overlap, which was significant at Lag 0, quickly diminished when prime and target were separated by two or more filler trials. The rapid decay of the lexical boost effect is analogous to the short-lived memory of sentence structure (e.g., [37]). Branigan and McLean [36] replicated these findings in three- to four-year-old speakers.

Other studies tried to answer a similar question by investigating the effect of structural priming and the lexical boost in people with aphasia ([33, 38, 39]). The basic assumption of these studies is that if lexical overlap functions as a retrieval cue, patients with impaired verbal short-term memory might show more difficulty in maintaining the explicit memory of sentence structure, which should lead to a smaller lexical boost effect in a structural priming task. Man and colleagues’ study provided evidence for this assumption: people with aphasia had preserved lexical independent priming but not lexical boost, whether the prime and target were separated by zero or two fillers ([33], but see [38]). These findings suggest that at least a lexical-specific effect of structural priming is driven by a mechanism that rapidly decays, which is constrained by the limited capacity of short-term memory.

Note that it is possible that the retrieval process is not solely dependent on the repeated lexical items. Reasonably, speakers are able to recognize similarities other than lexical repetition (e.g., the events, the configuration of the pictures, and the event structure) between the prime and the target. They might as well employ such repeated representations as a retrieval cue for prime sentences. If this is the case, the decay of explicit memory might also manifest itself in lexical-independent structural priming. This hypothesis has been supported by Bernolet and colleagues [29]. In three pairs of experiments, the researchers investigated the lexical-independent structure repetition of Dutch transitives, datives, and word order in relative clauses. The structure repetition in the experiment was either spontaneous (priming experiment) or instructed (structure memory experiment). They found that the effect of structural priming quickly dropped at Lag 2 and Lag 6, which was comparable to the memory decay of sentence structures. As no lexical item was repeated between prime and target, this fast decay suggested that explicit memory not only contributes to lexical-dependent but also to lexical-independent structural priming. Bernolet and colleagues’ finding therefore extends the view of the so-called cueing effect on structural priming. The lexical boost effect cannot be taken as the only index of the postulated explicit memory process in structural priming. Instead, even for minimally related sentences, as long as certain representations are shared between the prime and the target, speakers could retrieve certain fast decaying representation of the prime sentence from memory to facilitate their syntactic choices.

In sum, previous studies suggested two essential properties of the explicit memory process in structural priming: lexical cueing, namely the stored sentence structure can be better retrieved when there is lexical overlap between prime and target, and short longevity, namely explicit memory effects rapidly dissipate over longer lags. However, although previous studies used these properties to pinpoint the explicit memory process in structural priming, it has not yet been empirically tested whether these two properties are intrinsically driven by explicit memory. First of all, the explicit memory hypothesis proposed that lexical repetition should act as a lexical cue in structural priming that facilitates the memory retrieval of sentence structure. However, to our knowledge, no study has tested the effect of lexical repetition on recalling prior sentence structures. Therefore, it is not clear whether lexical repetition can indeed function as a retrieval cue. Second, the explicit memory hypothesis assumes that the short-term decay of the priming effects is relative to the limited capacity for speakers to maintain syntactic information in explicit memory (see [13, 14]). Although some evidence was found in studies on structural priming in people with aphasia (e.g., [33, 38, 39]), no evidence has been gathered regarding whether limited memory capacity constrains structural priming in healthy speakers.

In response to these lacunae in the evidence we discussed above, the current study asks two questions: First, is the lexical boost effect driven by the lexical cueing on sentence structure retrieval? Second, is the short-term effect of structural priming a function of the capacity of speakers to maintain memory traces of sentence structure? We will briefly discuss these two questions further in the sections below.

Lexical cueing in syntactic persistence

There is a long history of studies on how lexical cues function in memory retrieval (e.g., [4050]). A general finding of these studies is that the presence of a word that is semantically or phonologically related to the to-be-recalled item facilitates the retrieval of the encoded memory traces (e.g., [41, 42, 48, 49]). The lexical cue is particularly effective when it occurs in both encoding and retrieval processes (e.g., [46, 50]). These findings are consistent with models of cued recognition/recall in that the features that are shared between a cue and a test item can be employed to facilitate the selection of output during memory retrieval (e.g., [51]).

However, it is still an open question whether lexical cues facilitate the retrieval of sentence structures. Although some studies showed better sentence recall when a lexical-semantic cue was presented ([45, 47]), these studies did not directly investigate how syntactic choices in sentence recall were influenced by lexical overlap. As a precondition of the explicit memory hypothesis, lexical overlap serves as a lexical cue for speakers to retrieve the syntactic memory traces that are possibly stored in short-term explicit memory. Thus, it is important to examine the role of lexical overlap in a task that directly taps into memory retrieval of sentence structure (e.g., sentence structure recall). If there is a facilitation effect of lexical overlap on sentence structure recall, it supports an important precondition for an account by which such a lexical cueing effect explains the lexical boost effect on structural priming. One purpose of the current study is to test this precondition of the explicit memory hypothesis, namely that lexical overlap facilitates the recall of sentence structures.

We therefore conducted two sentence structure memory experiments and two structural priming experiments. In the sentence structure memory experiments, participants were instructed to memorize the structure of the sentences that a confederate of the experimenters produced (i.e., to-be-recalled sentences) and reuse it in the subsequent production task. We manipulated both the structure of the to-be-recalled sentences and the overlap of the head noun in prime/to-be-recalled and target sentences in all four experiments in order to investigate the effect of lexical overlap on the memory retrieval of sentence structure.

Limited memory capacity in the decay of structural priming

It is well established that the maintenance of the memory traces stored in short-term memory is constrained by the assignment of attentional resources ([5255]). Barrouillet and colleagues argued that the maintenance of memory traces is a time-based process that requires attention. Items encoded in short-term explicit memory can be refreshed when attention is directed to them. But when attention is switched away to processing, memory suffers from a time-related decay. As the central bottleneck only allows one process at a time, the sharing of attentional resources is realized by a constant switching between processing and memory maintenance. Crucially, Barrouillet and colleagues’ model acknowledges the variability of memory decay within a fixed time window: even when the duration of the processing is controlled, a more attention-demanding task might yield a greater detrimental effect on memory maintenance.

If the short-term persistence of syntactic structure is indeed a short-term explicit memory effect, it is reasonable to assume that the decay of this short-term effect is dependent upon the limited resources that can be assigned to the maintenance of the memory of sentence structures ([13, 14]). In this case, the rapid decay of the priming effect is not only a function of the time lag or the number of fillers between prime and target but also relative to the amount of time attention is occupied by memory maintenance. On the basis of Barrouillet and colleagues’ model, two more predictions can be made. First, there will be more decay of the priming effect when prime and target are separated by a secondary task that requires more attention, even when the time lag between prime and target is fixed. Second, such modulation of priming by cognitive load occurs regardless of the nature of the secondary task, so the priming effect decays even when no new sentence material is encountered. It is possible that an effect of cognitive overload would affect lexically mediated priming in particular, given that lexical overlap might make the process of sentence structure retrieval from explicit memory more likely. But if, as suggested by [29], the repeated lexical item is not the only trigger of the memory retrieval process in structural priming, we would predict that both lexical-dependent and lexical-independent structural priming would be susceptible to load manipulation.

To test these predictions, we investigated the lexical overlap effect and cognitive load effect on structural priming and sentence structure recall of Dutch genitives (of-genitive vs. s-genitive, see [56, 57]). We employed Dutch genitives as the target structure because the priming of Dutch genitive alternation is a well-established effect with substantial magnitude [10]. Two computer-paced structural priming experiments were conducted, along with two corresponding sentence structure memory experiments ([29, 39]). All experiments had the same design.

Importantly, we inserted a secondary arithmetic problem solving task between prime and target in each trial. We controlled the duration and manipulated the difficulty of the problem solving task. We used an arithmetic problem solving task as the secondary task for mainly three reasons. First, a plethora of studies demonstrated that the processing load of arithmetic problems is a function of the problem difficulty (see [58, 59] for a review). Second, arithmetic problem solving (or the operation solving task) is one of the most frequently used secondary tasks in studies of working memory (e.g., [52, 55]). Third, it has been established that the difficulty of a concurrent arithmetic problem influences the preparation time of sentence production [60]. It is possible that the cognitive load imposed by a difficult arithmetic problem would similarly influence the syntactic choices in sentence production.

In sum, based on the rationale above, we tested two predictions. First, lexical overlap enhances the magnitude of both structural priming and sentence structure recall. Second, structural priming and sentence recall are reduced by a secondary task with high vs. low cognitive load.

Experiment 1a: Structural priming

Method

Participants

Forty Ghent University students, all native Dutch speakers, participated in exchange for course credit (33 females and 7 males, average age 19.35 years). All participants reported to be non-color-blind and right-handed and had normal or corrected-to-normal vision. A 22-year-old female native Dutch speaker acted as confederate. The study is in line with the “General Ethical Protocol for Scientific Research at the Faculty of Psychology and Educational Sciences of Ghent University” and was approved by the Ethical Committee of Faculty of Psychology and Educational Sciences, Ghent University. Informed consent was obtained for experimentation with human subjects.

Materials

A verification set of 120 pictures and a description set of 96 pictures for participants were adopted from Bernolet et al. [29]. All pictures showed black-and-white line drawings of two figures. The participant’s description set contained 48 critical description pictures and 48 filler pictures. The critical pictures were designed to elicit genitive expressions (see Fig 1). On each critical picture, the figures were holding an object, wearing an object, or standing next to an object, indicating the status of ownership. One object in the picture was colored; the rest of the picture was in black and white. This way, the referential expression the speakers could choose for the colored object should contain information of the ownership of the object (e.g., the duck of the boy/the boy’s duck is red). The filler pictures contained no objects but two figures, with one fully colored and the other in black and white. All the figures in the participant’s description set were chosen equally often from a boy, a girl, a nurse, a wizard, a pirate, a nun, a priest, and a witch. Four colors (blue, green, red, yellow) were used equally often for the different objects and figures in the pictures. The participant’s verification set contained 72 critical pictures and 48 filler pictures. Among the critical pictures in the verification set, 48 pictures matched with the confederate’s description (24 for the descriptions that shared the object with the corresponding target picture; 24 for the descriptions containing the object that differed from the corresponding target picture) and 24 pictures differed from the description. The configuration of the pictures was the same as the description set.

Fig 1. Example of a target picture.

Fig 1

A description set of 240 sentences for the confederate was created. Half of the sentences matched with the participants’ pictures from the verification set. The confederate’s description set contained 192 critical prime sentences that corresponded to 48 critical description pictures for participants and were counterbalanced between prime conditions and head noun conditions. Half of these critical sentences were s-genitive sentences (e.g., 1a and 1b), the other half were of-genitive sentences (e.g., 2a and 2b). The critical prime sentence in the Same Head Noun conditions (e.g., 1a and 2a) contained a head noun (e.g., eend, meaning duck) that matched with the target object of the corresponding target picture (e.g., a red duck). In the Different Head Noun conditions, the semantic and phonological overlap between the head noun of the prime sentence (e.g., kaas in 1b and 2b, meaning cheese) and the target object (e.g. a red duck) was kept to a minimum. The prime nouns and their non-overlap controls had the same number of syllables and were matched for prosody. The target objects in the prime and target descriptions had the same color; the owner of the object was different in prime and target descriptions. The remaining 48 sentences in the confederate’s description set were filler sentences that could be used to describe the filler items in the participant’s verification set. The confederate’s verification set contained 96 further pictures that were the same as the pictures in the participant’s description set. All materials are listed in S2 Appendix.

(1a) De jongen zijn eend is rood.

  [Literally: The boy his duck is red.]

(1b) De jongen zijn kaas is rood.

  [Literally: The boy his cheese is red.]

(2a) De eend van de jongen is rood.

  [The duck of the boy is red.]

(2b) De kaas van de jongen is rood.

  [The cheese of the boy is red.]

S-genitive, Same Head Noun

S-genitive, Different Head Noun

Of-genitive, Same Head Noun

Of-genitive, Different Head Noun

Furthermore, 144 addition problems (S3 Appendix) were constructed for the participant’s arithmetic problem solving task. Each problem was composed of two addends and a plus sign. The problem set contained 96 critical problems and 48 filler problems. The problems in the Easy Problem condition were composed of a two-digit addend and a one-digit addend that is either 1 or 2 (e.g., 35 + 2). The problems in the Difficult Problem condition were composed of two two-digit addends (e.g., 35 + 42). In the critical trials, the addition never involved a carrying operation. The units digits of all two-digit addends ranged from 1 to 7, and the tens digits of the smaller two-digit addends ranged from 1 to 4. The order of the addends in each problem was counterbalanced. The filler problems contained 16 addition problems between a two-digit figure and 1 or 2, 16 addition problems between two two-digit figures without carrying, and 16 addition problems between two two-digit figures with carrying. The design of the problems for the confederate was the same as the participant, but the problems in each trial were different between the two.

A critical trial was defined as a pairing of the confederate’s critical prime sentences and the participant’s critical description pictures, which were separated by the arithmetic problem solving task. Thus, we had a 2 (prime condition) x 2 (head noun condition) x 2 (problem difficulty) design; all factors were manipulated within items and participants. We constructed eight counterbalanced pseudo-random lists so that each target object was preceded by the same object in four lists (Same Head Noun conditions) and by an unrelated control object (Different Head Noun conditions) in the other four lists. In the Same Head Noun condition and Different Head Noun condition the target picture was preceded by an s-genitive in four lists and by an of-genitive in the other four lists. In four lists, the prime and the target were separated by an easy problem, and in the other four lists, the prime and the target were separated by a difficult problem. For each of the eight lists, the trials were presented in the same pseudo-random order. There were four fillers at the beginning of each list; critical trials were separated by 0 to 6 filler trials. Each participant was presented with one of these eight lists.

Procedure

Both the participant and the confederate sat in front of a PC, which ran the experimental program on Eprime (version 2.0.10.356). They could not see each other’s screens. They were told that they would cooperate with their partners to perform as fast and accurately as possible in a series of tasks. In the picture description tasks, one would describe pictures and the other would judge whether the picture on his/her screen matched with the description made by the partner. In the arithmetic problem solving task, one would solve an arithmetic problem and the other would judge the correctness of the result. At the beginning of the experiment, the participants and the confederate read the instruction to learn the series of tasks they would perform and the possible pictures they would see during the experiment. They then familiarized themselves with the procedure in a practice session. The practice session included five filler trials.

The procedure of the main test in Experiment 1a and 1b is illustrated in Fig 2. Note that the timing of each task was strictly controlled. The program was synchronized between the participant and the confederate and it was set up so that the confederate always took the first turn. At the beginning of each trial, the confederate started by “describing the picture” while she actually read sentences directly from the screen. The participant pressed “1” if the description matched with the picture and “2” otherwise. After the key was pressed, the picture remained on the screen for 6500 ms. Then an arithmetic problem appeared on the participant’s screen. The participant typed in the answer first and then verbally reported the answer. After hearing the answer, the confederate verified the answer by pressing “1” or “2”. This task lasted for 4500 ms and a visual signal appeared on the screen for 2000 ms by the end of the task to remind the participants to verbally report the results. The participant then saw a picture on the screen and described it to the confederate. The confederate judged whether the description matched the picture. The task lasted for 6500 ms. Upon responding, the participant saw feedback about his/her average accuracy in the arithmetic problem solving task. The feedback was presented on the screen for 800 ms. Finally, the confederate solved an arithmetic problem and verbally reported the number to the participant. The participant verified the reported result. The experimental session took about 50 minutes.

Fig 2. The procedure in Experiment 1a.

Fig 2

Each trial for participants consisted of the following events: After a 1000 ms interval the prime sentence was presented auditorily (by the confederate) and simultaneously a verification picture was presented (6500 ms). After another 1000 ms interval an arithmetic problem was presented (6500 ms), followed by a 1000 ms interval, a target picture (6500 ms), and feedback (1000 ms). After another 1000 ms interval the verification number was presented (6500 ms). The series of events in Experiment 1b was nearly identical. The only difference was that a parity judgment task (1000 ms) was presented for the confederate instead of an interval after the confederate’s verification picture.

Scoring

The latency of the first key press for participants in the arithmetic problem solving task was recorded as an index of processing time. Responses for participants during the picture description task were coded as s-genitives, of-genitives, or ‘others’. A response was coded as an s-genitive if the possessor preceded the possessed object and the appropriate possessive morpheme (zijn for male possessor/haar for female possessor) was added between the possessor and the object. A response was coded as an of-genitive if the sentence began with the possessed object, followed by the preposition van, and ended with the possessor. If a different preposition was used, or if any other construction was used, the response counted as ‘other’ response.

Results

Data and analysis scripts for all experiments reported in this paper are available online (on the Open Science Framework at https://osf.io/6utkf/). Critical trials in which participants produced no response or an ‘other’ response were excluded from the analyses (8.6% of the data). The final data set contained 1755 target responses, among which were 274 s-genitive responses (15.6%) and 1481 of-genitive responses (84.4%).

We first report the results of the arithmetic problem solving task. The average time occupied by problem solving in the Easy Problem condition was 1311 ms shorter than the average time of problem solving in the Difficult Problem condition (Easy: 1548 ms vs. Difficult: 2851 ms, Cohen’s d = -3.085), indicating an evident distinction of cognitive load between the two difficulty levels of the secondary task.

The descriptive data of the s-genitive production for each prime condition x head noun condition is illustrated in Table 1. The overall s-genitive production was 26.0% after an s-genitive prime and 5.1% after an of-genitive prime, yielding a 20.9% structural priming effect. More s-genitives were produced in the Same Head Noun condition than in the Different Head Noun condition (20.3% vs. 10.6%), while there was no difference in the s-genitive production between the Easy Problem and Difficult problem condition (15.8% vs. 15.0%). The priming effect was 31.2% in the Same Head Noun condition and 10.2% in the Different Head Noun condition, resulting in a 21.1% lexical boost effect on structural priming. The difference in the priming effect between the Easy Problem and Difficult Problem conditions was negligible in both the Same Head Noun condition (1.1%) and Different Head Noun condition (0.6%). The priming effect for each overlap x problem difficulty condition is illustrated in Fig 3.

Table 1. Proportions of s-genitive responses out of all the s-genitive and of-genitive responses for each prime condition*head noun condition combination in each experiment.

Experiment Head noun condition Prime/ to-be-recalled structure
S-genitive Of-genitive Structure repetition
EXP1a Same .360 .047 .313
Different .157 .055 .102
EXP1b Same .784 .040 .744
Different .530 .056 .474
EXP2a Same .404 .022 .382
Different .204 .041 .163
EXP2b Same .865 .019 .846
Different .641 .028 .613

“S-genitive” and “Of-genitive” in the second row of the header indicate the levels of prime condition in Experiment 1a and 2a as well as the levels of to-be-recalled structure in Experiment 1b and 2b.

“Structure repetition” in column 5 refers to the structural priming effect in Experiment 1a and 2a as well as the structure memory effect in Experiment 1b and 2b. It indicates the proportion of s-genitive responses in s-genitive condition (column 3) minus that in of-genitive condition (column 4).

Fig 3. The priming effect (s-genitive production in the s-genitive condition minus that in the of-genitive condition) as a function of head noun condition and problem difficulty in Experiment 1a and 1b.

Fig 3

Error bars reflect standard errors calculated for a by-participants analysis.

The participants’ responses were fit by a Generalized Linear Mixed Model (GLM model), using the lme4 package (version 1.1.23) in R (version 3.4.0). The model predicted the logit-transformed likelihood of an s-genitive response. Prime condition, head noun condition, and problem difficulty were included in the model as fixed factors. These predictors were entered into the model in mean-centered form (deviation coding). We also included the critical trial number (normalized) as an independent variable, as a number of studies suggest a cumulative priming effect, so that the likelihood of production of the least frequent structure gradually increases over the course of the experiment toward the statistical distribution of the current language environment (e.g., [61, 62]). For the analysis (and all the analyses thereafter), we employed the maximal random effects structure justified by the design [63]. Specifically, we included in the model the by-subject and by-item random intercepts as well as random slopes for all main effects and interactions in the fixed model. If the maximal model could not converge or showed singularity, we first dropped the random correlation terms once and for all. If the model without random correlation could not converge either, we begin to drop one random factor at a time, starting from the most complex terms in the random model. When there were multiple terms with the same complexity, we compared the variances of the random effects in the last model and dropped the term that accounted for the least amount of variance. We repeated this step until the model converged and no warning of singular fit was reported.

The final model included a random intercept, a random slope of prime condition, and a random slope of head noun condition for subjects as well as a random intercept, a random slope of prime condition, a random slope of head noun condition, and a random slope of problem difficulty for items. The random correlations were dropped. Here we report the significance of the effects based on the fixed effect estimate of the of the LME models. This is because by using contrastive coding, the fixed effects of the model are informative about the main effects and the interactions [64]. The summary of the fixed effects of the model is listed in Table 2. Alpha was assumed as .05. The significant negative intercept (pz < .001) indicates that out of all the s-genitive and of-genitive responses, the proportion of s-genitive was significantly below 50%. To our expectation, we found a significant main effect of prime condition (pz < .001). The significant interaction between prime condition and head noun condition (pz < .001) demonstrated an evident lexical boost effect. Additionally, we found a significant main effect of critical trial number (pz < .001). The negative coefficient indicated that the overall likelihood of s-genitive production decreased over the course of the experiment. However, there was no significant two-way interaction between prime condition and problem difficulty (pz = .623) and neither was there a three-way interaction among prime condition, head noun condition, and problem difficulty (pz = .280). Thus, there was no evidence that secondary task difficulty modulated structural priming or the lexical boost. We also found a significant main effect of head noun condition (pz = .011). Although this effect was not predicted, we argued that this might be a by-product of the lexical boost effect. It is possible that the facilitation effect of the head noun overlap on the persistence of sentence structure further led to an increase of the overall production of the least frequent structure (s-genitive), resulting in a higher likelihood of overall s-genitive production when there was a head overlap. The rest of the effects were not significant (pzs > .1).

Table 2. Fixed effect estimates (in log odds units), Experiment 1a.

Estimate SE z p-value
(Intercept) -3.157 0.343 -9.203 < .001
Prime condition 2.918 0.288 10.121 < .001
Head noun condition 0.659 0.260 2.537 .011
Problem difficulty 0.240 0.230 1.042 .298
Critical trial number -0.379 0.096 -3.949 < .001
Prime condition: Head noun condition 2.083 0.470 4.432 < .001
Prime condition: Problem difficulty -0.223 0.454 -0.491 .623
Head noun condition: Problem difficulty 0.691 0.454 1.522 .128
Prime condition: Head noun condition: Problem difficulty -0.968 0.896 -1.081 .280

Prime condition (Of-genitive as the baseline level), head noun condition (Different Head Noun as the baseline level), problem difficulty (Difficult Problem as the baseline level) were in mean-centered form. Critical trial number was normalized.

There was no effect of cognitive load on structural priming. One possibility is that the arithmetic problems were generally too easy, even in the “difficult” condition, to exert substantial cognitive load. However, considerable variation can be expected among our subjects and items, so that there may have been a relatively strong load on a subset of the trials. In an exploratory analysis, we therefore considered the reaction times as a proxy for cognitive load in a subset of data. We predicted that in the Difficult Problem condition there would be a negative correlation between reaction time and recall accuracy [53]. We did not make the same prediction in the Easy Problem condition since solving a simple arithmetic problem like “12+1” requires little attention [59]. We fitted a further model that predicted the likelihood of s-genitive production in the subset of data in the Difficult Problem condition, using prime condition, head noun condition, and the processing time (normalized) as predictors. The final model included a random intercept for subjects as well as a random intercept, a random slope of prime condition, and a random slope of head noun condition for items. The random correlations were dropped. The summary of the fixed effects of the model is illustrated in S1 Appendix. The intercept and the main effect of the prime condition were significant (pzs < .001). The interaction between the prime condition and the head noun condition was marginally significant (pz = .096). Now we found a significant interaction between the prime condition and processing time (pz = .006). The negative coefficient estimate indicated a negative correlation between the two predictors: the longer the reaction time in the secondary task, the smaller the priming effect afterwards. However, the three-way interaction among prime condition, head noun condition, and the processing time was not significant (pz = .990). Unexpectedly, we found a significant main effect of the processing time (pz = .019). The rest of the effects were not significant (pzs > .1).

Experiment 1b: Sentence structure memory

Method

Participants

Forty further Ghent University students participated in exchange for course credit (34 females and 6 males, average age 19.85 years). All participants were native Dutch speakers, reported to be non-color-blind and right-handed, and had normal or corrected-to-normal vision. The same Dutch speaker as in Experiment 1a was employed as confederate.

Materials

The materials were the same as in Experiment 1a. Because the stimuli in Experiment 1b (and also Experiment 2b) did not serve as primes, we use the term ‘to-be-recalled sentence’ to refer to the sentences provided by the confederate.

Procedure

The procedure was similar to that of Experiment 1a, except that the participant and confederate were told that they would each perform an extra memory task in parallel with one of the tasks. At the beginning of the experiment, the experimenter assigned the structure memory task to the participant and the number memory task to the confederate, making it appear as if the tasks were randomly assigned. The participant was then told that in each trial he/she should memorize the sentence structure used by the confederate for picture description and then reuse the same sentence structure to formulate his or her utterances when he/she had to describe the picture in the same trial. The experimenter did not give a further explanation about what “sentence structure” refers to in order to avoid overexposing participants to metalinguistic knowledge about syntax. However, if speakers failed to reuse sentence structure (in this case an s-genitive or an of-genitive) in the practice session, the experimenter gave corrective feedback to the participant until he/she began to reuse sentence structures. Meanwhile, in order to balance the workload between participant and confederate, the confederate was instructed to memorize the parity (i.e., odd or even) of the number reported by the participant and recall the parity in the subsequent task within the same trial. The participant and confederate familiarized themselves with the procedure with five trials in the practice session.

The program was set up so that the confederate always took the first turn. The program ran simultaneously between the participant and the confederate. At the beginning of each trial, the confederate started by “describing the pictures”. The participant pressed the button to judge whether the picture matched with the description, while memorizing the sentence structure the confederate used. Then an arithmetic problem appeared on the participant’s screen. The participant typed in the answer first and then verbally reported the answer. After hearing the answer, the confederate made the judgment about the correctness of the answer, while memorizing the parity of the reported number. The participant then saw a picture on the screen and was instructed to describe it by reusing the sentence structure he/she memorized. In the filler trials, there was no structure alternation so the participant was expected to repeat an intransitive structure. The confederate judged the matching of the description with the picture. Then, the confederate recalled the parity of the reported number by pressing “1” for odd numbers and “2” for even numbers. The recall task lasted for 1000 ms. Finally, the confederate solved an arithmetic problem and verbally reported the number to the participant. The participant verified the reported result. The duration and other settings (including the timing of each task) were the same as in Experiment 1a.

Scoring

The scoring was the same as in Experiment 1a.

Results

Critical trials in which participants produced no response or an ‘other’ response in the picture description task were excluded (1.4% of the data). The final dataset contained 1893 target responses, among which were 663 s-genitive responses (35.0%) and 1230 of-genitive responses (65.0%).

In Experiment 1b, the average time occupied by problem solving in the Easy Problem condition was 1310 ms shorter than the average time of problem solving in the Difficult Problem condition (Easy: 1643 ms vs. Difficult: 2952 ms, Cohen’s d = -3.403), again clearly reflecting a difference in Problem difficulty. The descriptive data are reported in Table 1. The s-genitive production was 65.8% after an s-genitive to-be-recalled sentence and 4.8% after an of-genitive, yielding a 61.0% sentence structure memory effect. Again, the s-genitive production in the Same Head Noun condition was higher than that in the Different Head Noun condition (41.1% vs. 29.2%) whereas there was no difference in s-genitives between the Easy Problem condition and the Difficult Problem condition (35.3% vs. 35.1%). The structure memory effect was 74.5% in the Same Head Noun condition and 47.4% in the Different Head Noun condition, yielding a 27.0% head noun overlap effect on sentence structure memory. The difference between the recall performance in the Easy Problem and the Difficult Problem conditions was 1.4% in the Same Head Noun condition and 2.5% in the Different Head Noun condition. The recall performance in each head noun condition and problem difficulty condition is illustrated in Fig 3.

A GLM model that predicted the likelihood of s-genitive production was fitted. Note that alternatively, we could have taken the number of correct responses as the dependent variable. However, we decided to use the same dependent variable (number of s-genitives) in all GLM models in the current study in order to make the analyses of the priming and memory experiments comparable. The final model included a random intercept, a random slope of to-be-recalled structure, and a random slope of critical trial number for subjects as well as for items. The random correlations were dropped. The fixed effects were reported in Table 3. There was a significant intercept (pz < .001). In line with our expectation, there was a significant main effect of the to-be-recalled structure (pz < .001), indicating that speakers followed the instruction to use the s-genitive structure if the preceding to-be-recalled structure was an s-genitive (i.e., a structure memory effect). There was also a significant interaction between to-be-recalled structure and head noun condition (pz < .001), indicating a head noun overlap effect on recall of s-genitive structure. However, there was no significant two-way interaction between to-be-recalled structure and problem difficulty (pz = .770), and neither was there a three-way interaction among to-be-recalled structure, head noun condition, and problem difficulty (pz = .499). Different from Experiment 1a, there was no main effect of the critical trial number (pz = .784). Similar to Experiment 1a, there was also a main effect of the head noun condition (pz < .001), which might also be due to the head noun overlap effect on structure memory led to an overall increase of the least frequent structure (s-genitive) when there was a head noun overlap. The rest of the effects were not significant (pzs > .1).

Table 3. Fixed effect estimates (in log odds units), Experiment 1b.

Estimate SE z p-value
(Intercept) -1.692 0.279 -6.073 < .001
To-be-recalled structure 5.574 0.469 11.881 < .001
Head noun condition 0.674 0.190 3.547 < .001
Problem difficulty 0.053 0.189 0.283 .777
Critical trial number 0.033 0.121 0.274 .784
To-be-recalled structure: Head noun condition 2.242 0.381 5.883 < .001
To-be-recalled structure: Problem difficulty 0.110 0.377 0.293 .770
Head noun condition: Problem difficulty 0.301 0.383 0.787 .431
To-be-recalled structure: Head noun condition: Problem difficulty -0.510 0.755 -0.676 .499

To-be-recalled structure (Of-genitive as the baseline level), head noun condition (Different Head Noun as the baseline level), problem difficulty (Difficult Problem as the baseline level) were in mean-centered form. Critical trial number was in normalized form.

As in Experiment 1a, we ran an exploratory analysis that considered item difficulty. Again, a GLM model was fitted that predicted the likelihood of s-genitive production after processing a difficult secondary task. The final model included a random intercept, a random slope of to-be-recalled structure, and a random slope of head noun condition for subjects as well as a random intercept and a random slope of to-be-recalled structure for items. The summary of the fixed effects of the model is illustrated in S1 Appendix. There were a significant intercept and a significant main effect of the to-be-recalled structure (pzs < .001). This time there was no two-way interaction between to-be-recalled structure and processing time (pz = .852). And the three-way interaction among prime condition, head noun condition, and the processing time was not significant either (pz = .112). There was also a main effect of the head noun condition (pz = .015). The rest of the effects were not significant (pzs > .1).

Cross-experiment analysis of structural priming effects and lexical boost effects in Experiment 1a and 1b

To further compare the magnitude of structural priming effects, lexical boost effects, and cognitive load effects between Experiment 1a and 1b, we fitted a GLM model that predicts the likelihood of s-genitive production in Experiment 1a-b. Prime condition (this condition indicates the prime condition in Experiment 1a and the to-be-recalled structure in Experiment 1b), head noun condition, problem difficulty, and experiment were included in the model as fixed factors (all in mean-centered form). The final model included a random intercept, a random slope of prime condition, and a random slope of head noun condition for subjects as well as a random intercept, a random slope of prime condition, a random slope of head noun condition, and a random slope of experiment for items. The random correlations were dropped.

The summary of the fixed effects of the model is listed in Table 4. The intercept was significant (pz < .001). There was a main effect of experiment (pz < .001), indicating that the overall proportion of s-genitive was much higher in Experiment 1b than in Experiment 1a. The main effect of prime condition was significant (pz < .001) and the two-way interaction between prime condition and experiment was significant (pz < .001), suggesting that the effect of structure repetition in Experiment 1b was significantly stronger than that in Experiment 1a. There was a significant interaction between prime condition and head noun condition (pz < .001), but the three-way interaction among prime condition, head noun condition, and experiment was not significant (pz = .915). The two-way interaction between prime condition and problem difficulty and the three-way interaction among prime condition, problem difficulty and experiment were not significant (pzs > .1). The three-way interaction among prime condition, head noun condition, and problem difficulty as well as the four-way interaction among all four predictors were not significant (pzs > .1). There was also a main effect of head noun condition (pz < .001). Unexpectedly, there was a significant interaction between head noun condition and problem difficulty (pz = .048). The rest of the effects were not significant (pzs > .1).

Table 4. Fixed effect estimates (in log odds units), cross experiment analysis (1a and 1b).
Estimate SE z p-value
(Intercept) -2.323 0.211 -10.992 < .001
Prime condition 4.144 0.279 14.862 < .001
Head noun condition 0.648 0.153 4.224 < .001
Problem difficulty 0.134 0.142 0.946 .344
Experiment -1.518 0.414 -3.669 < .001
Prime condition: Head noun condition 2.101 0.288 7.295 < .001
Prime condition: Problem difficulty -0.041 0.283 -0.146 .884
Prime condition: Experiment -2.456 0.475 -5.174 < .001
Head noun condition: Problem difficulty 0.558 0.283 1.974 .048
Head noun condition: Experiment -0.039 0.289 -0.135 .892
Problem difficulty: Experiment 0.230 0.287 0.803 .422
Prime condition: Head noun condition: Problem difficulty -0.693 0.565 -1.227 .220
Prime condition: Head noun condition: Experiment -0.062 0.582 -0.106 .915
Prime condition: Problem difficulty: Experiment -0.487 0.573 -0.849 .396
Head noun condition: Problem difficulty: Experiment 0.532 0.571 0.932 .351
Prime condition: Head noun condition: Problem difficulty: Experiment -0.730 1.137 -0.642 .521

Prime condition (Of-genitive as the baseline level), head noun condition (Different Head Noun as the baseline level), problem difficulty (Difficult Problem as the baseline level), experiment (Experiment 1b as the baseline level) were in mean-centered form.

In addition, the theoretically interesting interactions that involved the contrast between the two experiments were examined by estimating the inverse of Bayes factor (BF10) using Bayesian Information Criteria. This compares the fit of the data under the alternative hypothesis (i.e., a model with experiment as an interaction term) to the null hypothesis (i.e., a model without experiment as an interaction term). Based on the standard interpretation of the inverse of Bayes factors as evidence for the alternative hypotheses [65], BF10 that ranges from 1 to 3 can be taken as weak evidence for the alternative hypothesis. The higher a BF10, the more evidence in support of the alternative hypothesis (3–20: positive evidence; 20–150: strong evidence; > 150: very strong evidence).

In the Bayesian analysis, we found an estimated BF10 for the two-way interaction between prime condition and experiment that suggested the data were 1017 times more likely to occur under a model including the two-way interaction than under a model without it. There was thus very strong evidence for the alternative hypothesis, namely that there was a difference in the effects of structure repetition between the two experiments. An estimated BF10 for the three-way interaction among prime condition, head noun condition, and experiment indicated that the data were only 0.017 times more likely to occur under a model including the three-way interaction than under a model without it. This suggested strong evidence against the alternative hypothesis that the magnitude of the head noun overlap effect on structure repetition was different between the two experiments. Similarly, strong evidence against the alternative hypothesis was also found for the three-way interaction among prime condition, problem difficulty, and experiment (BF10 = 0.023) and for the four-way interaction among prime condition, head noun condition, problem difficulty, and experiment (BF10 = 0.020). Taken together, the cross-experiment analysis showed that while the overall proportion of the s-genitive responses and the tendency to repeat a previously experienced structure were significantly different between Experiment 1a and 1b, the magnitude of the head noun overlap effect (and other interactions under test) was not different between the two experiments.

Discussion of Experiment 1a and Experiment 1b

In Experiment 1a, we found a structural priming effect in dialogue. Specifically, the likelihood for a speaker to spontaneously produce an s-genitive sentence was higher after an s-genitive prime than after an of-genitive prime (21.2% priming effect). Structural priming survived over a non-linguistic task that mostly taps into cognitive processes that are independent of language processing. These findings, in combination with the previous finding of long-term structural priming effect over linguistic intervening tasks (e.g., [14, 23, 24, 66]), suggest that the structural priming effect may persist over at least one intervening task. Additionally, we found a significant effect of the to-be-recalled structure (60.9%) in Experiment 1b, demonstrating that the participants typically reused the sentence structure as instructed. However, the participants did not perform perfectly in the memory experiment: the overall accuracy of sentence structure memory was 80.5%, indicating that misrecall occurred regularly. This is consistent with the findings of Bernolet et al. [29] in which the accuracy of immediate structure recall ranged from 73.3% to 89.6%.

One possible locus of the syntactic persistence in both experiments is that speakers memorized the gender possessive pronouns in the s-genitives (e.g., zijn, haar) and used these function words as a pointer to guide the sentence structure formulation in the production task. If so, we would expect that when the gender of the possessive pronouns was consistent between prime and target sentences, the structural priming (and memory) effect would be larger. However, adding the interaction between prime structure and gender consistency did not improve the fit of the GLM model for the full data set in Experiments 1a (χ2 = 0.16, df = 1, p = .900) and 1b (χ2 = 0.416, df = 1, p = .519). Thus, it is unlikely that the priming effect in Experiment 1a and memory effect in Experiment 1b were driven by a sentence formulation process that was guided by the retrieval of the primed possessive pronouns.

As expected, there was an effect of head noun overlap on structural priming (21.1% lexical boost effect). This lexical boost effect with Dutch genitives was in line with the numerous studies that observed lexical overlap as a modulating factor on structural priming (e.g., [12, 18]). More importantly, we found a similar effect of lexical overlap on sentence structure recall (27.0% lexical overlap effect). Despite the evident difference of magnitude between the effect of structural priming and the effect of sentence structure memory, the difference caused by lexical overlap was very similar in both experiments. This suggests that the extent to which lexical overlap facilitates spontaneous syntactic repetition was comparable to the extent to which lexical overlap enhances the retrieval of sentence structure from explicit memory.

One unexpected result in the priming experiment was a negative correlation between the critical trial number and the likelihood of s-genitive production. In contrast with the prediction of the implicit learning account of structural priming that the likelihood of the least frequent structure would increase over the course of the experiment, the number of s-genitive sentences decreased with increasing trial number. We will briefly discuss this cumulative effect in the General Discussion.

In both experiments, the average time needed to process a difficult problem was significantly longer than the time needed for an easy problem. This indicated that the difficult problem exerted substantial cognitive load on the participants. We did not find direct evidence for the effect of secondary task difficulty on structural priming. Nevertheless, the negative correlation between the structural priming effect and the time occupied processing the arithmetic task in the Difficult Problem condition is suggestive that the magnitude of priming can be affected by the cognitive load experienced between prime and target processing. Furthermore, the average chance of successful s-genitive structure repetition was close to the chance level in the Different Head Noun condition (52.6%), which, for a recall task, was unexpectedly low. This led us to further suspect that the memory traces of sentence structure can easily dissipate when speakers process the secondary task, regardless of the difficulty of the task (i.e., an across-the-board load effect). Given these results, we predicted that the secondary task imposed a cognitive load on maintaining the memory traces of sentence structure, thus reducing the priming effect and hindering sentence recall and that a considerably more difficult secondary task might lead to a further reduction of priming and recall. We thus designed two further experiments (Experiment 2a and 2b) with a more difficult secondary task, namely addition problems that involved carry operations. Another change, made for practical reasons, was that instead of using an on-site confederate that performed face-to-face interaction with the participants, we took the recording of the confederate in Experiment 1a and 1b as the prime stimuli in the picture verification task for participants.

Experiment 2a: Structural priming

Method

Participants

Forty-eight further Ghent University students participated in exchange for course credit (38 females and 10 males, average age 18.56 years). All participants were native Dutch speakers, reported to be non-color-blind and right-handed, and had normal or corrected-to-normal vision.

Materials

The materials were similar to Experiment 1a. Two major changes were made. First, the difficulty of the arithmetic problems in the difficult condition was increased. Ninety-six addition problems were constructed for the critical set of participants’ arithmetic problem solving tasks. The problems were similar to the ones of Experiment 1a, but now the addition in the Difficult problems always involved carrying. The unit digits of all the two-digit addends in the Difficult problems ranged from 3 to 8.

Second, instead of scripted sentences read on-site by the confederate, we used 240 audio clips recorded during Experiment 1a as prime stimuli. This makes the priming manipulation very similar to Experiment 1a (same sentences, spoken by the same confederate in a near-identical experimental context) but without the need to involve the (further) confederate. Each clip contained one prime sentence uttered by the confederate. The prime sentence set contained 192 critical prime sentences and 48 filler sentences. The duration of each audio clip was 2000 ms. The intensity of the clips was normalized to 75 dB. The pictures and prime sentences were the same as those of Experiment 1a.

Procedure

The procedure was very similar to that of Experiment 1a, but with a few minor changes. First, participants now used headphones to listen to prime sentences. Second, as we did not employ an on-site interlocutor, we no longer included tasks for the confederate (solving a mathematical problem, judging the correctness of the participants’ problem solving, and judging the matching of the pictures). Third, as the difficulty of the secondary task in the Difficult Problem condition was increased, we prolonged the duration of the arithmetic problem solving task from 4500 ms to 5000 ms.

The participants were told that they would perform a series of tasks. In the picture verification tasks, they would hear the description of pictures from a previous participant, and they should judge whether the description matches the picture on the screen. In the arithmetic problem solving task, they should solve the mathematical problem and type in the answer as fast and accurately as possible. At the beginning of the experiment, the participants adjusted the headphone to a position that was convenient for them but not detached from their ears. Next, the participants had five trials to practice.

At the beginning of each trial, the participant heard an utterance that described a picture via headphones and made a judgment. The picture judgment task lasted for 4000 ms. Then an arithmetic problem appeared on the participant’s screen. The participant typed in the answer. The task lasted for 5000 ms. Finally, the description task lasted for 6500 ms. The experiment took about 30 minutes.

Scoring

The scoring was the same as in Experiment 1a.

Results

Critical trials in which participants produced no response or an ‘other’ response in the picture description task were excluded (2.2% of the data). The final dataset contained 2252 target responses, among which were 370 s-genitive responses (16.4%) and 1882 of-genitive responses (83.6%).

In Experiment 2a, the average time occupied by problem solving in the Easy Problem condition was 1937 ms shorter than the average problem solving time in the Difficult Problem condition (Easy: 1444 ms vs. Difficult: 3380 ms, Cohen’s d = -4.983), indicating an evident difference of cognitive load between levels that is descriptively much larger than in Experiment 1a. The descriptive data are illustrated in Table 1. The s-genitive production was 30.6% after an s-genitive to-be-recalled sentence and 3.2% after an of-genitive, yielding a 27.5% structure repetition effect. The s-genitive production in the Same Head Noun condition was higher than that in the Different Head Noun condition (21.2% vs. 12.2%) whereas there was no difference of s-genitives between the Easy Problem condition and the Difficult Problem condition (16.5% vs. 16.9%). The structure priming effect was 38.2% in the Same Head Noun condition and 16.3% in the Different Head Noun condition, thus there was a 21.9% head noun overlap effect on structural priming. Descriptively, there was 3.8% more priming in the Easy problem than in the Difficult problem conditions. This difference amounted to 3.2% in the Same Head Noun conditions and 4.7% in the Different Head Noun conditions (Fig 4).

Fig 4. The priming effect (s-genitive production in s-genitive condition minus that in of-genitive condition) as a function of head noun condition and problem difficulty in Experiment 2a and 2b.

Fig 4

Error bars reflect standard errors calculated for a by-participants analysis.

We fitted a GLM model that predicts the likelihood of s-genitive production. The model was fitted in the same way as in Experiment 1a. The final model included a random intercept and a random slope of prime condition, a random slope of head noun condition, and a random slope of critical trial number for subjects as well as a random intercept, a random slope of prime condition, and a random slope of head noun condition for items. The random correlations were dropped. The summary of the fixed effects of the model is listed in Table 5. There were a significant intercept, a significant main effect of prime condition, and a significant interaction between prime condition and head noun condition (pzs < .001). Importantly, there was a significant two-way interaction between prime condition and problem difficulty (pz = .006) as well as a marginal three-way interaction among prime condition, head noun condition, and problem difficulty (pz = .071). In addition, there was a main effect of critical trial number (pz < .001). The negative coefficient of the fixed effect indicated a decrease of the s-genitive production toward the end of the experiment. Unexpectedly, we also found a significant main effect of problem difficulty (pz = .035) and a significant two-way interaction between problem difficulty and head noun condition (pz = .012). There was no clear theoretical reason for these effects, and similar effects were not found in any other experiments. We decided to refrain from speculation about these unexpected findings, which might well be a result of type I error. The rest of the effects were not significant (pz > .1).

Table 5. Fixed effect estimates (in log odds units), Experiment 2a.

Estimate SE z p-value
(Intercept) -3.632 0.346 -10.506 < .001
Prime condition 4.108 0.423 9.713 < .001
Head noun condition 0.196 0.309 0.633 .527
Problem difficulty -0.611 0.289 -2.113 .035
Critical trial number -0.610 0.154 -3.951 < .001
Prime condition: Head noun condition 3.001 0.588 5.102 < .001
Prime condition: Problem difficulty 1.571 0.577 2.725 .006
Head noun condition: Problem difficulty -1.452 0.578 -2.512 .012
Prime condition: Head noun condition: Problem difficulty 2.081 1.152 1.807 .071

Prime condition (Of-genitive as the baseline level), head noun condition (Different Head Noun as the baseline level), problem difficulty (Difficult Problem as the baseline level) were in mean-centered form. Critical trial number was normalized.

Experiment 2b: Sentence structure memory

Method

Participants

Forty-eight further Ghent University students participated in exchange for course credit (39 females and 9 males, average 19.06 years). All participants were native Dutch speakers, reported to be non-color-blind and right-handed, and had normal or corrected-to-normal vision.

Materials

The materials were the same as Experiment 1a.

Procedure

The procedure was similar to that of Experiment 2a. The only change was that the participants were told that they should memorize the sentence structure used in the audio clip in order to reuse the same structure in the subsequent picture description task. In contrast to Experiment 1b, no parity recall task was presented as this task was only for the confederate in Experiment 1b.

Scoring

The scoring was the same as in Experiment 1a.

Results

Critical trials in which participants produced no response or an ‘other’ response in the picture description task were excluded (0.7% of the data). The final dataset contained 2287 target responses, among which were 823 s-genitive responses (38.9%) and 1287 of-genitive responses (61.1%).

In Experiment 2b, the average time occupied by problem solving in the Easy Problem condition was 1923 ms shorter than the average problem solving time in the Difficult Problem condition (Easy: 1427 ms vs. Difficult: 3350 ms, Cohen’s d = -5.690). Again, there was an evident difference between cognitive load conditions, which was descriptively much larger than in Experiment 1b. The descriptive data of the proportion of s-genitive responses are reported in Table 1. The s-genitive production was 75.4% after an s-genitive to-be-recalled sentence and 2.3% after an of-genitive, yielding a 73.0% sentence structure memory effect. Again, the s-genitive production in the Same Head Noun condition was higher than that in the Different Head Noun condition (44.2% vs. 33.4%) whereas there was no difference in s-genitives between the Easy Problem condition and the Difficult Problem condition (39.4% vs. 38.2%). The structure memory effect was 84.6% in the Same Head Noun condition and 61.4% in the Different Head Noun condition, thus there was a 23.2% head noun overlap effect on sentence structure memory. Descriptively, sentence structure recall was better for easy problems than for difficult problems. This difference amounted to 7.1% in the Same Head Noun conditions and 2.3% in the Different Head Noun condition. (Fig 4).

Again, we fitted a GLM model that predicts the likelihood of s-genitive production. The final model included a random intercept, a random slope of to-be-recalled structure, and a random slope of critical trial number for subjects as well as a random intercept, a random slope of to-be-recalled structure, a random slope of problem difficulty, and a random slope of critical trial number for items. The random correlations were dropped. The summary of the fixed effects of the model is listed in Table 6. There were a significant intercept, a significant main effect of the to-be-recalled structure, and a significant interaction between to-be-recalled structure and head noun condition (pzs < .001). Similar to Experiment 2a, there was also a significant two-way interaction between the to-be-recalled structure and the problem difficulty (pz = .015). There was a significant three-way interaction among to-be-recalled structure, head noun condition, and problem difficulty (pz = .037). The main effect of the head noun condition was once again significant (pz < .001). In addition, we found a main effect of critical trial number (pz = .048), which had a positive slope. The rest of the effects were not significant (pzs > .1).

Table 6. Fixed effect estimates (in log odds units), Experiment 2b.

Estimate SE z p-value
(Intercept) -1.861 0.317 -5.861 < .001
To-be-recalled structure 7.823 0.598 13.092 < .001
Head noun condition 0.925 0.238 3.884 < .001
Problem difficulty -0.017 0.233 -0.074 .941
Critical trial number 0.283 0.143 1.978 .048
To-be-recalled structure: Head noun condition 2.853 0.475 6.007 < .001
To-be-recalled structure: Problem difficulty 1.121 0.460 2.439 .015
Head noun condition: Problem difficulty -0.095 0.464 -0.204 .839
To-be-recalled structure: Head noun condition: Problem difficulty 1.934 0.930 2.081 .037

To-be-recalled structure (Of-genitive as the baseline level), head noun condition (Different Head Noun as the baseline level), problem difficulty (Difficult Problem as the baseline level) were in mean-centered form. Critical trial number was in normalized form.

To further explore the three-way interaction among prime condition, head noun condition, and problem difficulty, we divided the dataset by the head noun condition and fitted one GLM model for each subset. The GLM model predicted the likelihood of s-genitive production. The to-be-recalled structure and the problem difficulty (all in mean-centered form) were taken as predictors. The summary of the fixed effects of the model is listed in S1 Appendix. The final model for the Same Head Noun subset included a random intercept for subjects. The random correlation was dropped. The intercept was significant (pz = .001). We found a significant main effect of the to-be-recalled structure (pz < .001). There was a significant two-way interaction between to-be-recalled structure and problem difficulty (pz = .012). The rest of the effect was not significant (pz > .1). The final model for the Different Head Noun subset included a random intercept and a random slope of to-be-recalled structure for subjects as well as a random intercept, a random slope of to-be-recalled structure, and a random slope of problem difficulty for items. The intercept was significant (pz < .001). We found a significant main effect of the to-be-recalled structure (pz < .001). There was no significant two-way interaction between to-be-recalled structure and problem difficulty (pz = .712). The rest of the effect was not significant (pz > .1).

Cross-experiment analysis of structural priming effects and lexical boost effects in Experiment 2a and 2b

To further compare the magnitude of structural priming effects and lexical boost effects between Experiments 2a and 2b, we fitted a GLM model that predicts the likelihood of s-genitive production in Experiments 2a-b. The fixed factors were the same as the ones in the cross-experiment analysis for Experiments 1a-b. The final model included a random intercept, a random slope of prime condition, and a random slope of head noun condition for subjects as well as a random intercept, a random slope of prime condition, a random slope of head noun condition, and a random slope of experiment for items.

The summary of the fixed effects of the model is listed in Table 7. The intercept was significant (pz < .001). Again, there was a main effect of experiment (pz < .001). The main effect of prime condition was significant and the two-way interaction between prime condition and experiment was significant (pzs < .001). There was again a significant interaction between prime condition and head noun condition (pz < .001), but the three-way interaction among prime condition, head noun condition, and experiment was not significant (pz = .925). The interaction between prime condition and problem difficulty across Experiment 2a and 2b was significant (pz < .001) and three-way interaction among prime condition, problem difficulty, and experiment was not significant (pz = .491). The three-way interaction among prime condition, head noun condition, and problem difficulty was significant (pz = .010), but the interaction between all four predictors was not significant. There was also a significant main effect of head noun condition (pz = .012). Unexpectedly, there was a marginal main effect of problem difficulty (pz = .077), a significant interaction between head noun condition and problem difficulty (pz = .029), and a marginally significant interaction between head noun condition, problem difficulty, and experiment (pz = .097). The rest of the effects were not significant (pzs > .1).

Table 7. Fixed effect estimates (in log odds units), cross experiment analysis (2a and 2b).
Estimate SE z p-value
(Intercept) -2.603 0.225 -11.553 < .001
Prime condition 5.643 0.348 16.205 < .001
Head noun condition 0.498 0.199 2.503 .012
Problem difficulty -0.321 0.181 -1.770 .077
Experiment -1.640 0.445 -3.686 < .001
Prime condition: Head noun condition 2.838 0.370 7.668 < .001
Prime condition: Problem difficulty 1.286 0.363 3.548 < .001
Prime condition: Experiment -3.454 0.590 -5.851 < .001
Head noun condition: Problem difficulty -0.792 0.362 -2.188 .029
Head noun condition: Experiment -0.581 0.375 -1.550 .121
Problem difficulty: Experiment -0.576 0.363 -1.586 .113
Prime condition: Head noun condition: Problem difficulty 1.870 0.725 2.581 .010
Prime condition: Head noun condition: Experiment 0.070 0.737 0.095 .925
Prime condition: Problem difficulty: Experiment 0.499 0.724 0.689 .491
Head noun condition: Problem difficulty: Experiment -1.204 0.727 -1.657 .097
Prime condition: Head noun condition: Problem difficulty: Experiment 0.247 1.446 0.171 .864

Prime condition (Of-genitive as the baseline level), head noun condition (Different Head Noun as the baseline level), problem difficulty (Difficult Problem as the baseline level), experiment (Experiment 2b as the baseline level) were in mean-centered form.

In the Bayesian analysis, we found an estimated BF10 for the two-way interaction between prime condition and experiment that suggested the data were 460 times more likely to occur under a model including the two-way interaction than a model without it. There was thus very strong evidence for the alternative hypothesis that there was difference in the effects of structure repetition between the two experiments. An estimated BF10 for the three-way interaction among prime condition, head noun condition, and experiment indicated that the data were only 0.015 times more likely to occur under a model that includes the three-way interaction than a model without it. This suggested strong evidence against the alternative hypothesis that the magnitude of the head noun overlap effects on structure repetition was different between the two experiments. Similarly, strong evidence against the alternative hypothesis was also found for the three-way interaction among prime condition, problem difficulty, and experiment (BF10 = 0.018) and for the four-way interaction among prime condition, head noun condition, problem difficulty, and experiment (BF10 = 0.015). Once again, in a cross-experiment analysis of Experiment 2a and 2b, we found that whereas the structure memory experiment showed a larger overall proportion of s-genitive responses and a greater tendency of structure repetition than the structural priming experiment, the effect of head overlap on structure repetition as well as other interactions under test were not different between the two experiments.

Discussion of Experiment 2a and 2b

In Experiments 2a and 2b, we replicated the findings of Experiments 1a and 1b and obtained clearer evidence for a cognitive load effect on structural priming. We used a more difficult secondary task, which, as we expected, considerably enlarged the difference in reaction times between difficulty conditions (roughly 1300 ms in Experiment 1a-b vs. roughly 1900 ms in Experiment 2a-b). There were significant effects of sentence structure in both the structural priming experiment (27.0%) and the structure memory experiment (73.1%). Once again, we observed comparable effects of lexical overlap in the structural priming experiment despite the evident difference in the effects of structure repetition. The comparable lexical overlap effect between structural priming and sentence structure memory supports the prediction from the multifactorial account that the lexical boost effect is driven by a short-term explicit memory mechanism that can be magnified by lexical cueing.

In Experiment 2a and 2b, we once again found that the processing of a difficult problem was much slower than that of an easy problem, indicating a considerable amount of cognitive resource is taxed in solving carrying problems. Most importantly, we found that the effect of structural priming was reduced when the target trial was preceded by a difficult secondary task. The maintenance of these memory traces was a function of the difficulty of the secondary tasks (which was directly correlated with the time that the speakers’ attention was switched away for concurrent processing), and so was the effect of structural priming. As the duration of the problem solving task is controlled, our study is the first to illustrate the cognitive load effect on sentence structure memory in a fixed time window. This indicated that time lag should not be the only modulating factor of memory decay in structural priming. It is also subject to the limited capacity for speakers to maintain the memory traces of sentence structure.

In Experiment 2a there was a marginal significant three-way interaction among prime condition, head noun condition, and problem difficulty. However, the numerical difference of structural priming effects between problem difficulties in the Same Head Noun condition (3.2%) was very similar to that in the Different Head Noun condition (4.7%). The interpretation of this interaction should be taken with much prudence. This three-way interaction was significant in Experiment 2b. The subset analyses showed that the prime condition x problem difficulty interaction was significant in the Same Head Noun condition, but not in the Different Head Noun condition.

The significant effect of critical trial number was replicated in Experiment 2a: the likelihood of s-genitive production decreased over the course of the experiment. In contrast, in the memory experiment (Experiment 2b), the likelihood of s-genitive production increased with the progress of the experiment, possibly indicating a practice effect. The participants in a sentence structure memory experiment performed in the tasks with an additional goal to accurately repeat the previously experienced sentence structure. It is possible that the accuracy of the structure repetition increased as the participants were getting familiarized with the task, resulting in an increase in the production of the least frequent structure (s-genitive) along with the experiment.

Solving an arithmetic problem mainly taps into the components of the working memory system that assign attentional resources as well as the components that store information ([60, 67]). Some authors argued that two-digit arithmetic solving may involve phonological rehearsal (see [68] for a review), especially when an extra carrying operation is involved [69]. It is possible therefore that in the current experiments, arithmetic problem solving exerted a load not only on the assignment of attention but also on the phonological short-term memory of linguistic representations. However, such a phonological account would predict stronger effects of load in our sentence memory experiments (which involved the explicit instruction to memorize sentence structure and might involve phonological rehearsal) than in our priming experiment (which involved no such instruction). Yet the cognitive load effect was comparable across the priming and memory experiments. But even if the phonological account were correct, it would not take away from our hypothesis that structural priming and sentence structure recall are constrained by limited memory capacity.

General discussion

In two pairs of experiments, we demonstrated the effects of lexical overlap and cognitive load on structural priming and sentence structure retrieval. In Experiments 1a and 2a, we found stronger structural priming when the head noun overlapped between prime and target, which replicated previous findings of a lexical boost effect on structural priming. The same pattern of effects was also found in Experiments 1b and 2b: The correct recall of the target structure was increased when the head noun was repeated from the encoded sentence. Although the effect of sentence structure was clearly much larger in the sentence structure memory experiments than in the structural priming experiments, the effects of head noun overlap on structure repetition in each pair were comparable. The experiments further showed that the priming effects interacted with cognitive load. In the difficult problem condition of Experiment 1a, the priming effect decreased when the processing time between prime and target increased. More importantly, Experiments 2a-2b demonstrated an effect of cognitive load on both structural priming and sentence memory: there was less priming and poorer recall when the addition problems were more difficult. Below we first discuss the lexical effects on priming and memory, followed by the cognitive load effects. We end with a discussion about the mechanisms of structural priming.

Lexical cueing effect on sentence structure recall and structural priming

In our sentence structure memory experiments, we set out to test one of the preconditions of the explicit memory hypothesis of structural priming, namely that the memory retrieval of sentence structure can be facilitated by lexical overlap. In both memory experiments, we found that speakers indeed correctly recalled the sentence structure more often when the head noun of the target task repeated that from the prime task. The current study is, to our knowledge, the first study to find an effect of lexical overlap on sentence structure recall. Given that in the current study and previous studies (e.g., [29, 39]), the chance of structural repetition in sentence structure memory experiments was substantially higher than that in structural priming, it is reasonable to assume that sentence structure recall is driven first and foremost by explicit memory of the sentence (and not, for instance, by a process by which the participant bases retrieval on whether a structure is primed). Therefore, the facilitation effect of lexical overlap on sentence recall implied that speakers made use of the repeated lexical item to help them retrieve the representations with which it associated. This lexical cueing effect on sentence structure retrieval converges with previous findings of lexical cueing effects on lexical retrieval (e.g., [46, 50]) and retrieval of sentential information (e.g., [45, 47]).

Importantly, these findings provide an important constraint for theoretical accounts of structural priming. They support one precondition of the explicit memory hypothesis of the lexical boost, namely that the lexical overlap facilitates the recall of sentence structures. In line with the assumptions of the multifactorial account of structural priming (e.g., [13, 14]), we argue that this facilitation is most likely a cue-based memory retrieval effect. The retrieval of lexical items interacts with the retrieval of sentence structures in such a way that the retrieval of the items from the encoded sentence reinforces the retrieval of the sentence structure that is strongly associated with these items during sentence encoding.

In line with previous findings on the effect of verb-particle overlap on idiom priming [32] and the cumulative lexical boost effect [35], we speculate that syntactic information is stored in explicit memory in the form of lexicalized chunks. The entwining between lexical and syntactic information might explain the substantial modulation effect of lexical overlap on memory retrieval of syntactic chunks.

The current study further showed that a similar effect of lexical overlap was also found on structural priming (i.e., lexical boost effect). The priming and memory experiments were the same in all aspects except for the extent to which explicit memory was taxed. Clearly, the memory experiments require the speakers to retrieve the syntactic structure from short-term explicit memory, but in the priming experiments, speakers were not requested to retrieve the structural information of the sentence. Thus, the most plausible explanation for the resemblance between the lexical overlap effects in the two experiments is that structural priming shares a similar cue-dependent memory retrieval process with sentence structure memory retrieval.

Limited memory capacity in the persistence of sentence structure

The current study supports the assumption that an attention-demanding short-term memory process contributes to the structural priming effect. First, in the sentence structure memory experiment with a more difficult secondary task (Experiment 2b), we found that the chance of successfully recalling a target structure in the head noun condition was reduced when the production task was preceded by a difficult arithmetic problem. This is compatible with our prediction that the memory traces for lexical-syntactic information suffer from a detrimental effect of a secondary task that demands longer processing time in between memory encoding and retrieval. Thus, the finding supports the view that the maintenance of the memory traces of sentence structure is constrained by limited attentional capacity.

Second, the results further showed a similar effect of cognitive load on structural priming: The general priming effect was reduced by a more attention-consuming secondary task (Experiment 2a). In Experiment 1a, although no significant effect of task difficulty was found, we demonstrated that the priming effect was reduced when speakers’ processing time of a difficult problem increased. This implied that there might be a negative association between priming and cognitive load. In Experiment 2a, a load effect on structural priming manifested itself since an extra carry operation was involved in the difficult problem solving. To our knowledge, the current study might be the first to show a cognitive load effect on structural priming in healthy speakers. The cognitive load effect on structural priming indicated that similar to the recall of sentence structure, a short-term memory mechanism contributes to the persistent effect of the prime structure. Such a mechanism is contingent upon the limited capacity for speakers to maintain memory traces of sentence structure: the longer attention is switched away for concurrent processing, the more the memory traces of sentence structure suffer from decay.

The results in Experiment 2a suggested the detrimental effects exerted by cognitive load were numerically similar between the two lexical conditions. So we cannot conclude that the cognitive load effect on structural priming is lexical-dependent. It is somewhat surprising that the cognitive load effect on structural priming was not predominantly lexical-specific. Nevertheless, as Bernolet and colleagues [29] suggested, explicit memory process may also occur in lexical-independent priming, it is reasonable that a higher cognitive load yielded larger interfering effects on the maintenance of sentence structures, even when the head nouns are not repeated.

Furthermore, the lexical boost might be driven by the availability of the primed lexical item stored in explicit memory [27]. As the memory of the content of a sentence is more robust than the form [37], it is possible that after an intervening task, the memory of the primed words stays relatively robust, whereas the memory of sentence structure is prone to the load manipulation. Thus, it is conceivable for a difficult secondary task to be detrimental to general structural priming, while the lexical boost effect remains uninfluenced. Whereas in a sentence memory experiment, speakers primarily maintained the memory traces of sentence structure, such that relatively less attentional resources could be assigned to the maintenance of the lexical traces including the primed head noun. So that in the memory tasks, the lexical-specific memory of the sentence structure might be more susceptible to the load effect.

Taken together, the current findings are consistent with previous studies on the explicit memory processes in structural priming, in that an explicit memory-related process contributes to both lexical-dependent structural priming ([14, 33, 36]) and lexical-independent structural priming [29].

Combining the findings of cognitive load from two priming experiments, we suggest that structural priming is affected by cognitive load only when the load manipulation is strong enough. This hypothesis might relate the current findings to previous studies that failed to find cognitive load effects on structural priming. In particular, Branigan and colleagues [66] found that the effect of lexical-dependent priming did not differ when prime and target were separated by an intervening sentence as compared to a pure time delay, which might imply that the priming effect stayed robust, regardless of the difference in the cognitive load of the intervening task. One possible reason for the null effect of the intervening sentence is that this secondary task was rather easy (i.e., completing an intransitive sentence) and so it might not have put a strong burden on attention. Thus, the time assigned to maintain the memory traces of sentence structure might not be intrinsically different between an intervening task and a time delay.

This constraint of limited memory capacity on structural priming provides a new perspective on the findings of the rapid decay of the short-term syntactic persistence ([14, 20, 29, 36]). Different from previous studies that examined the time course of structural priming and the lexical boost, the current study did not manipulate the number of fillers between prime and target. Instead, we presented only one filler task with a fixed duration and manipulated its difficulty. The cognitive load effect on structural priming found in the current study suggests that the priming effect is not automatically attenuated with the passage of time but is rather affected by the cognitive load exerted by the processing of materials in between prime and target. This is in line with the working memory models that assume memory load within a given time is modulated by the limited capacity [5254]. Such a load effect might covary with the time lag, which would then lead to the decay of priming effect over time. In most previous studies, the processing load for the filler tasks (or chunks) in between prime and target was conceivably homogenous. It could be assumed that each filler item required a fixed amount of time during which attention was driven away to processing. Thus, priming effect quickly dissipated over the fillers due to the accumulation of the cognitive load.

The mechanism of structural priming

In a recent study, [36] nicely recounted the putative cognitive processes of structural priming in children, in which they suggested both automatic processes and non-automatic processes underlie structural priming, including implicit learning, explicit memory, and residual activation. The present paper does not focus on the implicit learning component of structural priming, but we note that in both structural priming experiments the production of the least frequent structure (s-genitive) gradually decreased with the progress of the experiment. This finding seems incompatible with accounts that propose an implicit learning process in structural priming (e.g., [62, 70]). These accounts posited that the cumulative recent experience influences the speakers’ structure preference in such a way that the likelihood of producing the least frequent structure would gradually increase as more exemplars of such structure have been mentioned. One speculative explanation of the reversed cumulative effect is that the speakers were most surprised to encounter the s-genitive structures at the beginning of the experiment. They therefore showed the highest possibility to produce the s-genitive structure early on, because of the relatively large prediction error. As the experiment progressed, the participants gradually regressed to the more default expression, perhaps to alleviate the processing load in sentence production, resulting in the decreasing pattern of the least frequent structure. Note that similar patterns were also found in some cross-linguistic structural priming studies (e.g., [57, 71]). A question for further research is whether it is possible to reconcile implicit learning accounts with such reversed cumulative priming effects.

A main goal of our study was to pinpoint the non-automatic components from structural priming. Our finding of a cognitive load effect on structural priming is best explained by a short-term explicit memory mechanism. Evidence for such an effect was found in both a dialogue experiment and a monologue experiment, which indicates that this short-term memory process in structural priming is context-general. These findings, in combination with the observation of a lexical modulation effect on sentence structure retrieval, suggest that both lexical cueing and fast decay in structural priming can be at least partially explained by a short-term explicit memory effect. We do not necessarily argue against a possible role of residual activation in structural priming: There may certainly be residual activation from the lemma nodes and combinatorial nodes that shortly boosts the chance that speakers spontaneously repeat syntactic choices along with the explicit memory of sentence structure. Nevertheless, we champion a multi-factorial mechanism of structural priming that at least incorporates lexically dependent short-term explicit memory processes as an essential contributor to the general priming effect (e.g., [13, 14, 36]).

Conclusions

The current study addressed two questions with respect to the explicit memory mechanism of structural priming. First, our experiments found lexical cueing effects on sentence structure recall, with comparable magnitude to that of a lexical boost effect on structural priming. This supports a crucial precondition for the account that the lexical boost effect in structural priming is driven by cue-dependent memory retrieval. Second, when the manipulation of the cognitive load was strong enough, there was a cognitive load effect on structural priming. This pinpoints a resource-consuming memory maintenance process in structural priming. The findings jointly suggest that the lexical boost effect and short-term decay in structural priming entail the involvement of an explicit memory-related process, supporting an account of structural priming that subsumes multiple memory mechanisms. More broadly, our findings are compatible with the view that speakers can use a capacity-constrained explicit memory to temporarily store a sentence’s structure and recycle that structure in sentence production.

Supporting information

S1 Appendix. Summary of fixed effects in LME models in the subset analyses of Experiment 1a-b and 2a-b.

(DOCX)

S2 Appendix. Primes and targets used in each experiment.

The description of the target picture is given in first line depicts the content of the target picture. The possessor of the colored object and the object that is owned are mentioned. In the following lines, the s-genitive (a) and the of-genitive primes (b) are given in Dutch. In each prime sentence, the noun in the Same Head Noun condition is mentioned before the slash and the noun in the Different Head Noun condition is mentioned after the slash.

(DOCX)

S3 Appendix. Arithmetic problems used in each experiment.

The appendix only displays each problem in an addend order that the first addend is always larger than the second. In the reported experiment the order of the addends was counterbalanced.

(DOCX)

Acknowledgments

We would like to thank Nolwenn Dierck for her assistance in data collection.

Data Availability

All the relevant data and scripts are available from the Open Science Framework (DOI https://doi.org/10.17605/OSF.IO/6UTKF).

Funding Statement

CZ was supported by China Scholarship Council (No. 201606280023). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Levelt WJ. Speaking: From intention to articulation: MIT press; 1993. [Google Scholar]
  • 2.Wagner V, Jescheniak JD, Schriefers H. On the flexibility of grammatical advance planning during sentence production: Effects of cognitive load on multiple lexical access. J Exp Psychol Learn Mem Cogn. 2010;36(2):423 10.1037/a0018619. [DOI] [PubMed] [Google Scholar]
  • 3.Bock K. Toward a cognitive psychology of syntax: Information processing contributions to sentence formulation. Psychol Rev. 1982;89(1):1 10.1037/0033-295X.89.1.1. [DOI] [Google Scholar]
  • 4.Allen KV, Pickering MJ, Zammitt NN, Hartsuiker RJ, Traxler MJ, Frier BM, et al. Effects of acute hypoglycemia on working memory and language processing in adults with and without type 1 diabetes. Diabetes Care. 2015;38(6):1108–15. 10.2337/dc14-1657. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Hartsuiker RJ, Barkhuysen PN. Language production and working memory: The case of subject-verb agreement. Language and Con. 2006;21(1–3):181–204. 10.1080/01690960400002117. [DOI] [Google Scholar]
  • 6.Slevc LR. Saying what’s on your mind: Working memory effects on sentence production. J Exp Psychol Learn Mem Cogn. 2011;37(6):1503 10.1037/a0024350. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Cowan N. The magical number 4 in short-term memory: A reconsideration of mental storage capacity. Behav Brain Sci. 2001;24(1):87–114. 10.1017/S0140525X01003922. [DOI] [PubMed] [Google Scholar]
  • 8.Hartsuiker RJ, Moors A. On the automaticity of language processing. Entrenchment and the psychology of language learning: How we reorganize and adapt linguistic knowledge. 2017:201–25. 10.1037/15969-010. [DOI] [Google Scholar]
  • 9.Bock K. Syntactic persistence in language production. Cognitive Psychol. 1986;18(3):355–87. 10.1016/0010-0285(86)90004-6. [DOI] [Google Scholar]
  • 10.Mahowald K, James A, Futrell R, Gibson E. A meta-analysis of syntactic priming in language production. J Mem Lang. 2016;91:5–27. 10.1016/j.jml.2016.03.009. [DOI] [Google Scholar]
  • 11.Branigan HP, Pickering MJ. An experimental approach to linguistic representation. Behav Brain Sci. 2017;40 10.1017/S0140525X16002028. [DOI] [PubMed] [Google Scholar]
  • 12.Pickering MJ, Branigan HP. The representation of verbs: Evidence from syntactic priming in language production. J Mem Lang. 1998;39(4):633–51. 10.1006/jmla.1998.2592. [DOI] [Google Scholar]
  • 13.Chang F, Dell GS, Bock K. Becoming syntactic. Psychol Rev. 2006;113(2):234 10.1037/0033-295X.113.2.234. [DOI] [PubMed] [Google Scholar]
  • 14.Hartsuiker RJ, Bernolet S, Schoonbaert S, Speybroeck S, Vanderelst D. Syntactic priming persists while the lexical boost decays: Evidence from written and spoken dialogue. J Mem Lang. 2008;58(2):214–38. 10.1016/j.jml.2007.07.003. [DOI] [Google Scholar]
  • 15.Bock K, Loebell H. Framing sentences. Cognition. 1990;35(1):1–39. 10.1016/0010-0277(90)90035-I. [DOI] [PubMed] [Google Scholar]
  • 16.Scheepers C. Syntactic priming of relative clause attachments: Persistence of structural configuration in sentence production. Cognition. 2003;89(3):179–205. 10.1016/S0010-0277(03)00119-7. [DOI] [PubMed] [Google Scholar]
  • 17.Ziegler J, Snedeker J, Wittenberg E. Event structures drive semantic structural priming, not thematic roles: Evidence from idioms and light verbs. Cognitive Sci. 2018;42(8):2918–49. 10.1111/cogs.12687. [DOI] [PubMed] [Google Scholar]
  • 18.Branigan HP, Pickering MJ, Cleland AA. Syntactic co-ordination in dialogue. Cognition. 2000;75(2):B13–B25. 10.1016/S0010-0277(99)00081-5. [DOI] [PubMed] [Google Scholar]
  • 19.Cleland AA, Pickering MJ. The use of lexical and syntactic information in language production: Evidence from the priming of noun-phrase structure. J Mem Lang. 2003;49(2):214–30. 10.1016/S0749-596X(03)00060-3. [DOI] [Google Scholar]
  • 20.Gries ST. Syntactic priming: A corpus-based approach. J Psycholinguist Res. 2005;34(4):365–99. 10.1007/s10936-005-6139-3. [DOI] [PubMed] [Google Scholar]
  • 21.Traxler MJ, Tooley KM, Pickering MJ. Syntactic priming during sentence comprehension: Evidence for the lexical boost. J Exp Psychol Learn Mem Cogn. 2014;40(4):905 10.1037/a0036377. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Levelt WJ, Roelofs A, Meyer AS. A theory of lexical access in speech production. Behav Brain Sci. 1999;22(1):1–38. 10.1017/S0140525X99001776. [DOI] [PubMed] [Google Scholar]
  • 23.Bock K, Dell GS, Chang F, Onishi KH. Persistent structural priming from language comprehension to language production. Cognition. 2007;104(3):437–58. 10.1016/j.cognition.2006.07.003. [DOI] [PubMed] [Google Scholar]
  • 24.Bock K, Griffin ZM. The persistence of structural priming: Transient activation or implicit learning? J Exp Psychol Gen. 2000;129(2):177 10.1037/0096-3445.129.2.177. [DOI] [PubMed] [Google Scholar]
  • 25.Kaschak MP, Borreggine KL. Is long-term structural priming affected by patterns of experience with individual verbs? J Mem Lang. 2008;58(3):862–78. 10.1016/j.jml.2006.12.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Chang F, Janciauskas M, Fitz H. Language adaptation and learning: Getting explicit about implicit learning. Lang Linguist Compass. 2012;6(5):259–78. 10.1002/lnc3.337. [DOI] [Google Scholar]
  • 27.Reitter D, Keller F, Moore JD. A computational cognitive model of syntactic priming. Cognitive Sci. 2011;35(4):587–637. 10.1111/j.1551-6709.2010.01165.x. [DOI] [PubMed] [Google Scholar]
  • 28.Pickering MJ, Ferreira VS. Structural priming: A critical review. Psychol Bull. 2008;134(3):427 10.1006/jmla.1998.2592. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Bernolet S, Collina S, Hartsuiker RJ. The persistence of syntactic priming revisited. J Mem Lang. 2016;91:99–116. 10.1016/j.jml.2016.01.002. [DOI] [Google Scholar]
  • 30.Bock K, Loebell H, Morey R. From conceptual roles to structural relations: Bridging the syntactic cleft. Psychol Rev. 1992;99(1):150 10.1037/0033-295X.99.1.150. [DOI] [PubMed] [Google Scholar]
  • 31.Konopka AE, Bock K. Helping syntax out: How much do words do? CUNY Human Sentence Processing Conference; 2005.
  • 32.Konopka AE, Bock K. Lexical or syntactic control of sentence formulation? Structural generalizations from idiom production. Cognitive Psychol. 2009;58(1):68–101. 10.1016/j.cogpsych.2008.05.002. [DOI] [PubMed] [Google Scholar]
  • 33.Man G, Meehan S, Martin N, Branigan H, Lee J. Effects of Verb Overlap on Structural Priming in Dialogue: Implications for Syntactic Learning in Aphasia. J Speech Lang Hear R. 2019:1–18. 10.1044/2019_JSLHR-L-18-0418. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Rowland CF, Chang F, Ambridge B, Pine JM, Lieven EV. The development of abstract syntax: Evidence from structural priming and the lexical boost. Cognition. 2012;125(1):49–63. 10.1016/j.cognition.2012.06.008. [DOI] [PubMed] [Google Scholar]
  • 35.Scheepers C, Raffray CN, Myachykov A. The lexical boost effect is not diagnostic of lexically-specific syntactic representations. J Mem Lang. 2017;95:102–15. 10.1016/j.jml.2017.03.001. [DOI] [Google Scholar]
  • 36.Branigan HP, McLean JF. What children learn from adults’ utterances: An ephemeral lexical boost and persistent syntactic priming in adult–child dialogue. J Mem Lang. 2016;91:141–57. 10.1016/j.jml.2016.02.002. [DOI] [Google Scholar]
  • 37.Sachs JS. Recognition memory for syntactic and semantic aspects of connected discourse. Percept Psychophys. 1967;2(9):437–42. 10.3758/BF03208784. [DOI] [Google Scholar]
  • 38.Yan H, Martin RC, Slevc LR. Lexical overlap increases syntactic priming in aphasia independently of short-term memory abilities: Evidence against the explicit memory account of the lexical boost. J Neurolinguistics. 2018;48:76–89. 10.1016/j.jneuroling.2017.12.005. [DOI] [Google Scholar]
  • 39.Hartsuiker RJ, Kolk HH. Syntactic facilitation in agrammatic sentence production. Brain Lang. 1998;62(2):221–54. 10.1006/brln.1997.1905. [DOI] [PubMed] [Google Scholar]
  • 40.Aue WR, Criss AH, Fischetti NW. Associative information in memory: Evidence from cued recall. J Mem Lang. 2012;66(1):109–22. 10.1016/j.jml.2011.08.002. [DOI] [Google Scholar]
  • 41.Craik FI, Tulving E. Depth of processing and the retention of words in episodic memory. J Exp Psychol Gen. 1975;104(3):268 10.1037/0096-3445.104.3.268. [DOI] [Google Scholar]
  • 42.Fisher RP, Craik FI. Interaction between encoding and retrieval operations in cued recall. J Exp Psychol Hum Learn. 1977;3(6):701 10.1037/0278-7393.3.6.701. [DOI] [Google Scholar]
  • 43.Freund JS, Underwood BJ. Storage and retrieval cues in free recall learning. J Exp Psychol. 1969;81(1):49 10.1037/h0027463. [DOI] [Google Scholar]
  • 44.Hayne H, Herbert J. Verbal cues facilitate memory retrieval during infancy. J Exp Child Psychol. 2004;89(2):127–39. 10.1016/j.jecp.2004.06.002. [DOI] [PubMed] [Google Scholar]
  • 45.Masson ME, Miller JA. Working memory and individual differences in comprehension and memory of text. J Educ Psychol. 1983;75(2):314 10.1016/S0022-5371(79)90109-9. [DOI] [Google Scholar]
  • 46.Thomson DM, Tulving E. Associative encoding and retrieval: Weak and strong cues. J Exp Psychol. 1970;86(2):255 10.1037/h0029997. [DOI] [Google Scholar]
  • 47.Till RE. Sentence memory prompted with inferential recall cues. J Exp Psychol Hum Learn. 1977;3(2):129 10.1037/0278-7393.3.2.129. [DOI] [Google Scholar]
  • 48.Tulving E, Pearlstone Z. Availability versus accessibility of information in memory for words. J Verbal Learning Verbal Behav. 1966;5(4):381–91. 10.1016/S0022-5371(66)80048-8. [DOI] [Google Scholar]
  • 49.Tulving E, Osler S. Effectiveness of retrieval cues in memory for words. J Exp Psychol. 1968;77(4):593 10.1037/h0026069. [DOI] [PubMed] [Google Scholar]
  • 50.Tulving E, Thomson DM. Encoding specificity and retrieval processes in episodic memory. Psychol Rev. 1973;80(5):352 10.1037/h0020071. [DOI] [Google Scholar]
  • 51.Shiffrin RM, Steyvers M. A model for recognition memory: REM—retrieving effectively from memory. Psychon B Rev. 1997;4(2):145–66. 10.3758/BF03209391. [DOI] [PubMed] [Google Scholar]
  • 52.Barrouillet P, Bernardin S, Camos V. Time constraints and resource sharing in adults’ working memory spans. J Exp Psychol Gen. 2004;133(1):83 10.1037/0096-3445.133.1.83. [DOI] [PubMed] [Google Scholar]
  • 53.Barrouillet P, Bernardin S, Portrat S, Vergauwe E, Camos V. Time and cognitive load in working memory. J Exp Psychol Hum Learn. 2007;33(3):570 10.1037/0278-7393.33.3.570. [DOI] [PubMed] [Google Scholar]
  • 54.Barrouillet P, Camos V. The time-based resource-sharing model of working memory. The cognitive neuroscience of working memory. 2007;455:59–80. 10.1093/acprof:oso/9780198570394.003.0004. [DOI] [Google Scholar]
  • 55.Oberauer K, Lewandowsky S. Modeling working memory: A computational implementation of the Time-Based Resource-Sharing theory. Psychon B Rev. 2011;18(1):10–45. 10.3758/s13423-010-0020-6. [DOI] [PubMed] [Google Scholar]
  • 56.Bernolet S, Hartsuiker RJ, Pickering MJ. Effects of phonological feedback on the selection of syntax: Evidence from between-language syntactic priming. Biling: Lang Cogn. 2012;15(3):503–16. 10.1017/S1366728911000162. [DOI] [Google Scholar]
  • 57.Bernolet S, Hartsuiker RJ, Pickering MJ. From language-specific to shared syntactic representations: The influence of second language proficiency on syntactic sharing in bilinguals. Cognition. 2013;127(3):287–306. 10.1016/j.cognition.2013.02.005. [DOI] [PubMed] [Google Scholar]
  • 58.Ashcraft MH. Cognitive arithmetic: A review of data and theory. Cognition. 1992;44(1–2):75–106. 10.1016/0010-0277(92)90051-I. [DOI] [PubMed] [Google Scholar]
  • 59.Zbrodoff NJ, Logan GD. What everyone finds: The problem-size effect. 2005. [Google Scholar]
  • 60.Ferreira F, Swets B. How incremental is language production? Evidence from the production of utterances requiring the computation of arithmetic sums. J Mem Lang. 2002. January 1;46(1):57–84. 10.1006/jmla.2001.2797 [DOI] [Google Scholar]
  • 61.Hartsuiker RJ, Kolk HH. Syntactic persistence in Dutch. Lang Speech. 1998;41(2):143–84. 10.1177/002383099804100202. [DOI] [PubMed] [Google Scholar]
  • 62.Kaschak MP, Kutta TJ, Jones JL. Structural priming as implicit learning: Cumulative priming effects and individual differences. Psychon B Rev. 2011;18(6):1133–9. 10.3758/s13423-011-0157-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Barr DJ, Levy R, Scheepers C, Tily HJ. Random effects structure for confirmatory hypothesis testing: Keep it maximal. J Mem Lang. 2013;68(3):255–78. 10.1016/j.jml.2012.11.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Schad DJ, Vasishth S, Hohenstein S, Kliegl R. How to capitalize on a priori contrasts in linear (mixed) models: A tutorial. Journal of Memory and Language. 2020. February 1;110:104038 10.1016/j.jml.2019.104038 [DOI] [Google Scholar]
  • 65.Raftery AE. Bayesian model selection in social research. Sociol. Methodol. 1995. January 1:111–63. 10.2307/271063 [DOI] [Google Scholar]
  • 66.Branigan HP, Pickering MJ, Stewart AJ, McLean JF. Syntactic priming in spoken production: Linguistic and temporal interference. Mem Cognition. 2000;28(8):1297–302. 10.3758/BF03211830. [DOI] [PubMed] [Google Scholar]
  • 67.Anderson JR. ACT: A simple theory of complex cognition. Am Psychol. 1996;51(4):355 10.1037/0003-066X.51.4.355. [DOI] [Google Scholar]
  • 68.DeStefano D, LeFevre JA. The role of working memory in mental arithmetic. Eur J Cogn Psychol. 2004;16(3):353–86. 10.1080/09541440244000328. [DOI] [Google Scholar]
  • 69.Imbo I, De Rammelaere S, Vandierendonck A. New insights in the role of working memory in carry and borrow operations. Psychol Belg. 2005;45(2):101–21. 10.5334/pb-45-2-101. [DOI] [Google Scholar]
  • 70.Kaschak MP, Kutta TJ, Schatschneider C. Long-term cumulative structural priming persists for (at least) one week. Mem Cognition. 2011;39(3):381–8. 10.3758/s13423-011-0157-y. [DOI] [PubMed] [Google Scholar]
  • 71.Kootstra GJ, Doedens WJ. How multiple sources of experience influence bilingual syntactic choice: Immediate and cumulative cross-language effects of structural priming, verb bias, and language dominance. Biling: Lang Cogn. 2016;19(4):710–32. 10.1017/S1366728916000420. [DOI] [Google Scholar]

Decision Letter 0

Michael P Kaschak

7 Apr 2020

PONE-D-20-04033

The role of explicit memory in syntactic persistence: effects of lexical cuing and load on sentence memory and sentence production

PLOS ONE

Dear Mr. Zhang,

I am writing with regard to your manuscript, "The role of explicit memory in syntactic persistence: effects of lexical cuing and load on sentence memory and sentence production" (PONE-D-20-04033). I have received comments from two experts in this area of research, and I have also reviewed your manuscript myself. The reviewers and I find that your experiments are tackling a critical and timely question about the mechanisms that underlie structural priming. These results will be of interest to many in this research community, and as such I would like to see them appear in the literature.

The reviewers have raised a number of issues that will need to be addressed in a revision of your manuscript. The most critical of these is raised by Reviewer 1. The reviewer notes that there seem to be inconsistencies between the way that the analyses are described in the manuscript and the analyses that were actually done. Further, the reviewer has a question about the results of Experiment 2a. To add to the reviewer's comments, you might consider generating estimated means from the model for Experiment 2a to see how well they match the observed means. As always, you should also address the reviewers' other comments in some manner -- either through a revision of the manuscript, or through your response letter. 

-------------------

We would appreciate receiving your revised manuscript by May 22 2020 11:59PM. When you are ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter.

To enhance the reproducibility of your results, we recommend that if applicable you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). This letter should be uploaded as separate file and labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. This file should be uploaded as separate file and labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. This file should be uploaded as separate file and labeled 'Manuscript'.

Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

We look forward to receiving your revised manuscript.

Kind regards,

Mike Kaschak

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements:

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at http://www.plosone.org/attachments/PLOSOne_formatting_sample_main_body.pdf and http://www.plosone.org/attachments/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Please upload a copy of Supporting Information Figure S1 Fig 1., S2 Fig 2., S3 Fig 3., S4 Fig 4. which you refer to in your text on page 52.

3. Your ethics statement must appear in the Methods section of your manuscript. If your ethics statement is written in any section besides the Methods, please move it to the Methods section and delete it from any other section. Please also ensure that your ethics statement is included in your manuscript, as the ethics section of your online submission will not be published alongside your manuscript.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Partly

Reviewer #2: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: No

Reviewer #2: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: This paper investigates the contribution of explicit memory to structural priming. The authors describe four experiments in which they investigate the relationship between structural priming, lexical boost, and the difficulty of a concurrent memory load. In two experiments (one dialogue, one monologue), participants performed a standard structural priming task, describing pictures using one of two genitive alternations. In the other two experiments (again one dialogue, one monologue), participants were explicitly told to repeat the sentence structure of the preceding sentence. In all experiments, in between the prime and target sentence, participants solved an addition problem, which was easy or hard; the difference in difficulty between the two types of math problems was increased for the third and fourth experiment.

A robust structural priming and lexical boost effect was found. The authors also found some evidence of task difficulty – i.e., degree of cognitive load – interacting with degree of syntactic priming and lexical boost, to the effect that subjects were less likely to repeat the prime structure when they cognitive load was more difficult, either because they spent longer engaged in the cognitive load task (exploratory RT analyses) or it was a more difficult math problem.

This is an interesting and well-written paper which addresses a theoretically relevant question about the mechanisms underlying structural priming and its reliance on explicit memory. The research questions are well-grounded in prior research and the experiments seem well-designed. The discussion and interpretation is well-reasoned from the results and interesting as well. However, I think there are issues with the statistics which call into question the main result – that higher cognitive load causes reduced structural priming/recall – and thus I do not think much can be inferred from the results and concomitant conclusions of this paper. I believe a complete re-analysis of the statistics is required, and that may (in my opinion, likely will) completely change the central conclusion of the paper.

Major comments:

-Table 1: This table seems to show that in the Same Head Noun condition, 36.0% of the subjects’ productions were s-genitive and 4.7% were of-genitive, and 15.7% and 5.5%, respectively in the Different Head Noun condition.

(a) Why are these values so low? That means that subjects were producing non-s/of-genitive structures on 60-80% of the trials (depending on condition), correct? What were these other structures?

(b) I can’t reconcile these data with the earlier statements that only 8.6% of responses were “other” and that 84.4% of responses were of-genitive (it looks like from Table 1 that ~5% in each condition were of-genitive).

(c) If subjects are producing a different structure on such a large proportion of the trials, please give some assurance that their behavior is indicative of a larger priming processing effect given how many trials were UNaffected by the prime manipulation.

-The authors write (pg. 20, line 473-476): “we included in the model the by-subject and by-item random intercepts as well as random slopes for all main effects and interactions in the fixed model. The full model (and all the other full models in this paper) converged.” But when I look at the code posted on OSF (thank you!), this does not seem to be the case. There are a ton of models there, but it looks like the full model that is being used in the model comparison ANOVAs is DP0122, which has all random slopes removed (i.e., everything except the intercept). It has the full set of fixed effects, but the random effects are just:

(1 | Subject) + (1 | Item)

Similarly, for all of the sub-models that were run, with the fixed effect in question removed, also have the random effects structure of just (1 | Subject) + (1 | Item).

(a) The authors shouldn’t write that all random slopes were included, as it appears that was not the case for the actual model comparisons. It should be clear as to the model structure that was used for the statistics that are reported.

(b) If the models that included all the random slopes converged, then why is that model not being used for the statistics? Removing random slopes can make a model output anti-conservative and may skew the results.

(c) Note that this point is relevant to all of the experiments, as the code that was posted said it was used for all four experiments.

-Likely related to the previous point, I find the results for Experiment 2a to be rather surprising, given the figures and data. I’m not sure what the error bars represent, but assuming they’re standard error, I find it quite unlikely that those would produce an interaction between Prime Condition x Difficulty or a main effect of Difficulty, and yet those are robustly significant.

When I run the code using the model with no random slopes (DP0122), there is indeed a significant Prime Condition x Difficulty interaction. However, I could not get the full model to converge when I ran it myself (the maximal model with all random slopes + intercepts which the text said had converged). So to try to approximate that model, I ran an F1 ANOVA, which, given that there was a balanced design and minimal data loss (2.2%), and I wouldn’t expect big item differences in your stimuli, the ANOVA should show pretty similar results to the LMER. Some results were very similar, as expected (Prime Condition, Prime Condition x Head Noun Condition) but all of the effects involving Task Difficulty were extremely different, e.g. the main effect of Task Difficulty in the ANOVA was F < 1, p = .92; Prime Condition x Task Difficulty was F = 2.5, p = .12; Prime Condition x Head Noun Condition x Task Difficulty was F < 1, p = .75, and these results seem much more in line with the descriptive data. I am not suggesting that the authors run ANOVAs instead of LMERs; there are many good reasons to run LMERs for this type of data. However, it’s important to make sure the results make sense given what the raw data show. As is noted in the next paragraph: “Unexpectedly, we also found a significant main effect of task difficulty (χ2 = 4.966, df = 1, p = .025) and a significant two-way interaction between task difficulty and lexical overlap (χ2 = 7.213, df = 1, p = .007). There was no clear theoretical reason for these effects, and similar effects were not found in any other experiments. We decided to refrain from speculation about these unexpected findings, which might well be a result of type I error.” Given that Task Difficulty shows a tiny effect in the numerical data, but yet shows a significant (or marginal) result for all of the fixed effects that it's involved in, plus none of them were significant in the sample ANOVA that I ran, and you have the minimal random effects structure in your models, and you even note that some of the Task Difficulty effects were unexpected and might be type I error – I think there is something wrong in the LMERs as conducted. It just doesn’t pass the eyeball test. As I cannot get the maximal models to converge on my machine, I can’t see what result that returns myself, so the advice I have is two-fold:

(a) look into the analyses and really make sure they’re being done correctly and that the statistical results match common-sense intuition based on the numerical data

(b) run the model-comparison stats on the maximal model that was reported to have converged, and see if the effects still hold as currently reported.

-pg. 38, line 924-925 says: “The non-significant interaction between cognitive load and lexical overlap on structural priming was somewhat surprising…” and then discusses the implications of this non-effect. I assume that is referring to the interaction between Task Difficulty x Head Noun Condition in Experiment 2a. However, when reporting the Exp 2a results, the text says (line 740): “Unexpectedly, we also found … a significant two-way interaction between task difficulty and lexical overlap (χ2 = 7.213, df = 1, p = .007).” Please resolve this conflict or clarify which comparison you are referring to in the GD.

Minor comments:

-pg. 25, line 593: The exploratory analysis of Exp 1b investigating processing time (I assume also by using RT, as in Exp 1a) x to-be-recalled structure has df = 0.42. I do not understand how doing a model-comparison test can produce a non-integer degrees of freedom. (Unfortunately I can’t look at the model output to help me understand as I can’t find it in the submission.)

-There are a few times throughout the paper when results between experiments are compared numerically, but not statistically (e.g., pg. 26, comparing the size of the lexical boost effect in Exps 1a vs 1b, etc.). Please make those comparisons statistically explicit and report the significance/non-significance of the cross-experiment comparison.

-I wonder how difficult even the most difficult math problems were. Is carrying in addition really such a challenging task? Subjects are only spending 3 seconds doing the math problem in the hardest case. Please talk a little about why you think this might be enough of a cognitive load to disrupt processing.

-In only Exp 2b is there the expected context effect, whereby production of the less-common structure increased over the course of the experiment; in the other experiments, this effect goes in the opposite direction (the less-common structure decreases over the experiment). This is surprising, given that it’s the opposite direction as would be expected. The authors write that the increase effect (the expected direction) is “possibly indicating a practice effect.” Please discuss this further. Why would there be a practice effect in Exp 2b but not the other experiments with (approximately) the same task? Why would the other experiments show the opposite direction as previously-attested context effects?

Textual/editing comments:

-The references are not all standardized – some use the journal abbreviation (J Mem Lang) and others the full name (Journal of experimental psychology: Learning, memory, and cognition).

-pg. 14: references example stimuli sentences before the sentences are produced in the text. This makes that entire paragraph difficult to follow; I suggest moving the sample sentences before the paragraph describing the stimuli conditions (i.e., at line 330).

-pg. 17-18, lines 419-427: There’s a strange paragraph, maybe it’s supposed to be the figure caption for Figure 2? It’s presented in the middle of the text (with no accompanying figure, which is only at the end of the text body).

-pg. 21, line 501 (same for Exp 1b): The authors write that the model for the supplementary analysis investigating RT of Exp 1a is in the Supplementary Material, but it’s not in Appendix A, B, or C (which are the only attached materials), so please include an explicit pointer to where I can find these results.

-Please note in the figure captions what the error bars are: standard error, standard deviation, confidence interval, etc.

-Please make the y-axis for the two figures on the same scale. It makes it easier to compare the results across experiments.

-Grammar/wording comments:

-pg. 5, line 102-3: “It has been several decades for the debate about how such structural priming effect comes about.” -> “It has been several decades since the debate began about…” (for example)

-pg. 7, line 164: “tried to answer A similar question”

-pg. 9, line 224: “These findings are coherent with…” consistent with?

-pg. 13, line 310, 318: “figurines” -> figures

-pg. 27, line 651: “we predicted that the secondary task posited a cognitive load” -> imposed a cognitive load?

-pg. 30, line 726: “minuses” -> minus

-pg. 35, line 856: “more often correctly” -> delete “correctly” perhaps?

-pg. 37, line 881: intertwinement -> entwining

-pg. 38, line 915-956: “similar as for the recall of sentence structure” -> “similar TO” ?

-pg. 39, line 932: “whereas the memory of sentence structure is prune to the load manipulation.” -> I am not sure what this sentence intends.

Reviewer #2: • I appreciate the availability of the authors’ data.

• Overall, I agree with the authors that it is interesting to further explore cognitive load factors that affect priming. I believe the manuscript presents a nice beginning to this. I am, however, not convinced by Experiment 1a in this manuscript. The authors employ a confederate, which I appreciate. I think it is prudent to consider social factors. Perhaps some participants were high self-monitors and simply took more time answering questions to not be judged poorly by their experimental partner. If this were a factor, it may be that they were paying less attention to their partner’s descriptions in anticipation of their turns.

• Regarding the second series of experiments – I found it interesting that the authors decided to make two major changes – 1) the difficulty of the problems and 2) the dismissal of the confederate. I agree that the arithmetic problems, especially the “easy” ones, likely presented minimal, if any cognitive load on the participants. However, making two changes at once leads to questionable conclusions – how can you be sure which manipulation led to the change in significance?

• I think this is a useful addition to the literature. However, I would recommend running a third study with a confederate and the more difficult problems. It would round off these series of experiments nicely and address the concerns I (and the authors even noted) have about the manuscript.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files to be viewed.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2020 Nov 5;15(11):e0240909. doi: 10.1371/journal.pone.0240909.r002

Author response to Decision Letter 0


19 May 2020

Reviewer #1: This paper investigates the contribution of explicit memory to structural priming. The authors describe four experiments in which they investigate the relationship between structural priming, lexical boost, and the difficulty of a concurrent memory load. In two experiments (one dialogue, one monologue), participants performed a standard structural priming task, describing pictures using one of two genitive alternations. In the other two experiments (again one dialogue, one monologue), participants were explicitly told to repeat the sentence structure of the preceding sentence. In all experiments, in between the prime and target sentence, participants solved an addition problem, which was easy or hard; the difference in difficulty between the two types of math problems was increased for the third and fourth experiment.

A robust structural priming and lexical boost effect was found. The authors also found some evidence of task difficulty – i.e., degree of cognitive load – interacting with degree of syntactic priming and lexical boost, to the effect that subjects were less likely to repeat the prime structure when they cognitive load was more difficult, either because they spent longer engaged in the cognitive load task (exploratory RT analyses) or it was a more difficult math problem.

This is an interesting and well-written paper which addresses a theoretically relevant question about the mechanisms underlying structural priming and its reliance on explicit memory. The research questions are well-grounded in prior research and the experiments seem well-designed. The discussion and interpretation is well-reasoned from the results and interesting as well. However, I think there are issues with the statistics which call into question the main result – that higher cognitive load causes reduced structural priming/recall – and thus I do not think much can be inferred from the results and concomitant conclusions of this paper. I believe a complete re-analysis of the statistics is required, and that may (in my opinion, likely will) completely change the central conclusion of the paper.

Major comments:

#1 Table 1: This table seems to show that in the Same Head Noun condition, 36.0% of the subjects’ productions were s-genitive and 4.7% were of-genitive, and 15.7% and 5.5%, respectively in the Different Head Noun condition.

(a) Why are these values so low? That means that subjects were producing non-s/of-genitive structures on 60-80% of the trials (depending on condition), correct? What were these other structures?

(b) I can’t reconcile these data with the earlier statements that only 8.6% of responses were “other” and that 84.4% of responses were of-genitive (it looks like from Table 1 that ~5% in each condition were of-genitive).

(c) If subjects are producing a different structure on such a large proportion of the trials, please give some assurance that their behavior is indicative of a larger priming processing effect given how many trials were UNaffected by the prime manipulation.

>>Author’s response: These comments stem from a misinterpretation of Table 1, presumably due to our unclear title of the table and labeling of the columns. The Table lists the proportion of s-genitives out of all s-genitives and of-genitives, for each experiment, level of lexical overlap condition (rows), and level of prime condition (columns). Thus, the value of 36% the reviewer is referring to means that in the Same Head Noun, s-genitive prime condition, 36% of valid responses were s-genitives (and so 64% were of-genitives). But in the Same Head Noun, of-genitive prime condition, only 4.7% of the responses were s-genitives (and so 95.3% were of-genitives). Thus, there was a priming effect of about 30%. This of course also means that the number of others was really 8.6% - the proportions listed in the Table are based only on non-others (s-genitives and of-genitives). We have revised the title and the header of the table. We hope this clarifies the table.<<

#2 The authors write (pg. 20, line 473-476): “we included in the model the by-subject and by-item random intercepts as well as random slopes for all main effects and interactions in the fixed model. The full model (and all the other full models in this paper) converged.” But when I look at the code posted on OSF (thank you!), this does not seem to be the case. There are a ton of models there, but it looks like the full model that is being used in the model comparison ANOVAs is DP0122, which has all random slopes removed (i.e., everything except the intercept). It has the full set of fixed effects, but the random effects are just:

(1 | Subject) + (1 | Item)

Similarly, for all of the sub-models that were run, with the fixed effect in question removed, also have the random effects structure of just (1 | Subject) + (1 | Item).

(a) The authors shouldn’t write that all random slopes were included, as it appears that was not the case for the actual model comparisons. It should be clear as to the model structure that was used for the statistics that are reported.

(b) If the models that included all the random slopes converged, then why is that model not being used for the statistics? Removing random slopes can make a model output anti-conservative and may skew the results.

(c) Note that this point is relevant to all of the experiments, as the code that was posted said it was used for all four experiments.

>>Author’s response: We thank the reviewer for pointing this out. In fact, the script in the OSF data repository contained an older version of our analysis, in which we employed backward selection for the models. In the version of the analyses that we reported in the manuscript we did employ the maximal random model suggested by Barr et al. (2013). We are very sorry that we did not update the OSF files in time. We have now put the correct R scripts in the OSF repository, one for each pair of experiments. In the script we included all the necessary codes for the descriptive and inferential analyses in the manuscript.<<

#3 Likely related to the previous point, I find the results for Experiment 2a to be rather surprising, given the figures and data. I’m not sure what the error bars represent, but assuming they’re standard error, I find it quite unlikely that those would produce an interaction between Prime Condition x Difficulty or a main effect of Difficulty, and yet those are robustly significant.

>>Author's response: The error bars denote the standard of the error mean from a by-participants analysis; this has now been added to the Figure caption. In interpreting the figure, please keep in mind that the Y-axis represents the priming effect. Thus, the interaction between prime condition and difficulty is reflected in a difference between the easy and difficult conditions. The main effect of difficulty (which is of no theoretical interest) cannot be gauged from the figure.<<

When I run the code using the model with no random slopes (DP0122), there is indeed a significant Prime Condition x Difficulty interaction. However, I could not get the full model to converge when I ran it myself (the maximal model with all random slopes + intercepts which the text said had converged). So to try to approximate that model, I ran an F1 ANOVA, which, given that there was a balanced design and minimal data loss (2.2%), and I wouldn’t expect big item differences in your stimuli, the ANOVA should show pretty similar results to the LMER. Some results were very similar, as expected (Prime Condition, Prime Condition x Head Noun Condition) but all of the effects involving Task Difficulty were extremely different, e.g. the main effect of Task Difficulty in the ANOVA was F < 1, p = .92; Prime Condition x Task Difficulty was F = 2.5, p = .12; Prime Condition x Head Noun Condition x Task Difficulty was F < 1, p = .75, and these results seem much more in line with the descriptive data. I am not suggesting that the authors run ANOVAs instead of LMERs; there are many good reasons to run LMERs for this type of data. However, it’s important to make sure the results make sense given what the raw data show. As is noted in the next paragraph: “Unexpectedly, we also found a significant main effect of task difficulty (χ2 = 4.966, df = 1, p = .025) and a significant two-way interaction between task difficulty and lexical overlap (χ2 = 7.213, df = 1, p = .007). There was no clear theoretical reason for these effects, and similar effects were not found in any other experiments. We decided to refrain from speculation about these unexpected findings, which might well be a result of type I error.” Given that Task Difficulty shows a tiny effect in the numerical data, but yet shows a significant (or marginal) result for all of the fixed effects that it's involved in, plus none of them were significant in the sample ANOVA that I ran, and you have the minimal random effects structure in your models, and you even note that some of the Task Difficulty effects were unexpected and might be type I error – I think there is something wrong in the LMERs as conducted. It just doesn’t pass the eyeball test. As I cannot get the maximal models to converge on my machine, I can’t see what result that returns myself, so the advice I have is two-fold:

(a) look into the analyses and really make sure they’re being done correctly and that the statistical results match common-sense intuition based on the numerical data

(b) run the model-comparison stats on the maximal model that was reported to have converged, and see if the effects still hold as currently reported.

>>Author’s response: We appreciate it that the reviewer went to the trouble of checking the analyses and even conducted an ANOVA as an approximation. We apologize again that the scripts on OSF were outdated. We have checked all analyses, and observed that our statistical inferences regarding task difficulty do hold - although these reanalyses did bring up a new issue, namely that of singular fit, which led to a number of changes to the inferential statistics, and for one interaction a difference in significance (see below). We respond in more detail below:

(a) Model convergence

In the data analyses employed in the manuscript we kept the maximal random model for each linear mixed model. We accepted the model results that showed the warning singular fit since the model converged. But experts on linear mixed models have raised caution about the singular fit in model convergence. Some argued it might indicate that a complicated model is overfitted and might result in mis-convergence (Bates et al., 2015). Thus, in the current version of the data analyses, we followed Barr et al. (2013) by keeping the most complex random model that did not result in singular fit. To do this, we started with the maximal random model and simplified the random model only when the model failed to converge or the model generated a singular fit. If the model showed singularity, we first dropped the random correlations then dropped one random factor a time, starting from the most complex interaction term, until the non-singular model was fitted. We updated the modelling method as well as the structure and fixed effects of the final models in the manuscript and appendices.

(b) The three-way interaction in Experiment 2a

The only place where there might be a discrepancy between the observed data and the model estimation was in the three-way interaction between prime condition, head noun condition, and problem difficulty in Experiment 2a. Descriptively, the difficult problem reduced the numerical size of the structural priming effect to a similar extent in the Same Head Noun condition and in the Different Head Noun condition. However, in the latest data analysis, the three-way interaction became significant in the model comparison analysis (i.e., model comparison between the reduced model and the full model). A possible interpretation of the three-way interaction is that the two-way interaction between prime condition and problem difficulty differed in the subsets divided by head noun condition. Although we did not make a strong argument out of this significant three-way interaction, we nevertheless fitted two additional models that predicted the likelihood of s-genitive production in the head noun condition subsets. We found that the interaction between the prime condition and problem difficulty was only significant in the Same Head Noun subset. The results of the subset models were added in the manuscript, and the fixed effect of the models were put in Appendix A. <<

#4 pg. 38, line 924-925 says: “The non-significant interaction between cognitive load and lexical overlap on structural priming was somewhat surprising…” and then discusses the implications of this non-effect. I assume that is referring to the interaction between Task Difficulty x Head Noun Condition in Experiment 2a. However, when reporting the Exp 2a results, the text says (line 740): “Unexpectedly, we also found … a significant two-way interaction between task difficulty and lexical overlap (χ2 = 7.213, df = 1, p = .007).” Please resolve this conflict or clarify which comparison you are referring to in the GD.

>>Author’s response: For the first interaction the reviewer refers to what we meant was rather the three-way interaction between prime condition, head noun condition, and problem difficulty. Sorry we did not make it clear. Now we have changed the expression to “It is somewhat surprising that the cognitive load effect on structural priming was not predominantly head-specific.” Thanks for pointing this out.<<

Minor comments:

#5 pg. 25, line 593: The exploratory analysis of Exp 1b investigating processing time (I assume also by using RT, as in Exp 1a) x to-be-recalled structure has df = 0.42. I do not understand how doing a model-comparison test can produce a non-integer degrees of freedom. (Unfortunately I can’t look at the model output to help me understand as I can’t find it in the submission.)

>>Author’s response: This was a typo. We have revised this result.<<

-There are a few times throughout the paper when results between experiments are compared numerically, but not statistically (e.g., pg. 26, comparing the size of the lexical boost effect in Exps 1a vs 1b, etc.). Please make those comparisons statistically explicit and report the significance/non-significance of the cross-experiment comparison.

>>Author’s response: Based on this comment, we have added two sections of cross-experiment analyses (Page 25-26 and Page 35) that compares the structural priming effects, lexical boost effects, and cognitive load effects between the structural priming experiments and sentence structure memory experiments. In both analyses, there were significant two-way interaction between prime condition and experiment, but the three-way interactions between prime condition, head noun condition, and experiment were not significant. We also calculated the Bayesian factors of these interactions to further examine to what extent these effects supported the alternative hypotheses.<<

#6 I wonder how difficult even the most difficult math problems were. Is carrying in addition really such a challenging task? Subjects are only spending 3 seconds doing the math problem in the hardest case. Please talk a little about why you think this might be enough of a cognitive load to disrupt processing.

>>Author’s response: In the design of the experiments we needed to find a balance between having an effortful task and having a relatively short task, given the transiency of the lexical boost effects. We needed to keep the difficult secondary task relatively short but effortful. The processing of a carrying problem is around 1900 ms slower than solving an easy problem. This suggests that a considerable amount of cognitive resource is taxed in solving carrying problems. Additionally, a similar manipulation of cognitive load significantly affected the preparation time of sentence production (Ferreira & Swets, 2002), we believe it is possible that the cognitive load manipulation in the current study makes a difference on the choice of syntactic structures.<<

#7 In only Exp 2b is there the expected context effect, whereby production of the less-common structure increased over the course of the experiment; in the other experiments, this effect goes in the opposite direction (the less-common structure decreases over the experiment). This is surprising, given that it’s the opposite direction as would be expected. The authors write that the increase effect (the expected direction) is “possibly indicating a practice effect.” Please discuss this further. Why would there be a practice effect in Exp 2b but not the other experiments with (approximately) the same task? Why would the other experiments show the opposite direction as previously-attested context effects?

>>Author’s response: The reversed cumulative effect on the production of the less preferred sentence structure (i.e., s-genitive) might be attributed to a surprisal effect. When the participants first perceived an s-genitive prime, they would be more prompted to use this structure in the ensuing task because the unexpected first encounter exerts a strong prediction error. As the experiment progressed, the participants might then regress to their preferred structure to alleviate the processing load in sentence production, resulting in a decreasing pattern of s-genitive production.

We argue such a reversed cumulative effect would not occur in the sentence structure memory experiments. Instead, a practice effect might manifest itself in these experiments. The production tasks in the sentence structure memory experiments was driven by an additional goal to accurately repeat the previously encoded sentence structure. We would expect that the accuracy of structure repetition gradually increased as the participants were getting familiarized with the task, resulting in higher chance to produce the less frequent structure as the experiment progressed. We added a brief discussion about these effects at Page 37 (for the practice effect) and Page 44 (for the reversed cumulative effects). <<

Textual/editing comments:

#8 The references are not all standardized – some use the journal abbreviation (J Mem Lang) and others the full name (Journal of experimental psychology: Learning, memory, and cognition).

>>Author’s response: We have changed all journal names into the abbreviated form as required by the journal's submission format.<<

-pg. 14: references example stimuli sentences before the sentences are produced in the text. This makes that entire paragraph difficult to follow; I suggest moving the sample sentences before the paragraph describing the stimuli conditions (i.e., at line 330).

>>Author’s response: Fixed. Thank you for pointing that out.<<

#9 pg. 17-18, lines 419-427: There’s a strange paragraph, maybe it’s supposed to be the figure caption for Figure 2? It’s presented in the middle of the text (with no accompanying figure, which is only at the end of the text body).

>>Author’s response: Yes, it is the figure caption for Figure 2; the journal's submission guidelines require us to put them in the text immediately after the paragraph in which the figure was first cited<<

#10 pg. 21, line 501 (same for Exp 1b): The authors write that the model for the supplementary analysis investigating RT of Exp 1a is in the Supplementary Material, but it’s not in Appendix A, B, or C (which are the only attached materials), so please include an explicit pointer to where I can find these results.

>>Author’s response: We apologize for this omission. We have now added these models in Appendix A.<<

#11 Please note in the figure captions what the error bars are: standard error, standard deviation, confidence interval, etc.

>>Author’s response: We have noted in the figure caption that the error bars reflect standard errors calculated for a by-participants analysis.<<

#12 Please make the y-axis for the two figures on the same scale. It makes it easier to compare the results across experiments.

>>Author’s response: Fixed. Thank you for pointing that out.<<

#13 Grammar/wording comments:

-pg. 5, line 102-3: “It has been several decades for the debate about how such structural priming effect comes about.” -> “It has been several decades since the debate began about…” (for example)

-pg. 7, line 164: “tried to answer A similar question”

-pg. 9, line 224: “These findings are coherent with…” consistent with?

-pg. 13, line 310, 318: “figurines” -> figures

-pg. 27, line 651: “we predicted that the secondary task posited a cognitive load” -> imposed a cognitive load?

-pg. 30, line 726: “minuses” -> minus

-pg. 35, line 856: “more often correctly” -> delete “correctly” perhaps?

-pg. 37, line 881: intertwinement -> entwining

-pg. 38, line 915-956: “similar as for the recall of sentence structure” -> “similar TO” ?

-pg. 39, line 932: “whereas the memory of sentence structure is prune to the load manipulation.” -> I am not sure what this sentence intends.

>>Author’s response: We have modified all the grammar errors.<<

Reviewer #2: • I appreciate the availability of the authors’ data.

#1 Overall, I agree with the authors that it is interesting to further explore cognitive load factors that affect priming. I believe the manuscript presents a nice beginning to this. I am, however, not convinced by Experiment 1a in this manuscript. The authors employ a confederate, which I appreciate. I think it is prudent to consider social factors. Perhaps some participants were high self-monitors and simply took more time answering questions to not be judged poorly by their experimental partner. If this were a factor, it may be that they were paying less attention to their partner’s descriptions in anticipation of their turns.

>>Author’s response: This is an interesting point. We argue below however that even in the absence of a physical interlocutor, the participants' utterances still had communicative value. Additionally, a reanlysis does not find evidence for differences in secondary task difficulty. Specifically, in Experiment 2a and 2b the participants were instructed that the utterances they would listen to came from a “participant” in the earlier test (who was in fact a confederate) and that their recording would be used in the upcoming tests of further participants. So we would argue that the production tasks in Experiment 2a and 2b also carried communicative purpose. The main difference in the social contexts of Experiments 1a-b and 2a-b would be the physical presence of an interlocutor. To examine whether the appearance of an interlocutor influenced participants’ processing time in problem solving, we conducted a between-experiment comparison of the processing time in filler arithmetic problem solving tasks. The materials in these filler problem solving tasks were identical in all four experiments. This enabled us to conduct a pairwise comparison of by-item mean processing time between experiments. We only examined the processing time of the hard problems (with or without carrying) because solving an easy problem requires little attention resources. If the participants took more time answering the questions when collaborating with an interlocutor, the processing time of a hard problem in the dialogue experiments should be longer than that in the monologue experiments. To test this hypothesis, we aggregated the processing time across items and compared the mean processing time between Experiment 1a and 2a as well as between 1b and 2b. The mean processing time in Experiment 1a was 3095 ms and that in Experiment 2a was 3097 ms. The standardized difference was negligible (Cohen's d for paired samples = -0.006). The difference of processing time between Experiment 1b (3102 ms) and 2b (3103 ms) was also negligible (Cohen's d for paired samples = -0.002). This indicates that the participants in the dialogue experiments did not take more time than those in the monologue experiments to solve the same filler task, suggesting no effect of the presence of an interlocutor on secondary task processing.<<

• Regarding the second series of experiments – I found it interesting that the authors decided to make two major changes – 1) the difficulty of the problems and 2) the dismissal of the confederate. I agree that the arithmetic problems, especially the “easy” ones, likely presented minimal, if any cognitive load on the participants. However, making two changes at once leads to questionable conclusions – how can you be sure which manipulation led to the change in significance?

>>Author’s response: We have argued in the response to comment #1 that the physical presence of an interlocutor had little impact on secondary task processing, and hence made little difference for the change of the cognitive load in Experiment 2a-b compared to Experiments 1a-b. We believe the main factor that drove the load increase was the more difficult secondary task.<<

• I think this is a useful addition to the literature. However, I would recommend running a third study with a confederate and the more difficult problems. It would round off these series of experiments nicely and address the concerns I (and the authors even noted) have about the manuscript.

>>Author’s response: Thank you for pointing this out. But because of the limited effect of this manipulation on the problem solving task, we have decided not to follow this suggestion<<

Attachment

Submitted filename: Response to reviewers _Zhang et al..docx

Decision Letter 1

Michael P Kaschak

8 Jul 2020

PONE-D-20-04033R1

The role of explicit memory in syntactic persistence: effects of lexical cueing and load on sentence memory and sentence production

PLOS ONE

Dear Dr. Zhang,

I am writing with regard to your manuscript, "The role of explicit memory in syntactic persistence: effects of lexical cueing and load on sentence memory and sentence production" (PONE-D-20-04033R1). I have received a set of comments from one of the original reviewers of your manuscript, and I have also reviewed your revision. The reviewer and I had very similar responses to your work:  you are presenting a nice set of experiments that contribute to our understanding of structural priming, but there are still issues with your statistical analyses and how they are presented. 

The reviewer raises three specific points about the analyses. 

1 -- Your procedure for dropping random effects from your models is not fully specified. For example, the reviewer notes that your strategy is to drop the most complex random effects (e.g., start with 3-way interaction, then the 2-way interactions, and so on), but it is not clear how you decide which effect to drop when there are multiple effects at the same level of complexity (e.g., multiple 2-way interactions). 

2 -- There appears to be an inconsistency between your stated strategy for simplifying your models (drop random correlations, then drop random slopes) and what appears in your code and text.

3 -- There also appears to be an inconsistency between the analyses specified in your code, the analyses described in the text, and the analyses presented in your Appendix. The confusion arises because you both a) present code for an analysis that includes all of the expected fixed effects, and then present the results of the analysis in the Appendix (i.e., a "full" analysis of your design), and b) present code suggesting that you are testing for specific effects using a model-comparison procedure (compare a model that has the target effect, and a model that does not) and generally focus on these specific effects in the text (as evidenced by the presentation of the X^2 statistic in the text, rather than the Z from the model results in the Appendix). The results in the text and in the Appendix are in accord, but it is unclear why you chose this particular way to handle the effects -- it would seem more reasonable to either stick with the results in the Appendix, or do the model comparison approach for all of your predictors (and present the X^2 values in your tables). I also think it would be helpful to have the full models reported in the text (i.e., put the tables in the text, rather than in the Appendix). 

Beyond these main points, the reviewer also provides a set of minor comments for you to consider and address.  

Please submit your revised manuscript by Aug 22 2020 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

We look forward to receiving your revised manuscript.

Kind regards,

Mike Kaschak

Academic Editor

PLOS ONE

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: (No Response)

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Partly

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: No

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: This paper is still an interesting topic and contribution to the literature. However, there remain several inconsistencies or errors with the analyses which have the potential to affect the results. I feel the authors need to fix the analyses before I can have confidence that the results are in fact as they have been presented (and thus of course whether the discussion/interpretation follows).

MAJOR COMMENTS:

-There still is confusion about the running and reporting of the lmer models. Due to the discrepancies between what is reported in the paper for the overall analysis strategy, what is reported for each individual experiment, and the R code, in addition to the fact that the reported results often seem at odds with the means and sds shown in the tables/figures (as even the authors themselves note at times), I believe the authors need to very carefully check their analyses, and this must be resolved before I can have confidence in them. I detail several (potentially related) concerns below.

The authors write (pg. 22, line 112) "If the maximal model could not converge or showed singularity, we first dropped the random correlation terms, and then dropped one random factor at a time, starting from the interaction terms, until the model converged and no warning of singular fit was reported." They write something similar in their Response letter.

a) How did you decide which order to drop terms if a model did not converge? A common strategy is to iteratively drop the random effect which accounts for the least amount of variance, but it does not sound like this was what was done. The authors say their strategy was to start by dropping the most complex interaction term, but how did you decide the order of dropping terms at the same complexity level (i.e., the order of dropping among the 3 2-way interactions, or among the 3 main effects)? It looks like terms were dropped based on the order that they were written in the model starting from the bottom-up; in order words, arbitrarily within a complexity level. That is not a good strategy, as dropping random terms can affect significance and thus should not be done arbitrarily. Dropping a random effect which accounts for a substantial amount of variance can have a big effect on the observed significance of fixed effects, and while this is necessary sometimes for convergence/singularity reasons, it needs to be done systematically.

b) The authors write that they first dropped random correlations, and then started dropping random slopes terms. However, most of the models (the main model for Exp 1b, 2a, and 2b) - both in the R code and as written in the paper text - contradict this stated strategy: Those models RETAIN (e.g.) the by-subjects random correlations, but DROP almost all of the by-subjects random effects. (But Exp 1a follows this stated strategy, so they differ between each other.) That is, this model did NOT first drop the by-subject random correlations and then start dropping random effects. It’s really important to both (1) have the description in the paper body match your actual model, and (2) to have a systematic and consistent strategy for pruning a model’s terms, as this can have a major effect on the model’s output.

c) In both the R code and the text body, only some of the factors appear to be tested. In both Exp 1a and 1b, there are models testing only these fixed effects:

-Prime Structure

-Trial Number

-Prime Structure x Head Noun

-Prime Structure x Task Difficulty

-Prime Structure x Head Noun x Task Difficulty

Where are the models testing the other fixed effects? Did you test, for example, for the main effect of Head Noun or Task Difficulty? If you didn’t test all factors, you should say that explicitly in the paper and justify why not. Even if some factors are not theoretically interesting, it is very non-standard to not even test for/report those factors if they were entered into your analysis.

The cross-experiment analysis of 1a vs. 1b similarly omits testing of a number of factors.

Strangely, in Exp 2a, the authors suddenly do test for the main effect of Problem Difficulty (but still not Head Noun), and Problem Difficulty x Lexical Overlap [sic].

But then testing for those two fixed effects disappears again in Exp 2b, which only tests for the 5 fixed effects I noted above.

So again we have a consistency problem.

MINOR COMMENTS:

-Table 1: I understand, I had misunderstood what the numbers in Table 1 represented. The authors respond that they changed the title of the table to clarify this; I admit the title looks identical to the previous one except that one sentence was moved from the top to the bottom of the table, but I leave this to the authors to figure out.

-There are some drastic changes between different versions of lme4, so I suggest you report the version of the package that you used.

-pg 29, line 292-293: The authors write "structural priming effect may persist over at least one intervening task, irrespective of what type of interference is employed." I do not think this is a fair statement; the authors have tested one type of secondary task (addition math problems); they don’t have evidence how any other type of task will or will not interfere.

-The authors reference a condition called "Lexical Overlap" and "Overlap Condition" (pg. 36, line 439 and continuing on that page). Is this supposed to be “Head Noun Condition” or is this a new contrast they have introduced?

-The authors write (pg. 29, lines 281-283) that the "Inverse of Bayes factor" is 0.023 or 0.020. Do they really mean “inverse”? If that’s accurate, that means the Bayes factor is around 50 (1/0.020), which is very strong evidence in favor of H1 (cf Jeffreys 1961).

-The authors provide a short response to my previous question about whether the math problems provide enough of a cognitive load to disrupt processing in their Response letter. However, I didn’t see this in the manuscript itself. While I appreciate the personalized response, it would be useful if other readers could see this discussion as well.

MINOR WORDING EDITS:

-pg. 29, line 291-292: “suggest that THE structural priming effect may persist”

-pg. 109, line 326-327: “One unexpected result in the priming experiments was a negative correlation between the critical trial number and the likelihood of s-genitive production.” --> “priming experiment” [singular], as you only found this negative correlation in Exp 1a, not 1b.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2020 Nov 5;15(11):e0240909. doi: 10.1371/journal.pone.0240909.r004

Author response to Decision Letter 1


28 Jul 2020

Reviewer #1: This paper is still an interesting topic and contribution to the literature. However, there remain several inconsistencies or errors with the analyses which have the potential to affect the results. I feel the authors need to fix the analyses before I can have confidence that the results are in fact as they have been presented (and thus of course whether the discussion/interpretation follows).

MAJOR COMMENTS:

#1 There still is confusion about the running and reporting of the lmer models. Due to the discrepancies between what is reported in the paper for the overall analysis strategy, what is reported for each individual experiment, and the R code, in addition to the fact that the reported results often seem at odds with the means and sds shown in the tables/figures (as even the authors themselves note at times), I believe the authors need to very carefully check their analyses, and this must be resolved before I can have confidence in them. I detail several (potentially related) concerns below.

The authors write (pg. 22, line 112) "If the maximal model could not converge or showed singularity, we first dropped the random correlation terms, and then dropped one random factor at a time, starting from the interaction terms, until the model converged and no warning of singular fit was reported." They write something similar in their Response letter.

a) How did you decide which order to drop terms if a model did not converge? A common strategy is to iteratively drop the random effect which accounts for the least amount of variance, but it does not sound like this was what was done. The authors say their strategy was to start by dropping the most complex interaction term, but how did you decide the order of dropping terms at the same complexity level (i.e., the order of dropping among the 3 2-way interactions, or among the 3 main effects)? It looks like terms were dropped based on the order that they were written in the model starting from the bottom-up; in order words, arbitrarily within a complexity level. That is not a good strategy, as dropping random terms can affect significance and thus should not be done arbitrarily. Dropping a random effect which accounts for a substantial amount of variance can have a big effect on the observed significance of fixed effects, and while this is necessary sometimes for convergence/singularity reasons, it needs to be done systematically.

<<Authors’ response: We did adopt a bottom-up strategy of term dropping in the random model in the previous model selections. The reason we employed this method was to avoid the potential complication when a lower-level random effect accounts for less variance than a higher-level random effect.

In the newest version of the manuscript, we adapted the method to your suggested strategy. Now, in the LME model analysis, we arranged the model selection based on both the complexity of the terms and the variance of the random effects. When the maximal model could not converge or showed singularity, we first dropped the random correlation terms (once and for all, see the response to Comment #1b), and then dropped one random factor at a time, starting from the most complex interaction terms. When there were multiple terms with the same complexity, we compared the variances of the random effects in the last model and dropped the term with the least amount of variance. We repeated this selection process until we fitted the first model that converged and showed no singularity. We have revised our description of the model selection method in the manuscript accordingly (p22).

We have updated the information about the final model and the fixed effects of the model in the manuscript. We have also updated our model selection method in the OSF repository (link: https://osf.io/6utkf/). There you could also find two additional scripts that tracked the process of model selection.>>

b) The authors write that they first dropped random correlations, and then started dropping random slopes terms. However, most of the models (the main model for Exp 1b, 2a, and 2b) - both in the R code and as written in the paper text - contradict this stated strategy: Those models RETAIN (e.g.) the by-subjects random correlations, but DROP almost all of the by-subjects random effects. (But Exp 1a follows this stated strategy, so they differ between each other.) That is, this model did NOT first drop the by-subject random correlations and then start dropping random effects. It’s really important to both (1) have the description in the paper body match your actual model, and (2) to have a systematic and consistent strategy for pruning a model’s terms, as this can have a major effect on the model’s output.

<<Authors’ response: The reason that the exclusion of the random correlation was not symmetrical for the by-subject and the by-item random effects is because that we fitted models with and without random correlations in each iteration of the term dropping. We always started a model with random correlations. If that model did not converge, we would fit a model that dropped the random correlation for subjects and a model that dropped the random correlation for items. If both models converged, we selected the model based on the goodness-of-fit of the models. If one of the models converged, that model would be the final model. If neither of the models converged, we would fit a model with no random correlation. We did this iteratively each time after a term was dropped from the random model. Sorry we did not make it very clear in the manuscript.

In the newest version of the manuscript, we aligned the method of model selection with the description in the Results section of the manuscript. More specifically, now we started the data analysis with the maximal model, if that model could not converge, we fitted another model with the most complex random effect structure, but the random correlation was dropped. If the reduced model could not converge, we continued with the model selection strategy described in the response of comment #1a and never considered the random correlation again. We hope now the model selection method is consistent with our description in the manuscript. >>

c) In both the R code and the text body, only some of the factors appear to be tested. In both Exp 1a and 1b, there are models testing only these fixed effects:

-Prime Structure

-Trial Number

-Prime Structure x Head Noun

-Prime Structure x Task Difficulty

-Prime Structure x Head Noun x Task Difficulty

Where are the models testing the other fixed effects? Did you test, for example, for the main effect of Head Noun or Task Difficulty? If you didn’t test all factors, you should say that explicitly in the paper and justify why not. Even if some factors are not theoretically interesting, it is very non-standard to not even test for/report those factors if they were entered into your analysis.

The cross-experiment analysis of 1a vs. 1b similarly omits testing of a number of factors.

Strangely, in Exp 2a, the authors suddenly do test for the main effect of Problem Difficulty (but still not Head Noun), and Problem Difficulty x Lexical Overlap [sic].

But then testing for those two fixed effects disappears again in Exp 2b, which only tests for the 5 fixed effects I noted above.

So again we have a consistency problem.

<<Authors’ response: In the newest version of the manuscript, we only reported the fixed effects of the LME models (estimates, standard errors, and p-value) but not the results of model comparison. The tables of the fixed effects in the LME models for each experiment was moved from Appendix A to the results sections.

We further modified the report of results in such a way that we first reported theoretically interesting effects that were significant, followed by the theoretically interesting effects that were not significant, the unexpectedly significant effects, and ended with the other non-significant effects. This way, we hope the report of model results is more consistent.

Due to the model update and change in the reporting of the results, the three-way interaction between prime condition, head noun condition, and problem difficulty was no longer significant in Experiment 2a, while this interaction became significant in Experiment 2b. Accordingly, we deleted the section about the subset analysis in Experiment 2a and added a subset analysis in Experiment 2b (p43-p44). And the change of the patterns of the interactions led to some modification of the discussion about the lexical-specificity of the cognitive load effect on structure persistence (p53).>>

MINOR COMMENTS:

#2 Table 1: I understand, I had misunderstood what the numbers in Table 1 represented. The authors respond that they changed the title of the table to clarify this; I admit the title looks identical to the previous one except that one sentence was moved from the top to the bottom of the table, but I leave this to the authors to figure out.

<<Authors’ response: We made some further changes in order to clarify what each column represents in the table. First, in the first row of the header, we showed the name of the three independent variables. Second, we changed the wording of the table title to …the proportion of s-genitive responses out of all s-genitive and of-genitive responses... Third, we added a column to the right of the table that reports the structure repetition effects in each head noun condition (i.e., the proportion of s-genitive in s-genitive condition minus that in of-genitive condition). >>

#3There are some drastic changes between different versions of lme4, so I suggest you report the version of the package that you used.

<<Authors’ response: We have reported the version of lme4 used in the analysis (p21).>>

#4 pg 29, line 292-293: The authors write "structural priming effect may persist over at least one intervening task, irrespective of what type of interference is employed." I do not think this is a fair statement; the authors have tested one type of secondary task (addition math problems); they don’t have evidence how any other type of task will or will not interfere.

<< Authors’ response: We have deleted this statement.>>

#5 The authors reference a condition called "Lexical Overlap" and "Overlap Condition" (pg. 36, line 439 and continuing on that page). Is this supposed to be “Head Noun Condition” or is this a new contrast they have introduced?

<<Authors’ response: We have unified the terms used in the manuscript>>

#6 The authors write (pg. 29, lines 281-283) that the "Inverse of Bayes factor" is 0.023 or 0.020. Do they really mean “inverse”? If that’s accurate, that means the Bayes factor is around 50 (1/0.020), which is very strong evidence in favor of H1 (cf Jeffreys 1961).

<< Authors’ response: Indeed, the inverse of Bayes factor (or BF10) here means the results that 1 is divided by the Bayes factor (BF01). It was argued that the bigger the inverse of Bayes factor is, the stronger the evidence that supports the alternative hypothesis (Raftery, 1995). That is why we followed the practice in previous literature and reported BF10 in the cross-experiment analyses. However, given we predicted that there would be no difference in the magnitudes of interactions between the two experiments, it is very much reasonable to only use Bayes factor (BF01) when we report the results of Bayesian analysis. So in the newest version of the manuscript, we decided to report Bayes factor (BF01) instead of its inverse (BF10).

We have made it clear that the Bayes factors that were reported in the manuscript were exclusively BF01. We then interpreted the BF01 only in terms of to what extent it supported or opposed the null hypothesis. This way, we hope to make the report of Bayes factors more consistent.>>

#7 The authors provide a short response to my previous question about whether the math problems provide enough of a cognitive load to disrupt processing in their Response letter. However, I didn’t see this in the manuscript itself. While I appreciate the personalized response, it would be useful if other readers could see this discussion as well.

<< Authors’ response: We have added parts in the manuscript that justified the design of the secondary task (p13, p35, and p47).>>

MINOR WORDING EDITS:

#8 pg. 29, line 291-292: “suggest that THE structural priming effect may persist”

#9 pg. 109, line 326-327: “One unexpected result in the priming experiments was a negative correlation between the critical trial number and the likelihood of s-genitive production.” --> “priming experiment” [singular], as you only found this negative correlation in Exp 1a, not 1b.

<<Author’s response: We have modified all the wording errors. Thank you for pointing them out.>>

Attachment

Submitted filename: Response to Reviewers for Second Revision _Zhang et al..docx

Decision Letter 2

Michael P Kaschak

22 Sep 2020

PONE-D-20-04033R2

The role of explicit memory in syntactic persistence: effects of lexical cueing and load on sentence memory and sentence production

PLOS ONE

Dear Dr. Zhang,

I am writing with regard to your manuscript, "The role of explicit memory in syntactic persistence: effects of lexical cueing and load on sentence memory and sentence production" (PONE-D-20-04033R2). I have received comments the reviewer who handled your last submission of the manuscript, and I have reviewed your manuscript myself. The reviewer and I find that you have done a good job of handling the majority of the concerns raised in the last round of reviews, leaving only a few relatively straightforward concerns to address. 

The main point to address comes from the reviewer, who notes that there appears to be an inconsistency between the sign of the effects listed in your tables and your stated procedure for defining a reference level for the comparisons. Please double check this to make sure that the results are presented correctly. Beyond this, there are a few other smaller concerns to address. 

Please submit your revised manuscript by Nov 06 2020 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

We look forward to receiving your revised manuscript.

Kind regards,

Mike Kaschak

Academic Editor

PLOS ONE

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: (No Response)

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: I Don't Know

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The authors have largely addressed my statistical comments on the previous version of the manuscript. There remain a few concerns but most of them are minor.

STATISTICS/ANALYSIS COMMENTS:

1. Table 2 lists the fixed effects estimates for Exp 1a with the caption, “Prime condition (S-genitive as the baseline level), head noun condition (Same Head Noun as the baseline level), problem difficulty (Easy Problem as the baseline level) were in mean-centered form.” But the estimates and z-scores listed in the table all seem to have the wrong sign. If S-genitive is coded as the baseline (reference) level, then the estimate shows the change from S-genitive to of-genitive prime. Subjects produced more S-genitives following S-genitive primes than of-genitive primes (see Table 1); that means the change from S-gen prime to of-gen prime is negative. But the estimate listed in Table 2 for Prime Condition is positive. There is a similar problem for Head Noun Condition – more S-genitives were produced in the Same than the Different Head Noun Condition, and Same was coded as the baseline level. That means the estimate for Head Noun Condition should be negative, but it is positive. This backwards-sign problem appears to be the case for all of the models reported in the paper (the four experiments plus the cross-experiment analyses). (Though perhaps not for the effect of critical trial number, which is continuous rather than categorical.)

This speaks to some underlying error that I’m having trouble diagnosing. I don’t know if the error comes about because of (a) type-os in the text of inverting all of the signs, (b) miscoding your factors in the code and analysis, or (c) me misunderstanding what you did (in which case it should be clearer because I am trying really hard to figure it out). But whatever is causing this problem, the output numbers don’t make sense.

2. The authors write, “Based on the standard interpretation of Bayes factors as evidence for null hypotheses [65], BF01 that ranges from 1 to 3 can be taken as weak evidence for the null hypothesis. The higher a BF01, the more evidence in support of the null hypothesis (3-20: positive evidence; 20-150: strong evidence; > 150: very strong evidence).”

I’ve always seen Bayes Factor for experimental hypothesis testing written as strength of evidence in favor of the Alternative (H1), rather than evidence in favor of the Null (H0). For example, as noted in [Kass, R.E. and Raftery, A. (1995) Bayes Factors, Journal of the American Statistical Association, 90: 773-795.], “When comparing results with standard likelihood ratio tests, it is convenient to instead put the null hypothesis in the denominator and thus use B10 as the Bayes factor.” Additionally, Jeffreys (1961), where Bayes Factor was originally proposed, lists BF values as strength of evidence for H1, so numbers >1 show evidence in favor of H1 and numbers <1 show evidence in favor of H0. Anyway, you do write explicitly that you’re using BF01 so I suppose that’s alright, but I suspect it will be confusing to readers who see a decimal number and yet the interpretation is that the evidence favors H1 (as I was on the previous version of the manuscript).

3. There is no report of all of the raw condition means in the paper, which makes interpreting some of your results difficult or impossible. For example, in Exp 2a, there is a main effect of Problem Difficulty, but I don’t know how that is actually manifest in the data. The text lists the p-value but no condition means. Table 5 lists the estimate which ostensibly tells me the direction and magnitude of the effect in standardized terms, (though see earlier point regarding sign) but no raw values. Table 1 lists raw values but only for Prime Condition x Head Noun Condition, collapsing across Problem Difficulty. The figures only show Head Noun Condition x Problem Difficulty because the DV is the Prime Condition difference score. So there’s a main effect of Problem Difficulty and I have no idea what that actually looks like in terms of raw condition means.

MINOR COMMENTS:

4. The code to normalize the key press RT and trial number variables is missing from the model selection code so it doesn’t run as-is. (You can leave it if you want but just so you know that I had to modify your code to run it.)

5. There are quite a number of type-os, extra or missing words, and grammatical errors throughout the text. I list a few here but this is non-exhaustive (just the ones I happened to notice) and I recommend careful proof-reading.

-pg. 22: “This is because that by using contrastive coding, the fixed effects of the model are informative about the main effects and the interactions” --> remove “that”

-pg. 29: “Similar to Experiment 1a, there was also a main effect of the head noun condition (pz

630 < .001), which might also BE due to that the head noun overlap effect…” --> add “be”, remove “that”

-pg. 32: “In addition, the theoretically interestING interactions that involved the contrast between the two experiments were examined by estimating Bayes factors (BF01) using Bayesian Information Criteria.” --> add “ing”

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2020 Nov 5;15(11):e0240909. doi: 10.1371/journal.pone.0240909.r006

Author response to Decision Letter 2


1 Oct 2020

Reviewer #1: The authors have largely addressed my statistical comments on the previous version of the manuscript. There remain a few concerns but most of them are minor.

STATISTICS/ANALYSIS COMMENTS:

1. Table 2 lists the fixed effects estimates for Exp 1a with the caption, “Prime condition (S-genitive as the baseline level), head noun condition (Same Head Noun as the baseline level), problem difficulty (Easy Problem as the baseline level) were in mean-centered form.” But the estimates and z-scores listed in the table all seem to have the wrong sign. If S-genitive is coded as the baseline (reference) level, then the estimate shows the change from S-genitive to of-genitive prime. Subjects produced more S-genitives following S-genitive primes than of-genitive primes (see Table 1); that means the change from S-gen prime to of-gen prime is negative. But the estimate listed in Table 2 for Prime Condition is positive. There is a similar problem for Head Noun Condition – more S-genitives were produced in the Same than the Different Head Noun Condition, and Same was coded as the baseline level. That means the estimate for Head Noun Condition should be negative, but it is positive. This backwards-sign problem appears to be the case for all of the models reported in the paper (the four experiments plus the cross-experiment analyses). (Though perhaps not for the effect of critical trial number, which is continuous rather than categorical.)

This speaks to some underlying error that I’m having trouble diagnosing. I don’t know if the error comes about because of (a) type-os in the text of inverting all of the signs, (b) miscoding your factors in the code and analysis, or (c) me misunderstanding what you did (in which case it should be clearer because I am trying really hard to figure it out). But whatever is causing this problem, the output numbers don’t make sense.

<< Author’s response: Thank you so much for pointing out this problem. The signs in Table 2 were not wrong. What went wrong was that we made typos in the table title. The levels that we coded as -0.5 (Of-genitive, Different Head Noun, and Difficult Problem) were supposed to be taken as the reference level. However, we mistakenly reported the levels coded as 0.5 (S-genitive, Same Head Noun, and Easy Problem) as the reference level. We were very sorry for this unfortunate mistake. Now we have corrected the naming of the reference level in Table 2-7.>>

2. The authors write, “Based on the standard interpretation of Bayes factors as evidence for null hypotheses [65], BF01 that ranges from 1 to 3 can be taken as weak evidence for the null hypothesis. The higher a BF01, the more evidence in support of the null hypothesis (3-20: positive evidence; 20-150: strong evidence; > 150: very strong evidence).”

I’ve always seen Bayes Factor for experimental hypothesis testing written as strength of evidence in favor of the Alternative (H1), rather than evidence in favor of the Null (H0). For example, as noted in [Kass, R.E. and Raftery, A. (1995) Bayes Factors, Journal of the American Statistical Association, 90: 773-795.], “When comparing results with standard likelihood ratio tests, it is convenient to instead put the null hypothesis in the denominator and thus use B10 as the Bayes factor.” Additionally, Jeffreys (1961), where Bayes Factor was originally proposed, lists BF values as strength of evidence for H1, so numbers >1 show evidence in favor of H1 and numbers <1 show evidence in favor of H0. Anyway, you do write explicitly that you’re using BF01 so I suppose that’s alright, but I suspect it will be confusing to readers who see a decimal number and yet the interpretation is that the evidence favors H1 (as I was on the previous version of the manuscript).

<< Author’s response: Thanks for the suggestion. We have modified the report of Bayes factors so that it is now focusing on BF10.>>

3. There is no report of all of the raw condition means in the paper, which makes interpreting some of your results difficult or impossible. For example, in Exp 2a, there is a main effect of Problem Difficulty, but I don’t know how that is actually manifest in the data. The text lists the p-value but no condition means. Table 5 lists the estimate which ostensibly tells me the direction and magnitude of the effect in standardized terms, (though see earlier point regarding sign) but no raw values. Table 1 lists raw values but only for Prime Condition x Head Noun Condition, collapsing across Problem Difficulty. The figures only show Head Noun Condition x Problem Difficulty because the DV is the Prime Condition difference score. So there’s a main effect of Problem Difficulty and I have no idea what that actually looks like in terms of raw condition means.

<< Author’s response: We have added in the Results sections all of the raw condition means.>>

MINOR COMMENTS:

4. The code to normalize the key press RT and trial number variables is missing from the model selection code so it doesn’t run as-is. (You can leave it if you want but just so you know that I had to modify your code to run it.)

<< Author’s response: We have added normalization of the key press RT and trial number variables in the scripts with model selection.>>

5. There are quite a number of type-os, extra or missing words, and grammatical errors throughout the text. I list a few here but this is non-exhaustive (just the ones I happened to notice) and I recommend careful proof-reading.

-pg. 22: “This is because that by using contrastive coding, the fixed effects of the model are informative about the main effects and the interactions” --> remove “that”

-pg. 29: “Similar to Experiment 1a, there was also a main effect of the head noun condition (pz

630 < .001), which might also BE due to that the head noun overlap effect…” --> add “be”, remove “that”

-pg. 32: “In addition, the theoretically interestING interactions that involved the contrast between the two experiments were examined by estimating Bayes factors (BF01) using Bayesian Information Criteria.” --> add “ing”

<< Author’s response: We have modified all the grammar errors mentioned above and other errors we could find in the text. >>

Decision Letter 3

Michael P Kaschak

6 Oct 2020

The role of explicit memory in syntactic persistence: effects of lexical cueing and load on sentence memory and sentence production

PONE-D-20-04033R3

Dear Dr. Zhang,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Michael Kaschak

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Acceptance letter

Michael P Kaschak

12 Oct 2020

PONE-D-20-04033R3

The role of explicit memory in syntactic persistence: effects of lexical cueing and load on sentence memory and sentence production

Dear Dr. Zhang:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Michael P. Kaschak

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Appendix. Summary of fixed effects in LME models in the subset analyses of Experiment 1a-b and 2a-b.

    (DOCX)

    S2 Appendix. Primes and targets used in each experiment.

    The description of the target picture is given in first line depicts the content of the target picture. The possessor of the colored object and the object that is owned are mentioned. In the following lines, the s-genitive (a) and the of-genitive primes (b) are given in Dutch. In each prime sentence, the noun in the Same Head Noun condition is mentioned before the slash and the noun in the Different Head Noun condition is mentioned after the slash.

    (DOCX)

    S3 Appendix. Arithmetic problems used in each experiment.

    The appendix only displays each problem in an addend order that the first addend is always larger than the second. In the reported experiment the order of the addends was counterbalanced.

    (DOCX)

    Attachment

    Submitted filename: Response to reviewers _Zhang et al..docx

    Attachment

    Submitted filename: Response to Reviewers for Second Revision _Zhang et al..docx

    Data Availability Statement

    All the relevant data and scripts are available from the Open Science Framework (DOI https://doi.org/10.17605/OSF.IO/6UTKF).


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES