Skip to main content
PLOS ONE logoLink to PLOS ONE
. 2022 Mar 25;17(3):e0264265. doi: 10.1371/journal.pone.0264265

Mind the gender gap: COVID-19 lockdown effects on gender differences in preprint submissions

Iñaki Ucar 1,☯,*, Margarita Torre 2,, Antonio Elías 3,
Editor: Ali B Mahmoud4
PMCID: PMC8956178  PMID: 35333874

Abstract

The gender gap is a well-known problem in academia and, despite its gradual narrowing, recent estimations indicate that it will persist for decades. Short-term descriptive studies suggest that this gap may have actually worsened during the months of confinement following the start of the COVID-19 pandemic in 2020. In this work, we evaluate the impact of the COVID-19 lockdown on female and male academics’ research productivity using preprint drop-off data. We examine a total of 307,902 unique research articles deposited in 5 major preprint repositories during the period between January and May each year from 2017 to 2020. We find that the proportion of female authors in online repositories steadily increased over time; however, the trend reversed during the confinement and gender parity worsened in two respects. First, the proportion of male authors in preprints increased significantly during lockdown. Second, the proportion of male authors in COVID-19-related articles was significantly higher than that of women. Overall, our results imply that the gender gap in academia suffered an approximately 1-year setback during the strict lockdown months of 2020, and COVID-related research areas suffered an additional 1.5-year setback.

Introduction

The under-representation of women in scientific publications is well established in the literature. Despite their growing presence in all research areas, women continue to publish less than men [13], even in fields where they are not a minority [4]. In addition, women are much less likely to be included as first or last authors in an article [3, 5]. According to recent estimates by Holman and his colleagues [6], a significant gender gap will persist for decades, especially in the areas of computer science, physics, and mathematics. As a result, women are less likely than men to be granted tenure or promoted [1, 7, 8].

How has the gender gap in scientific production changed during the global lockdown? And how might these changes affect the career paths of men and women in the short and medium term? Here, we build on recent research to argue that the move towards gender parity could slow down as a result of COVID-19. On the one hand, surveys examining time allocation and research production during the pandemic suggest that research productivity has decreased more for women, particularly those with young children [911]. On the other hand, research funds are being redirected to support coronavirus-related studies, even at the cost of other less cutting-edge topics. Those with the ability to initiate COVID-related research projects will be more likely to benefit from these new lines of funding. If the majority of scholars conducting such novel research are men, then they will have gained a further advantage in the coming years. All in all, lower levels of scientific production during confinement could be detrimental to women—particularly those in early career stages [12]—who may see diminished promotion possibilities or even risk losing their jobs.

Assessing the magnitude and scope of the new gap in scientific production is, therefore, crucial to designing and implementing effective actions that will prevent a backslide in academic gender equity. With this aim, we model the evolution of the gender gap in preprint submissions from January 2017 to May 2020 to measure the impact of COVID-19 on male and female scientific productivity. Specifically, we examine a total of 307,902 unique research articles deposited in 5 major repositories: arXiv, medRxiv, bioRxiv, PsyArXiv, and SocArXiv. A preprint is a full draft article that is shared publicly before it has been peer reviewed. Preprints offer strong benefits, such as the possibility of receiving feedback and increased visibility, which often results in a higher number of citations. In addition, during the COVID-19 pandemic, they have enabled researchers to share data and essential findings at unprecedented speeds. We aim to quantify the effect of confinement on the likelihood of men and women to publish a preprint and, more specifically, to publish COVID-related research.

Our research contributes to prior studies in several ways. First, unlike previous research focusing on a very limited period [1315], we examine preprint submission trends from 2017 to 2020. By expanding the observation window, we are able to discern how much of the observed change in 2020 corresponds to the effect of lockdown and social distancing and how much is due to gender differences in the circulation of scientific knowledge over time [1618]. Second, unlike previous studies, we provide a systematic analysis of academic fields. Some scholars have examined very broad areas of knowledge, such as mathematics, physics, or economics [1315], while others have focused on journals on a very specific topic, such as medicine [19, 20] or biological sciences [21]. Our study seeks to fill this gap by covering a total of 10 academic fields and 250 sub-fields. Thus, we provide a comprehensive view of the pandemic’s impact on scientific productivity. Also, as we will see, this level of granularity is crucial to avoid incurring Simpson’s paradox, where group trends disappear or reverse when data is aggregated. Third, our model discriminates between COVID-19-related research and general research. While the coronavirus has brought many challenges to academic research, it has also created opportunities for research moving forward. Therefore, it is key to look at who is taking advantage of these new opportunities.

Finally, we pay detailed attention to authorship order. Given the growing tendency across scientific disciplines to write multi-author papers, the sequence of names is becoming a major topic in recruitment processes, promotion, and tenure [3, 22, 23]. Thus, we not only examine whether gendered patterns of authorship vary after lockdown among all authors but also among solo authors, first authors, and last authors. It is common practice across many areas that the first author contributes most to the work and receives the most credit. Therefore, given the asymmetric share of domestic responsibilities women have assumed during confinement [2426], we expect a decrease in females listed as first authors and as solo authors, the two positions requiring the most intensive research work. This decline might be particularly noticeable in areas of knowledge where women have recently joined, since young female academics are more likely to have kids and experience the caregiver burden. As for the other authorship positions, the pre-lockdown expectations are not so clear. In some disciplines, the last author position is reserved for the senior author (first-last-author-emphasis norm, or FLAE), while in other fields the author sequence reflects their relative contributions to the manuscript (sequence-determines-credit approach, or SDC) [23]. Consequently, variations in gender composition after lockdown cannot be easily anticipated.

Data and methods

The complete dataset is publicly available on Zenodo [27], where the software repository containing the replication scripts is linked as supplementary materials.

Data collection and integration

Using their APIs, we collected public data from the top 5 preprint servers in terms of submissions with availability since January 2017 (namely, arXiv, bioRxiv, medRxiv, PsyArXiv, and SocArXiv). We used R [28], and, specifically, the following packages: aRxiv [29], medrxivr [30] and osfr [31]. We downloaded all submission records from January to May for the years 2017, 2018, 2019, and 2020. The arXiv API returns the date of the last update for preprints with more than one submission. For the other repositories, the date is defined as the first posting date. Specifically, we collected information on full author names, order of appearance, publishing date, categories, subcategories, article title, abstract, and keywords. We restricted the analysis to the March-May period, when the lockdown was considerably uniform among countries. When contagions dropped during June and July, internal and external border restrictions were relaxed. However, the restrictions were lifted unevenly and the confinement situation became more heterogeneous across countries and continents, potentially skewing the data. In addition, focusing on the short-term period allows us to capture gender differences when researchers are put under time pressure and must carry out their work in challenging circumstances.

Preprint categorization

Different preprint servers require different approaches to preprint categorization. On the one hand, the arXiv repository consists of 8 categories (such as Computer Science, Mathematics, or Physics) and a reduced set of subcategories for large subject areas within those main categories. bioRxiv and medRxiv follow a similar approach for the Biology and Health Sciences categories, respectively. On the other hand, PsyArXiv and SocArXiv (Psychology and Social Sciences) allow the authors to freely tag submissions with a (potentially unlimited) number of areas, sub-areas, and even more specific fields of study. As a result, compared to the rest of the repositories, PsyArXiv and SocArXiv contain a large number of subcategories for a relatively small number of preprints, as shown in S1 Table.

Thus, to better balance our categorization, we pre-processed PsyArXiv and SocArXiv data to consider just the preprint’s main subcategory. To identify this main subcategory, we first sorted all the unique tags in descending order based on the number of papers using them. The main subcategory for each preprint was then defined as the first appearance in the previous list, and the rest of the tags were removed. We also discarded those subcategories with fewer than 100 preprints, and manually recoded some SocArXiv subcategories that were still too specific (see S2 Table). Finally, arXiv categories “Quantitative Biology” and “Quantitative Finance” were recoded and merged into Biology and Economics, respectively, and the Psychology subcategory contained in SocArXiv was merged into the Psychology category.

The summary after this pre-processing is shown in Table 1. While public repositories have gained popularity in all fields of knowledge, the table reveals that significant differences among disciplines persist. Physics, Computer Science, and Mathematics are the three areas with the highest number of submissions. At the other extreme are Economics, Social Science, and Psychology. These figures are consistent with previous research showing that journals from STEM disciplines have clearer policies regarding preprinting than journals from the Social Sciences and Humanities, which could affect authors’ decisions to submit their work to open-access repositories [32].

Table 1. Number of subcategories and preprints per category after pre-processing.

In the arXiv repository, preprints are sometimes cross-tagged in several categories. As a result, the number of unique preprints is 307,902.

Category # subcategories # preprints
Biology 34 44454
Computer Science 40 79653
Economics 12 2869
Elec. Eng. Systems Science 4 10686
Health Sciences 52 6100
Mathematics 32 78187
Physics 51 116983
Psychology 14 4120
Social Sciences 5 1938
Statistics 6 22040
Total: 250 367030

Next, a data quality assessment was conducted to detect and remove possible inconsistencies. First, we processed full names to remove stop words, places, and institutions, so that, for example, an author’s institution did not register as an additional author. Second, we removed articles with inconsistent dates. Finally, because in some repositories supplementary materials appeared as an additional posting, we removed them, as well as preprints that were marked as withdrawn from the repository.

Inferring gender from authors’ given names

We used the genderize.io database to assign gender to authors based on their first names. This is one of the most effective gender prediction tools [33] and has been widely used in the literature ([6], among others). One of the most important advantages of using name-to-gender inference services is that, compared to standard approaches for name-to-gender inference based on administrative data (census data, administration records, or country-specific birth lists), they allow for a robust prediction for names from countries all over the world (see [34] for an evaluation of different web services). Our choice, Genderize, is a database of name–gender associations assembled from all over the web (> 114M given names for approximately 80 countries as of January 2021), and thus is a good option for analyses on data outside of a national context. Gender data was collected via Genderize’s API using the genderizeR R package [35] for a total of 1,235,037 unique authors.

In addition to predicting gender for a given name, Genderize returns additional information to quantify the precision of such predictions, namely, count and probability. The count shows how many instances in the database associate a given first name with the predicted gender, and the probability corresponds to the proportion, or frequency, of such associations. Unfortunately, gender cannot be predicted when the authors’ given names were written as initials or were absent from the Genderize database, and these instances were reported as missing cases. The Genderize API reported a total of 28% of missing cases. Additionally, following Holman and his colleagues [6], we only consider gender identification with a probability higher or equal to 0.95 and a frequency of at least 10 appearances in the Genderize database. This simple procedure preserved 80% of cases. After filtering out missing cases, the proportion of preprints included in the analyses were equivalent for all the years (see Table 2), indicating that this procedure did not introduce any under- or over-representation bias.

Table 2. Preprints considered per year and model.

Proportion (p) and number (N) of preprints included in the analyses for each year and model after filtering out missing cases.

Model 2017 2018 2019 2020
p N p N p N p N
all authors 0.76 39843 0.78 49623 0.80 60939 0.79 91030
first author 0.66 13941 0.65 19002 0.65 24857 0.64 37993
last author 0.74 15379 0.73 21252 0.73 27765 0.73 43014
single author 0.78 6605 0.77 7010 0.77 7328 0.76 9965

Measuring the effect of lockdown

We are interested in estimating the gender gap in preprint submissions and measuring how much of this gap can be attributed to the global lockdown. Previous research examining gender differences in publications has mostly used generalized linear models (GLMs) to estimate the gender proportion and its rate of change ([6, 36], among others). However, this approach takes the averages of the individual-level variables, discarding valuable within-group information that may reveal opposing trends. A potential alternative would be to disaggregate and introduce categories and subcategories as fixed effects in the GLM design, but this would violate the assumption of independence of the observations, thus biasing the results. To account for this drawback, we employ a hierarchical (or multilevel) GLM, which explicitly models the nested nature of this data.

More concretely, we define a fractional hierarchical GLM model that captures the proportion of males as a function of time and measures the effect of the lockdown and type of research (directly related to COVID-19 versus not directly related), with a random intercept per category and subcategory. We consider the quasi-binomial family to describe the error distribution (to account for overdispersion) with the logit link function:

logit(pmale,i)=β0+α0,k[j[i]]+β1·year+β2·lockdown+β3·COVIDpaper+ϵiα0,k=α1,j[i]·category+ηkα1,j=α2,i·subcategory+γj (1)

where i, j, and k index observations, subcategories, and categories, respectively; the terms ϵi, γj, and ηk are normal errors at the individual and cluster levels; the response variable pmale is the proportion of males; ‘year’ is a continuous variable that takes a value of 0 at the start of our time window (i.e., 2017 → 0, 2018 → 1 and so on); ‘lockdown’ is a binary factor that is equal to 1 during the lockdown period, from March to May 2020; ‘COVID paper’ is a binary factor that is equal to 1 for preprints directly related to COVID-19, defined as those preprints containing “coronavirus”, “sars-cov-2,” or “covid-19” in their title (restricted to 2020 and with case-insensitive matching); and where we consider a random intercept that varies across categories and subcategories inside categories.

Within this framework, we consider four distinct models. Namely, we estimate the monthly proportion of males for (1) all authors, (2) first authors, (3) last authors, and (4) single authors. In all cases, pmale is computed as the total number of males over the total number of males and females identified per month, excluding missing values. In the case of all authors, preprints with missing gender rates greater than 25% are not considered, and subcategories with fewer than 30 authors per month are dropped too. In the case of first and last authors, preprints with an alphabetically-ordered list of authors are discarded, as alphabetical sequence is frequently used to acknowledge similar contributions or to avoid disharmony among collaborating groups [23]. In the case of single authors, only preprints with one author are considered.

Results and discussion

Gender trends in preprints submission over time

Global numbers in Fig 1 show that the trend of preprint submissions has accelerated notably over the previous three years, and especially during lockdown. This effect is particularly pronounced in fields where COVID-related production is more likely—such as biology, health sciences (vaccines, epidemiology, etc.), and mathematics (epidemiological models)—but is also clearly evident in computer science, economics, engineering, physics, and psychology. Also, the time trend in the social sciences and economics is less constant than in the other areas. This is largely due to the lower number of submissions registered in these fields. As we have seen in Table 1, both social sciences and economics rank at the bottom in terms of the number of preprints received, with N = 1938 and N = 2869, respectively, far behind other areas such as physics (N = 116983), computer science (N = 79653), and mathematics (N = 78187). In spite of these variations, results suggest that research in all areas has been very prolific during 2020, particularly during the lockdown months.

Fig 1. Number of submissions per month.

Fig 1

The first facet (all) shows the global number of preprint submissions per month in all repositories during the period considered. Subsequent facets break down these numbers per category with varying scales for the vertical axis.

Next, Fig 2 displays the male proportion by category. In general terms, we do observe a slowly declining global trend. This is mainly driven by categories that are already more feminized than the average, such as biology, health sciences, and psychology. As for the rest of the categories, the gender gap has remained rather stable during the period considered, in consonance with previous findings [6]. The social sciences are the only exception to the general trend, showing a slight upswing in the proportion of males. As discussed above, this oddity might be related to a less frequent and irregular use of online servers in this particular field of study.

Fig 2. Proportion of male authors per month.

Fig 2

The first facet (all) shows the global proportion of males per month submitting to all repositories during the period considered. Subsequent facets break down these numbers per category.

Consistent with previous studies [14, 15], Fig 2 reveals a slowing down of the feminization process during the pandemic. This slowdown can be separately observed in biology and psychology, and the gender gap has even started to grow in health sciences and economics. However, it is difficult to conclude with certainty whether such an effect exists and, even if it does, whether it can be ultimately attributable to the lockdown period, as other authors have suggested from similar descriptive analyses. To better account for these trends, we next explore this issue in a hierarchical modelling framework.

Explaining the gender gap

We estimate the monthly proportion of males for (1) all authors, (2) first authors, (3) last authors, and (4) single authors, as described in the Methods section. Our model separates the temporal trend from the effect of the lockdown, controlling for COVID-related work, and uses random intercepts to take into account the hierarchical nature of the data, which is nested in categories and subcategories. Table 3 shows the coefficient estimates and main summary statistics for all the models considered.

Table 3. Regression analysis results.

Table of coefficients and summary statistics for the models considered.

Dependent variable:
Proportion of males
all authors all authors first author last author single author
fractional GLM fractional GLMM
(0) (1) (2) (3) (4)
year −0.104*** −0.049*** −0.044*** −0.059*** −0.029
(0.003) (0.003) (0.009) (0.009) (0.018)
lockdown 0.072*** 0.031*** 0.035* 0.008 0.136***
(0.007) (0.007) (0.020) (0.021) (0.052)
COVID paper −0.614*** 0.076*** 0.399*** 0.129** 0.715**
(0.019) (0.022) (0.064) (0.065) (0.298)
(Intercept) 1.826*** 1.595*** 1.520*** 1.819*** 2.480***
(0.006) (0.199) (0.195) (0.173) (0.169)
Observations 3,368 3,368 3,996 4,027 3,517
N (subcategory) 192 201 201 200
N (category) 10 10 10 10
sd(subcategory) 0.29 0.32 0.30 0.45
sd(category) 0.62 0.60 0.53 0.47
Log Likelihood −39,496.310 −11,327.900 −7,182.127 −6,955.634 −3,788.911
Akaike Inf. Crit. 79,000.620 22,667.800 14,376.250 13,923.270 7,589.823
Bayesian Inf. Crit. 22,704.540 14,414.010 13,961.070 7,626.815
Pseudo-R2 0.04 0.83 0.24 0.18 0.11

Note:

*p<0.1;

**p<0.05;

***p<0.01

The use of a GLMM model is justified by comparing model (1) with model (0), which is a GLM for the same data but does not account for this hierarchical structure. There are three reasons that lead us to opt for the GLMM model. First, the GLMM achieves a much better fit and predictive power (83% of variance explained). Second, the model intercept correctly captures the overall average proportion of males at the beginning of our time window (∼83% of males as of January 2017, as can be seen in Fig 2). Finally, once we include categories and subcategories as random intercepts, we observe that the sign of the ‘COVIDpaper’ coefficient changes. This result reveals that disaggregation is necessary to avoid incurring a Simpson’s paradox. Moreover, these random intercepts also correctly capture well-known differences among categories and subcategories. For example, we observe that STEM categories have a wider gender gap than the average, while non-STEM ones are much more feminized (see S1 Fig). Similarly, subcategories such as “high-energy” and “quantum physics” are more masculinized than the average for all physics, while astrophysics-related research as well as “bio-medical physics” are more feminized (see S2 Fig).

To facilitate model comparison, Fig 3 displays the relative effect size with 95% confidence intervals (CI) for the three fixed effects in (1–4). Results can be summarized in three key points. First, findings confirm a slow but significative decreasing trend in the overall proportion of male authors (OR 0.95, 95% CI 0.95–0.96); this holds for first (OR 0.96, 95% CI 0.94–0.97) as well as last authors (OR 0.94, 95% CI 0.93–0.96). The reduced number of single authors, however, does not provide enough statistical power to measure such a small effect (should it exist), but the point estimate is consistent with the estimates for the rest of the models (OR 0.97, 95% CI 0.94–1.01).

Fig 3. Relative effect sizes.

Fig 3

Odds ratio with 95% CIs for the fixed effects in all the models considered, from top to bottom: (1) all authors, (2) first authors, (3) last authors, and (4) single authors.

Second, we find that for all authors, there is a measurable lockdown effect that is slightly smaller than the yearly effect but with the opposite sign (OR 1.03, 95% CI 1.02–1.05): the lockdown has partially reversed the yearly decrease in the proportion of male authors that would be expected in 2020 given the trend from previous years. A very similar effect for first authors could exist, although the variability is higher and thus there is not enough statistical power to reach a stronger conclusion (OR 1.04, 95% CI 1.00–1.08). Regarding single authors, a statistically significant and potentially much stronger effect can be observed (OR 1.15, 95% CI 1.03–1.27). This can possibly be explained by the greater effort and time required to produce single-authored papers, which could negatively affect women, especially those with children [2426]. Finally, no effect is found for last authors (OR 1.01, 95% CI 0.97–1.05). This may be due to the fact that this co-authorship position has different meanings across disciplines. While in disciplines following the first-last-author-emphasis norm, last positions tend to be reserved for senior researchers with consolidated careers (e.g., senior female authors), the last position in the sequence-determines-credit approach is more likely to correspond to less-contributing authors (e.g., young women with child-rearing responsibilities). It seems reasonable to think that the latter may have had more trouble juggling child care and research than the former. These factors might result in a lower average value, and otherwise larger variability, for last author effects.

Third, we find an additional masculinization effect in COVID-related preprints that is equal or larger than the yearly feminization effect for all the models considered (all authors: OR 1.08, 95% CI 1.03–1.13; first author: OR 1.49, 95% CI 1.31–1.69; last author: OR 1.14, 95% CI 1.00–1.29; single author: OR 2.04, 95% CI 1.14–3.67). In other words, overall, female authors’ production has been penalized during lockdown compared to their male peers, but especially in those disciplines where increased productivity is directly linked to COVID-19 research. While the COVID-19 pandemic has resulted in unprecedented research opportunities worldwide, women have not benefited as much as men have. This is particularly noticeable among single authors and first authors, the two most time-consuming positions. As for the last authorship position, the relative effect size is lower but still significant.

Conclusion

In this paper, we examine how the global lockdown has affected the gender productivity gap in academia. More specifically, we model the evolution of the gender gap in preprint submissions between 2017 an 2020. Our findings show that the progress towards gender parity that has been observed over the past few years partially reversed during the lockdown period. The pandemic confinements penalized women in two ways. Not only were they less likely to complete research during the pandemic, but they were also less likely to produce COVID-related research than men, despite the increasing research opportunities that COVID-19 provided in many fields. Overall, results indicate that the gender gap in academia suffered an approximately 1-year setback during the strict lockdown months of 2020, and COVID-related research areas (which incidentally have a better male-female balance) suffered an additional 1.5-year setback.

The results of our research are relevant from an empirical and substantive point of view. From the empirical perspective, our analysis indicates that dissagregated data at the sub-field level is key to untangling within-group trends that are otherwise concealed in global averages due to the profound gender gap difference that still exist across disciplines and sub-disciplines. In this regard, generalized linear mixed models are a natural choice to model the hierarchical structure of such data. From the substantive perspective, results show that COVID-19 lockdowns exacerbated gender inequalities such that their effects will be felt in the years to come.

Current differences in productivity levels might result in higher rates of gender inequality in the next few years. Negative effects in the short and medium term might be twofold. On the one hand, we expect a reallocation of research money at the expense of research areas funded prior to the pandemic, which can lead to an unequal distribution of resources. In addition, lower productivity levels will result in fewer citations, fewer research grants, and lower likelihood of promotion among women. While our work only examines the impact of COVID-19 lockdowns in academia, this prognosis might be valid for all women in high-skilled occupations where promotion tracks and human capital accumulation are crucial during early career years, to the point that early productivity declines might lead to job loss [37, 38]. Therefore, the implementation of gender equity actions is necessary in order to ensure that the COVID-19-related penalty does not translate into inequality in future recruitment and promotion processes.

Supporting information

S1 Fig. Random intercepts for categories for all authors (1).

The model captures the known trends for all the major categories. Economics, engineering, computer science, mathematics, physics, and, to a lesser extent, statistics are categories with a proportion of males over the global average. By contrast, social sciences, biology, health sciences, and, specially, psychology are more balanced than the average.

(PDF)

S2 Fig. Random intercepts for subcategories for all authors (1).

Within each category, specific subcategories develop their own trends. For example, we observe that high-energy and quantum physics are more masculinized than the average for all physics, while astrophysics-related research and bio-medical physics are more feminized.

(PDF)

S1 Table. Initial composition of the dataset.

Raw number of subcategories and preprints per category prior to any cleaning or pre-processing step.

(PDF)

S2 Table. Manual adjustments for SocArXiv.

As detailed in Methods, the large number of tags added to PsyArXiv and SocArXiv documents required a separate methodology for preprint categorization. As a final step of such methodology, SocArXiv also required manual recoding of the subcategories listed here.

(PDF)

Data Availability

The complete dataset is archived on Zenodo (https://doi.org/10.5281/zenodo.5142676). R code used to analyse the data is archived at https://github.com/CONCIERGE-CM-UC3M/COVID19-gender-gap.

Funding Statement

This work has been supported by the Madrid Government (Comunidad de Madrid) under the Multiannual Agreement with UC3M in the line of "Fostering Young Doctors Research" (CONCIERGE-CM-UC3M), and in the context of the V PRICIT (Regional Programme of Research and Technological Innovation. M.T. and I.U. also acknowledge support from the Spanish Ministry of Science and Innovation through the research grant RTI2018-098182-A-I00.

References

  • 1. Huang J, Gates AJ, Sinatra R, Barabási AL. Historical comparison of gender inequality in scientific careers across countries and disciplines. Proceedings of the National Academy of Sciences. 2020;117(9):4609–4616. doi: 10.1073/pnas.1914221117 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Larivière V, Ni C, Gingras Y, Cronin B, Sugimoto CR. Bibliometrics: Global gender disparities in science. Nature News. 2013;504(7479):211. doi: 10.1038/504211a [DOI] [PubMed] [Google Scholar]
  • 3. West JD, Jacquet J, King MM, Correll SJ, Bergstrom CT. The Role of Gender in Scholarly Authorship. PLOS ONE. 2013;8(7):e66212. doi: 10.1371/journal.pone.0066212 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Teele DL, Thelen K. Gender in the Journals: Publication Patterns in Political Science. PS: Political Science & Politics. 2017;50(2):433–447. doi: 10.1017/s1049096516002985 [DOI] [Google Scholar]
  • 5. Bendels MHK, Müller R, Brueggmann D, Groneberg DA. Gender disparities in high-quality research revealed by Nature Index journals. PLOS ONE. 2018;13(1):e0189136. doi: 10.1371/journal.pone.0189136 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Holman L, Stuart-Fox D, Hauser CE. The gender gap in science: How long until women are equally represented? PLOS Biology. 2018;16(4):e2004956. doi: 10.1371/journal.pbio.2004956 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Catalyst. Quick Take: Women in Academia; 2020. Available from: https://www.catalyst.org/research/women-in-academia/.
  • 8. Hechtman LA, Moore NP, Schulkey CE, Miklos AC, Calcagno AM, Aragon R, et al. NIH funding longevity by gender. Proceedings of the National Academy of Sciences. 2018;115(31):7943–7948. doi: 10.1073/pnas.1800615115 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Myers KR, Tham WY, Yin Y, Cohodes N, Thursby JG, Thursby MC, et al. Unequal effects of the COVID-19 pandemic on scientists. Nature human behaviour. 2020;4(9):880–883. doi: 10.1038/s41562-020-0921-y [DOI] [PubMed] [Google Scholar]
  • 10. Barber BM, Jiang W, Morse A, Puri M, Tookes H, Werner IM. What Explains Differences in Finance Research Productivity During the Pandemic? The Journal of Finance. 2021;. doi: 10.1111/jofi.13028 [DOI] [Google Scholar]
  • 11. Staniscuaski F, Kmetzsch L, Soletti RC, Reichert F, Zandonà E, Ludwig ZM, et al. Gender, race and parenthood impact academic productivity during the COVID-19 pandemic: from survey to action. Frontiers in psychology. 2021;12. doi: 10.3389/fpsyg.2021.663252 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Harrop C, Bal V, Carpenter K, Halladay A. A lost generation?The impact of the COVID-19 pandemic on early career ASD researchers. Autism Research. 2021;. doi: 10.1002/aur.2503 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Cooke SJ, Cramp RL, Madliger CL, Bergman JN, Reeve C, Rummer JL, et al. Conservation physiology and the COVID-19 pandemic. Conservation Physiology. 2021;9(1):coaa139. doi: 10.1093/conphys/coaa139 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Viglione G. Are women publishing less during the pandemic? Here’s what the data say. Nature. 2020;581:365–366. doi: 10.1038/d41586-020-01294-9 [DOI] [PubMed] [Google Scholar]
  • 15. Matthews D. Pandemic lockdown holding back female academics, data show. Times Higher Education. 2020;. [Google Scholar]
  • 16. Zhu Y. Who support open access publishing? Gender, discipline, seniority and other factors associated with academics’ OA practice. Scientometrics. 2017;111(2):557–579. doi: 10.1007/s11192-017-2316-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. J T, A K, J M, C R, E MH, C P, et al. Gender differences and bias in open source: pull request acceptance of women versus men. PeerJ Computer Science 3:e111. 2017. doi: 10.7717/peerjcs.111 [DOI] [Google Scholar]
  • 18. Ruggieri R, Pecoraro F, Luzi D. An intersectional approach to analyse gender productivity and open access: a bibliometric analysis of the Italian National Research Council. Scientometrics. 2021;126(2):1647–1673. doi: 10.1007/s11192-020-03802-0 [DOI] [Google Scholar]
  • 19. Andersen JP, Nielsen MW, Simone NL, Lewiss RE, Jagsi R. Meta-Research: COVID-19 medical papers have fewer women first authors than expected. elife. 2020;9:e58807. doi: 10.7554/eLife.58807 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Gayet-Ageron A, Messaoud KB, Richards M, Schroter S. Female authorship of covid-19 research in manuscripts submitted to 11 biomedical journals: cross sectional study. BMJ. 2021;375. doi: 10.1136/bmj.n2288 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Ribarovska AK, Hutchinson MR, Pittman QJ, Pariante C, Spencer SJ. Gender inequality in publishing during the COVID-19 pandemic. Brain, behavior, and immunity. 2021;91:1. doi: 10.1016/j.bbi.2020.11.022 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Macaluso B, Larivière V, Sugimoto T, Sugimoto CR. Is science built on the shoulders of women? A study of gender differences in contributorship. Academic Medicine. 2016;91(8):1136–1142. doi: 10.1097/ACM.0000000000001261 [DOI] [PubMed] [Google Scholar]
  • 23. Tscharntke T, Hochberg ME, Rand TA, Resh VH, Krauss J. Author Sequence and Credit for Contributions in Multiauthored Publications. PLOS Biology. 2007;5(1):1–2. doi: 10.1371/journal.pbio.0050018 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Ruppanner L, Tan X, Scarborough W, Landivar LC, Collins C. Shifting Inequalities? Parents’ Sleep, Anxiety, and Calm during the COVID-19 Pandemic in Australia and the United States. Men and Masculinities. 2021;24(1):181–188. doi: 10.1177/1097184X21990737 [DOI] [Google Scholar]
  • 25. Collins C, Landivar LC, Ruppanner L, Scarborough WJ. COVID-19 and the Gender Gap in Work Hours. Gender Work Organ. 2020;n/a(n/a). doi: 10.1111/gwao.12506 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Landivar LC, Ruppanner L, Scarborough WJ, Collins C. Early Signs Indicate That COVID-19 Is Exacerbating Gender Inequality in the Labor Force. Socius. 2020;6:2378023120947997. doi: 10.1177/2378023120947997 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Ucar I, Torre M, Fernández AE. CONCIERGE-CM-UC3M/COVID19-gender-gap; 2021. Available from: 10.5281/zenodo.5142676. [DOI]
  • 28.R Core Team. R: A Language and Environment for Statistical Computing; 2019. Available from: https://www.R-project.org/.
  • 29.Ram K, Broman K. aRxiv: Interface to the arXiv API; 2019.
  • 30.McGuinness LA, Schmidt L. medrxivr: Accessing medRxiv data in R; 2020. Available from: https://github.com/mcguinlu/medrxivr.
  • 31. Wolen AR, Hartgerink CHJ, Hafen R, Richards BG, Soderberg CK, York TP. osfr: An R Interface to the Open Science Framework. Journal of Open Source Software. 2020;5(46):2071. doi: 10.21105/joss.02071 [DOI] [Google Scholar]
  • 32. Klebel T, Reichmann S, Polka J, McDowell G, Penfold N, Hindle S, et al. Peer review and preprint policies are unclear at most major journals. PLOS ONE. 2020;15(10):1–19. doi: 10.1371/journal.pone.0239518 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Wais K. Gender Prediction Methods Based on First Names with genderizeR. R J. 2016;8(1):17. doi: 10.32614/RJ-2016-002 [DOI] [Google Scholar]
  • 34. Santamaría L, Mihaljević H. Comparison and benchmark of name-to-gender inference services. PeerJ Computer Science. 2018;4:e156. doi: 10.7717/peerj-cs.156 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. Wais K. Gender Prediction Methods Based on First Names with genderizeR. The R Journal. 2006;8(1):17–37. doi: 10.32614/RJ-2016-002 [DOI] [Google Scholar]
  • 36. van den Besselaar PAA, Sandstrom U. Vicious circles of gender bias, lower positions and lower impact: gender differences in scholarly productivity and impact. PLoS ONE. 2017;12(8):e0183301. doi: 10.1371/journal.pone.0183301 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Antecol H, Bedard K, Stearns J. Equal but Inequitable: Who Benefits from Gender-Neutral Tenure Clock Stopping Policies? American Economic Review. 2018;108(9):2420–41. [Google Scholar]
  • 38. Rosen S. Contracts and the Market for Executives. National Bureau of Economic Research; 1990. 3542. Available from: http://www.nber.org/papers/w3542. [Google Scholar]

Decision Letter 0

Alireza Abbasi

22 Sep 2021

PONE-D-21-25133Mind the gender gap: COVID-19 lockdown effects on gender differences in preprint submissionsPLOS ONE

Dear Dr. Ucar,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Nov 06 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Alireza Abbasi

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at 

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and 

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

Additional Editor Comments (if provided):

Below are two reviews of your submission. Each reviewer raises important issues and concerns but each also sees potential. We invite you to resubmit a revissied version addressing the issues raised carefully. Good luck on the revision.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Partly

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: I Don't Know

Reviewer #2: No

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The paper analyses preprint submission to five major preprint services from 2017 to the early stages of lockdown in 2020 to find out if the pandemic had any impact on submission by female authors. The paper is timely and has interesting findings. The data set is large and seems to have been processed properly to deal with known issues in bibliographic data and gender identification. The authors have made the data available which is commendable.

I have a few comments and questions. My first question is about the dates chosen. I think May was the early stages of lockdown in 2020 and not sure why a later date wasn't chosen to allow enough time for the pandemic to have its impact on scholarly work.

Some of the gender bibliometric studies (see works by Mike Thelwall for instance) have used data such as US census or social security data for gender identification arguing other data are based on the social web and therefore, unreliable and usually lack transparency. does this issue apply to Genderize service, or it is a transparent and reliable source.

The paper hasn't covered the related literature properly. while I didn't expect to see coverage of all gender bibliometric studies as there are many of them, the relevant papers that had a similar topic (impact of the pandemic on scientific productivity) should've been consulted and used if useful in the paper. An easy way to find most of such studies is to look at the papers that have cited the key paper by Viglione.

https://scholar.google.com/scholar?cites=15300145317525011924&as_sdt=2005&sciodt=0,5&hl=en

Finally, I think at least the Zenodo (or a link) should be mentioned in reference 20 otherwise readers won't be able to locate the source. currently, it is just author names and the title.

Reviewer #2: The manuscripts looks the disparity of productivity across males and females researcher by analysing a sample of nearly 500,000 preprints deposited during the years 2017 and 2020.

Language of the paper is at certain places very speculative. For example, predicting that the gender gap is going to persist for decades. How do we know that it is going to persist for decades? What is our indication and how many decades are we talking about? Could it be for the next 8-10 decades? Could this vanish over the next few decades? This is currently very vague and rather speculative. There are other examples of speculative arguments in the introduction too, and I am not very comfortable with them, because I can’t neither confirm or challenge those statements.

Language of the paper has at certain places been made unnecessarily complicated. In the sample, there were perhaps more 2020 papers with male authors than female authors. It is unclear why this has to be presented in a probabilistic language: “men were slightly more likely than women to submit preprints during lockdown”. Also being “slightly more likely” is not consistent with the sentence before claiming that the gap has widened during 2020. It is important that authors look at these findings with neutrality and not predisposed with the idea that the gender disparity has to have worsened during pandemic.

“men were significantly more likely than women to submit COVID-related research” – How is this related to overall productivity of male and female researchers? Why Covid topic has been singled out as a measure of productivity?

The reviewer also notes that more than 3 million articles are overall published each year, whereas the sample used in this study uses 500,000 pre-print items distributed over four years. While there is no prohibitive issue with sampling from pre-prints in general, one should note that they are not necessarily representative of the overall research production. The issue especially becomes important when the difference found between male and female is slight and can change after considering a bigger picture (i.e., the full amount of research produced) or published papers. Also, we cannot ignore the fact that these are pre-prints after all, and it is not clear what portion of them translated to official publications. This is especially a concern for covid-related publications in 2020 where an avalanche of papers were deposited in mass during first months of pandemic and many of them never got accepted due to insufficient quality/rigor.

For what portion of articles, the gender for all authors (or at least first author) could be determined at the specified threshold of confidence (0.95) and criteria (full names and not initials)?

p. 4 “However, this approach takes the averages of the individual-level variables, discarding valuable within-group information that may reveal opposing trends.” Please clarify. What does this mean? Are we talking about the interaction between contributing variables but in a rather complicated language?

If the function in Eq 1 is a logit model, then why the distribution of error is normal? That is not consistent with a logit model specification.

Why is p the proportion of males? The data is coming from individual articles. So what one would expect is for example p=1 if the paper has a male author (or a male first author, depending on what analysis you’re running). This is not aggregate model, so what is the reason for using the term proportion?

The most confusing part of this equation is the alpha coefficients, which have not even been elaborated on. Why such a confusing specification? What is the reason for the additional error terms for alphas? Those are going to confound with the main error term and the reviewer cannot even see the justification or interpretation of those.

Also this analysis (the logit analysis) is less problematic when we consider first authorships because a paper either has a male first author or a female first author (which is binary and consistent with a logit model specification). Therefore, if the coefficient for “year” for example is negative then one can conclude that the proportion of male authors have been decreasing over time (Although I am still not convinced why this has to be inferred from such an indirect complicated way as opposed to just reporting the proportion from the sample). But when it comes to “all author” analysis, then I am not sure how the models treats this. Is this where the issue of “proportion” comes to play? You count the proportion of male authors on the paper? Are you only using the papers for which the gender of all authors could be determined with 0.95 confidence?

Again, considering that the issue of lockdown was not a thing until 2020, I do not see the point of doing such joint/multivariate analysis, whereas, the proportion of males/females that published in 2020 could have just been reported independently. I apologise if this comment might sound too direct, but could it be that authors had to force themselves to use a more sophisticated statistical approach because simple reporting of proportions/stats could have come across rudimentary? Again, I apologise and I am sure that authors understand that the nature of a review sometimes entails direct questions of this nature. There is just limited interaction between these variables especially for 2017-2019 and that makes the use of such joint model very questionable to the reviewer. So could you justify the use of this method (and not choosing to just simply report the split of authorship in 2020 pre-prints)? Does it have to be inferred indirectly from a multivariate probabilistic model?

In the absence of captions for Supplementary figures, the reviewer unfortunately has no idea how to interpret them.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Hamid R. Jamali

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2022 Mar 25;17(3):e0264265. doi: 10.1371/journal.pone.0264265.r002

Author response to Decision Letter 0


28 Oct 2021

We would like to thank the reviewers for their time and valuable comments, which have been a great help in improving the paper. In the following, we first describe the main changes that we have introduced in the revised version of the manuscript, and then provide a response to each reviewer's comments, explaining how these comments have been addressed in the new version of the paper. For convenience, along with the revised version of the manuscript, we submit an alternate version with track changes enabled that highlights changes in blue and red, denoting whether we added or removed content.

# Summary of changes

--------------------

- First, we have endeavored to emphasize both the relevance and the contribution of our paper. Also, the introduction of the revised manuscript now offers a more comprehensive view of previous research on this topic.

- Second, we explain in more detail the issues concerning our research design, particularly those concerning the observation window and the gender classification process.

- Finally, we have realized that the empirical portion of the paper was not presented with the precision required and raised some questions about the methods we are using. Consequently, we have better explained our analytical decisions. Also, we have redrafted the article aiming to make the results clear and intelligible to these and other readers.

By expanding the frame, presenting the main findings more simply and clearly, and more judiciously assessing the theoretical contribution of the results, we believe that we have clarified the paper’s contribution to our evolving understanding of men's and women's productivity during lockdown. We hope that the revised version meets the high standards and expectations of PLOS ONE.

Below we respond to all the concerns in detail.

# Responses to reviewer 1

-------------------------

****************************************

> The paper analyses preprint submission to five major preprint services from 2017 to the early stages of lockdown in 2020 to find out if the pandemic had any impact on submission by female authors. The paper is timely and has interesting findings. The data set is large and seems to have been processed properly to deal with known issues in bibliographic data and gender identification. The authors have made the data available which is commendable.

****************************************

We thank the reviewer for their careful reading of the manuscript and their encouraging comments.

****************************************

> I have a few comments and questions. My first question is about the dates chosen. I think May was the early stages of lockdown in 2020 and not sure why a later date wasn't chosen to allow enough time for the pandemic to have its impact on scholarly work.

****************************************

We definitely agree with Reviewer 1 that the observation window is a crucial aspect of the research design. Expanding the observation period would be feasible. However, after giving this issue a great deal of thought, we concluded that keeping the short-term analysis has important advantages for our aim. From a substantive perspective, it allows us to capture gender differences at work in the context of time pressures and challenging circumstances (one of the most important sources of gender occupational segregation).

From an empirical perspective, while the lockdown was considerably uniform among countries from March to May 2020, the restrictions were lifted unevenly from June onward. Contagions dropped during June and July to levels that permitted internal and external border restrictions to be relaxed. However, the de-escalation process was not uniform, and the situation among countries/continents became more heterogeneous. Since our data do not allow for controlling by country, it is crucial to reduce the observation window to the most homogeneous time.

In the revised version of the manuscript, we have further elaborated on these decisions in the introduction section.

****************************************

> Some of the gender bibliometric studies (see works by Mike Thelwall for instance) have used data such as US census or social security data for gender identification arguing other data are based on the social web and therefore, unreliable and usually lack transparency. does this issue apply to Genderize service, or it is a transparent and reliable source.

****************************************

Indeed, name-to-gender inference based upon querying large name repositories (e.g., censuses, administration records, or country-specific birth lists) is particularly suitable when working at the national level, or with a limited number of countries. For analyses outside of a national context, however, name-to gender inference services are a better choice, as they can usually handle a greater degree of name diversity. In the particular case of the "Genderize" service (see [1] below for an evaluation of different web services), the underlying data is collected from social networks across 79 countries and 89 languages. Although the service does not geo-localize names automatically, it does accept two optional parameters, location and language, for more qualified guesses.

To further ensure accuracy in gender estimation, we follow the same classification process as Holman et al. (2018) [2] in this journal. When predicting gender for a given name, "Genderize" service returns additional information to quantify the precision of such prediction, namely, "count" and "probability". The count shows how many instances in the database associate a given first name with the predicted gender, and the probability corresponds to the proportion of such associations. Following [2], we only consider the drafts with a probability higher or equal to $0.95$. Also, we excluded names that were not found more than $10$ times in the "Genderize" database.

We have added these details to the methods section in the revised version of the manuscript. Thank you for the comment.

****************************************

> The paper hasn't covered the related literature properly. while I didn't expect to see coverage of all gender bibliometric studies as there are many of them, the relevant papers that had a similar topic (impact of the pandemic on scientific productivity) should've been consulted and used if useful in the paper. An easy way to find most of such studies is to look at the papers that have cited the key paper by Viglione.

https://scholar.google.com/scholar?cites=15300145317525011924&as_sdt=2005&sciodt=0,5&hl=en

****************************************

In the previous draft, we cited Viglione's paper but did not explore articles that cited it. Following the reviewer's suggestion, we have endeavored to make a more comprehensive review of the literature on COVID-19 and scientific productivity. As a result, the following references have been added: [3-10] listed below.

****************************************

> Finally, I think at least the Zenodo (or a link) should be mentioned in reference 20 otherwise readers won't be able to locate the source. currently, it is just author names and the title.

****************************************

Thank you for pointing this out. A wrong BibTeX declaration on our side led to the URL not appearing. This is fixed in the revised version of the manuscript (see reference 27).

# Responses to reviewer 2

-------------------------

We appreciate the reviewer's engagement with this research and the many specific and constructive suggestions for improvement.

****************************************

> The manuscripts looks the disparity of productivity across males and females researcher by analysing a sample of nearly 500,000 preprints deposited during the years 2017 and 2020.

Language of the paper is at certain places very speculative. For example, predicting that the gender gap is going to persist for decades. How do we know that it is going to persist for decades? What is our indication and how many decades are we talking about? Could it be for the next 8-10 decades? Could this vanish over the next few decades? This is currently very vague and rather speculative. There are other examples of speculative arguments in the introduction too, and I am not very comfortable with them, because I can’t neither confirm or challenge those statements.

****************************************

After reading the reviewer's comment, we realize that the introductory section of the manuscript has led to some confusion about the goal and contribution of our research. Our paper does not aim to estimate how long the gender gap of publications will last. The statement mentioned by the reviewer corresponds to a paper written by Holman et al. (2018), published in PLOS Biology [2]: "The gender gap in science: how long until women are equally represented?". In that paper, the authors affirm that "Topics such as physics, computer science, mathematics, surgery, and chemistry had the fewest women authors, while health-related disciplines like nursing, midwifery, and palliative care had the most. Of the gender-biased disciplines, almost all are moving towards parity, though some are predicted to take decades or even centuries to reach it" (section "The changing nature of the gender gap", first paragraph).

Here, we build on Holman et al. (2018) and other recent research (Viglione 2020, Myers et al. 2020, among others) to argue that the move towards gender parity could slow down as a result of COVID-19. We model the evolution of the gender gap in preprint submissions to measure the impact of COVID-19 on male and female scientific productivity during strict confinement. Overall, our results show that the gender gap in academia suffered an approximately 1-year setback during the strict lockdown months of 2020, and COVID-related research areas suffered an additional 1.5-year setback.

In the new version of the manuscript, we have revised the introductory section to clarify what corresponds to our article and what corresponds to previous research. Thank you for the remark.

****************************************

> Language of the paper has at certain places been made unnecessarily complicated. In the sample, there were perhaps more 2020 papers with male authors than female authors. It is unclear why this has to be presented in a probabilistic language: "men were slightly more likely than women to submit preprints during lockdown". Also being "slightly more likely" is not consistent with the sentence before claiming that the gap has widened during 2020. It is important that authors look at these findings with neutrality and not predisposed with the idea that the gender disparity has to have worsened during pandemic.

****************************************

Thank you for the observation. Following the reviewer's suggestion, we have redrafted the whole paper using a more direct language, endeavoring to communicate our meaning in a clear and concise manner. We hope these revisions help the reader.

We would also like to stress that neither the findings nor their interpretation reflect our personal views or preferences.

****************************************

> "men were significantly more likely than women to submit COVID-related research" – How is this related to overall productivity of male and female researchers? Why Covid topic has been singled out as a measure of productivity?

****************************************

While the coronavirus has created many challenges to conducting academic research, it has also created new opportunities (sometimes at the cost of less cutting-edge topics). As argued in the introduction, those (men or women) with the ability to launch COVID-related research projects will be more likely to benefit from these new lines of funding. Therefore, it is crucial to assess who is taking advantage of these new opportunities and whether there is a gender disparity among such researchers.

In our analysis, 'COVID paper' is defined as a binary factor that equals 1 for preprints directly related to COVID-19, defined as those preprints containing "coronavirus," "sars-cov-2," or "covid-19" in their title (restricted to 2020 and with case-insensitive matching). Please see the methods section for a more detailed description.

Therefore, we don't use COVID topics as a measure of productivity but as a key control in the analysis. In fact, a major contribution of our research is that COVID-related research during lockdown was signifcantly more masculinized than other kinds of research.

****************************************

> The reviewer also notes that more than 3 million articles are overall published each year, whereas the sample used in this study uses 500,000 pre-print items distributed over four years. While there is no prohibitive issue with sampling from pre-prints in general, one should note that they are not necessarily representative of the overall research production. The issue especially becomes important when the difference found between male and female is slight and can change after considering a bigger picture (i.e., the full amount of research produced) or published papers. Also, we cannot ignore the fact that these are pre-prints after all, and it is not clear what portion of them translated to official publications. This is especially a concern for covid-related publications in 2020 where an avalanche of papers were deposited in mass during first months of pandemic and many of them never got accepted due to insufficient quality/rigor.

****************************************

We fully agree with the reviewer's observation that preprints and publications are not equivalent, and we would never make this claim. We contend, however, that an analysis of preprints is itself of interest. On the one hand, preprints are a productivity proxy increasingly exploited in the literature (see, for example, Viglione [11] or Matthews [12], among others). Their relevance is increasing, both in substantive and numeric terms. As we can see in Figure 1 in the manuscript, there is a growing body of scientists uploading their research papers to public repositories. Not surprisingly, journals in many fields are developing clearer policies regarding preprinting [13]. On the other hand, preprints are useful for assessing men’s and women’s ability to carry out research under challenging circumstances and time pressures, regardless of whether this work later translates into publications.

We also agree with the reviewer that the period under study is a changing scenario. Consequently, our analyses examine both the pre- and post-pandemic periods. Indeed, looking at the data from both of these periods is a major contribution to previous research.

****************************************

> For what portion of articles, the gender for all authors (or at least first author) could be determined at the specified threshold of confidence (0.95) and criteria (full names and not initials)?

****************************************

Thank you for the question, which we have addressed by adding a new table (see Table 2 in the revised manuscript) to the expanded section "Inferring gender from authors' given names". This new table displays the portion and number of preprints that entered the modelling stage after our filtering process due to missing gender identification cases. As can be seen, the proportion of preprints per year is quite high and, more importantly, it is approximately constant over time, which means that the filtering process did not introduce any under- or over-representation bias.

****************************************

> p. 4 "However, this approach takes the averages of the individual-level variables, discarding valuable within-group information that may reveal opposing trends." Please clarify. What does this mean? Are we talking about the interaction between contributing variables but in a rather complicated language?

****************************************

Please note that this paragraph does not refer to the interaction of contributing variables but to the necessity of using random intercepts in our analysis. More concretely, we discuss the difference between conventional Generalized Linear Models and hierarchical modelling, also called "multilevel" models. In the typical example using students' grades, students cluster within schools; similarly, here authorships cluster within categories and subcategories. In a GLM setting, group averages (e.g., category averages) are considered. This conceals trends for each group and, therefore, may be biased. Consequently, we use a hierarchical model that separates and uncovers both individual and aggregated effects via these random intercepts.

We are aware that these kind of models are referred to differently across various disciplines (e.g., mixed models, random effects models, hierarchical models, multilevel models...), which can lead to some confusion. Therefore, in our methods section we have chosen to use standard terms in hierarchical modelling theory. More specifically, we follow the well-established guide by Gelman and Hill [14]. Nevertheless, acknowledging that "multilevel modelling" is also a very common term (Gelman and Hill also mention this term in their book), we have added this clarification to our manuscript as well.

****************************************

> If the function in Eq 1 is a logit model, then why the distribution of error is normal? That is not consistent with a logit model specification.

****************************************

Please note that we do *not* use a logit model, and we do not make such a claim. Eq. 1 shows a logit function only because this is the link function required for considering this kind of response and error. As specified in the methods section, the model has a "fractional response" (the response is a proportion between 0 and 1, and the total counts are introduced as weights). In particular, it uses a quasi-binomial family (to account for overdispersion), which means that the error distribution of the weighted response (the proportion of males multiplied by the total number of cases, i.e., the number of males) is considered to be binomial with some overdispersion. This also means that, under these assumptions, the errors of the transformed response (logit of the proportion of males, as Eq. 1 shows) do follow a normal distribution. Therefore, our specification is consistent with fractional models.

For the sake of clarity, the revised version of the manuscript explains that we consider the quasi-binomial family as a description of the error distribution. Thank you for the observation.

****************************************

> Why is p the proportion of males? The data is coming from individual articles. So what one would expect is for example p=1 if the paper has a male author (or a male first author, depending on what analysis you’re running). This is not aggregate model, so what is the reason for using the term proportion?

****************************************

As note by Reviewer 2, "we estimate the monthly proportion of males" by category and subcategory for (1) "all authors", (2) "first authors", (3) "last authors", and (4) "single authors". We did this because there are short-term effects in which we are not interested and that would affect the estimation: e.g., there are fewer submissions during weekends and other holidays, and some distinctive patterns may arise even for weekday submissions. Therefore, monthly aggregates effectively mitigate these short-term artifacts.

****************************************

> The most confusing part of this equation is the alpha coefficients, which have not even been elaborated on. Why such a confusing specification? What is the reason for the additional error terms for alphas? Those are going to confound with the main error term and the reviewer cannot even see the justification or interpretation of those.

****************************************

Here again, we realize that the confusion may come from the fact that the model used has different names across different fields. As discussed in previous comments, Eq. 1 defines the nested structure of the random intercepts with the $\\alpha_{0, k[j[i]]}$ coefficient, where individual observations ($i$) are nested into subcategories ($j$), and those are nested into categories ($k$). Each level in this hierarchical structure has its own error term that must be specified and estimated separately.

Aiming to avoid misunderstandings, we have added the "multilevel" term to the manuscript. Also, we would like to reiterate that we follow the well-established guide by Gelman and Hill in hierarchical/multilevel modelling [14], both in terms of the language and notation used.

****************************************

> Also this analysis (the logit analysis) is less problematic when we consider first authorships because a paper either has a male first author or a female first author (which is binary and consistent with a logit model specification). Therefore, if the coefficient for "year" for example is negative then one can conclude that the proportion of male authors have been decreasing over time (Although I am still not convinced why this has to be inferred from such an indirect complicated way as opposed to just reporting the proportion from the sample). But when it comes to "all author" analysis, then I am not sure how the models treats this. Is this where the issue of "proportion" comes to play? You count the proportion of male authors on the paper? Are you only using the papers for which the gender of all authors could be determined with 0.95 confidence?

****************************************

Thank you for the remark. This observation relates to the points discussed in our previous comments, which we hope have clarified the main issues. Specifically, (1) our model is not logit but fractional, where the response is a proportion and not 0-1; and (2) the response is a proportion "for all cases", so we calculate (a) the monthly proportion of all authors that are males, (b) the monthly proportion of first authors that are males, (c) the monthly proportion of last authors that are males, and (d) the monthly proportion of single authors that are males. We hope this helps clarify the kind of model and response we use.

Regarding the "all authors" analysis, we refer to the section "Inferring gender from authors' given names", where we specify: "Following Holman et al., [2], we only consider the drafts with a probability higher or equal to $0.95$. Additionally, we excluded names that were not found more than $10$ times in the "genderize" database." Further on, in the "Measuring the effect of lockdown" section, we specify: "In the case of "all authors", preprints with missing gender rates greater than 25% are not considered, and subcategories with fewer than 30 authors per month are dropped too."

****************************************

> Again, considering that the issue of lockdown was not a thing until 2020, I do not see the point of doing such joint/multivariate analysis, whereas, the proportion of males/females that published in 2020 could have just been reported independently. I apologise if this comment might sound too direct, but could it be that authors had to force themselves to use a more sophisticated statistical approach because simple reporting of proportions/stats could have come across rudimentary? Again, I apologise and I am sure that authors understand that the nature of a review sometimes entails direct questions of this nature. There is just limited interaction between these variables especially for 2017-2019 and that makes the use of such joint model very questionable to the reviewer. So could you justify the use of this method (and not choosing to just simply report the split of authorship in 2020 pre-prints)? Does it have to be inferred indirectly from a multivariate probabilistic model?

****************************************

Comments are always welcome. We understand the nature of the review and appreciate any suggestions for improvement.

In our view, the fact that lockdown was not an issue until 2020 is the main reason why we do need multivariate probabilistic modelling. Previous work already reported descriptive values about the proportion of males/females in 2020, but this provides a very limited picture for several reasons.

First, Figure 2 shows an overall trend whereby we expect to see more women every year participating in preprint submissions. Yet, a reversal of this pattern is observed during confinement. If we only reported the proportion of male/female authors during the lockdown months, how much of the gender gap could be attributed to the lockdown effect? How much of it could be simply explained by the temporal trend? (i.e., what we would have expected without a pandemic). By expanding the observation window, we are able to discern how much of the observed change in 2020 corresponds to the effect of lockdown and social distancing and how much is due to gender differences in the circulation of scientific knowledge over time [15-17].

Second, the gender trend in preprint submissions is not homogeneous: different fields/sub-fields have historically attracted more/less women, and thus they present distinctive patterns.

Finally, preprint submissions have experienced an unprecedented growth, in part due to the emergence of COVID-related research (see Figure 1). Does COVID-related research display different gender patterns than "general" research?

Our model allows us to answer all these questions and provides a much better understanding of the impact of COVID-19 on the gender gap compared to previous work reporting descriptive data. A detailed discussion of the empirical and substantive contribution of our research can be found both in the introduction and conclusion of the manuscript.

****************************************

In the absence of captions for Supplementary figures, the reviewer unfortunately has no idea how to interpret them.

****************************************

Please note that PLOS ONE requires supplementary tables and figures to be uploaded separately, while their captions can be found in the manuscript in a section called "Supporting information" that is located between our conclusions and references. We understand the confusion this may have caused, but this is how we are required to format the materials according to the journal guidelines.

# References

------------

[1]Santamaría L, Mihaljevi ´c H. Comparison and benchmark of name-to-gender inference services. PeerJComputer Science. 2018;4:e156.

[2]Holman L, Stuart-Fox D, Hauser CE. The gender gap in science: How long until women are equallyrepresented? PLOS Biology. 2018 Apr;16(4):e2004956. Available from: https://doi.org/10.1371/journal.pbio.2004956.

[3]Myers KR, Tham WY, Yin Y, Cohodes N, Thursby JG, Thursby MC, et al. Unequal effects of theCOVID-19 pandemic on scientists. Nature human behaviour. 2020;4(9):880–883.

[4]Andersen JP, Nielsen MW, Simone NL, Lewiss RE, Jagsi R. Meta-Research: COVID-19 medical papershave fewer women first authors than expected. elife. 2020;9:e58807.7

[5]Barber BM, Jiang W, Morse A, Puri M, Tookes H, Werner IM. What Explains Differences in FinanceResearch Productivity During the Pandemic? The Journal of Finance. 2021.

[6]Staniscuaski F, Kmetzsch L, Soletti RC, Reichert F, Zandonà E, Ludwig ZM, et al. Gender, race andparenthood impact academic productivity during the COVID-19 pandemic: from survey to action.Frontiers in psychology. 2021;12.

[7]Gayet-Ageron A, Messaoud KB, Richards M, Schroter S. Female authorship of covid-19 research inmanuscripts submitted to 11 biomedical journals: cross sectional study. BMJ. 2021;375.

[8]Ribarovska AK, Hutchinson MR, Pittman QJ, Pariante C, Spencer SJ. Gender inequality in publishingduring the COVID-19 pandemic. Brain, behavior, and immunity. 2021;91:1.

[9]Harrop C, Bal V, Carpenter K, Halladay A. A lost generation? The impact of the COVID-19 pandemicon early career ASD researchers. Autism Research. 2021.

[10]Oleschuk M. Gender equity considerations for tenure and promotion during COVID-19. Canadianreview of sociology. 2020.

[11]Viglione G. Are women publishing less during the pandemic? Here’s what the data say. Nature.2020;581:365–366. Available from: https://www.nature.com/articles/d41586-020-01294-9.

[12]Matthews D. Pandemic lockdown holding back female academics, data show. Times Higher Education.2020. Available from: https://www.timeshighereducation.com/news/pandemic-lockdown-holding-back-female-academics-data-show.

[13]Klebel T, Reichmann S, Polka J, McDowell G, Penfold N, Hindle S, et al. Peer review and preprintpolicies are unclear at most major journals. PLOS ONE. 2020 10;15(10):1–19. Available from: https://doi.org/10.1371/journal.pone.0239518.

[14]Gelman A, Hill J. Data Analysis Using Regression and Multilevel/Hierarchical Models. AnalyticalMethods for Social Research. Cambridge University Press; 2006.

[15]Zhu Y. Who support open access publishing? Gender, discipline, seniority and other factors associatedwith academics’ OA practice. Scientometrics. 2017;111(2):557–579. Available from: https://doi.org/10.1007/s11192-017-2316-z.

[16]J T, A K, J M, C R, E MH, C P, et al. Gender differences and bias in open source: pull request acceptanceof women versus men. PeerJ Computer Science 3:e111. 2017.

[17]Ruggieri R, Pecoraro F, Luzi D. An intersectional approach to analyse gender productivity and openaccess: a bibliometric analysis of the Italian National Research Council. Scientometrics. 2021;126(2):1647–1673.

Attachment

Submitted filename: response.pdf

Decision Letter 1

Alireza Abbasi

2 Feb 2022

PONE-D-21-25133R1Mind the gender gap: COVID-19 lockdown effects on gender differences in preprint submissionsPLOS ONE

Dear Dr. Ucar,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Mar 19 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Alireza Abbasi

Academic Editor

PLOS ONE

Journal Requirements:

Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: I Don't Know

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The authors have provided detailed responses to reviewers' comments and have made appropriate amendments. The only thing that was not clear to me in the revised format was whether the change in the number of preprints (changes in data presented in table 1) resulted in any change in the results of the regression analysis. I see they are the same as before while the number of preprints has changed. I just want the authors to ensure that there hasn't been an oversight in relation to this and the results are not erroneous.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Hamid R. Jamali

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2022 Mar 25;17(3):e0264265. doi: 10.1371/journal.pone.0264265.r004

Author response to Decision Letter 1


5 Feb 2022

We would like to thank the reviewers again for their time and valuable comments, which have been a great help in improving the paper. According to the feedback from revision 1, we are required to address one last comment:

****************************************

> Reviewer 1: The authors have provided detailed responses to reviewers' comments and have made appropriate amendments. The only thing that was not clear to me in the revised format was whether the change in the number of preprints (changes in data presented in table 1) resulted in any change in the results of the regression analysis. I see they are the same as before while the number of preprints has changed. I just want the authors to ensure that there hasn't been an oversight in relation to this and the results are not erroneous.

****************************************

We confirm that the results of the regression analysis were correct and didn't change. It was just a mistake in the counts provided in our first version of Table 1, but the models were fed with the correct data, and the results have been checked several times. Accordingly, we proceeded to submit this new revision with no changes.

Attachment

Submitted filename: response.pdf

Decision Letter 2

Ali B Mahmoud

8 Feb 2022

Mind the gender gap: COVID-19 lockdown effects on gender differences in preprint submissions

PONE-D-21-25133R2

Dear Dr. Ucar,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Ali B. Mahmoud, Ph.D.

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: Happy with the response of the authors about the accuracy of their analysis. The manuscript is ok for publishing.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Hamid R. Jamali

Acceptance letter

Ali B Mahmoud

14 Feb 2022

PONE-D-21-25133R2

Mind the gender gap: COVID-19 lockdown effects on gender differences in preprint submissions

Dear Dr. Ucar:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Ali B. Mahmoud

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Fig. Random intercepts for categories for all authors (1).

    The model captures the known trends for all the major categories. Economics, engineering, computer science, mathematics, physics, and, to a lesser extent, statistics are categories with a proportion of males over the global average. By contrast, social sciences, biology, health sciences, and, specially, psychology are more balanced than the average.

    (PDF)

    S2 Fig. Random intercepts for subcategories for all authors (1).

    Within each category, specific subcategories develop their own trends. For example, we observe that high-energy and quantum physics are more masculinized than the average for all physics, while astrophysics-related research and bio-medical physics are more feminized.

    (PDF)

    S1 Table. Initial composition of the dataset.

    Raw number of subcategories and preprints per category prior to any cleaning or pre-processing step.

    (PDF)

    S2 Table. Manual adjustments for SocArXiv.

    As detailed in Methods, the large number of tags added to PsyArXiv and SocArXiv documents required a separate methodology for preprint categorization. As a final step of such methodology, SocArXiv also required manual recoding of the subcategories listed here.

    (PDF)

    Attachment

    Submitted filename: response.pdf

    Attachment

    Submitted filename: response.pdf

    Data Availability Statement

    The complete dataset is archived on Zenodo (https://doi.org/10.5281/zenodo.5142676). R code used to analyse the data is archived at https://github.com/CONCIERGE-CM-UC3M/COVID19-gender-gap.


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES