Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2024 Aug 1.
Published in final edited form as: Pain. 2023 Mar 6;164(8):1775–1782. doi: 10.1097/j.pain.0000000000002874

Standing on the shoulders of bias: lack of transparency and reporting of critical rigor characteristics in pain research

Ashley N Plumb 1, Joseph B Lesnak 1, Giovanni Berardi 1, Kazuhiro Hayashi 1, Adam J Janowski 1, Angela F Smith 1, Dana Bailey 1, Cassie Kerkman 1, Zoe Kienenberger 1, Ben Martin 1, Ethan Patterson 1, Hannah Van Roekel 1, Carol G T Vance 1, Kathleen A Sluka 1
PMCID: PMC10356741  NIHMSID: NIHMS1865980  PMID: 36877823

1. Introduction

Scientific advancement and validity in biomedical research require studies with rigorous design and transparency in publishing of the methods and data. According to the principles and guidelines for reporting preclinical research set by the National Institute of Health (NIH) in 2014, minimum relevant standards for reporting in a study design include but are not limited to: requiring statistics to be reported, stating randomization and blinding, and completing a power analysis or stating how sample size was determined [1]. These crucial experimental design elements enhance the ability to review, analyze, and reproduce research findings and are ethically and economically necessary. For over the past two decades, human clinical trials have improved their rigor standards through adopting standard reporting elements such as the Consolidated Standards for Reporting of Trials (CONSORT) statement [35] which is often required in biomedical journals, including PAIN. In contrast, transparency of reporting in preclinical data is lacking with a 2009 report showing little improvement in journals that adopted guidelines similar to those required for human clinical trials [5; 29]. Preclinical pain research is not exempt from these findings with analysis from 2008 and 2013 showing only 11–28% of studies reported the use of randomization while 31–37% of studies reported adequate blinding [15; 38]. Although there have been multiple calls for increasing transparency in pain research, it is unknown whether preclinical pain research has increased transparency in reporting in the last decade [3; 15; 38].

The NIH established guidelines to include both sexes within clinical and preclinical research in 1993 and 2014, respectively [13]. Considering Sex as a Biological Variable (SABV) was recently initiated by NIH to improve rigor and reproducibility of research [4]. While preclinical research historically includes male rodents as the subject of choice, epidemiological studies of humans exemplify the disproportionate prevalence of pain conditions by sex, with females affected more frequently than males [17]. In contrast to other rigor characteristics in clinical studies, analysis by sex in NIH-funded randomized controlled trials is reported in less than a third of studies in 14 leading U.S. medical journals [21]. Furthermore, a recent report showed roughly 50% of preclinical studies still use only males [34], and it is unclear if those who use both sexes disaggregate data. Excluding females or omission of disaggregated data can negatively impact women’s health as exemplified by higher adverse drug reactions and less effectiveness of many drugs in females [19]. Adherence to policies on inclusion, reporting, and analysis of sex are necessary for replication, translation, and facilitating future metanalysis by sex.

Thus, improving rigor and transparency in biomedical research will require improved reporting of methodological factors that affect outcomes, and inclusion and reporting by sex. The purpose of the current manuscript was to examine reporting of rigor in clinical and preclinical pain research. We completed a systematic survey of research articles in the journal PAIN between 2012–2021 to examine the quality of methodological reporting and inclusion of sex in both clinical and preclinical studies.

2. Materials and methods

To investigate reporting of rigor in pain literature over the past 10 years, a group of 12 trained reviewers systematically analyzed primary research articles from the journal PAIN from 2012 to 2021 using a survey that was captured in Research Electronic Data Capture System (REDCap). Each reviewer underwent training which included reviewing a specific month from the journal PAIN to ensure reliability in assessment. Prior to the training session, reviewers were provided definitions of each item on the survey. The results from the training session were compared to two senior reviewers for accuracy and differences (ANP, JBL). Training was repeated until each reviewer demonstrated sufficient reliability with article assessment. After sufficient training, reviewers were then assigned articles to inspect and analyze [23]. An average number of 184 articles were assigned to individuals by month and year of publication. The survey was designed to identify each article by the title of the article and DOI number. Publication year of the article was recorded to determine trends for rigor characteristics over time. The article was then searched to determine included species (human, rat, mouse, cell, or other). Only original research articles were included, all topical reviews, systematic reviews and meta-analysis, narrative review, and clinical notes were excluded from our analysis.

2.1. Condition studied/model utilized coding

Pain condition of each article that utilized humans was coded as: fibromyalgia, postsurgical, migraine/headache, visceral/pelvic, nonspecific chronic pain, healthy/experimental, musculoskeletal, neuropathic, or other. Musculoskeletal conditions were further categorized as lower back pain, osteoarthritic, orofacial, neck pain, rheumatoid arthritis, and general musculoskeletal pain. Pain model of each article for animals was coded as: neuropathic, inflammatory, healthy, migraine, muscle, visceral, joint, postsurgical, cancer, or other.

2.2. Rigor coding

The methods section of each article was assessed for the following rigor characteristics: randomization, blinding, sample size calculation, and statistical plan. Each of these could be categorized as yes, no, not applicable, or unsure. In articles that included multiple experiments, reporting of randomization or blinding in just one of the experiments was categorized as “yes”, therefore coding was biased in favor of inclusion. Not applicable was chosen for studies that could not include randomization or blinding (i.e. no treatment group; survey-based study). Articles were characterized as “yes” if they completed a priori sample size calculation or a post-hoc power analysis to determine whether the study was statistically powered. Reviewers marked unsure for any article that was difficult to determine and senior team members (ANP, JBL) were assigned to review and code these articles.

2.3. Sex specific coding

The sex of subjects was categorized as both, male only, female only, or unreported. If the study used both sexes in just one of the experiments, coding was biased in favor of inclusion. Articles that included both sexes were further categorized as “reporting data disaggregated” and/ or “analyzing for sex differences.” Studies were not considered as “analyzing for sex differences” if researchers only completed a statistical adjustment of the data by sex (i.e., sex as a covariate). Articles utilizing both sexes, were also coded as “sex differences being the primary aim” if the main goal of the project was to explore sex differences. If only one sex was reported in the study, it was coded as “justified” if there was a written justification or if the disease was a sex specific disease. Justifications, when provided, were divided into sex specific studies (e.g., prostate cancer), studies with higher prevalence of disease or disorder in specific sex (e.g., breast cancer), or inadequate justification (e.g., not including females due to hormone fluctuations).

2.4. Data analysis

The data from this study were qualitative in nature with nominal data described as %. Simple linear regressions were performed to examine trends over time with a significant level set at 0.05. For calculation of the percentage that reported randomization or blinding, only articles that should have used randomization or blinding were included in the analysis.

3. Results

3.1. Species demographics and pain condition/model

Reporting of rigor characteristics were examined in 2,214 (this number includes non-rat/mouse species, this gets complicated because some articles used humans and “other”, so I cant just subtract) original scientific articles from the journal PAIN. The main species reported in these articles were humans (66.2%) with a majority of the remaining articles including mice (17.5%) or rats (19.2%) and very few articles using cell lines (1.9%) (Fig. 1a). Other species used in studies (0.68%) included cats, pigs, guinea pigs, squirrel monkeys, and dogs but were excluded from further analysis due to low percentage. Because articles may contain more than one species, only 15 articles were completely excluded making a total of 2,199 articles further examined for rigor characteristics.

Figure 1.

Figure 1

Pie graphs showing the percentage of species choice in all primary research published in Pain in the last 10 years (A). Percentage of condition/model studied in articles using humans (B), mice (C), and rats (D) as subjects in the journal PAIN in the last 10 years.

Next, we examined the pain condition or model studied in human and rodent (mouse and rat) research articles. In articles that studied humans, general nonspecific chronic pain (22.4%), musculoskeletal pain (16.9%), and healthy individuals given experimental pain (22.3%) comprised majority of the articles (Fig. 1b). Musculoskeletal pain conditions included lower back pain (7.9%), osteoarthritic (4.1%), orofacial (2.1%), neck pain (1.4%), rheumatoid arthritis (0.6%), and general musculoskeletal pain (0.6%). A smaller portion of studies included individuals with neuropathic pain (12.0%), fibromyalgia (4.8%), postoperative pain (4.4%), migraine (3.9%), and visceral/ pelvic pain (3.3%). Conditions that accounted for less than 1% of articles were grouped as other (15.6%). The most prevalent pain model studied in mice was neuropathic (35.6%) followed by inflammatory (19.9%), healthy/no pain (11.1%), cancer (4.0%), visceral (3.6%), migraine (2.9%), joint (2.2%), post-surgical (2.0%), muscle (1.6%), (Fig. 1c). Similarly, the most prevalent model studied in rats was neuropathic pain (44.9%) followed by inflammatory (20.9%), healthy (14.9%), other (12.6%), migraine (5.7%), joint (3.8%), post-surgical (3.3%), cancer (3.1%), visceral (3.1%), and muscle (1.4%) (Fig. 1d). Thus, the percentage of articles studying specific conditions differs between human and animal research. This data is similar to a previous bibliometric analysis in PAIN showing that 42% of preclinical pain studies published in the journal PAIN from 2016–2020 were neuropathic pain models [39], however, this study did not examine human literature and therefore was unable to directly compare clinical condition to preclinical animal model.

3.2. Reporting of rigor characteristics in clinical data is high with improvements in preclinical data

To assess the quality of reporting experimental design, we evaluated three key rigor characteristics that are necessary bias-reducing measures [25; 45] including: randomization, blinding, and sample size calculation/power analysis. The percentage of articles reporting randomization and blinding in human data varied between 2012 and 2021 (Fig. 2a) with a drop in 2016 followed by a steady increase in the use of these rigor characteristics from 2016 to 2021. However, when all primary clinical studies were considered in the journal PAIN in the past 10 years, there were no significant trends over time in the number of studies reporting blinding (slope 0.94, r2 =0.03, p=0.61) or randomization (slope 0.41, r2 =0.01, p=0.79). On the other hand, articles utilizing humans significantly increased the reporting of the use of a power analysis (slope 2.62, r2 =0.65, p<0.01) (Fig. 2a).

Figure 2.

Figure 2

Line graphs showing the percent of articles reporting randomization, blinding, power analysis, and statistical analysis plan separated by articles reporting human (A), mouse (B), and rat (C) studies.

In contrast to articles with human subjects, articles reporting data from rodents showed a significant steady increase from 2012–2021 in reporting of all rigor characteristics examined: randomization (mice: slope 4.0, r2 =0.83, p<0.01; rat: slope 4.57, r2 =0.81, p<0.01), blinding (mice: slope 3.20, r2 =0.53, p<0.05; rats: slope 2.57, r2 =0.68, p<0.01), and power analysis (mice: slope 2.63, r2 =0.73, p<0.01; rat: slope 3.48, r2 =0.83, p<0.01)(Fig. 2b, 2c). In the year 2021, studies that included mice reported blinding in 87% of articles, randomization in 53% of articles, and a power analysis in 28% of articles (Fig. 2b), while studies that included rats reported blinding in 67% of articles, randomization in 60% of articles, and a power analysis in 36% of articles (Fig. 2c).

3.3. Sex bias is low in clinical data but remains in preclinical data

The use of both sexes in human data remained relatively steady with the lowest year (2012) reporting 82% and the highest years (2018, 2021) reporting 90% over the past 10 years. Only a few articles did not report the sex of the subject over the past 10 years with the lowest (2014, 2018) reporting at 1% and highest (2012, 2017, 2021) at 4% (Fig. 3a). Statistically, there was no significant change in the use of both sexes (slope 0.38 ± 0.68, r2 =0.17, p=0.24), females only (slope −0.15 ± 1.39, r2 =0.03, p=0.62), or unreported sex (slope −0.03 ± 0.64, r2 =0.01, p=0.81). There was a significant but small decrease in the use of males only (slope −0.52 ± 0.26, r2 =0.73, p<0.01).

Figure 3.

Figure 3

Stacked bar graph showing percent of articles reporting whether species utilized was male, female, both, or if it was unreported in human (A), mouse (B), and rat (C) studies. The number of articles in each year is given in parenthesis on the x-axis legend.

Articles using mice and rats have significantly increased use and reporting of both sexes over the past 10 years (mouse: slope 3.92 ± 2.27, r2 =0.67; rat: slope 3.31 ± 1.79, r2 =0.69, p<0.05; p<0.05) (Fig. 3b, 3c) but only 49% mouse and 30% rat articles in 2021 reported the use of both sexes. Rodent studies also significantly decreased in the number of male only subjects (rat: slope −2.57 ± 1.37, r2 =0.7, p<0.05; mouse: slope −3.99 ± 1.92, r2 =0.64, p<0.05) but the number of studies that were female only remained the same (rat: slope −2.29 ± 0.69, r2 =0.1, p=0.36; mouse: slope −0.66 ± 1.51, r2 =0.11, p=0.34). The number of studies that did not report the sex of the rodent significantly decreased for rats (slope −0.46 ± 0.37, r2 =0.49, p<.05) but remained unchanged for mice (slope −0.59 ± 1.00, r2 =0.18, p=0.21).

3.4. Most clinical and preclinical data including both sexes are not disaggregated by sex or analyzed for sex differences

Although most of the articles that include human subjects report both sexes (Fig. 3a), the percent of studies that disaggregated or analyzed their data by sex remained relatively low with only 7–25% of articles disaggregated their data by sex and 19–37% analyzed for sex differences over the past decade (Fig. 4a). The percent of studies that analyzed for sex differences in human subjects has remained unchanged over the past decade (slope 0.60 ± 1.58, r2=0.09, p=0.41). Data disaggregation by sex had a small but non-significant trend toward increasing over the past decade (slope 0.97 ± 1.04, r2=0.36, p=0.06). There were no significant changes over the past 10 years in which one of the primary aims of the study was examining sex differences (slope 0.14 ± 0.36, r2=0.09, p=0.40).

Figure 4.

Figure 4

Line graphs showing the percent of articles reporting data analyzed and/or disaggregated by sex and whether the study was a sex specific study in human (A) and rodent (B) data. The number of articles in each year is given in parenthesis on the x-axis legend. Disaggregating data by sex is reporting of male and female data separately in an article. A sex difference study was defined as a study where the main goal was examining sex differences. Note the number of articles in rodent research was low before 2019 making percentage interpretation difficult.

Since there were so few studies, to determine if there was a trend in preclinical data over the past decade, studies reporting both sexes for mice and rats were combined. For preclinical studies, 17–54% of articles disaggregated by sex and 33–60% of articles analyzed by sex. The percent of studies in each category varied widely over the last decade but there was no significant change in analysis for sex differences (slope 1.21 ± 2.40, r2=0.14, p=0.28), disaggregation of data by sex (slope –0.19 ± 3.23, r2=0.0, p=0.88) (Fig. 4b), or percent of studies in which sex differences was one of the primary aims of the study (slope −0.98 ± 4.10, r2=0.04, p=0.60) (Fig. 4b).

3.5. Justification for articles reporting use of only one sex is lacking

For articles that utilized males or females only (Fig. 5), data were collected and aggregated over the past 10 years on whether the authors provided a justification for using a single sex. In humans, of the 162 articles reporting utilization of a single sex, 53% of articles did not report a justification, 28% studied sex specific conditions, 15% studied conditions with a higher sex specific prevalence, and 4% reported inadequate justifications (Fig. 6a). Of the 240 articles utilizing a single sex in mice, 91% of articles did not report a justification, 3% studied sex specific conditions, 4% studied conditions with a higher sex specific prevalence, and 2% reported inadequate justifications (Fig. 6b). Of the 349 articles utilizing a single sex in rats, 94% of articles did not report a justification, 2% studied sex specific conditions, 1% studied conditions with a higher sex specific prevalence, and 3% reported inadequate justifications (Fig. 6c). Justifications that were considered inadequate are included in Table 1.

Figure 5.

Figure 5

Stacked bar graphs showing the percent of males or females in studies that reported only one sex in human (A), mouse (B), and rat (C) studies. The number of articles in each year is given in parenthesis on the x-axis legend.

Figure 6.

Figure 6

Pie charts showing the percent of studies using a single sex with a justification as a sex specific condition or a studying a population with a higher prevalence of pain condition in a specific sex, inadequate justification, or no justification at all in human (A), mouse, (B), and rat (C) studies.

Table 1.

Examples of inadequate justification for the use of single sex in the last 10 years in the journal PAIN.

Examples of inadequate justifications in human research
“Given their differential response to opioids.”
“…in order to avoid the confounding effects of sex, as sex may affect the manner with which subjects respond to pain and to stress.”
“…was necessary to avoid sex as a confounding factor in the genetic component of the development study without waiting for a sufficient number of female patients.”
“Females were excluded to minimize the potential effects of menstruation-related hormone fluctuations on pain responses and facilitate comparison with previous positron emission tomographic studies that were performed in men.”
“Women were excluded as the experimental drug study lasts about 4–6 weeks, and there is clear evidence for the potentially confounding effects of menstruation related hormone fluctuations on pain in that month-long test period.”
“Female subjects were not enrolled because of potential variability in pain thresholds due to hormonal influences.”
Examples of inadequate justifications in rodent research
“In this first report, only female mice were used because the ventral portion of the lumbar column is easier to access in females.”
“Sexual dimorphism in microglial P2X4 pain signaling has been proposed to be present in rodents, with P2X4-driven brain-derived neurotrophic factor release from spinal microglia being restricted to males. Our neuropathic pain studies exclusively used female mice, thereby pointing to a key role for P2X4 in both male and female rodents in the maintenance of neuropathic pain, albeit potentially through different downstream signaling mechanisms.”
“Only male animals were used to avoid the well-known interaction of CGRP with cycling estrogen, previously reported in the cranial-window model.”
“Female mice were used because we routinely used female mice for our study of oral cancer pain”
“Males were used to reduce variability, and thus the number of animals required, and maintain consistency with previous work characterizing the monosodium iodoacetate (MIA) model in SD rats.”
“In these initial proof of concept studies, only a single sex of animals was used; however, understanding the impact of sex will be an important follow-on study should the hypotheses being tested here be substantiated. We have previously established that affective biases in the ABT are present in both male and female rats, but the lack of a sex-related comparisons is a limitation of the study design.”

4. Discussion:

It has become increasingly clear that science is facing a reproducibility crisis with part of the issue being attributable to lack of transparency in experimental and analytical details. Transparency and reporting of rigor characteristics allows other researchers to properly assess, interpret, and reproduce the results of a study. Our analysis identified that reporting of randomization and blinding is relatively high and stable in clinical research with an increase in the use of power analysis over the past 10 years. Although lower than clinical data, reporting of randomization, blinding, and power analysis in preclinical data has increased over the past 10 years. Our analysis also found that reporting of both sexes has remained high in clinical research, however, disaggregating data by sex or completing an analysis for sex differences has remained relatively low. In preclinical research, the reporting and use of both sexes is increasing but remains relatively low. Lastly, there is a lack of justification for the utilization of a single sex in both human and animal studies. Our data suggests that the reporting of rigor characteristics in both clinical and preclinical research is improving but is still not at an acceptable level.

Randomization and blinding have been a standard for human clinical trials for over 20 years with guidelines for preclinical data being developed approximately 10 years ago [7; 28]. Randomization is necessary to reduce bias when assigning subjects to treatment groups, whereas blinding is essential to reduce or minimize bias during data collection. Evidence suggests that biomedical research studies that do not report randomization or blinding, lead to overestimated treatment effects [14; 25; 32; 45]. Clinical and preclinical pain research is not exempt from these findings with systematic reviews showing greater differences between experimental and control groups in studies not reporting measures to reduce bias [10; 15; 16]. Further, an analysis in 2008 of rigor in preclinical studies in the journal PAIN found that out of sample of 14 articles, 37% of articles described blinding and 28% described randomization [38]. Our data shows improvements in preclinical data relative to this report from 2008 but there are still major reporting deficiencies. This issue is not unique to pain science with other journals reporting a lack of reporting rigor characteristics. For example, early systematic reviews of biomedical research show reporting of blinding in around 20%, randomization in 10%, and power analysis/sample size calculation in 0–7% of articles [5; 29]. Furthermore, recent systematic reviews conclude that lack of high-quality research results in downgrading of clinical recommendations resulting in low or insufficient evidence [11; 36]. Calls for change to improve reporting of rigor often suggest that journals need to implement checklists for manuscripts such as CONSORT for clinical and ARRIVE (Animal Research: Reporting of In Vivo Experiments) for preclinical research. However, medical journals that adopted the CONSORT checklist still lack completeness of reporting [43]. Similarly, an increase in reporting of rigor characteristics was seen in Nature journals when they implemented a reporting checklist; however, articles still did not reach 80% compliance suggesting that other efforts are needed to improve quality and transparency in reporting [2].

This study also assessed whether articles reported the use of a power analysis or completed a sample size estimation. Power analysis is imperative in determining the necessary sample size required to detect an effect based on a predetermined level of significance. Sample size calculation in clinical and preclinical research is an ethical obligation and is often required by funding agencies. Our data show that a low percentage of human (30%), mouse (9%), and rat (12%) studies reported a sample size calculation. This is an improvement from a sample collected in 2008 from the journal PAIN in which 0 out of the 14 preclinical articles reported using a power calculation [38]. Again, pain research is not unique to these findings with reports in preclinical stroke research from 2007 showing 3% of studies in their sample reported power calculations [40]. Therefore, despite the importance and requirement of this methodology in grant applications, it is not disseminated in research articles.

We also explored the integration of SABV over the past 10 years as it is necessary for informing subsequent clinical and preclinical studies. Funding agencies around the world have adopted policies on SABV to increase reporting and use of both sexes in clinical and preclinical research. For example, the NIH, which is the largest funding agency in the United States, implemented a policy in 1986 for scientists to examine sex and gender differences in clinical trials, which was followed by the Revitalization Act of 1993 making inclusion of both sexes into law. Next, the NIH implemented an initiative to include SABV in preclinical research in 2014 [24; 26; 27; 30]. Females have a higher prevalence of chronic pain, and generally experience pain of greater intensity, frequency, and duration [33; 44]. Despite this fact, they have been long neglected in both general and pain biomedical research, especially in preclinical animal studies [6]. For example, neuroscience research journals from 2010–2014 showed that under 25% of articles considered SABV in their animal research [46]. Furthermore, in the journal PAIN in 2019, around 50% of preclinical research articles examined only males [34]. Similarly, our data demonstrates that under 50% of articles in 2021 reported the use of both sexes in preclinical data [34]. Although federal agencies may require grant applications to include SABV, this requirement is nonbinding and allows authors to collect and publish data using a single sex. It is understood that some single sex studies are valid based on studying sex specific effects or a priori knowledge of sex differences, however, these single sex studies need to be adequately justified. The current study showed that over 50% of articles using humans and 90% of articles using rodents did not justify the use of a single sex. Ultimately, this overreliance on male subjects in research has caused pharmaceuticals to be developed based on physiological targets specific to males, leading to reduced efficacy and greater side effects in females [1820].

In clinical studies, and the minimal preclinical studies that utilize both sexes, our data showed a lack of data disaggregation and analysis between the sexes. This is similar to data collected from 2019 which showed that approximately 40% of articles utilizing both sexes did not include an analysis for sex differences [34]. Similarly, recent reports show that of the 4,420 registered clinical studies for COVID-19, only 4% include sex in their planned statistical analysis [9]. Even if the primary aim of a study does not include analysis for sex differences or is powered to investigate for sex differences, data should be disaggregated by sex to allow for future meta-analyses to examine for sex differences. Not disaggregating data by sex or running statistical analysis for sex differences could limit the uncovering of various sex differences such as pain mechanisms or drug efficacy and safety. Numerous studies have shown significant sex differences underlying peripheral and central nervous system mechanisms, particularly in pain research [34; 41]. Reviews of sex differences in pain discuss both sex specific mechanisms in preclinical research as well as sex specific treatment effects in humans [41]. For example, after induction of an activity-induced pain model, serotonin transporter expression is upregulated in female but not male mice in the rostral ventromedial medulla [31]. In the periphery, toll-like receptor 4 activation in nociceptors is necessary for the development of hypersensitivity from nerve injury in female but not male mice [42]. Since preclinical literature has historically been conducted in males, it is crucial to re-examine the literature to confirm pain mechanisms in females and create a foundation to examine sex differences in pain processing. This evidence-derived research will be the first step toward creating sex-based, customized treatment plans to effectively treat and prevent chronic pain in both sexes. There have been repeated calls for sex-based analysis and disaggregation of data by sex, which suggests other methods are needed improve its implementation [8; 12; 13; 22; 37].

In summary, our data showed that reporting of necessary rigor characteristics in the journal PAIN are higher than previous reports in biomedical research, however, there is still room for improvement. The journal PAIN includes instructions to authors for preclinical research under the heading Animals/General which include reporting of age, sex, species, source of animals/cells, and methods for randomization and blinding. In addition, the journal “strongly recommends” the use of both male and female animals in experiments. However, it is clear from this study that rigor characteristics and inclusion and analysis of both sexes are not being utilized or reported in journals. As discussed by the Preclinical Pain Research Consortium for Investigating Safety and Efficacy Working group, it is necessary to push for continued inclusion of these characteristics by journals, editors, peer-reviewers, researchers, and the general scientific communities with continued ongoing assessment of the literature [3]. We hope these efforts will lead to an increasing trend in reporting of rigor characteristics, the use of both male and female subjects, the disaggregation of data by sex, and statistical analysis of sex differences.

Acknowledgements

Funding:

National Institutes of Health AR073187.

NIH NS045549 (ANP), NIH GM144636-01 and NIH NS007421-23 (AFS), Foundation for Physical Therapy Research Promotion of Doctoral Studies (PODS) I (JBL, AJ), NIH GM067795 (JBL), NIH NS112873-03S3 (GB), Japan Society for the Promotion of Science (KH).

National Center for Advancing Translational Sciences of the National Institutes of Health under Award Number UL1TR002537. Its contents are the authors’ sole responsibility and do not necessarily represent official National Institutes of Health views.

Footnotes

Conflicts of Interest:

The authors declare that there are no relevant conflicts of interest.

References

  • [1].Prinicples and Guidelines for Reporting Preclinical Research, Vol. 2022: National Institutes of Health (NIH), 2014. [Google Scholar]
  • [2].Did a change in Nature journals’ editorial policy for life sciences research improve reporting? BMJ Open Sci 2019;3(1):e000035. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [3].Andrews NA, Latrémolière A, Basbaum AI, Mogil JS, Porreca F, Rice ASC, Woolf CJ, Currie GL, Dworkin RH, Eisenach JC, Evans S, Gewandter JS, Gover TD, Handwerker H, Huang W, Iyengar S, Jensen MP, Kennedy JD, Lee N, Levine J, Lidster K, Machin I, McDermott MP, McMahon SB, Price TJ, Ross SE, Scherrer G, Seal RP, Sena ES, Silva E, Stone L, Svensson CI, Turk DC, Whiteside G. Ensuring transparency and minimization of methodologic bias in preclinical pain research: PPRECISE considerations. Pain 2016;157(4):901–909. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [4].Arnegard ME, Whitten LA, Hunter C, Clayton JA. Sex as a Biological Variable: A 5-Year Progress Report and Call to Action. J Womens Health (Larchmt) 2020;29(6):858–864. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [5].Baker D, Lidster K, Sottomayor A, Amor S. Two years later: journals are not yet enforcing the ARRIVE guidelines on reporting standards for pre-clinical animal studies. PLoS Biol 2014;12(1):e1001756–e1001756. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [6].Beery AK, Zucker I. Sex bias in neuroscience and biomedical research. Neurosci Biobehav Rev 2011;35(3):565–572. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [7].Begg C, Cho M, Eastwood S, Horton R, Moher D, Olkin I, Pitkin R, Rennie D, Schulz KF, Simel D, Stroup DF. Improving the Quality of Reporting of Randomized Controlled Trials: The CONSORT Statement. JAMA 1996;276(8):637–639. [DOI] [PubMed] [Google Scholar]
  • [8].Beltz AM, Beery AK, Becker JB. Analysis of sex differences in pre-clinical and clinical data sets. Neuropsychopharmacology 2019;44(13):2155–2158. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [9].Brady E, Nielsen MW, Andersen JP, Oertelt-Prigione S. Lack of consideration of sex and gender in COVID-19 clinical studies. Nat Commun 2021;12(1):4015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [10].Carroll D, Tramèr M, McQuay H, Nye B, Moore A. Randomization is important in studies with pain outcomes: systematic review of transcutaneous electrical nerve stimulation in acute postoperative pain. Br J Anaesth 1996;77(6):798–803. [DOI] [PubMed] [Google Scholar]
  • [11].Castellini G, Bruschettini M, Gianola S, Gluud C, Moja L. Assessing imprecision in Cochrane systematic reviews: a comparison of GRADE and Trial Sequential Analysis. Systematic Reviews 2018;7(1):110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [12].Clayton JA. Applying the new SABV (sex as a biological variable) policy to research and clinical care. Physiol Behav 2018;187:2–5. [DOI] [PubMed] [Google Scholar]
  • [13].Clayton JA, Collins FS. Policy: NIH to balance sex in cell and animal studies. Nature 2014;509(7500):282–283. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [14].Crossley NA, Sena E, Goehler J, Horn J, Worp Bvd, Bath PMW, Macleod M, Dirnagl U. Empirical Evidence of Bias in the Design of Experimental Stroke Studies. Stroke 2008;39(3):929–934. [DOI] [PubMed] [Google Scholar]
  • [15].Currie GL, Delaney A, Bennett MI, Dickenson AH, Egan KJ, Vesterinen HM, Sena ES, Macleod MR, Colvin LA, Fallon MT. Animal models of bone cancer pain: systematic review and meta-analyses. Pain 2013;154(6):917–926. [DOI] [PubMed] [Google Scholar]
  • [16].Ernst E, White AR. Acupuncture for Back Pain: A Meta-Analysis of Randomized Controlled Trials. Archives of Internal Medicine 1998;158(20):2235–2241. [DOI] [PubMed] [Google Scholar]
  • [17].Fillingim RB, King CD, Ribeiro-Dasilva MC, Rahim-Williams B, Riley JL 3rd. Sex, gender, and pain: a review of recent clinical and experimental findings. J Pain 2009;10(5):447–485. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [18].Fish EN. The X-files in immunity: sex-based differences predispose immune responses. Nature Reviews Immunology 2008;8(9):737–744. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [19].Franconi F, Brunelleschi S, Steardo L, Cuomo V. Gender differences in drug responses. Pharmacol Res 2007;55(2):81–95. [DOI] [PubMed] [Google Scholar]
  • [20].Gandhi M, Aweeka F, Greenblatt RM, Blaschke TF. Sex Differences in Pharmacokinetics and Pharmacodynamics. Annual Review of Pharmacology and Toxicology 2004;44(1):499–523. [DOI] [PubMed] [Google Scholar]
  • [21].Geller SE, Adams MG, Carnes M. Adherence to federal guidelines for reporting of sex and race/ethnicity in clinical trials. J Womens Health (Larchmt) 2006;15(10):1123–1131. [DOI] [PubMed] [Google Scholar]
  • [22].Hankivsky O, Springer KW, Hunting G. Beyond sex and gender difference in funding and reporting of health research. Res Integr Peer Rev 2018;3:6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [23].Harris PA, Taylor R, Thielke R, Payne J, Gonzalez N, Conde JG. Research electronic data capture (REDCap)--a metadata-driven methodology and workflow process for providing translational research informatics support. J Biomed Inform 2009;42(2):377–381. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [24].Health NIo. NIH guidelines on the inclusion of women and minorities as subjects in clinical research. Fed Regist 1994;59:1408–1413. [Google Scholar]
  • [25].Hirst JA, Howick J, Aronson JK, Roberts N, Perera R, Koshiaris C, Heneghan C. The Need for Randomization in Animal Trials: An Overview of Systematic Reviews. PLOS ONE 2014;9(6):e98856. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [26].Johnson J, Sharman Z, Vissandjée B, Stewart DE. Does a change in health research funding policy related to the integration of sex and gender have an impact? PLoS One 2014;9(6):e99900. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [27].Johnson JL, Beaudet A. Sex and gender reporting in health research: why Canada should be a leader. Can J Public Health 2012;104(1):e80–81. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [28].Kilkenny C, Browne W, Cuthill I, Emerson M, Altman D. Improving Bioscience Research Reporting: The ARRIVE Guidelines for Reporting Animal Research. PLoS Biol 8 (6): e1000412, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [29].Kilkenny C, Parsons N, Kadyszewski E, Festing MF, Cuthill IC, Fry D, Hutton J, Altman DG. Survey of the quality of experimental design, statistical analysis and reporting of research using animals. PLoS One 2009;4(11):e7824. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [30].Kirschstein RL. Research on Women’s Health. American Journal of Public Health 1991;81(3):291–293. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [31].Lesnak JB, Inoue S, Lima L, Rasmussen L, Sluka KA. Testosterone protects against the development of widespread muscle pain in mice. Pain 2020;161(12):2898–2908. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [32].Macleod MR, Worp HBvd, Sena ES, Howells DW, Dirnagl U, Donnan GA. Evidence for the Efficacy of NXY-059 in Experimental Focal Cerebral Ischaemia Is Confounded by Study Quality. Stroke 2008;39(10):2824–2829. [DOI] [PubMed] [Google Scholar]
  • [33].Mogil JS. Sex differences in pain and pain inhibition: multiple explanations of a controversial phenomenon. Nature Reviews Neuroscience 2012;13(12):859–866. [DOI] [PubMed] [Google Scholar]
  • [34].Mogil JS. Qualitative sex differences in pain processing: emerging evidence of a biased literature. Nature Reviews Neuroscience 2020;21(7):353–365. [DOI] [PubMed] [Google Scholar]
  • [35].Moher D, Jones A, Lepage L, Group ftC. Use of the CONSORT Statement and Quality of Reports of Randomized TrialsA Comparative Before-and-After Evaluation. JAMA 2001;285(15):1992–1995. [DOI] [PubMed] [Google Scholar]
  • [36].Pandis N, Fleming PS, Worthington H, Salanti G. The Quality of the Evidence According to GRADE Is Predominantly Low or Very Low in Oral Health Systematic Reviews. PLoS One 2015;10(7):e0131644. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [37].Prager EM. Addressing sex as a biological variable, Vol. 95: Wiley Online Library, 2017. pp. 11–11. [DOI] [PubMed] [Google Scholar]
  • [38].Rice ASC, Cimino-Brown D, Eisenach JC, Kontinen VK, Lacroix-Fralish ML, Machin I, Mogil JS, Stöhr T. Animal models and the prediction of efficacy in clinical trials of analgesic drugs: A critical appraisal and call for uniform reporting standards. PAIN 2008;139(2). [DOI] [PubMed] [Google Scholar]
  • [39].Sadler KE, Mogil JS, Stucky CL. Innovations and advances in modelling and measuring pain in animals. Nature Reviews Neuroscience 2022;23(2):70–85. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [40].Sena E, van der Worp HB, Howells D, Macleod M. How can we improve the pre-clinical development of drugs for stroke? Trends in Neurosciences 2007;30(9):433–439. [DOI] [PubMed] [Google Scholar]
  • [41].Sorge RE, Totsch SK. Sex Differences in Pain. J Neurosci Res 2017;95(6):1271–1281. [DOI] [PubMed] [Google Scholar]
  • [42].Szabo-Pardi TA, Barron LR, Lenert ME, Burton MD. Sensory Neuron TLR4 mediates the development of nerve-injury induced mechanical hypersensitivity in female mice. Brain, Behavior, and Immunity 2021;97:42–60. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [43].Turner L, Shamseer L, Altman DG, Schulz KF, Moher D. Does use of the CONSORT Statement impact the completeness of reporting of randomised controlled trials published in medical journals? A Cochrane reviewa. Systematic Reviews 2012;1(1):60. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [44].Unruh AM. Gender variations in clinical pain experience. Pain 1996;65(2–3):123–167. [DOI] [PubMed] [Google Scholar]
  • [45].Vesterinen HM, Sena ES, ffrench-Constant C, Williams A, Chandran S, Macleod MR. Improving the translational hit of experimental treatments in multiple sclerosis. Multiple Sclerosis Journal 2010;16(9):1044–1055. [DOI] [PubMed] [Google Scholar]
  • [46].Will TR, Proaño SB, Thomas AM, Kunz LM, Thompson KC, Ginnari LA, Jones CH, Lucas S-C, Reavis EM, Dorris DM, Meitzen J. Problems and Progress regarding Sex Bias and Omission in Neuroscience Research. eNeuro 2017;4(6):ENEURO.0278–0217.2017. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES