A nonparametric alternative to the Cochran-Armitage trend test in genetic case-control association studies: The Jonckheere-Terpstra trend test

Sydney E Manning; Hung-Chih Ku; Douglas F Dluzen; Chao Xing; Zhengyang Zhou

doi:10.1371/journal.pone.0280809

. 2023 Feb 2;18(2):e0280809. doi: 10.1371/journal.pone.0280809

A nonparametric alternative to the Cochran-Armitage trend test in genetic case-control association studies: The Jonckheere-Terpstra trend test

Sydney E Manning ¹, Hung-Chih Ku ², Douglas F Dluzen ³, Chao Xing ^4,^*, Zhengyang Zhou ^5,^*

Editor: Mehdi Rahimi⁶

PMCID: PMC9894441 PMID: 36730335

Abstract

Identifications of novel genetic signals conferring susceptibility to human complex diseases is pivotal to the disease diagnosis, prevention, and treatment. Genetic association study is a powerful tool to discover candidate genetic signals that contribute to diseases, through statistical tests for correlation between the disease status and genetic variations in study samples. In such studies with a case-control design, a standard practice is to perform the Cochran-Armitage (CA) trend test under an additive genetic model, which suffers from power loss when the model assumption is wrong. The Jonckheere-Terpstra (JT) trend test is an alternative method to evaluate association in a nonparametric way. This study compares the power of the JT trend test and the CA trend test in various scenarios, including different sample sizes (200–2000), minor allele frequencies (0.05–0.4), and underlying modes of inheritance (dominant genetic model to recessive genetic model). By simulation and real data analysis, it is shown that in general the JT trend test has higher, similar, and lower power than the CA trend test when the underlying mode of inheritance is dominant, additive, and recessive, respectively; when the sample size is small and the minor allele frequency is low, the JT trend test outperforms the CA trend test across the spectrum of genetic models. In sum, the JT trend test is a valuable alternative to the CA trend test under certain circumstances with higher statistical power, which could lead to better detection of genetic signals to human diseases and finer dissection of their genetic architecture.

Introduction

Over the past fifteen years, genome-wide association studies have significantly expanded the knowledge base for genetic factors in important healthcare outcomes [1]. Such studies have identified numerous genetic signals contributing to various complex human diseases, which can be very important to the diseases’ diagnosis, prevention, and treatment. One commonly used approach to test association in a case-control genetic study is the Cochran-Armitage (CA) trend test [2, 3] under the assumption of an additive genetic model [4, 5], which can reach the optimal power when the underlying genetic model is also additive. However, it can suffer from power loss when the true genetic model is nonadditive (see, e.g., [6–9]). Power loss is one critical issue in genetic association studies. On the one hand, conducting a statistical test with reduced power may fail to detect true genetic signals, leading to false negative results. On the other hand, to achieve the same level of statistical power, the sample sizes needed will increase, leading to higher study expenses and resource requirements.

To test for associations between the disease status and genetic variation, one alternative approach to the CA trend test is the Jonckheere-Terpstra (JT) trend test [10, 11], which is a rank-based nonparametric test. The JT trend test does not make assumptions on genetic models or data distribution, and thus has the potential to achieve better statistical power than the parametric CA trend test under certain circumstances. The potentially higher power from the JT trend test may result in novel genetic discoveries for complex human diseases, which can help researchers better understand the genetic etiology and eventually aid in the development of effective diagnosis, prevention, and treatment strategies of the diseases. Although the JT trend test offers an alternative to the CA trend test with potential advantages, their comparative performance has not been examined in the genetic literature. In this study, we aim to fill this research gap by comparing the power of the two tests in various conditions via simulations and real data analysis. The knowledge gained in this study can help guide the model selection between the CA and JT trend tests when conducting genetic case-control studies in practice.

Methods

Consider a diallelic locus with the major and minor alleles denoted as a and A, respectively, the genotype distribution in a case-control study can be summarized as in Table 1. Specifically, denote r_i and s_i as the number of cases and controls, respectively, for genotype G_i, where i∈{0,1,2} reflects the number of A alleles a subject has. Thus G₀, G₁, and G₂ correspond to genotypes aa, Aa, and AA, respectively. Denote by R, S, and n_i the marginal sums such that $R = \sum_{i = 0}^{2} r_{i}, S = \sum_{i = 0}^{2} s_{i}$ , and n_i = r_i+s_i, and by N the total sample size such that $N = R + S = \sum_{i = 0}^{2} n_{i}$ . Assume (r₀, r₁, r₂) follow a trinomial distribution with parameters R and (τ₀, τ₁, τ₂), and (s₀, s₁, s₂) follow a trinomial distribution with parameters S and (υ₀, υ₁, υ₂). The null hypothesis of no association between the disease and genotype is then H₀: τ_i = υ_i, for i∈{0,1,2}. Equivalently, we can also assume r_i’s are drawn from binomial distributions Bin(n_i, π_i). The null hypothesis of no association between the disease and genotype is H₀: π₀ = π₁ = π₂. Assuming G₀, G₁, and G₂ are three ordered categories, a restricted alternative hypothesis for a trend test is H₁: π₀≤π₁≤π₂ or π₀≥π₁≥π₂ with at least one strict inequality.

Table 1. Genotype distribution at a diallelic marker in a case-control study.

	Genotype^*
Phenotype	aa	Aa	AA	Total
Cases	r ₀	r ₁	r ₂	R
Controls	s ₀	s ₁	s ₂	S
Total	n ₀	n ₁	n ₂	N

Open in a new tab

* A denotes the minor allele across the paper.

To test H₁, the CA trend test assigns a set of scores (x₀, x₁, x₂) to G₀, G₁, and G₂, respectively, with the constraints x₀≤x₁≤x₂ and x₀<x₂, and examines whether there is a linear relationship between π_i’s and x_i’s by fitting a linear regression model. The test statistic is $T_{C A} = \frac{N {(N \sum_{i = 0}^{2} r_{i} x_{i} - R \sum_{i = 0}^{2} n_{i} x_{i})}^{2}}{R S [N \sum_{i = 0}^{2} n_{i} {x_{i}}^{2} - {(\sum_{i = 0}^{2} x_{i} n_{i})}^{2}]}$ . Under H₀, T_CA follows a χ² distribution with 1 degree-of-freedom (d.f.). The choices of (x₀, x₁, x₂) represent assumptions on the genetic models. In practice, the additive model with (x₀, x₁, x₂) = (0,0.5,1) is usually assumed because of its robustness. Hereinafter we denote it as $T_{C A}^{A d d}$ .

Alternatively, the JT trend test compares the ranks of subjects based on their affection status between genotype groups to test H₁. Consider the disease status of case and control as an ordinal variable Y, and denote by Y_ij∈{0,1} individual j′s phenotypic value with genotype G_i. The JT test statistic is $T_{J T} = \frac{{[U - E (U)]}^{2}}{V a r (U)}$ , where $U = \sum_{j = 1}^{n_{0}} \sum_{k = 1}^{n_{1}} S (Y_{0 j}, Y_{1 k}) + \sum_{j = 1}^{n_{0}} \sum_{k = 1}^{n_{2}} S (Y_{0 j}, Y_{2 k}) + \sum_{j = 1}^{n_{1}} \sum_{k = 1}^{n_{2}} S (Y_{1 j}, Y_{2 k}), E (U) = \frac{N^{2} - \sum_{i = 0}^{2} n_{i}^{2}}{4}$ , and $V a r (U) = \frac{A}{72} + \frac{B}{36 N (N - 1) (N - 2)} + \frac{C}{8 N (N - 1)}$ , in which $A = N (N - 1) (2 N + 5) - \sum_{i = 0}^{2} n_{i} (n_{i} - 1) (2 n_{i} + 5) - S (S - 1) (2 S + 5) - R (R - 1) (2 R + 5), B = \sum_{i = 0}^{2} n_{i} (n_{i} - 1) (n_{i} - 2) [S (S - 1) (S - 2) + R (R - 1) (R - 2)]$ and $C = \sum_{i = 0}^{2} n_{i} (n_{i} - 1) [S (S - 1) + R (R - 1)]$ . The components of U are the Mann-Whitney U statistics defined as $S (Y_{∙ j}, Y_{∙ k}) = {\begin{matrix} 1, i f Y_{∙ j} < Y_{∙ k} \\ 0.5, i f Y_{∙ j} = Y_{∙ k} \\ 0, i f Y_{∙ j} > Y_{∙ k} \end{matrix}$ . For large N and n_i′s not too small, T_JT also follows a χ² distribution with 1 d.f. under H₀.

For simplicity, hereinafter we use T_JT and $T_{C A}^{A d d}$ to refer to both tests and test statistics.

Simulation

We conduct simulations to compare performance between T_JT and $T_{C A}^{A d d}$ in terms of statistical power under various conditions. Define the penetrance function for each genotype as f_i = P(affected|G_i), i = 0, 1, 2, and define genotype relative risk as λ_i = f_i/f₀; thus λ₀ = 1. The dominant, additive, and recessive genetic models can be specified by λ₁ = λ₂, λ₁ = (1+λ₂)/2, and λ₁ = 1, respectively. Note that the genetic model is defined in regard to the minor allele in this study. The model can be reparameterized by defining λ₁ = 1+λcosθ and λ₂ = 1+λsinθ, where λ≥0 is the distance between point P = (λ₁, λ₂) and point O = (1,1), which determines how far the genetic effect is from the null, and θ∈[π/4, π/2] is the angle between OP and the horizontal line in a two-dimensional space, which determines the genetic model [12]. Therefore, the null hypothesis can be rewritten as H₀: λ = 0. In terms of genetic models, θ = π/4, arctan 2, and π/2 correspond to dominant, additive, and recessive models, respectively. We performed simulations under the following alternative settings. Assume a disease prevalence (K) of 0.1 and the minor allele frequency (MAF) q∈(0.05,0.1,0.2,0.3). Fix the alternative hypothesis as λ = 1 and vary the genetic models by setting θ‘ = θ/π from 1/4 to 1/2, i.e., from the dominant model to the recessive model, with an increment of 0.01. Under each genetic model, penetrances are determined by $f_{0} = K / [{(1 - q)}^{2} + 2 λ_{1} q (1 - q) + λ_{2} q^{2}]$ and f_i = λ_if₀. The probabilities of the two trinomial distributions for cases and controls are then τ_i = P(G_i)f_i/K and υ_i = P(G_i)(1−f_i)/(1−K), respectively. Consider a balanced design, i.e., R = S with the total sample size N∈(200, 500, 1000). At each setting 10,000 replicates are simulated, and each dataset is examined by both T_JT and $T_{C A}^{A d d}$ . The empirical power is estimated as the proportion of the replicates for which the p-value is less than or equal to 0.05. In an additional set of simulations for wider range of sample size and MAF, we considered N = 1500 and 2000 as well as q = 0.4 across the genetic models. Because the sample sizes and MAF were large, the effect size in the alternative hypothesis was set to λ = 0.5 to make the maximum power less than 1 for the sake of comparison.

The simulation results across the main settings are presented in Fig 1 and the results for the additional set of simulations are presented in S1 Table. In all situations T_JT is more powerful than $T_{C A}^{A d d}$ when the underlying genetic model is dominant. The power advantage of T_JT diminishes as the genetic model evolves toward the additive model. In most situations except for small sample sizes and low MAFs, the two tests have approximately equivalent power when the underlying model is additive. T_JT becomes less powerful than $T_{C A}^{A d d}$ and the disadvantage enlarges when the genetic model keeps evolving toward the recessive end. In case of low MAFs and small sample sizes, e.g., N = 200 & q≤0.1 or N≤1000 & q = 0.05, T_JT is more powerful than $T_{C A}^{A d d}$ .

To verify the above findings, for each simulation setting we construct a table with the value in each cell equal to the expected value under the trinomial distributions for cases and controls, respectively, i.e., E(r_i) = τ_iN/2 and E(s_i) = υ_iN/2, i∈{0,1,2}. Specifically, in each simulation setting, using the fixed combination of sample size (N), MAF (q), and genetic model (θ), the cell probabilities of each genotype for cases and controls in the genotype distribution table (Table 1) can be calculated, and therefore, the expected cell values can also be calculated by multiplying the probabilities with the sample size. This table consists of the expected cell values, which allows us to evaluate the relative performance of the two tests by comparing their theoretical test statistics (T_JT and $T_{C A}^{A d d}$ ) in each simulation setting. For each table, the theoretical T_JT and $T_{C A}^{A d d}$ are calculated and compared by $Δ T = \frac{(T_{J T} - T_{C A}^{A d d})}{T_{C A}^{A d d}} \times 100 %$ . Therefore, a positive (or negative) value of ΔT indicates the JT trend test is more (or less) powerful than the CA trend test. The results of ΔT for dominant, additive, and recessive genetic models across simulation settings are reported in Table 2. Consistent with the simulation results, ΔT>0 when the genetic model was dominant; |ΔT|<2% when the genetic model was additive, and ΔT<0 when the genetic model was recessive. The only discrepancy is that in case of low MAFs and small sample sizes, theoretically ΔT would be less than zero under the recessive model, but empirically it is greater than zero. We suspect it is because the parametric assumptions and asymptotic theory behind $T_{C A}^{A d d}$ do not hold in these circumstances, whereas T_JT does not impose assumptions on data distribution.

Table 2. Percent difference between the Jonckheere-Terpstra trend test statistic (T_JT) and the Cochran-Armitage trend test statistic ( $T_{C A}^{A d d}$ ) based on the expectation of genotype distributions under dominant, additive, and recessive genetic models.

N	Minor Allele Frequency	$(T_{J T} - T_{C A}^{A d d}) / T_{C A}^{A d d}$
N	Minor Allele Frequency	Dominant	Additive	Recessive
200	0.05	2.15%	-1.39%	-69.47%
200	0.1	4.97%	-1.67%	-62.83%
200	0.2	9.92%	-1.00%	-46.62%
200	0.3	11.58%	0.14%	-27.71%
500	0.05	2.46%	-1.10%	-69.38%
500	0.1	5.28%	-1.38%	-62.71%
500	0.2	10.25%	-0.70%	-46.46%
500	0.3	11.91%	0.44%	-27.49%
1000	0.05	2.56%	-1.00%	-69.35%
1000	0.1	5.39%	-1.28%	-62.68%
1000	0.2	10.36%	-0.60%	-46.41%
1000	0.3	12.03%	0.54%	-27.42%

Open in a new tab

Real data analysis

We further compared T_JT and $T_{C A}^{A d d}$ in real data, which confirmed the simulation results. The first example is on the association between the variant rs2398162 and hypertension in the Wellcome Trust Case Control Consortium study [13]. There were 1940 cases (r₀ = 1205, r₁ = 624, r₂ = 111) and 2923 controls (s₀ = 1608, s₁ = 1121, s₂ = 194), and it was suggested the minor allele have a dominant effect. Applying the two trend tests on the dataset, we can obtain T_JT = 22.82 (p-value = 1.8×10⁻⁶) and $T_{C A}^{A d d} = 19.97$ (p-value = 7.9×10⁻⁶). These results were in line with the observations in the simulation that T_JT is more powerful than $T_{C A}^{A d d}$ when the underlying genetic model is dominant.

Additional real data analyses came from case-control studies for falciparum malaria [14], age-related macular degeneration (AMD) [15] and hypertension with additional variants, with the genotype counts all extracted from [9]. In the falciparum malaria study, variant rs10900589 in ATP2B4 was associated with the disease in the Ghanaian samples. This association was also evaluated in the Gambian samples and it was significant under a recessive model but insignificant under dominant and additive models. In the AMD study, variants rs380390 and rs10131337 in CFH were associated with AMD. We examined the associations of the three variants with the diseases using both tests. Cell counts, test statistics and p-values are reported in Fig 2. For rs10900589, the minor allele approximately acts in a recessive mode $(\frac{r_{0}}{n_{0}} \approx \frac{r_{1}}{n_{1}} < \frac{r_{2}}{n_{2}})$ and $T_{J T} < T_{C A}^{A d d}$ , which were consistent with the simulation results that T_JT is less powerful than $T_{C A}^{A d d}$ when the underlying genetic model is recessive. For both rs380390 and rs10131337, the minor allele approximately acts in an additive mode $(\frac{r_{1}}{n_{1}} \approx \frac{r_{0} / n_{0} + r_{2} / n_{2}}{2})$ and $T_{J T} \approx T_{C A}^{A d d}$ , which were consistent with the simulation results that the two tests have comparable power when the underlying model is additive. In the hypertension study, we compared the two tests on three SNPs that showed genome-wide significance. The results are reported in S2 Table. The conclusion still holds in this real data analysis: T_JT and $T_{C A}^{A d d}$ had similar power when the genetic model was close to be additive (rs7961152, rs1937506, rs6997709).

Fig 2 — A denotes the minor allele; r_i′s and s_i′s are as defined in Table 1.

To assess the potential genotyping errors among the variants considered in the real data analysis, we tested the Hardy-Weinberg equilibrium (HWE) among the cases and controls, separately, for each variant [16, 17]. An exact test for HWE was conducted using the R package HardyWeinberg [18], and the results with exact p-values for the variants were reported in S3 Table. Results showed that the p-values of the HWE tests for all the variants were larger than 0.01, with only two between 0.01 and 0.05, suggesting that there was little evidence of genotyping error among the variants. Moreover, we conducted allelic test to evaluate the association for the variants. The allelic test assesses the genetic association at the allele level by collapsing the genotypes into the counts for reference and alternative alleles, between cases and controls, however, this approach is not robust against the HWE departures [4]. The test statistics and p-values of allelic test results were summarized in S3 Table. Of note, the results were close to those of $T_{C A}^{A d d}$ , suggesting that the assumptions of HWE were not violated.

Discussion

Our previous work elucidates the mechanism of the CA trend test that it examines the location shift of genotype scores between the case and control groups [19] by measuring the goodness of fit of a linear regression model correlating proportions of cases in genotype groups with their respective scores [20]. The preassigned scores reflect assumptions on the genetic model. In contrast, the JT trend test examines the location shift of phenotype scores among genotype groups in a rank-based nonparametric way without making assumptions on genetic models or data distribution. The power difference between the two tests in different situations shown in this study can be explained by their properties. When the underlying model is dominant, $T_{C A}^{A d d}$ suffers power loss because of the wrong model assumption that it is inferior to T_JT; when the underlying model is recessive, the limited information on location shift hampers T_JT more than the wrong model assumption hampers $T_{C A}^{A d d}$ ; in case of low MAFs and small sample sizes where the large-sample theory breaks, T_JT outperforms $T_{C A}^{A d d}$ because it does not impose assumptions on data distribution as the latter does.

In sum, in this study we compared the power of $T_{C A}^{A d d}$ and T_JT under different situations. By simulation and real data examples, we show T_JT can provide a valuable alternative to $T_{C A}^{A d d}$ in case of small sample sizes and low MAFs; when the genetic mechanism is known to be dominant, or that is the only model of interest, T_JT is preferred. However, in a moderate to large sample size study with the true mode of inheritance unknown, the use of the JT trend test is not recommended compared to the CA trend test under an additive model, which is more robust under a wide range of modes of inheritance.

Supporting information

S1 Table. Power comparison between T_JT and

T_{C A}^{A d d}

for the additional simulation settings (q = 0.4).

(DOCX)

Click here for additional data file.^{(30.9KB, docx)}

S2 Table. Comparison between the Jonckheere-Terpstra trend test (T_JT) and the Cochran-Armitage trend test (

T_{C A}^{A d d}

) on SNPs that were reported associated with hypertension.

(DOCX)

Click here for additional data file.^{(31.1KB, docx)}

S3 Table. Exact p-values of the Hardy-Weinberg equilibrium (HWE) tests among cases and controls of the variants and the allelic test statistics in the Real data analysis.

(DOCX)

Click here for additional data file.^{(30.5KB, docx)}

Acknowledgments

The authors acknowledge the Texas Advanced Computing Center (https://www.tacc.utexas.edu) at The University of Texas at Austin for providing high performance computing resources that have contributed to the research results reported within this paper.

Data Availability

Real data analyzed in this study can be obtained from the cited articles (Liu et al., 2000; Timmann et al., 2012; Loley et al., 2013).

Funding Statement

This work is supported by the National Institute of Environmental Health Sciences grant R03ES034138 to C.X. and Z.Z. D.F.D. and Z.Z. are also supported in part by the National Institute on Minority Health and Health Disparities grant 5U54MD013376-8281. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. There was no additional external funding received for this study.

References

1.Visscher P.M., et al., 10 years of GWAS discovery: Biology, function, and translation. Am J Hum Genet, 2017. 101(1): p. 5–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Cochran W.G., Some Methods for Strengthening the Common χ2 Tests. Biometrics, 1954. 10(4): p. 417–451. [Google Scholar]
3.Armitage P., Tests for Linear Trends in Proportions and Frequencies. Biometrics, 1955. 11(3): p. 375–386. [Google Scholar]
4.Sasieni P.D., From genotypes to genes: doubling the sample size. Biometrics, 1997: p. 1253–1261. [PubMed] [Google Scholar]
5.Clarke G.M., et al., Basic statistical analysis in genetic case-control studies. Nature Protocols, 2011. 6(2): p. 121–133. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Gonzalez J.R., et al., Maximizing association statistics over genetic models. Genet Epidemiol, 2008. 32(3): p. 246–54. [DOI] [PubMed] [Google Scholar]
7.Kuo C.L. and Feingold E., What’s the best statistic for a simple test of genetic association in a case-control study? Genet Epidemiol, 2010. 34(3): p. 246–253. doi: 10.1002/gepi.20455 [DOI] [PubMed] [Google Scholar]
8.Li Q., et al., Robust tests for single-marker analysis in case-control genetic association studies. Ann Hum Genet, 2009. 73(2): p. 245–52. doi: 10.1111/j.1469-1809.2009.00506.x [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Loley C., et al., A unifying framework for robust association testing, estimation, and genetic model selection using the generalized linear model. Eur J Hum Genet, 2013. 21(12): p. 1442–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Jonckheere A.R., A Distribution-Free k-Sample Test Against Ordered Alternatives. Biometrika, 1954. 41(1/2): p. 133–145. [Google Scholar]
11.Terpstra T.J., The asymptotic normality and consistency of Kendall’s test against trend, when ties are present in one ranking. Indagationes Mathematicae, 1952. 14(3): p. 327–333. [Google Scholar]
12.Zheng G., Joo J., and Yang Y., Pearson’s test, trend test, and MAX are all trend tests with different types of scores. Annals of Human Genetics, 2009. 73(2): p. 133–140. [DOI] [PMC free article] [PubMed] [Google Scholar]
13.The Wellcome Trust Case Control Consortium, Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature, 2007. 447(7145): p. 661–678. [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Timmann C., et al., Genome-wide association study indicates two novel resistance loci for severe malaria. Nature, 2012. 489(7416): p. 443–446. [DOI] [PubMed] [Google Scholar]
15.Li Q., et al., MAX-rank: a simple and robust genome-wide scan for case-control association studies. Human Genetics, 2008. 123(6): p. 617–623. [DOI] [PubMed] [Google Scholar]
16.Leal S.M., Detection of genotyping errors and pseudo‐SNPs via deviations from Hardy‐Weinberg equilibrium. Genetic Epidemiology, 2005. 29(3): p. 204–214. [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Hosking L., et al., Detection of genotyping errors by Hardy–Weinberg equilibrium testing. European Journal of Human Genetics, 2004. 12(5): p. 395–399. [DOI] [PubMed] [Google Scholar]
18.Graffelman J., Exploring diallelic genetic markers: the HardyWeinberg package. Journal of Statistical Software, 2015. 64: p. 1–23. [Google Scholar]
19.Zhou Z., et al., Differentiating the Cochran-Armitage trend test and Pearson’s χ2 test: Location and dispersion. Annals of Human Genetics, 2017. 81(5): p. 184–189. [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Zhou Z., et al., Decomposing Pearson’s χ2 test: A linear regression and its departure from linearity. Annals of Human Genetics, 2018. 82(5): p. 318–324. [DOI] [PMC free article] [PubMed] [Google Scholar]

PLoS One. doi: 10.1371/journal.pone.0280809.r001

Decision Letter 0

Mehdi Rahimi

1 Mar 2022

PONE-D-21-39215A nonparametric alternative to the Cochran-Armitage trend test in genetic case-control association studies: the Jonckheere-Terpstra trend testPLOS ONE

Dear Dr. Zhou,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Apr 08 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Mehdi Rahimi, Ph.D.

Academic Editor

PLOS ONE

Journal Requirements:

1. When submitting your revision, we need you to address these additional requirements.

Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2.Thank you for stating in your Funding Statement:

(This work was supported by National Institute on Minority Health and Health Disparities 5U54MD013376-8281 (ZZ).

The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.)

Please provide an amended statement that declares *all* the funding or sources of support (whether external or internal to your organization) received during this study, as detailed online in our guide for authors at http://journals.plos.org/plosone/s/submit-now. Please also include the statement “There was no additional external funding received for this study.” in your updated Funding Statement.

Please include your amended Funding Statement within your cover letter. We will change the online submission form on your behalf.

3. We note you have included a table to which you do not refer in the text of your manuscript. Please ensure that you refer to Table 2 in your text; if accepted, production will need this reference to link the reader to the Table.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Partly

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: No

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: I have reviewed the manuscript entitled: "nonparametric alternative to the Cochran-Armitage trend test in genetic case-control association studies: The Jonckheere-Terpstra trend test", with Manuscript Number: PONE-D-21-39215. This manuscript is an interesting topic that could help researchers to tackle any bias results might be happened in the related analysis. But still need more revisions (major/minor) that I have mentioned viewpoints/ comments as below:

# Abstract:

Abstract is too general, while it would be better to be more specific in case of the definition and importance of the problem, and authors should give evidences in which are supported with quantitative data such as sample size etc.!

#Introduction:

In Line 47: it seems that sentence is incomplete ...is not additive (e.g.,(Gonzalez et al., 2008;…

Generally: This section at this format seems to be non-informative, authors must give more evidences of reported studies, try to highlight the weakness including suffering from the power loss etc., then discuss importance and objectives of the method of interest with more details.

# Simulation:

In Lines 113~120: authors must present more supportive data in case of sample size and MAFs to show advantages/ dis-advantages of both methods in analyzing genome-wide association studies.

# Real Data Analysis:

At this section, authors are strongly advised to use more real data set for comparing the efficiency of the two tests. With few cases, we are not able to observe power/failure of each method.

# Tables:

In table 2: Could authors show whether the calculated statistics at different sample size are significantly different? By the way, the table was not referred in the main text.

With the best regards.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Ali Moumeni

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

Attachment

Submitted filename: Reviewer-Points-Comments.docx

Click here for additional data file.^{(12.6KB, docx)}

PLoS One. 2023 Feb 2;18(2):e0280809. doi: 10.1371/journal.pone.0280809.r002

Author response to Decision Letter 0

19 May 2022

Please see the attached "Plos One response letter.docx" in this resubmission. The content is attached here too:

Point-by-point response to comments from Reviewer

Thank you very much for reviewing our paper and the detailed and helpful review report. We greatly appreciate your time, effort, encouragement, and insight. We have revised our paper addressing all issues raised in your report. The following is our point-by-point response to your comments. For convenience, your original comments are copied and our replies follow in blue. The associated revisions in the manuscript are highlighted in track change.

Comments:

I have reviewed the manuscript entitled: "nonparametric alternative to the Cochran-Armitage trend test in genetic case-control association studies: The Jonckheere-Terpstra trend test", with Manuscript Number: PONE-D-21-39215. This manuscript is an interesting topic that could help researchers to tackle any bias results might be happened in the related analysis. But still need more revisions (major/minor) that I have mentioned viewpoints/ comments as below:

Thank you very much for your concise summary and encouraging comment. We appreciate very much for your time and effort.

# Abstract:

Thank you for the comment. In this revision, we have provided more related materials and details in the Abstract.

#Introduction:

In Line 47: it seems that sentence is incomplete ...is not additive (e.g.,(Gonzalez et al., 2008;…

In this sentence, the period is after the long list of citations, so it may appear that the sentence is incomplete:

However, it can suffer from power loss when the true genetic model is not additive (e.g., Gonzalez et al., 2008; Kuo & Feingold, 2010; Li, Zheng, Liang, & Yu, 2009; Loley, König, Hothorn, & Ziegler, 2013).

When reviewing it, however, we did find an extra “(” after “e.g.,”, and we have removed it in this revision (line 55, page 4).

We appreciate this suggestion to add more details in Introduction. In this revision, we expanded Introduction by providing more context information and details for the tests and significance of the research question.

# Simulation:

In Lines 113~120: authors must present more supportive data in case of sample size and MAFs to show advantages/ dis-advantages of both methods in analyzing genome-wide association studies.

In the original simulation, we considered sample sizes of N∈(200,500,1000) and MAFs of q∈(0.05,0.1,0.2,0.3). To evaluate the relative performance of the CA and JT trend tests in a wider range of data scenarios, we further considered the sample size of N=1500 and 2000 as well as q=0.4. Because the sample sizes and MAF were fairly large, we reduced the effect size in the alternative hypothesis to λ=0.5 to make the power comparison meaningful (otherwise the power of both tests will be close to 1 and hence difficult to evaluate the relative performance).

The empirical power of the two tests for the additional simulation settings were summarized in Table S1 in the Supplementary Materials. The results were consistent with original conclusion: compared to the CA trend test, the JT trend test is more powerful when the underlying genetic model is dominant. The power advantage of T_JT diminishes as the genetic model evolves toward the additive model and have approximately equivalent power when the underlying model is additive. T_JT becomes less powerful than T_CA^Add and the disadvantage enlarges when the genetic model keeps evolving toward the recessive end.

# Real Data Analysis:

At this section, authors are strongly advised to use more real data set for comparing the efficiency of the two tests. With few cases, we are not able to observe power/failure of each method.

Thank you for the suggestion. In this revision, we compared the two tests on additional studies for SNPs that were reported associated with hypertension. The results were reported in Table S2 in the Supplementary Materials. The conclusion still holds in this real data analysis: T_JT and T_CA^Add had similar power when the genetic model tended to be additive (rs7961152, rs1937506, rs6997709) and T_JT is more powerful than T_CA^Add when the genetic model tended to be dominant (rs2398162).

# Tables:

In table 2: Could authors show whether the calculated statistics at different sample size are significantly different? By the way, the table was not referred in the main text.

Thank you for the comment and we apologize for not referring Table 2 in the main text in the original submission. This table refers to the simulation findings verification using expected theoretical tables for each simulation setting (last paragraph of Simulation section) and we now explained Table 2 in the right place of the manuscript.

We would like to explain that the test statistics (T_JT and T_CA^Add) in Table 2 were calculated based on “expected theoretical tables”. Specifically, in each simulation setting, using the fixed combination of sample size, MAF, and genetic model, the cell probabilities of each genotype for cases and controls in the genotype distribution table (Table 1) can be calculated, and therefore, the expected cell values can also be calculated by multiplying the probabilities with the sample size. We refer such table consisted of the expected cell values as a “expected” table, which allows us to evaluate the relative performance of the two tests by comparing their theoretical test statistics (T_JT and T_CA^Add) in each simulation setting. The relative difference between theoretical test statistics (ΔT=((T_JT-T_CA^Add ))/(T_CA^Add )×100%) are then reported in Table 2, which are used to verify the empirical findings observed in simulation. To fully explain this, we also provided more details and clarifications to this paragraph in this revision. However, since each simulation setting corresponds to a single ΔT that is non-random, we cannot assess the statistical significance of ΔT. Instead, readers may refer to Figure 1 for the actual power comparison from simulation, which can serve as a proxy for the relative statistical significance between the two tests.

Attachment

Submitted filename: Plos One response letter.docx

Click here for additional data file.^{(29.6KB, docx)}

PLoS One. doi: 10.1371/journal.pone.0280809.r003

Decision Letter 1

Mehdi Rahimi

23 May 2022

PONE-D-21-39215R1A nonparametric alternative to the Cochran-Armitage trend test in genetic case-control association studies: the Jonckheere-Terpstra trend testPLOS ONE

Dear Dr. Zhou,

Please submit your revised manuscript by Jul 07 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

We look forward to receiving your revised manuscript.

Kind regards,

Mehdi Rahimi, Ph.D.

Academic Editor

PLOS ONE

Additional Editor Comments:

Dear Author

The reviewer(s) have recommended major revisions to your manuscript. Therefore, I invite you to respond to the reviewer(s)' comments and revise your manuscript.

With Thanks

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

PLoS One. 2023 Feb 2;18(2):e0280809. doi: 10.1371/journal.pone.0280809.r004

Author response to Decision Letter 1

26 Aug 2022

Point-by-point response to comments from Reviewer

Comments:

Thank you very much for your concise summary and encouraging comment. We appreciate very much for your time and effort.

# Abstract:

Thank you for the comment. In this revision, we have provided more related materials and details in the Abstract.

#Introduction:

In Line 47: it seems that sentence is incomplete ...is not additive (e.g.,(Gonzalez et al., 2008;…

In this sentence, the period is after the long list of citations, so it may appear that the sentence is incomplete:

When reviewing it, however, we did find an extra “(” after “e.g.,”, and we have removed it in this revision (line 55, page 4).

# Simulation:

In Lines 113~120: authors must present more supportive data in case of sample size and MAFs to show advantages/ dis-advantages of both methods in analyzing genome-wide association studies.

# Real Data Analysis:

At this section, authors are strongly advised to use more real data set for comparing the efficiency of the two tests. With few cases, we are not able to observe power/failure of each method.

# Tables:

In table 2: Could authors show whether the calculated statistics at different sample size are significantly different? By the way, the table was not referred in the main text.

Attachment

Submitted filename: Plos One response letter.docx

Click here for additional data file.^{(29.6KB, docx)}

PLoS One. doi: 10.1371/journal.pone.0280809.r005

Decision Letter 2

Mehdi Rahimi

24 Oct 2022

PONE-D-21-39215R2A nonparametric alternative to the Cochran-Armitage trend test in genetic case-control association studies: the Jonckheere-Terpstra trend testPLOS ONE

Dear Dr. Zhou,

Please submit your revised manuscript by Dec 08 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

We look forward to receiving your revised manuscript.

Kind regards,

Mehdi Rahimi, Ph.D.

Academic Editor

PLOS ONE

Journal Requirements:

Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #2: (No Response)

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

Reviewer #2: No

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #2: No

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

Reviewer #2: Yes

**********

6. Review Comments to the Author

Reviewer #2: This article concerns a power comparison of the Cochran-Armitage trend test and the non-parametric Jonckheere-Terpstra trend test, under common genetic models (additive, dominant, recessive), different minor allele frequencies and sample sizes, and for bi-allelic genetic variants. The article is concise and well written, and the conclusions are clear. I have some major and minor concerns detailed below.

Major concerns:

The example in the Real data analysis section on page 9 is not clearly a case of dominance; in fact, a co-dominant model seems to be the best, where the heterozygote has the highest risk. Variant rs20541 is thus a poor example to make the case of an advantage of the Jonckheere-Terpstra test. In fact, the clearest case of dominance is rs2398162 in Table S2. If the authors wish to practically illustrate the advantage of the Jonckheere-Terpstra test, then rs2398162 would be a better choice.

On line 192 the authors state “variant rs10900589 in ATP2B4 was associated with the disease in the Ghanaian samples and the association was replicated in the Gambian samples”. This statement is obviously FALSE. For Figure 2 shows very significant association for the Ghanaian sample, but a clearly non-significant association for the Gambian sample. Please correct the sentence.

In genetic association studies it is common to test the SNPs involved for Hardy-Weinberg equilibrium. If there are significant deviations from HWE, the results of the association tests may be questioned, because disequilibrium is potentially indicative of the presence of genotyping errors (Hosking et al, 2004; Leal, 2005). The HWE testing can be done with many genetic data analysis software such as PLINK (Purcell et al., 2007) or R-package Hardy-Weinberg (Graffelman, 2015). It is recommended to test cases and controls separately, and report exact Hardy-Weinberg p-values for each group. Exact testing is the preferred approach, for it has the highest power. I suggest the authors to include HW test results in the paper, for this can only improve the credibility of their conclusions.

The authors emphasize the power gain of the Jonckheere-Terpstra test under the dominant model. However, the power loss of this test under recessive model is generally much larger than the gain under the dominant model (see Figure 1, Table 2). In practice one often does not know the correct genetic model. One thus say that, a priori, and overall, the Cochran Armitage test may be the best choice, at least if the MAF is not low. This conclusion should be added to the evaluation of the tests in the Discussion section.

In genetic association studies with SNPs, it is also common to test for association not only at the genotype level, but also at the level of alleles. A standard test for this purpose is the so called alleles test (Laird & Lange, 2011). For all empirical 9 SNPs (Page Figure 2, Table S2) the alleles test leads to the same conclusion as the Jonckheere-Terpstra test. This could at least be stated, as it strengthens the conclusions of the “Real data analysis” Section.

Minor issues:

L59: “recource” -- “resource”

L138: “set of simulation” -- “set of simulations”

L161: “We refer this table consisted of” -- “This table consists of”

L162: delete “as an expected table”.

L167, L202: “were reported” -- “are reported”

L262: capitalize “Kendall”

References:

Graffelman, J. (2015) Exploring Diallelic Genetic Markers: The HardyWeinberg Package. The Journal of Statistical Software 64(3): 1–23.

Hosking, L., S. Lumsden, K. Lewis, A. Yeo, L. McCarthy, A. Bansal, J. Riley, I. Purvis, and C. Xu (2004): Detection of genotyping errors by Hardy-Weinberg equilibrium testing. Eur. J. Hum. Genet., 12, 395–399.

Laird, N. M. and Lange, C. (2011) The fundamentals of modern statistical genetics. Springer.

Leal SM (2005) Detection of genotyping errors and pseudo-SNPs via deviations from Hardy–Weinberg equilibrium. Genet Epidemiol 29:204–214

S. Purcell and B. Neale and K. Todd-Brown and L. Thomas and M. A. R. Ferreira and D. Bender and J. Maller and P. Sklar and P. I. W. de Bakker and M. J. Daly and P. C. Sham (2007) PLINK: A Toolset for Whole-Genome Association and Population-Based Linkage Analysis. American Journal of Human Genetics 81(3): 559—575.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #2: No

**********

PLoS One. 2023 Feb 2;18(2):e0280809. doi: 10.1371/journal.pone.0280809.r006

Author response to Decision Letter 2

16 Dec 2022

The word document of response letter is included in this submission too.

Thank you very much for your concise summary and encouraging comment. We appreciate very much for your time and effort.

Major concerns:

Thank you for this suggestion. In this revision we updated this section by using the variant rs2398162 in the hypertension study in the Real data analysis as an example of dominant model. Relevant text and Table S2 were also updated to reflect this change.

Thank you for this observation and we apologize that the previous statement was inaccurate. In Loley et al. (2013, EJHG 21:1442-1448) it was shown this signal was only significant when coded in a recessive model in the Gambian group (Table 1). In this revision we corrected the sentence and stated that “This association was also evaluated in the Gambian samples and it was significant under a recessive model but insignificant under dominant and additive models.”

Thank you for this suggestion. In the revision we conducted the exact test for HWE among the cases and controls for each of the variant using the suggested R package (HardyWeinberg). The p-value results were summarized in S3 Table and we concluded that “Results showed that the p-values of the HWE tests for all the variants were larger than 0.01, with only two between 0.01 and 0.05, suggesting that there was little evidence of genotyping error among the variants”.

Thank you for bringing up this excellent point. In this revision, we provided such information about the selection between JT and CA trend tests when the true genetic model is unknown at the end of the discussion.

Thank you for this suggestion. We conducted the allelic test for the variants included in the Real data analysis section and confirmed that the test results were close to the discussed methods. The results of the allelic test were included in S3 table.

Minor issues:

L59: “recource” -- “resource”

L138: “set of simulation” -- “set of simulations”

L161: “We refer this table consisted of” -- “This table consists of”

L162: delete “as an expected table”.

L167, L202: “were reported” -- “are reported”

L262: capitalize “Kendall”

Thank you for these observations. We have made corresponding changes in this revision.

Attachment

Submitted filename: Response letter.docx

Click here for additional data file.^{(19.6KB, docx)}

PLoS One. doi: 10.1371/journal.pone.0280809.r007

Decision Letter 3

Mehdi Rahimi

10 Jan 2023

A nonparametric alternative to the Cochran-Armitage trend test in genetic case-control association studies: the Jonckheere-Terpstra trend test

PONE-D-21-39215R3

Dear Dr. Zhou,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Mehdi Rahimi, Ph.D.

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

Reviewer #2: Yes

**********

6. Review Comments to the Author

Reviewer #2: This article concerns a power comparison of the Cochran-Armitage trend test and the non-parametric Jonckheere-Terpstra trend test, under common genetic models (additive, dominant, recessive), different minor allele frequencies and sample sizes, and for bi-allelic genetic variants. The article has improved after the previous round of review. The authors have addressed most of my concerns satisfactorily. I have only some minor points for improvement left which are detailed below.

L75: “as Table” --- “as in Table”

L82: The null hypothesis is supposed to apply to all i (0, 1 or 2); that should be stated.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #2: No

**********

PLoS One. doi: 10.1371/journal.pone.0280809.r008

Acceptance letter

Mehdi Rahimi

24 Jan 2023

PONE-D-21-39215R3

A nonparametric alternative to the Cochran-Armitage trend test in genetic case-control association studies: the Jonckheere-Terpstra trend test

Dear Dr. Zhou:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Associate Prof. Mehdi Rahimi

Academic Editor

PLOS ONE

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

S1 Table. Power comparison between T_JT and

T_{C A}^{A d d}

for the additional simulation settings (q = 0.4).

(DOCX)

Click here for additional data file.^{(30.9KB, docx)}

S2 Table. Comparison between the Jonckheere-Terpstra trend test (T_JT) and the Cochran-Armitage trend test (

T_{C A}^{A d d}

) on SNPs that were reported associated with hypertension.

(DOCX)

Click here for additional data file.^{(31.1KB, docx)}

S3 Table. Exact p-values of the Hardy-Weinberg equilibrium (HWE) tests among cases and controls of the variants and the allelic test statistics in the Real data analysis.

(DOCX)

Click here for additional data file.^{(30.5KB, docx)}

Attachment

Submitted filename: Reviewer-Points-Comments.docx

Click here for additional data file.^{(12.6KB, docx)}

Attachment

Submitted filename: Plos One response letter.docx

Click here for additional data file.^{(29.6KB, docx)}

Attachment

Submitted filename: Plos One response letter.docx

Click here for additional data file.^{(29.6KB, docx)}

Attachment

Submitted filename: Response letter.docx

Click here for additional data file.^{(19.6KB, docx)}

Data Availability Statement

Real data analyzed in this study can be obtained from the cited articles (Liu et al., 2000; Timmann et al., 2012; Loley et al., 2013).

[pone.0280809.ref001] 1.Visscher P.M., et al., 10 years of GWAS discovery: Biology, function, and translation. Am J Hum Genet, 2017. 101(1): p. 5–22. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0280809.ref002] 2.Cochran W.G., Some Methods for Strengthening the Common χ2 Tests. Biometrics, 1954. 10(4): p. 417–451. [Google Scholar]

[pone.0280809.ref003] 3.Armitage P., Tests for Linear Trends in Proportions and Frequencies. Biometrics, 1955. 11(3): p. 375–386. [Google Scholar]

[pone.0280809.ref004] 4.Sasieni P.D., From genotypes to genes: doubling the sample size. Biometrics, 1997: p. 1253–1261. [PubMed] [Google Scholar]

[pone.0280809.ref005] 5.Clarke G.M., et al., Basic statistical analysis in genetic case-control studies. Nature Protocols, 2011. 6(2): p. 121–133. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0280809.ref006] 6.Gonzalez J.R., et al., Maximizing association statistics over genetic models. Genet Epidemiol, 2008. 32(3): p. 246–54. [DOI] [PubMed] [Google Scholar]

[pone.0280809.ref007] 7.Kuo C.L. and Feingold E., What’s the best statistic for a simple test of genetic association in a case-control study? Genet Epidemiol, 2010. 34(3): p. 246–253. doi: 10.1002/gepi.20455 [DOI] [PubMed] [Google Scholar]

[pone.0280809.ref008] 8.Li Q., et al., Robust tests for single-marker analysis in case-control genetic association studies. Ann Hum Genet, 2009. 73(2): p. 245–52. doi: 10.1111/j.1469-1809.2009.00506.x [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0280809.ref009] 9.Loley C., et al., A unifying framework for robust association testing, estimation, and genetic model selection using the generalized linear model. Eur J Hum Genet, 2013. 21(12): p. 1442–8. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0280809.ref010] 10.Jonckheere A.R., A Distribution-Free k-Sample Test Against Ordered Alternatives. Biometrika, 1954. 41(1/2): p. 133–145. [Google Scholar]

[pone.0280809.ref011] 11.Terpstra T.J., The asymptotic normality and consistency of Kendall’s test against trend, when ties are present in one ranking. Indagationes Mathematicae, 1952. 14(3): p. 327–333. [Google Scholar]

[pone.0280809.ref012] 12.Zheng G., Joo J., and Yang Y., Pearson’s test, trend test, and MAX are all trend tests with different types of scores. Annals of Human Genetics, 2009. 73(2): p. 133–140. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0280809.ref013] 13.The Wellcome Trust Case Control Consortium, Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature, 2007. 447(7145): p. 661–678. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0280809.ref014] 14.Timmann C., et al., Genome-wide association study indicates two novel resistance loci for severe malaria. Nature, 2012. 489(7416): p. 443–446. [DOI] [PubMed] [Google Scholar]

[pone.0280809.ref015] 15.Li Q., et al., MAX-rank: a simple and robust genome-wide scan for case-control association studies. Human Genetics, 2008. 123(6): p. 617–623. [DOI] [PubMed] [Google Scholar]

[pone.0280809.ref016] 16.Leal S.M., Detection of genotyping errors and pseudo‐SNPs via deviations from Hardy‐Weinberg equilibrium. Genetic Epidemiology, 2005. 29(3): p. 204–214. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0280809.ref017] 17.Hosking L., et al., Detection of genotyping errors by Hardy–Weinberg equilibrium testing. European Journal of Human Genetics, 2004. 12(5): p. 395–399. [DOI] [PubMed] [Google Scholar]

[pone.0280809.ref018] 18.Graffelman J., Exploring diallelic genetic markers: the HardyWeinberg package. Journal of Statistical Software, 2015. 64: p. 1–23. [Google Scholar]

[pone.0280809.ref019] 19.Zhou Z., et al., Differentiating the Cochran-Armitage trend test and Pearson’s χ2 test: Location and dispersion. Annals of Human Genetics, 2017. 81(5): p. 184–189. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0280809.ref020] 20.Zhou Z., et al., Decomposing Pearson’s χ2 test: A linear regression and its departure from linearity. Annals of Human Genetics, 2018. 82(5): p. 318–324. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

A nonparametric alternative to the Cochran-Armitage trend test in genetic case-control association studies: The Jonckheere-Terpstra trend test

Sydney E Manning

Hung-Chih Ku

Douglas F Dluzen

Chao Xing

Zhengyang Zhou

Roles

Abstract

Introduction

Methods

Table 1. Genotype distribution at a diallelic marker in a case-control study.

Simulation

Fig 1. Power comparison between the Jonckheere-Terpstra trend test (TJT) and the Cochran−Armitage trend test (TCAAdd).

Table 2. Percent difference between the Jonckheere-Terpstra trend test statistic (TJT) and the Cochran-Armitage trend test statistic (TCAAdd) based on the expectation of genotype distributions under dominant, additive, and recessive genetic models.

Real data analysis

Fig 2. Comparison between the Jonckheere-Terpstra trend test (TJT) and the Cochran−Armitage trend test (TCAAdd) in four real datasets.

Discussion

Supporting information

Acknowledgments

Data Availability

Funding Statement

References

Decision Letter 0

Mehdi Rahimi

Roles

Author response to Decision Letter 0

Decision Letter 1

Mehdi Rahimi

Roles

Author response to Decision Letter 1

Decision Letter 2

Mehdi Rahimi

Roles

Author response to Decision Letter 2

Decision Letter 3

Mehdi Rahimi

Roles

Acceptance letter

Mehdi Rahimi

Roles

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

Fig 1. Power comparison between the Jonckheere-Terpstra trend test (T_JT) and the Cochran−Armitage trend test $(T_{C A}^{A d d})$ .

Table 2. Percent difference between the Jonckheere-Terpstra trend test statistic (T_JT) and the Cochran-Armitage trend test statistic ( $T_{C A}^{A d d}$ ) based on the expectation of genotype distributions under dominant, additive, and recessive genetic models.

Fig 2. Comparison between the Jonckheere-Terpstra trend test (T_JT) and the Cochran−Armitage trend test $(T_{C A}^{A d d})$ in four real datasets.