Skip to main content
PLOS One logoLink to PLOS One
. 2021 Sep 15;16(9):e0257472. doi: 10.1371/journal.pone.0257472

Applications of statistical experimental designs to improve statistical inference in weed management

Steven B Kim 1, Dong Sub Kim 2,3,*, Christina Magana-Ramirez 1
Editor: Ahmet Uludag4
PMCID: PMC8443065  PMID: 34525126

Abstract

In a balanced design, researchers allocate the same number of units across all treatment groups. It has been believed as a rule of thumb among some researchers in agriculture. Sometimes, an unbalanced design outperforms a balanced design. Given a specific parameter of interest, researchers can design an experiment by unevenly distributing experimental units to increase statistical information about the parameter of interest. An additional way of improving an experiment is an adaptive design (e.g., spending the total sample size in multiple steps). It is helpful to have some knowledge about the parameter of interest to design an experiment. In the initial phase of an experiment, a researcher may spend a portion of the total sample size to learn about the parameter of interest. In the later phase, the remaining portion of the sample size can be distributed in order to gain more information about the parameter of interest. Though such ideas have existed in statistical literature, they have not been applied broadly in agricultural studies. In this article, we used simulations to demonstrate the superiority of the experimental designs over the balanced designs under three practical situations: comparing two groups, studying a dose-response relationship with right-censored data, and studying a synergetic effect of two treatments. The simulations showed that an objective-specific design provides smaller error in parameter estimation and higher statistical power in hypothesis testing when compared to a balanced design. We also conducted an adaptive experimental design applied to a dose-response study with right-censored data to quantify the effect of ethanol on weed control. Retrospective simulations supported the benefit of this adaptive design as well. All researchers face different practical situations, and appropriate experimental designs will help utilize available resources efficiently.

1. Introduction

A successful weed management is a key to improve the crop productivity and quality. Researchers have used various response variables in weed control studies such as the viability of weed seeds [1, 2], the germination of weed seeds [3, 4], the weed emergence [5, 6], the weed density per unit area using hand count [7, 8], and the proportion of area covered by green colors [9, 10]. To identify the effect of treatments for weed control, the analysis of variance (ANOVA) and similar statistical methods have been used. Traditionally, if an ANOVA test rejects the null hypothesis (that all group means are equal), the Tukey’s test (also known as the Tukey’s honestly significant difference) is often used to determine significantly different groups [11, 12]. The Duncan’s multiple range test and the Fisher’s least square difference test seem to be alternate choices among researchers [1319]. These statistical tests assume that data are observed from normal distributions (normality assumption) with equal variance (homogeneity assumption). The normality assumption may not be a big deal in large-sample studies, but large sample sizes cannot resolve the issue of unequal variances. The Duncan test does not control the family-wise rate of Type I Error [20]. In other words, the probability of falsely claiming a difference between any two treatments increases as the number of treatment groups increase, so it should not be recommended when there are many treatments to be compared. To guard against the inflated rate of Type I Error in the comparison of multiple treatments, the Tukey’s test or a correction method for multiple testing should be considered.

Balanced designs are commonly used in agricultural experiments [2123]. A balanced design mitigates the violation of homogeneity assumption [24]. In addition, if the homogeneity assumption is true, a balanced design increases statistical power of hypothesis testing for comparing groups. In a two-sample t-test with the homogeneity assumption, it can be shown by calculus that the standard error, the square root of σ2(1/n1 + 1/n2), is minimized when n1 = n2 for a fixed total sample size n1 + n2. A balanced design may end up unbalanced due to unexpected reasons (e.g., incorrect implementation of a treatment, invasion of pests, and missing samples). Losing a few data point during an experiment is unfortunate, but it is often not a big deal unless the original sample size was extremely small. The Tukey-Kramer method adjusts the calculation of standard error to account for unequal sample sizes [25, 26].

In some cases, the homogeneity assumption is not plausible when treatments have different expected outcomes. Count data tend to vary more when the mean is higher, and the standard error for comparing two group means, which is the square root of σ12/n1+σ22/n2, is minimized when n2=n1σ22/σ12. In this case, however, researchers may be uncomfortable to guess σ12 and σ22 before collecting data. Furthermore, it is not always possible to increase the sample size, but it is possible to control a maximally available sample size. If researchers believe that σ12 and σ22 are substantially different, an adaptive design may be a practical suggestion. For example, a researcher may start with a balanced design by spending a portion of an available (fixed) sample size, then the researcher may spend the remaining portion of the fixed sample size to minimize the standard error based on estimated σ12 and σ22. An adaptive design is not limited to two phases. It may consist of two or more phases to improve precision in parameter estimation. Although a large number of phases may improve the precision from statistical perspective, such a long adaptive design may not be feasible in practice. Throughout the article, an adaptive design refers to an experiment which is designed in two phases until drawing a final statistical inference.

Ronald Fisher realized that the randomization is needed in order to satisfy the assumption of independent errors, and he introduced the principles of randomization in his book, Statistical Methods for Research Workers [27, 28]. The randomization is an important component of experimental design to reduce bias in parameter estimation. It is a misconception that a small sample size leads to the bias. In fact, a small sample size is associated with the variance which is another important component in parameter estimation. A lower variance can be attained by increasing the sample size, but researchers often have limited resources, time, and labor capacity. Given a fixed sample size, we can still lower the variance by choosing an unbalanced design carefully. For example, suppose a researcher assumes that two numeric variables X (e.g., treatment level) and Y (e.g., response) are linearly related as Y = β0 + β1 X + ϵ. Further assume that the researcher has four numeric levels of treatment x1 < x2 < x3 < x4 and can afford a total sample size of n1 + n2 + n3 + n4 = 20, where ni is the number of experimental units assigned to xi for i = 1, 2, 3, 4. If the parameter of interest is β1 which quantifies the linear relationship between X and Y, the unbalanced design (n1, n2, n3, n4) = (10, 0, 0, 10) leads to a smaller variance than the balanced design (n1, n2, n3, n4) = (5, 5, 5, 5) in the estimation of parameter β1. As such, when a researcher has a target parameter to be estimated or tested, the researcher would like to seek an unbalanced design to minimize the variance in parameter estimation. However, an optimal design for β1 (which is optimal from the theoretical perspective) may not be recommended in practice because it is optimal only under the strong assumption of linearity. Even if the linearity assumption is plausible as an approximation, weed scientists may be (or should be) interested in an adequate strength of a treatment from a variety of perspectives such as the effectiveness of weed control, the impact on the environment, and the cost. In this regard, finding an adequate concentration (or any quantification of treatment strength) would be an important research objective. In a later section of this article, we demonstrate a statistical model and an experimental design to find such a parameter in terms of delaying weed emergence.

The primary focus of this article is to demonstrate that there are many practical situations in agricultural studies that unbalanced designs are better than balanced designs. Here, “better” means a smaller mean square error (which accounts both bias and variance) in parameter estimation and a greater statistical power in hypothesis testing (while respecting a fixed significance level α). In particular, we demonstrate three practical situations when the parameter of interest is the difference in two group means (Section 2), the effective concentration of an active treatment in which the median time to weed emergence is doubled when compared to the control (Sections 3 and 4), and the synergistic effect of two treatments (Section 5).

In addition, by spending an available total sample size in two phases (referred to as an adaptive design in this article), a researcher may improve the precision of parameter estimation by correcting an assumption made prior to the experiment. For instance, a Bayesian optimal design or a locally optimal design requires researcher’s guess about model parameters prior to collecting data [2931]. In practice, it is challenging to specify an informative prior for model parameters, and an informative prior may severely deviate from the truth. In such a case, a researcher may regret for making a decision at once with scarce knowledge. To address the caveat of designing an experiment at once before collecting data, a researcher may use a non-informative prior to allocate a portion of the total sample size, then decide an experimental design for the remaining portion using an updated posterior. In this Bayesian approach, the key idea is to design each phase of experiment by utilizing all available knowledge (Sections 4 and 5). The benefit of adaptive designs will be demonstrated via simulations in the later sections. There are many statistical methods which have been developed to make adaptive decisions in clinical trials [3235]. The idea of an adaptive design is not new in scientific communities, and in this article, it is discussed in the context of agricultural studies.

2. Comparing two groups

2.1. Assumptions

We assume that there are two treatments to be compared and further assume that the response variable is generated from normal distributions, N(μ1,σ12) for the first treatment group (say group 1) and N(μ2,σ22) for the second treatment group (say group 2). The null hypothesis is H0: μ1μ2 = 0, and the alternative hypothesis is H1: μ1μ2 ≠ 0. Let n1 and n2 be the sample size for group 1 and group 2, respectively. Consider two-sample t-test without the equal variance assumption, and the degrees of freedom for the T statistic is estimated by the Welch-Satterthwaite equation.

Suppose Researcher 1 performs the t-test using the balanced design with a total sample size of n1 + n2 = 40, so n1 = n2 = 20. Among possible choices of (n1, n2) = (2, 38), (3, 37), …, (37, 3), (38, 2), the balanced design minimizes the standard error (SE), the square root of σ12/n1+σ22/n2, when σ1 = σ2. In general, if the researcher knew the true values of σ1 and σ2, the optimal choice for minimizing σ12/n1+σ22/n2 would be n1 = 40σ1/(σ1 + σ2) and n2 = 40 − n1 rounded.

Suppose Researcher 2 chooses an adaptive design (given the total sample size 40) as follows. Let n1 and n2 be the sample size in the first phase for group 1 and group 2, respectively. Let n1 and n2 be the respective sample size in the second phase. The researcher initiates a balanced design of n1=n2=10, estimates σ12 and σ22 by the sample variances S12 and S22, respectively, then decides n1 and n2=20-n1 which minimize the estimated SE, the square root of S12/(10+n1)+S22/(10+n2), for n1=0,1,,20. In other words, the researcher spends one half of the total sample size 40 to learn about σ12 and σ22, then spends the remaining half to reduce the SE in the two-sample t-test.

2.2. Simulations

To compare statistical power between the design of Researcher 1 (balanced design) and the design of Researcher 2 (adaptive design), simulations scenarios were set at μ1 = 10, σ1 = 10, μ2 = 10, 20, …, 100, and σ2 = 10, 20, 50, 100 at the significance level of α = 0.05. Each scenario was simulated 10,000 times. The simulation process for the balanced design is as follows:

  1. Fix the values of μ1, μ2, σ1, and σ2.

  2. Generate a random sample of size n1 = 20 from N(μ1,σ12).

  3. Generate a random sample of size n2 = 20 from N(μ2,σ22).

  4. Perform the two-sample t-test and calculate the p-value.

  5. Repeat Steps 2 to 4 10,000 times.

  6. Calculate the proportion of times when p-value <0.05 to estimate the probability of concluding H1 (power at the significance level α = 0.05).

The simulation process for the adaptive design is as follows:

  1. Fix the values of μ1, μ2, σ1, and σ2.

  2. Generate a random sample of size n1=10 from N(μ1,σ12), and estimate σ12 by the sample variance S12.

  3. Generate a random sample of size n2=10 from N(μ2,σ22), and estimate σ22 by the sample variance S22.

  4. Let n2=20-n1, evaluate SE^=S1210+n1+S2210+n2 for n1=0,1,,20, and choose the value of n1 which minimizes SE^.

  5. Given the chosen value of n1, generate a random sample of size n1 from N(μ1,σ12).

  6. Given n2=20-n1, generate a random sample of size n2 from N(μ2,σ22).

  7. Combine the random samples in Steps 2, 3, 5, and 6 to perform the two-sample t-test and calculate the p-value.

  8. Repeat Steps 2 to 7 10,000 times.

  9. Calculate the proportion of times when p-value <0.05 to estimate the probability of concluding H1 (power at α = 0.05).

All computational work was performed in R Version 4.0.2 [36], and all codes were written by the authors with built-in functions like t.test. The simulation results are graphically shown in Fig 1. There is no meaningful difference between the two designs when σ1 = σ2 (the upper left panel of Fig 1). The adaptive design provides greater statistical power than the balanced design as σ2 deviates more from σ1 (the lower right panel of Fig 1).

Fig 1. Power analysis.

Fig 1

This figure compares statistical power between the balanced design and the adaptive design with respect to μ2μ1 at σ1 = 10 and σ2 = 10, 20, 50, 100.

Departures from the normality assumption were tested under the following distributions: t(10), t(5), Beta(5, 10), Beta(10, 5), and χ2(3). To match means (μ1 and μ2) and standard deviations (σ1 and σ2) of each scenario, these distributions were standardized, scaled by the standard deviations, then shifted by the means. The overall patterns, superiority of the adaptive design over the balanced design, were similar to Fig 1. The adaptive design showed a higher Type I error rate (about 0.08) than the balanced design when data were generated under the chi-square distribution scaled by σ1 = σ2 = 10. The balanced design showed a higher Type I error rate (about 0.07) when data were generated under the chi-square distribution σ1 = 10 and σ2 = 100.

2.3. Note

An adaptive design would be more applicable in pilot studies which require relatively short time. A large branch in agricultural sciences relies on field experiments with annual crops, so there may be a practical challenge to use an adaptive design. On the other hand, conclusions of scientific studies are more convincing when data show a consistent pattern between two seasons. Some journals, reviewers, and researchers prefer to see consistent results by repeating an experiment. In addition, count data (e.g., weed count) often violate the normality assumption and homogeneity assumption. A large sample size often mitigates the violation of normality assumption but not the violation of homogeneity assumption. If a large sample size is available, there is no reason to make the homogeneity assumption in the two-sample t-test. In this case, an adaptive design (if it is feasible) provides experimenters an opportunity to increase statistical power by considering an optimal distribution of experimental units between groups. In this process, the unequal variances can be estimated after the first phase of an experiment in order to plan the second phase.

There may be other kinds of weed control studies and related studies. In particular, count data naturally involve non-normality and homogeneity (i.e., data do not follow a normal distribution with equal variance), and there are generalized linear models which can properly account for the uncertainty associated with count data [3739].

3. Comparing time to weed emergence

Traditionally, treatments for weed control have been compared by the average count of weeds per given area [2, 7, 8], the average biomass of weeds per given area [2, 7], and the proportion of area covered by weed colors [9, 10]. The response variables have been recorded at an arbitrary time point. In a cross-sectional assessment, the quantification of effect size may heavily depend on the time of assessment. For example, as shown in Fig 2, the effect of an active treatment may be similar to the control at the beginning of an experiment, the relative effect size becomes large for a period of time, then the relative effect size eventually becomes the same as the control because, even where pesticides and fumigants have been treated, weeds may eventually emerge.

Fig 2. Treatment effect with respect to time since treatment application.

Fig 2

This hypothetical scenario compares the weed density (proportion of area covered by weeds) between the treatment group and the control group with respect to time since treatment application. The quantification of treatment effect (relative to control) may be highly sensitive to the time of data collection.

From farmers’ perspective, the primary interest would be how long a treatment delays the weed emergence relative to control. In addition, if a treatment is known to be effective, the question of interest would be how strong (concentration or frequency of an active treatment) the treatment should be in order to balance among cost, effect, and other practical considerations. In this section, we discuss an experimental design to estimate a parameter which quantifies the treatment effect in terms of the time to weed emergence.

3.1. Model assumptions

Let T be the waiting time (days) to observe weed emergence, and let x be the ethanol concentration (fixed by an researcher), where x = 0 denotes the concentration of 0% (control) and x = 1 denotes the ethanol concentration of 100%. We assume ln(T)∼N(μx, σ), where μx = β0 + β1 x + β2x2 with the parameter space −∞<β0 < ∞, β1 > 0, β2 > 0, and σ > 0. Two inequalities β1 > 0 and β2 > 0 imply that μx increases with respect to x for 0 < x < 1, and these assumptions will simplify some mathematical subtlety. Let Δ be the concentration such that μΔ = μ0 + ln(2) as demonstrated in Fig 3.

Fig 3. The relation between Δ and μΔ.

Fig 3

The y-axis represents the logarithmic time to weed emergence, and the x-axis represents the ethanol concentration. The value of Δ corresponds to the concentration such that the expected time to weed emergence increases by ln(2). In other words, the median time doubles at the concentration of Δ when compared to the zero (control) concentration.

The choice of log-normal distribution allows the following interpretation. Under the model assumption, the median of ln(T), denoted by M[ln(T)], and the expectation of ln(T), denoted by E[ln(T)], are equal. Therefore,

M[ln(T)x=Δ]-M[ln(T)x=0]=E[ln(T)x=Δ]-E[ln(T)x=0]=μ0+ln(2)-μ0=ln(2). (1)

Further note that

M[ln(T)x=Δ]-M[ln(T)x=0]=ln[M(Tx=Δ)]-ln[M(Tx=0)]=ln[M(Tx=Δ)M(Tx=0)]. (2)

From Eqs (1) and (2), we obtain

M(Tx=Δ)M(Tx=0)=2.

In the subsequent applied example (Section 4), the primary parameter of interest is Δ, the concentration that corresponds to the doubled median waiting time when compared to the control. Note that

μΔ-μ0=β1Δ+β2Δ2=ln(2)

is equivalent to the quadratic equation

β2Δ2+β1Δ-ln(2)=0.

Under the model assumptions, by the quadratic formula, the parameter Δ has the closed-form expression

Δ=-β1+(β1)2+4ln(2)β22β2. (3)

Note that the choice of the constant 2 (doubled median waiting time) is arbitrary. For any constant k > 1 (an increase in median waiting time by k times), ln(2) in Eq (3) can be replaced by k.

3.2. Experimental design

Let xi be a fixed concentration and let mi be the number of units allocated to xi for i = 1, …, k. Suppose n is the total sample size which is available for an experimenter, so the experimenter designs an experiment by choosing (m1, …, mk) such that m1 + ⋯+ mk = n. Let Tij be the waiting time to be observed at concentration xi for j = 1, …, mi.

Assuming ln(Tij)N(μxi,σ2), where μxi=β0+β1xi+β2(xi)2 is the assumed quadratic regression, let ϕij be the normal probability density function for ln(Tij). Given the model parameters θ=(β0,β1,β2,σ2), under the independence assumption, the likelihood function is given by L(θ)=i=1kj=1miϕij. Given the likelihood function, the Fisher information is defined as

I(θ)=-E(2ln[L(θ)]θθT),

and V(θ)=hT(θ)[I(θ)]-1h(θ) is the approximation for the variance of the maximum likelihood estimator for Δ in Eq (3), where

hT(θ)=(Δβ0Δβ1Δβ2Δσ2)Δβ0=0Δβ1=-Δ(β1)2+4ln(2)β2Δβ2=-1β2(Δ-ln(2)(β1)2+4ln(2)β2)Δσ2=0.

This experimental design is referred to as the c-optimal design, and it has been introduced and applied in other regression models [34, 4042]. The c-optimal design is devised to minimize the expected asymptotic variance of the maximum likelihood estimator for the parameter of interest [40, 43]. The primary focus is to increase the precision in the estimation for the parameter Δ by seeking the distribution of (m1, …, mk) which minimizes the expected value of V(θ) given prior knowledge modeled by a prior distribution f(θ). In other words, the c-optimal design minimizes V(θ)f(θ)dθ with respect to (m1, …, mk).

In agricultural studies, the balanced design (equal replication per group) seems common, and the c-optimal design and other designs often outperform the balanced design for parameter estimation. When researchers have a specific parameter to be estimated, the c-optimal design is devised for the purpose. For a situation when there are multiple criteria to be optimized, robust designs have been discussed [40].

3.3. Simulations

To demonstrate the performance of the c-optimal design relative to the balanced design (i.e., mi = n/k for i = 1, 2, …, k), four simulation scenarios were designed as shown in Fig 4. In the figure, the curves represent the expected time to weed emergence in the original unit (days) under the assumption of σ = 1. For each scenario, k = 5 concentrations were fixed at x1 = 0, x2 = 1/8, x3 = 1/4, x4 = 1/2, and x5 = 1, and the total sample size was fixed at n = 100. We compared the balanced design m=(20,20,20,20,20) and the c-optimal design with a flat prior f(θ)1 which allocated m=(42,0,0,50,8) at the fixed concentrations, respectively.

Fig 4. Simulation scenarios.

Fig 4

The curves are designed by the values of regression parameters (β0, β1, β2) given in Table 1. The true values of Δ (the parameter to be estimated) are 0.477, 0.203, 0.241, and 0.267 in Scenarios 1, 2, 3, and 4, respectively.

Each scenario was repeated 1,000 times. All computational work in this simulation and subsequent simulations was done by writing codes with built-in functions in R Version 4.0.2 [36]. The bias, variance, and mean square error (MSE) of posterior mean for Δ were compared between the two experimental designs as shown in Table 1. It demonstrates the outperformance of the c-optimal design over the balanced design in terms of the three criteria.

Table 1. Simulation results.

True parameter values Balanced design c-optimal design
Scenario β 0 β 1 β 2 Δ Bias Variance MSE Bias Variance MSE
1 1.0 0.5 2.0 0.477 + 0.003 0.104 0.104 + 0.001 0.078 0.078
2 1.0 3.0 2.0 0.203 + 0.012 0.052 0.053 + 0.006 0.035 0.035
3 1.0 3.0 -0.5 0.241 + 0.029 0.082 0.087 + 0.010 0.054 0.055
4 1.0 3.0 -1.5 0.267 + 0.056 0.127 0.139 + 0.025 0.080 0.084

4. Right-censored time to weed emergence

In the previous section, we considered time to weed emergence as a response variable of interest. In practice, it is implausible to wait for weed emergence in all experimental units because it will require a too long study time. Suppose the maximum time of observation is fixed before initiating an experiment. For instance, in the small-scale experiment to be introduced in Section 4.3, we fixed the maximum time of observation at 30 days, and weed did not emerge until 30 days in some experimental units. In this case, we do not know the exact time of weed emergence, but we know that it is at least 30 days. This type of data is referred as right-censored data, and we revisit the regression model discussed in Section 3 to account for the right-censored data.

4.1. Model assumptions

We maintain all assumptions made in Section 3.1 and introduce the following notations. Let Tij denote the actual time of weed emergence in the j-th experimental unit of the i-th concentration level for i = 1, …, k and j = 1, …, mi. Let Cij = 1 if Tij ≤ 30 (so that the actual time of weed emergence is observed) and Cij = 0 if tij > 30 (the actual time is not observed). The likelihood function given in Section 3.1 is modified as L(θ)=i=1kj=1mi(ϕi)cij(1-Φ)1-cij, where ϕi is the probability density function of ln(Tij)N(μxi,σ2) and Φi is its cumulative distribution function [44, 45].

4.2. Prior specification

Instead of the flat prior f(θ)1, we modeled an informative prior before starting the experiment (to be introduced in Section 4.3). Under the regression model μxi=β0+β1xi+β2(xi)2, β0 is interpreted as E[ln(T)x=0], and it is equivalent to ln[M(Tx=0)] because of the log-normal assumption. Instead of a prior specification on β0, we specified a prior on eβ0=M(Tx=0), the median time to weed emergence at the control dose. (This parameterization was easier to elicit our prior knowledge.) We assumed that the median time at the control is shorter than 7 days with a probability 0.5, P(eβ0<7)=P(β0<ln(7))=0.5. We were fairly certain that the median time is shorter than 30 days, and we chose P(eβ0<30)=P(β0<ln(30))=0.975 for computational simplicity. Using a normal prior β0N(a0, b0), we calculated a0 = ln(7) = 1.95 and b0 = 0.5 ln(30/7) = 0.73 to reflect the prior assumptions on the median time eβ0.

Under the regression model, it was challenging to elicit a prior distribution jointly on β1 > 0 and β2 > 0 in a tractable way. For the sake of simplicty, we specified β1 ∼ Exp(d1) and β2 ∼ Exp(d2) independently. The hyper-parameters, d1 and d2, were chosen by trial and error such that P(Δ<0.5)=·0.95 and P(Δ<1)=·1, where Δ is the transformation of β1 and β2 given in Eq (3). After several iterations of trial and error, we found that d1 = 0.2 and d2 = 0.2 are reasonable. For the standard deviation σ > 0, a flat prior was chosen independently.

4.3. Adaptive experiment design (applied example)

Typical weed control treatments contain pre- and/or post-emergence pesticides, fumigants, biofumigants, solarization, flaming, and hand hoeing [1]. While pesticide-based weed controls are known to be biologically efficacious and economically efficient, most of them are harmful to environments. Consumers have raised their concern, they have showed a high interest in organic products, and regulations on the use of pesticides have been strengthened. Ethanol (EtOH) contained in plants or synthesized in factories is an easily available, low-toxic solvent. Although EtOH is not registered as a biological control agent, researchers have reported that it inhibits the germination of weed seeds. For instance, it was shown that the germination of morning glory seeds was reduced after being exposed to 1% v/v of EtOH [46]. Since EtOH is a natural product, EtOH may be available as a biological control agent. It seems promising that a high concentration of EtOH is effective, and our objective is to find an adequate concentration. We acknowledge that there are more realistic powerful herbicides in weed science, and this section is devised for the purpose of demonstrating the adaptive experimental design for estimating the parameter Δ. An experiment of EtOH was conducted to find the concentration which doubles the median time of weed emergence, when compared to the control, and this parameter is denoted by Δ as in Section 3.

Each flowerpot contained 10 seeds of ryegrass (Lolium multiflorum). At the center of each flowerpot, we prepared to apply 15 mL of 0% (non-treated control), 12.5%, 25%, 50%, and 100% of EtOH were applied. That is, we fixed the experimental concentrations at x1 = 0, x2 = 0.125, x3 = 0.25, x4 = 0.5, and x5 = 1. The original experimental plan was to have a sample of size n = 100, but only 50 flowerpots were available at a time. We decided to perform an adaptive experimental study, and we fixed the maximum waiting time of 30 days per phase because the emergence of ryegrass would take an extremely long time at a high concentration of EtOH.

For the first phase of this experiment, we applied the c-optimal design using the prior in Section 4.2. The c-optimal design allocated m1 = 11 flowerpots at x1 = 0, m4 = 25 flowerpots at x4 = 0.5, and m5 = 14 flowerpots at x5 = 1. All flowerpots were monitored daily. All of the 11 flowerpots at x1 = 0 had ryegrass emerged within 30 days (average of 12.45 days), 13 out of the 25 flowerpots at x4 = 0.5 had ryegrass emerged within 30 days, and none of the 14 flowerpots at x5 = 1 had ryegrass emerged within 30 days.

After collecting the data in the first phase, we combined the prior f(θ) with the likelihood L(θ) for the posterior, and we applied the c-optimal design for the next 50 flowerpots by minimizing the posterior expectation of V(θ). Note that we used the likelihood L(θ) of the form given in Section 4.1 to account for the right-censored data. For the second phase, the c-optimal design allocated 32 flowerpots at x1 = 0 and 18 flowerpots at x4 = 0.5. In other words, the c-optimal design suggested stop observing at the maximum (100%) concentration, and it attempted to gather more information at the control than at the 50% concentration in order to reduce uncertainty about Δ. For the observed data, see S1 and S2 Data.

Fig 5 graphically presents the change in the knowledge about Δ before the experiment (prior) and after the first phase and the second phase of the experiment (posteriors). The respective point estimates for Δ, using the mean of distribution, were 0.21, 0.46, and 0.39, respectively. The respective 95% credible intervals were (0.04, 0.7), (0.35, 0.6), and (0.33, 0.45), and the degree of uncertainty about Δ decreased as we collected more data.

Fig 5. Prior and posterior inference for Δ.

Fig 5

This figure demonstrates that the uncertainty about Δ decreases as data are accumulated.

4.4. Retrospective simulations

After the second phase, the posterior means of β0, β1, and β2 were 2.2, 1.0, and 1.9, respectively. Six simulation scenarios were designed around these parameter values. This was a retrospective simulation study to investigate what would happen if the c-optimal design was performed at once instead of taking two phases. Each retrospective scenario was replicated 1,000 times. The adaptive design resulted in lower MSEs in the six scenarios as shown in Table 2.

Table 2. Simulation results.

True parameter values Fixed design Adaptive designs
Scenario β 0 β 1 β 2 Δ Bias Variance MSE Bias Variance MSE
1 2.5 0.07 1 0.798 -0.064 0.148 0.161 -0.054 0.150 0.160
2 2.5 0.07 3 0.469 -0.073 0.048 0.087 -0.051 0.049 0.070
3 2.5 0.7 1 0.553 -0.024 0.087 0.090 -0.020 0.082 0.084
4 2.5 0.7 3 0.378 -0.037 0.046 0.059 -0.025 0.042 0.049
5 2.5 1.7 1 0.340 0.017 0.056 0.058 0.016 0.050 0.053
6 2.5 1.7 3 0.275 -0.007 0.037 0.037 -0.004 0.033 0.033

4.5. Note

In a large-scale field experiment, random-effects may exist due to random germination times and other environmental factors. Additional sources of random variations can be modeled via a mixed-effects model. An experimental design under a mixed-effects model requires a more sophisticated variance structure, the underlying mathematical formulas are much more technical [4749]. In addition, survival analysis (analyzing time-to-event data) in weed science has been discussed in literature [5052].

5. Synergistic effect

Sometimes researchers seek a synergistic effect of two treatments [5356]. Suppose an outcome is coded as 1 or 0 (e.g., 1 for suppressing germination, 0 otherwise), and let π be the probability of the outcome coded as 1. Let x be the concentration of treatment A and z be the concentration of treatment B. The logistic regression is given by

π(x,y)=eβ0+β1x+β2z+β3xz1+eβ0+β1x+β2z+β3xz,

or equivalently

ln(π(x,y)1-π(x,y))=β0+β1x+β2z+β3xz.

Under the model, the parameter of interest is β3 to test for the presence of synergistic or antagonistic effect between the two treatments. If β3 = 0, the null hypothesis, it implies the absence of synergistic or antagonistic effect. If β3 ≠ 0, the alternative hypothesis, it implies the presence of synergistic or antagonistic effect.

For the purpose of demonstration, suppose a researcher has four concentrations for treatment A, say x = 0, 0.25, 0.5, 1, and four concentrations for treatment B, say z = 0, 0.25, 0.5, 1. If the researcher can afford a total sample of size 160, there are 10 units allocated to each possible combination (x, y) for a balanced design. Instead of the balanced design, the researcher may consider the d-optimal design which maximizes the determinant of the Fisher expected information (FEI) matrix, and it is devised to increase the amount of information about the model parameters β=(β0,β1,β2,β3) globally. Alternatively, the c-optimal design maximizes the asymptotic variance of the maximum likelihood estimator for β3 (i.e., the (4,4)-th element of the inverted FEI), and it is devised to maximize the information about the target parameter β3 in order to test for the synergistic or antagonistic effect.

5.1. Prior specifications for simulation study

For the d- and c-optimal designs, we need a prior specification on β. Agriculture researchers may collaborate with statisticians to express a prior (researchers’ knowledge prior to an experiment) via a probability distribution. Instead of directly expressing a prior on β, which is difficult to interpret in the context of research, prior knowledge can be expressed on the probability of an outcome at four (the number of regression parameters) arbitrary concentrations. For the purpose of demonstration, we considered four concentrations (0, 0), (1, 0), (0, 1), and (1, 1), and we specified π(0,0) ∼ Beta(1, 1), π(1,0) ∼ Beta(1, 1), π(0,1) ∼ Beta(1, 1), and π(1,1) ∼ Beta(1, 1) independently to express a high degree of uncertainty. This non-informative prior is referred to as prior 1 in this section. The independent priors on π’s can be transformed to the joint prior of β=(β0,β1,β2,β3) as shown in the left panel of Fig 6. This method of eliciting a prior distribution on β is known as the conditional mean prior [57]. To express a less degree of uncertainty, we specified π(0,0) ∼ Beta(2, 8), π(1,0) ∼ Beta(5, 5), π(0,1) ∼ Beta(5, 5), and π(1,1) ∼ Beta(8, 2) independently. This prior is referred to as prior 2, and the informative prior leads to smaller prior variances on β as shown in the right panel of Fig 6.

Fig 6. Joint prior distributions of model parameters.

Fig 6

The scatter plots graphically demonstrate the joint prior distribution of β=(β0,β1,β2,β3) induced from the independent beta priors on π(0,0), π(1,0), π(0,1), and π(1,1). The figure on the left demonstrates prior 1 (non-informative prior), and the figure on the right demonstrates prior 2 (informative prior).

5.2. Experimental designs

The c-optimal design is sensitive to a prior specification, but the d-optimal design is not. In Fig 7, the four experimental designs (balanced, d-optimal, c-optimal with prior 1, and c-optimal with prior 2) are compared in terms of the relative proportion of units (out of the total 160) allocated at the 16 possible combinations of two treatments. The d-optimal design spreads the total sample size of 160 evenly at the four concentration points (x, z) = (0, 0), (0, 1), (1, 0), and (1, 1). In other words, it widely spreads the units on the entire concentration space [0, 1] × [0, 1] in order to learn about all model parameters β=(β0,β1,β2,β3) globally. The c-optimal design with prior 1 balances between the extreme and middle concentrations at some degree, but the c-optimal design with prior 2 resembles the d-optimal design.

Fig 7. Experimental designs.

Fig 7

Unlike the balanced design (the same proportion of experimental units across all concentration points), the d-optimal design and the c-optimal design with prior 2 distribute experimental units at the extreme points (0, 0), (0, 1), (1, 0), and (1, 1). The c-optimal design with prior 1 (non-informative prior) seeks information at a variety of concentration points.

5.3. Simulations

To compare the four designs, seven simulation scenarios were considered. The values of β were chosen at β0 = −1.5, β1 = 0.75, β2 = 1.5, and β3 = 0, 0.5, 1, 3, 5, 10, 15 to vary the degree of synergistic effect. Each scenario was replicated 1,000 times to approximate statistical power for testing β3 = 0 versus β3 ≠ 0 at significance level 0.05 for each design. As shown in the left panel of Fig 8, the d-optimal design outperforms the balanced design until β3 = 10, and the c-optimal design with prior 1 outperforms the d-optimal design. The power of c-optimal design with prior 2 seems similar to the power of d-optimal design because the two designs were very similar as shown in Fig 7. When β3 = 15 the balanced design outperformed the d-optimal design and the c-optimal design with prior 2.

Fig 8. Statistical power.

Fig 8

The figure on the left demonstrates that the c-optimal design and d-optimal design outperform the balanced design when β3 is relatively close to the null value 0. The figure on the right demonstrates that the c-optimal design with prior 2 can be improved by the adaptive design.

The power of c-optimal with prior 2 was substantially lower than the power with prior 1 for high values of β3, and it is because the strong prior substantially deviated from the true simulation scenarios. The power could be improved by implementing adaptive design. In the first phase, 80 units were allocated based on prior 2, and the remaining 80 units were allocated based on posterior (prior 2 and collected data). The two-step procedure was helpful to correct the initial c-optimal design, and the power was noticeably increased for β3 ≥ 3 as shown in the right panel of Fig 8.

5.4. Note

Binomial counts are typically over-dispersed which means the data are more variable than the assumption under the standard logistic regression discussed in this section. The over-dispersion can be addressed via a mixed-effects model or a quasi-binomial logistic regression. The quasi-binomial model includes a dispersion parameter, and it scales the standard error of under the standard logistic regression. Regardless, the c-optimal design has the same objective which is to reduce uncertainty associated with the estimation for the parameter of interest.

6. Discussion

A clear objective of an experiment should be specified before choosing an appropriate experimental design [58]. This point was demonstrated in Section 5.3. If an objective was to investigate the interaction between two treatments, the c-optimal design would result in higher statistical power than the d-optimal design. Sometimes a researcher has multiple objectives, and this situation has been discussed in the context of a non-monotonic dose-response relationship in toxicology [40]. Choosing an objective-specific experimental design is not a new idea. It has been practiced among engineers and drug developers [59]. Like other research areas, agricultural data are expensive in terms of time and effort given a fixed amount of resources. Therefore, a careful experimental design is worth to be considered before initiating an experiment.

Hopefully this article alleviates some misconceptions of balanced designs. It is an optimal approach under specific cases like when two groups have the same variance in the two-sample t-test, but σ1 = σ2 or σ1σ2 is out of researcher’s control. After a researcher gains information about σ1 and σ2, via a pilot study or the first phase of a multi-phase experiment, the researcher may attempt to balance between σ12/n1 and σ22/n2 by choosing appropriate n1 and n2. In practice, a researcher may face a situation when some treatments might be more difficult to run or more expensive than other treatments [60]. Therefore, an experimental design is a practical problem of balancing between statistics and logistics.

In this article, given a specific objective which is formulated by a model parameter, we discussed adaptive designs to address uncertainty about researcher’s prior knowledge. Scientific research requires some degree of assumptions prior to data collection, and an adaptive design provides an opportunity to correct the prior assumption before exhausting all available resources. If the initial assumption is reasonably close to the truth, an adaptive design will not be detrimental as shown in the simulations of this article. Despite an adaptive design being a practical challenge because the total time of an experiment would be increased, we believe that the benefit of an adaptive design is clear from statistical perspective.

In agricultural studies, it is common to collect data for two seasons to confirm a hypothesis [1, 2, 8, 61]. It is also an opportunity to consider an adaptive design or some variation as there is no single statistical strategy which can fit all situations. Collaborations between agricultural researchers and statisticians are highly encouraged to find an appropriate strategy for a given research objective under practical and logistical considerations. Simulating data and comparing multiple possible plans under likely scenarios would be a recommended practice.

7. Conclusion

A research question can be formulated via a statistical parameter (a quantity which measures the treatment effect), and an experiment can be designed to increase the amount of information about the parameter of interest. In practice, increasing the sample size is not always feasible, so researchers fix a sample size at their maximal capacities. The simulations demonstrated that unbalanced and adaptive designs provide smaller error in parameter estimation and higher statistical power in hypothesis testing than balanced and fixed designs. Therefore, researchers facing different practical situations can utilize available resources efficiently by using appropriate experimental designs.

Supporting information

S1 Data. Data observed after the first phase of the experiment.

(CSV)

S2 Data. Data observed after the second phase of the experiment (combined with the first phase).

(CSV)

Data Availability

All relevant data are within the manuscript and its Supporting information files.

Funding Statement

The authors received no specific funding for this work.

References

  • 1.Kim DS, Hoffmann M, Kim S, Scholler BA, Fennimore SA. Integration of steam with allyl-isothiocyanate for soil disinfestation. HortScience. 2020; 55(6):920–925. doi: 10.21273/HORTSCI14600-20 [DOI] [Google Scholar]
  • 2.Kim DS, Kim S, Fennimore SA. Evaluation of broadcast steam application with mustard seed meal in fruiting strawberry. HortScience. 2021; 56(4):500–505. doi: 10.21273/HORTSCI15669-20 [DOI] [Google Scholar]
  • 3.Batlla D, Benech RL. Weed seed germination and the light environment: implications for weed management. Weed Biology and Management. 2014; 14(2):77–87. doi: 10.1111/wbm.12039 [DOI] [Google Scholar]
  • 4.Fawcett RS, Slife FW. Effects of field applications of nitrate on weed seed germination and dormancy. Weed Science. 1978; 26(6):594–596. doi: 10.1017/S0043174500064626 [DOI] [Google Scholar]
  • 5.Moore MJ, Gillespie TJ, Swanton CJ. Effect of cover crop mulches on weed emergence, weed biomass, and soybean (Glycine max) development. Weed Technology. 1994; 8(3):512–518. doi: 10.1017/S0890037X00039609 [DOI] [Google Scholar]
  • 6.Ogg AG Jr., Dawson JH. Time of emergence of eight weed species. Weed Science. 1984; 32(3):327–335. doi: 10.1017/S0043174500059087 [DOI] [Google Scholar]
  • 7.Fennimore SA, Martin FN, Miller TC, Broome JC, Dorn N, Greene I. Evaluation of a mobile steam applicator for soil disinfestation in California strawberry. HortScience. 2014; 49(12):1542–1549. doi: 10.21273/HORTSCI.49.12.1542 [DOI] [Google Scholar]
  • 8.Samtani JB, Ajwa HA, Weber JB, Browne GT, Klose S, Hunzie J, et al. Evaluation of non-fumigant alternatives to methyl bromide for weed control and crop yield in California strawberries (Fragaria ananassa L.). Crop Protection. 2011; 30(1):45–51. doi: 10.1016/j.cropro.2010.08.023 [DOI] [Google Scholar]
  • 9.Kim DS, Kim SB, Fennimore SA. Incorporating statistical strategy into image analysis to estimate effects of steam and allyl isocyanate on weed control. PLoS ONE. 2019; 14(9):e0222695. doi: 10.1371/journal.pone.0222695 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Kim SB, Kim DS, Mo X. An image segmentation technique with statistical strategies for pesticide efficacy assessment. PLoS ONE. 2021; 16(3):e0248592. doi: 10.1371/journal.pone.0248592 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Camprubi A, Estaún V, El Bakali MA, Garcia-Figueres F, Calvet C. Alternative strawberry production using solarization, metham sodium and beneficial soil microbes as plant protection methods. Agronomy for Sustainable Development. 2007; 27:179–184. doi: 10.1051/agro:2007007 [DOI] [Google Scholar]
  • 12.Díaz-Pérez M, Camacho-Ferre F, Diánez-Martínez F, De Cara-García M, Tello-Marquina JC. Evaluation of alternatives to methyl bromide in melon crops in Guatemala. Microbial Ecology. 2009; 57(2):379–383. doi: 10.1007/s00248-008-9460-1 [DOI] [PubMed] [Google Scholar]
  • 13.García-Méndez E, García-Sinovas D. Chemical alternatives to methyl bromide for weed control and runner plant production in strawberry nurseries. HortScience. 2008; 43(1):177–182. doi: 10.21273/HORTSCI.43.1.177 [DOI] [Google Scholar]
  • 14.Gilreath JP, Motis TN, Santos BM, Noling JW, Locascio SJ, Chellemi DO. Resurgence of soilborne pests in doublecropped cucumber after application of methyl bromide chemical alternatives and solarization in tomato. HortTechnology. 2005; 15(4):797–801. doi: 10.21273/HORTTECH.15.4.0797 [DOI] [Google Scholar]
  • 15.López-Aranda JM, Miranda L, Medina JJ, Soria C, de los Santos B, Romero F, et al. Methyl bromide alternatives for high tunnel strawberry production in southern Spain. HortTechnology. 2009; 19(1):187–192. doi: 10.21273/HORTSCI.19.1.187 [DOI] [Google Scholar]
  • 16.Mao L, Jiang H, Zhang L, Zhang Y, Sial MU, Yu H, et al. Replacing methyl bromide with a combination of 1, 3-dichloropropene and metam sodium for cucumber production in China. PloS ONE. 2017; 12(11):e0188137. doi: 10.1371/journal.pone.0188137 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Mao LG, Wang QX, Yan DD, Xie HW, Li Y, Guo MX, et al. Evaluation of the combination of 1, 3-dichloropropene and dazomet as an efficient alternative to methyl bromide for cucumber production in China. Pest Management Science. 2012; 68(4):602–609. doi: 10.1002/ps.2303 [DOI] [PubMed] [Google Scholar]
  • 18.Ślusarski C, Pietr SJ. Combined application of dazomet and Trichoderma asperellum as an efficientalternative to methyl bromide in controlling the soil-borne disease complex of bell pepper. Crop Protection. 2009; 28(8):668–674. doi: 10.1016/j.cropro.2009.03.016 [DOI] [Google Scholar]
  • 19.Uhlig RE, Bird G, Richardson RJ, Zandstra BH. Soil fumigants to replace methyl bromide for weed control in ornamentals. HortTechnology. 2007; 17(1):111–114. doi: 10.21273/HORTTECH.17.1.111 [DOI] [Google Scholar]
  • 20.Petrinovich LF, Hardyck CD. Error rates for multiple comparison methods: Some evidence concerning the frequency of erroneous conclusions. Psychological Bulletin. 1969; 71(1):43–54. doi: 10.1037/h0026861 5808819 [DOI] [Google Scholar]
  • 21.Ahn MG, Kim DS, Ahn SR, Sim HS, Kim S, Kim SK. Characteristics and trends of strawberry cultivars throughout the cultivation season in a greenhouse. Horticulturae. 2021; 7(2):30. doi: 10.3390/horticulturae7020030 [DOI] [Google Scholar]
  • 22.Caser M, Demasi S, Caldera F, Dhakar NK, Trotta F, Scariot V. Activity of Ailanthus altissima (Mill.) swingle extract as a potential bioherbicide for sustainable weed management in horticulture. Agronomy. 2020; 10(7):965. doi: 10.3390/agronomy10070965 [DOI] [Google Scholar]
  • 23.Kim DS, Lee DU, Lim JH, Kim S, Choi JH. Agreement between visual and model-based classification of tomato fruit ripening. Transactions of the ASABE. 2020; 63(3):667–674. doi: 10.13031/trans.13812 [DOI] [Google Scholar]
  • 24.Lix LM, Keselman JC, Keselman HJ. Consequences of assumption violations revisited: A quantitative review of alternatives to the one-way analysis of variance “F” test. Review of Educational Research. 1996; 66(4):579–619. doi: 10.3102/00346543066004579 [DOI] [Google Scholar]
  • 25.Benjamini Y, Braun H. John W Tukey’s contributions to multiple comparisons The Annals of Statistics. 2002; 30(6):1576–1594. doi: 10.1214/aos/1043351247 [DOI] [Google Scholar]
  • 26.Kramer CY. Extension of multiple range tests to group means with unequal numbers of replications. Biometrics. 1956; 12(3):307–310. doi: 10.2307/3001469 [DOI] [Google Scholar]
  • 27.Fisher RA. Statistical Methods for Research Workers. Oliver & Boyd, Edinburgh; 1970. [Google Scholar]
  • 28.Verdooren LR. History of the statistical design of agricultural experiments. Journal of Agricultural, Biological and Environmental Statistics. 2020; 25(3):457–486. doi: 10.1007/s13253-020-00394-3 [DOI] [Google Scholar]
  • 29.Chaloner K, Verdinelli I. Bayesian experimental design: a review. Statistical Science. 1995; 10(3):273–304. doi: 10.1214/ss/1177009939 [DOI] [Google Scholar]
  • 30.Chernoff H. Locally optimal designs for estimating parameters. The Annals of Mathematical Statistics. 1953; 24(4):586–602. doi: 10.1214/aoms/1177728915 [DOI] [Google Scholar]
  • 31.Zhai Y, Fang Z. Locally optimal designs for some dose-response models with continuous endpoints. Communications in Statistics—Theory and Methods. 2018; 47(16):3803–3819. doi: 10.1080/03610926.2017.1361996 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Braun TM. Generalizing the TITE-CRM to adapt for early- and late-onset toxicities. Statistics in Medicine. 2006; 25(12):2071–2083. doi: 10.1002/sim.2337 [DOI] [PubMed] [Google Scholar]
  • 33.O’Quigley J, Pepe M, Fisher L. Continual reassessment method: a practical design for phase 1 clinical trials in cancer. Biometrics. 1990; 46(1):33–48. doi: 10.2307/2531628 [DOI] [PubMed] [Google Scholar]
  • 34.Whitehead J, Brunier H. Bayesian decision procedures for dose determining experiments. Statistics in Medicine. 1995; 14(9):885–899. doi: 10.1002/sim.4780140904 [DOI] [PubMed] [Google Scholar]
  • 35.Yin G, Yuan Y. Bayesian model averaging continual reassessment method in phase I clinical trials. Journal of the American Statistical Association. 2009; 104(487):954–968. doi: 10.1198/jasa.2009.ap08425 [DOI] [Google Scholar]
  • 36.R Core Team. R: A language and environment for statistical computing. R Foundation for Statistical Computing. Vienna, Austria. 2020. URL https://www.R-project.org/.
  • 37.Dror H, Steinberg D. Sequential experimental designs for generalized linear models. Journal of the American Statistical Association. 2008; 103(481):288–298. doi: 10.1198/016214507000001346 [DOI] [Google Scholar]
  • 38.Gbur EE, Stroup WW, McCarter KS, Durham S, Young LJ, Christman M, et al. Analysis of generalized linear mixed models in the agricultural and natural resources sciences. American Society of Agronomy, Soil Science Society of America, Crop Science Society of America, Madison, WI; 2012.
  • 39.Russell KG. Design of experiments for generalized linear models. Chapman and Hall/CRC, Boca Raton; 2018. [Google Scholar]
  • 40.Dette H, Pepelyshev A, Wong WK. Optimal experimental design strategies for detecting hormesis. Risk Analysis. 2011; 31(12):1949–1960. doi: 10.1111/j.1539-6924.2011.01625.x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Elfving G. Optimum allocation in linear regression theory. The Annals of Mathematical Statistics. 1952; 23(2):255–262. doi: 10.1214/aoms/1177729442 [DOI] [Google Scholar]
  • 42.Pukelsheim F. On linear regression designs which maximize information. Journal of Statistical Planning and Inference. 1980; 4(4):339–364. doi: 10.1016/0378-3758(80)90020-8 [DOI] [Google Scholar]
  • 43.Kim SB, Gillen DL. A Bayesian adaptive dose-finding algorithm for balancing individual- and population-level ethics in Phase I clinical trials. Sequential Analysis. 2016; 35(4):423–439. doi: 10.1080/07474946.2016.1238250 [DOI] [Google Scholar]
  • 44.Klein JP, Moeschberger ML. Survival analysis: techniques for censored and truncated data. Springer, New York; 2003. [Google Scholar]
  • 45.Patti S, Biganzoli E, Boracchi P. Review of the maximum likelihood functions for right censored Data. A new elementary derivation. COBRA Preprint Series. 2007; 21. [Google Scholar]
  • 46.Holm RE. Volatile metabolites controlling germination in buried weed seeds. Plant Physiology. 1972; 50(2):293–297. doi: 10.1104/pp.50.2.293 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Ankenman BC, Avilés AI, Pinheiro JC. Optimal designs for mixed-effects models with two random nested factors. Statistica Sinica. 2003; 13(2):385–401. [Google Scholar]
  • 48.Dette H, Holland-Letz T. A geometric characterization of c-optimal designs for heteroscedastic regression. The Annals of Statistics. 2009; 37(6B):4088–4103. doi: 10.1214/09-AOS708 [DOI] [Google Scholar]
  • 49.Mentré F, Mallet A, Baccar D. Optimal Design in random-effects regression models. Biometrika. 1997; 84(2):429–42. doi: 10.1093/biomet/84.2.429 [DOI] [Google Scholar]
  • 50.Onofri A, Carbonell EA, Piepho H-P, Mortimer AM, Cousens RD. Current statistical issues in weed research. Weed Research. 2010; 50(1):5–24. doi: 10.1111/j.1365-3180.2009.00758.x [DOI] [Google Scholar]
  • 51.Scheiner SM, Gurevitch J Design and analysis of ecological experiments, 1st edition. Oxford University Press, Oxford, UK; 1993. [Google Scholar]
  • 52.Scott SJ, Jones RA, Williams WA Review of data analysis methods for seed germination. Crop Science. 1984; 24(6):1192–1199. doi: 10.2135/cropsci1984.0011183X002400060043x [DOI] [Google Scholar]
  • 53.Casey M, Gennings C, Carter WH Jr., Moser VC, Simmons JE. Ds-optimal designs for studying combinations of chemicals using multiple fixed-ratio ray experiments. Environmentrics. 2005; 16(2):129–147. doi: 10.1002/env.666 [DOI] [Google Scholar]
  • 54.Holland-Letz T, Kopp-Schneider A. Optimal experimental designs for estimating the drug combination index in toxicology. Computational Statistics & Data Analysis. 2018; 117:182–193. doi: 10.1016/j.csda.2017.08.006 29173022 [DOI] [Google Scholar]
  • 55.Sperrin M, Thygesen H, Su TL, Harbron C, Whitehead A. Experimental designs for detecting synergy and antagonism between two drugs in a pre-clinical study. Pharmaceutical Statistics. 2015; 14(3):216–225. doi: 10.1002/pst.1676 [DOI] [PubMed] [Google Scholar]
  • 56.Straetemans R, O’Brien T, Wouters L, Van Dun J, Janicot M, Bijnens L, et al. Design and analysis of drug combination experiments. Biometrical Journal. 2005; 47(3):299–308. doi: 10.1002/bimj.200410124 [DOI] [PubMed] [Google Scholar]
  • 57.Bedrick EJ, Christensen R, Johnson W. A new perspective on priors for generalized linear models. Journal of the American Statistical Association. 1996; 91(436):1450–1460. doi: 10.1080/01621459.1996.10476713 [DOI] [Google Scholar]
  • 58.Inouye BD. Response surface experimental designs for investigating interspecific competition. Ecology. 2001; 82(10):2696–2706. doi: 10.1890/0012-9658(2001)082[2696:RSEDFI]2.0.CO;2 [DOI] [Google Scholar]
  • 59.Shivhare M, McCreath G. Practical consideration for D0E implementation in quality by design. BioProcess International. 2010. [Google Scholar]
  • 60.Chatterjee K, Georgiou SD, Koukouvinos C. A2-optimal designs: the nearly-balanced case. Statistics. 2017; 51(2):235–246. doi: 10.1080/02331888.2016.1239726 [DOI] [Google Scholar]
  • 61.Swegarden HR, Sheaffer CC, Michaels TE. Yield stability of heirloom dry bean (Phaseolus vulgaris L.) cultivars in midwest organic production. HortScience. 2016; 51(1):8–14. doi: 10.21273/HORTSCI.51.1.8 [DOI] [Google Scholar]

Decision Letter 0

Ahmet Uludag

21 Jul 2021

PONE-D-21-17452

Applications of statistical experimental designs to improve statistical inference in weed management

PLOS ONE

Dear Dr. Kim,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please take the consideration every single point that three referees pointed out. 

Please submit your revised manuscript by a month. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Ahmet Uludag, Ph.D.

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Thank you for stating the following financial disclosure:

“The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.”

At this time, please address the following queries:

a) Please clarify the sources of funding (financial or material support) for your study. List the grants or organizations that supported your study, including funding received from your institution.

b) State what role the funders took in the study. If the funders had no role in your study, please state: “The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.”

c) If any authors received a salary from any of your funders, please state which authors and which funders.

d) If you did not receive any funding for this study, please state: “The authors received no specific funding for this work.”

Please include your amended statements within your cover letter; we will change the online submission form on your behalf.

3. Thank you for stating the following in the Acknowledgments Section of your manuscript:

“The authors declare that there is no funding associated with this manuscript.”

We note that you have provided funding information that is not currently declared in your Funding Statement. However, funding information should not appear in the Acknowledgments section or other areas of your manuscript. We will only publish funding information present in the Funding Statement section of the online submission form.

Please remove any funding-related text from the manuscript and let us know how you would like to update your Funding Statement. Currently, your Funding Statement reads as follows:

“The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.”

Please include your amended statements within your cover letter; we will change the online submission form on your behalf.

4. Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

Additional Editor Comments:

I have review of three referees. One rejected and the other two concluded minor revision. I think this is an important paper but needs to be improved. Please follow suggestions of three reviewers. There are good suggestions from referee2 as well. DO your best to meet his suggestions where appropriate. I think he gave good insights to add your paper.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Partly

Reviewer #3: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: No

Reviewer #3: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The manuscript aims to introduce and explain methodology for better planning of agricultural experiments through the use of optimal designs. This is a good idea as optimal designs aren't used much, also not in agricultural sciences. However, the manuscript could provide even more explanations and insights if it's the intention to have more practitioner use these methods.

Specifically, the manuscript looks at 3 practical situations, which occur frequently in practice. In each situation, theory is introduced very briefly, with some concepts left unexplained, followed by simulation results and possibly a practical example. For instance, no literature on mixture models or optimal design for mixture models are cited; this is disappointing.

Specific comments:

lines 26-27: Are estimation of standard errors really modified?!

lines 59-64: It would be helpful if it was spelled out that three "practical situations" are dealt with in detail in the manuscript.

lines 98-107: Which statistical software was used?

line 160: No reference for the c-optimal design.

lines 244-310: No references to existing literature on mixture models or optimal design for mixture are provided. One place to start:

https://doi.org/10.1002/env.666

https://doi.org/10.1016/j.csda.2017.08.006

Reviewer #2: This paper considers several examples, modeled on applications in weed science, where a good design involves unequal replication. This finding may come as a surprise to some weed scientists, because it is very common in practive to use equally replicated designs.

While the intention of this paper is very laudable, I find most of the examples unconvincing because they are rather artificial. Perhaps the most important limitation of the proposed methods is the need to run an experiment twice, using the first to optimize the design for the second. Most experimenters in weed science will simply not have the time to run two experiments in order to get a single experiment's worth of results. The paper lacks convincing examples, where such designs would really be practical and where weed scientists would indeed be willing to use them.

Major comments

p.1: The Duncan test does not control the family-wise error rate and should not be recommended. The t-test does not either, but the Tukey test does. I think it is important to mention these facts when introducing these popular tests. The current wording "seem to be alternative choices among researchers" leaves the authors' view open at this point and may even raise the impression that the choice does not matter.

p.2: To introduce the idea of an unbalanced design, the authors consider the case of a linear regression with four equally spaced doses. The fact that the best design allocates half the observations to the smallest and largest dose and none to the intermediate ones does not mean the design is balanced. It's quite balanced between the highest and lowest dose. It is a well known fact that with linear regression, only two doses are needed to estimate the regression coefficient and that this is the optimal design. This design does rest on the strong assumption that there is no departure from linearity, and this is a very strong assumption that many practitioners may not be willing to make. This, I would hardly ever recommend such a design to a weed scientist in practice.

p.2, bottom: The authors use the term "two-phase design" for an approach that takes a first example to inform about the parameters of the model to be fitted and then in a second phase adapts the sample size to achieve a desire precision. I know at this point what the authors are intending but notice at the same time that the term "two-phase design" can have quite different meaning in other contexts. See, e.g.,

Brien, C. J. (2019). Multiphase experiments with at least one later laboratory phase. II. Nonorthogonal designs. Australian & New Zealand Journal of Statistics, 61:234-268.

I am more familiar with the term "adaptive design" for the approach the authors are pursing. These types of design are very common in clinical trials, and there is a very rich literature on this kind of design. The authors do not seem to be aware of this literature. Perhaps they can consider adopting this term to avoid confusion. Moreover, delving into the relevant literature may be useful. The authors do mention some work along these lines in a cursory manner at the very end of the paper, but it would be appropriate to state this from the very start, making it clear that the paper presents nothing that is in itself new and that much of the relevant theory was developed in the context of clinical trials, where one can find ample guidance for the types of design the authors are advocating.

p.3: Having introduced linear regression as one example of an "unbalanced design" in the introduction, it is a bit surprising that Section 2 focuses on the comparison of two means. The connection with linear regression is not at all clear. Also, in the introduction the authors mention various examples of traits that involve counts and as such do not obey the normality or homogeneity assumption. Readers would expect that methods for count data are considered, e.g. generalized linear models. It's actually not clear after the introduction what will be the overall structure of the paper and how many different cases will be considered.

p.3: Section 3 really lacks an agricultural example to motivate and illustrate the method. There is a large branch in agricultural sciences that relies primarily on field experiments with annual crops. In these settings there is rarely the luxury to use an adaptive design. The authors are still assuming normality here, whereas the introduction motivated the need to assume heterogeneity of variance using the example of count data. As pointed out before, there are tailored methods for count data, which would be prefered in this setting, and based on the introduction, this is what I would have expected in Section 2.

In Section 3, the authors consider a second example, i.e. a dose-response relationship where the interest is in finding the dose (denoted as Delta) that doubles the time to emergence of a weed. The authors suggest what they denote as a c-optimal design, but they give no reference whatsoever for this standard design problem. The assert that the information about Delta can be increased by minimizing the prior expectation of the variance of the parameter vector, but I am not seeing any proof of this claim. Moreover, what we really want is the optimal design for estimating Delta, not a design that "increases the information about Delta" (compared to what baseline design?). Finally, it is not even explained how the "optimal" number of replications are to be determined, let alone the number of x levels and their placement along the x-axis. All of this looks to be a standard design problem in nonlinear regression for which canned solutions are available but the authors make it seem like an entirely level problem. Showing in simulations that the equally replicated designs can be beaten is knocking a strawman. Incidentally, there is quite a lot of literature on time-to-event data as used in weed science, and there are several good R packages for this kind of data. The lognormal is but one distributional model for such data, and I am wondering why the authors picked this particular distrubution. Perhaps because some of the algebra is straightforeard? What if the times-to-event follow a Weibull or Gompertz distribution, for example?

Section 4.1: I have tried hard to understand the rationale of the computations leading to the prior for this example but failed. Section 4.2 only mentions right-censoring in passing, focussing on the derivation of a design under a right-censored model. Such a model was never mentioned in Section 3, so as a reader I am confused here and do not know what is going on. Why censoring all of a sudden? I think this kind of modelling needs a much more thorough introduction for the intended audience, which I think is agronomists and weed scientists. Censoring is a key property of such data, so just mentioning this in passing as if every body knew this does not seem appropriate for the intended audience. Regarding the example, I note that the authors are assuming 10 seeds per pot. Each seed may have a different germination time, and it is most informative to analyse individual plant data. These data are clustered, however, and this would require fitting random pot effects.

Section 5 considers a logistic regression model to assess synergistic effects between two quantitative inputs. The response is binary, and germination of individual seeds is considered as a case in kind. This is just a hypothetical example, no real experiment is considered. The main challange with this kind of data, again, is clustering. Several seeds are usually placed on the same experimental units (pot, petri dish, filter paper), and the number of germinated counts is assessed per experimental unit. These binomial counts are very typically overdispersed, which is the main challenge in the analysis. The whole derivation in this section ignores overdispersion, so I am afraid it is not relevant for most practical purposes.

In summary, my impression is that the authors do not have a very intimate connection with experimenters and real experiments in the weed sciences and that the methods they propose are not very practical. What I find mainly lacking are convincing real examples that fully motivate the proposed methods. Most of the derivations are based on rather hypothetical and artificial settings.

I do aplaud the general intention of the paper, which I would say is focused on the optimal design for regression models in weed science. I believe, however, that this work would benefit greatly from a closer collaboration with weed scientists. If examples can be presented where the proposed designs have really been found to work for weed scientists, this could make a much more convincing case. This may also mean rather different designs presented as desirable in the end.

Reviewer #3: The study is focused on one of the most important issues in weed science, experimental designs. The concept structured was firstly described and given its theoretical background. Although the data presented by author to clarify their ideas was enough, it would be obtained more realistic and powerful from herbicides instead of EtOH. Under different scenarios, the two-phase design may create lower MSE’s than single stage design.

Please check the sentence in line 80-81 “The null hypothesis is H0: µ1- µ 2 = 0, and the alternative hypothesis is H1: µ1- µ 2 = 0”.

Please check the value in line 216 “x3 = 0.5”.

Please add separate captions for each figure.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: Yes: Ahmet Tansel Serim

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2021 Sep 15;16(9):e0257472. doi: 10.1371/journal.pone.0257472.r002

Author response to Decision Letter 0


26 Jul 2021

Reviewer #1: The manuscript aims to introduce and explain methodology for better planning of agricultural experiments through the use of optimal designs. This is a good idea as optimal designs aren't used much, also not in agricultural sciences. However, the manuscript could provide even more explanations and insights if it's the intention to have more practitioners use these methods.

Specifically, the manuscript looks at 3 practical situations, which occur frequently in practice. In each situation, theory is introduced very briefly, with some concepts left unexplained, followed by simulation results and possibly a practical example. For instance, no literature on mixture models or optimal design for mixture models are cited; this is disappointing.

Response: Thank you for your feedback. We added more theoretical and conceptual explanations, particularly in Section 4. The section is revised substantially to thoroughly explain the concepts. Furthermore, we added references related to mixed-effects models and optimal designs. [e.g., Lines 205-208; 326-329]

Specific comments:

lines 26-27: Are estimation of standard errors really modified?!

Response: The Tukey-Kramer method adjusts the standard error to account for unequal sample sizes when compared to the traditional Tukey’s method (which was devised for equal sample size). We edited the language to avoid confusion. Thank you for your comment. [Lines 30-33]

lines 59-64: It would be helpful if it was spelled out that three "practical situations" are dealt with in detail in the manuscript.

Response: We think that this is a great idea for readers. We spelled out the three practical situations (parameters of interest) and provided associated section numbers. [Lines 81-85]

lines 98-107: Which statistical software was used?

Response: R Version 4.0.2 was used. It is added and cited in the revised manuscript. [Lines 127-128]

line 160: No reference for the c-optimal design.

Response: References for the c-optimal design are added. Thank you. [Lines 205-206]

lines 244-310: No references to existing literature on mixture models or optimal design for mixture are provided. One place to start:

https://doi.org/10.1002/env.666

https://doi.org/10.1016/j.csda.2017.08.006

Response: Thank you for the suggested references. In addition to your suggested references, we added a couple more. [Line 331]

Reviewer #2: This paper considers several examples, modeled on applications in weed science, where a good design involves unequal replication. This finding may come as a surprise to some weed scientists, because it is very common in practice to use equally replicated designs.

While the intention of this paper is very laudable, I find most of the examples unconvincing because they are rather artificial. Perhaps the most important limitation of the proposed methods is the need to run an experiment twice, using the first to optimize the design for the second. Most experimenters in weed science will simply not have the time to run two experiments in order to get a single experiment's worth of results. The paper lacks convincing examples, where such designs would really be practical and where weed scientists would indeed be willing to use them.

Response: Thank you for your careful review. We agree that some researchers cannot afford the time to run two experiments. This article demonstrates the potential benefit of a two-phase (adaptive) design. We (the authors) did not mean to argue that all experiments must be run twice. Via simulations, we demonstrated that a single-phase optimal design outperforms the balanced design which is commonly seen in literature. We recommend that researchers consider an optimal design (or any appropriate variation) based on their clear objective (a parameter of interest) to increase statistical power. Based on our academic and working experiences, some journal reviewers and practitioners believe that the effect of a treatment is more convincing when the experiment was replicated (two seasons of data). The demonstrated two-phase (adaptive) design can be beneficial in this situation.

Major comments

p.1: The Duncan test does not control the family-wise error rate and should not be recommended. The t-test does not either, but the Tukey test does. I think it is important to mention these facts when introducing these popular tests. The current wording "seem to be alternative choices among researchers" leaves the authors' view open at this point and may even raise the impression that the choice does not matter.

Response: This is a great point. We added the issue of family-wise error rate to the introduction. [Lines 16-21]

p.2: To introduce the idea of an unbalanced design, the authors consider the case of a linear regression with four equally spaced doses. The fact that the best design allocates half the observations to the smallest and largest dose and none to the intermediate ones does not mean the design is balanced. It's quite balanced between the highest and lowest dose. It is a well known fact that with linear regression, only two doses are needed to estimate the regression coefficient and that this is the optimal design. This design does rest on the strong assumption that there is no departure from linearity, and this is a very strong assumption that many practitioners may not be willing to make. This, I would hardly ever recommend such a design to a weed scientist in practice.

Response: This is also a great point. The example of the linear model was to help readers understand the concept of an experimental design (distribution of experimental units). Weed scientists may be (or should be) interested in an adequate strength of a treatment from a variety of perspectives such as the effectiveness of weed control, the impact on the environment, and the cost. In this regard, finding an adequate concentration (or any quantification of treatment strength) would be an important research objective. In a later section, we demonstrate a statistical model and an experimental design to find such a parameter in terms of delaying weed emergence. This point is added to the introduction of the revised manuscript. [Lines 67-76]

p.2, bottom: The authors use the term "two-phase design" for an approach that takes a first example to inform about the parameters of the model to be fitted and then in a second phase adapts the sample size to achieve a desire precision. I know at this point what the authors are intending but notice at the same time that the term "two-phase design" can have quite different meaning in other contexts. See, e.g.,

Brien, C. J. (2019). Multiphase experiments with at least one later laboratory phase. II. Nonorthogonal designs. Australian & New Zealand Journal of Statistics, 61:234-268.

I am more familiar with the term "adaptive design" for the approach the authors are pursing. These types of design are very common in clinical trials, and there is a very rich literature on this kind of design. The authors do not seem to be aware of this literature. Perhaps they can consider adopting this term to avoid confusion. Moreover, delving into the relevant literature may be useful. The authors do mention some work along these lines in a cursory manner at the very end of the paper, but it would be appropriate to state this from the very start, making it clear that the paper presents nothing that is in itself new and that much of the relevant theory was developed in the context of clinical trials, where one can find ample guidance for the types of design the authors are advocating.

Response: We appreciate your expertise on this. Throughout the paper, we replaced the “two-phase design” by the “adaptive design” and introduced the common use of adaptive designs in clinical trials early in the paper (in the introduction). [Lines 43-48] We made it clear that adaptive designs are not new in scientific communities, and it is discussed in the article in the context of agricultural studies. [Lines 97-100]

p.3: Having introduced linear regression as one example of an "unbalanced design" in the introduction, it is a bit surprising that Section 2 focuses on the comparison of two means. The connection with linear regression is not at all clear. Also, in the introduction the authors mention various examples of traits that involve counts and as such do not obey the normality or homogeneity assumption. Readers would expect that methods for count data are considered, e.g. generalized linear models. It's actually not clear after the introduction what will be the overall structure of the paper and how many different cases will be considered.

Response: You made a valid point. We did not intend to connect the comparison of two means with linear regression. To avoid such a confusion, we edited our language in the example of linear regression in the introduction (so it does not sound like a connected example). [Lines 57-62] In addition, we could not include all possible kinds of variable types in this article. We briefly mentioned the different nature of count data (non-normality and/or heterogeneity) and added some references related to experimental designs under GLMs. [Lines 148-151] Finally, we introduced the overall structure by the three parameters of interest discussed in the paper. [Lines 81-85]

p.3: Section 3 really lacks an agricultural example to motivate and illustrate the method. There is a large branch in agricultural sciences that relies primarily on field experiments with annual crops. In these settings there is rarely the luxury to use an adaptive design. The authors are still assuming normality here, whereas the introduction motivated the need to assume heterogeneity of variance using the example of count data. As pointed out before, there are tailored methods for count data, which would be prefered in this setting, and based on the introduction, this is what I would have expected in Section 2.

Response: We assume that this comment is on Section 2 (comparing two groups) not Section 3. Our intention was to start an example with the simplest case for diverse levels of readers. Some journals, reviewers, researchers, and practitioners believe that a conclusion is more convincing when results are reproduced (e.g., two seasons of data). As mentioned in the introduction, the impact of violating normality assumption mitigates as the sample size increases (e.g., a Poisson distribution gets close to a normal distribution), but a large sample size does not mitigate the violating homogeneity of variance. In large-sample studies, the homogeneity assumption is not required, and the two-sample t-test would work with unequal variance assumption. In this regard, researchers still can benefit from an adaptive design (if it is feasible) by estimating the unequal variances after the first phase of an experiment. We added a small section to address these points at the end of Section 2. [Section 2.3]

In Section 3, the authors consider a second example, i.e. a dose-response relationship where the interest is in finding the dose (denoted as Delta) that doubles the time to emergence of a weed. The authors suggest what they denote as a c-optimal design, but they give no reference whatsoever for this standard design problem. The assert that the information about Delta can be increased by minimizing the prior expectation of the variance of the parameter vector, but I am not seeing any proof of this claim. Moreover, what we really want is the optimal design for estimating Delta, not a design that "increases the information about Delta" (compared to what baseline design?). Finally, it is not even explained how the "optimal" number of replications are to be determined, let alone the number of x levels and their placement along the x-axis. All of this looks to be a standard design problem in nonlinear regression for which canned solutions are available but the authors make it seem like an entirely level problem. Showing in simulations that the equally replicated designs can be beaten is knocking a strawman. Incidentally, there is quite a lot of literature on time-to-event data as used in weed science, and there are several good R packages for this kind of data. The lognormal is but one distributional model for such data, and I am wondering why the authors picked this particular distribution. Perhaps because some of the algebra is straightforward? What if the times-to-event follow a Weibull or Gompertz distribution, for example?

Response: References for the c-optimal design have been added in the revised manuscript. [Lines 204-205] The c-optimal design is devised to minimize the expected asymptotic variance of the maximum likelihood estimator for the parameter of interest. It is what we meant by “increase the information about Delta,” but it was not an accurate language, so we corrected our language in the revised manuscript. In addition, we further explained (using calculus notation) how the optimal number of replications are to be determined (with an explanation using lay language). [Lines 205-211] There are many available experimental designs (not only c-optimal design) which can easily outperform the balanced (equally replicated) design, and we wanted to inform and demonstrate this fact to agricultural researchers who commonly choose the balanced design. [Lines 212-216] In addition, we added references for discussion of survival analysis in weed science. The choice of lognormal distribution is for mathematical convenience and practical interpretation [Lines 177-178].

Section 4.1: I have tried hard to understand the rationale of the computations leading to the prior for this example but failed. Section 4.2 only mentions right-censoring in passing, focussing on the derivation of a design under a right-censored model. Such a model was never mentioned in Section 3, so as a reader I am confused here and do not know what is going on. Why censoring all of a sudden? I think this kind of modelling needs a much more thorough introduction for the intended audience, which I think is agronomists and weed scientists. Censoring is a key property of such data, so just mentioning this in passing as if everybody knew this does not seem appropriate for the intended audience. Regarding the example, I note that the authors are assuming 10 seeds per pot. Each seed may have a different germination time, and it is most informative to analyse individual plant data. These data are clustered, however, and this would require fitting random pot effects.

Response: We appreciate your comment on the structure of paper. To explain the right-censored model and the prior specification more thoroughly, we restructured Section 4. In the revised manuscript, we first introduced the right-censored model under the log-normal assumption (Section 4.1) and prior specification (Section 4.2), then we introduced the applied example (Section 4.3) and retrospective simulations (Section 4.4). The experiment in Section 4.3 was a small-scale (pilot) study conducted in a balcony, and we tried hard to reduce random-effects due to soil characteristics, irrigation, and location. In these major changes, comments of the other reviewers are also addressed, and some typos are corrected.

In a large-scale field experiment, as you pointed out, we believe that substantial random-effects would exist due to the clustering, and it should be considered via a more complex model such as a mixed-effect model. It is beyond the scope of this article, and it is discussed at the end of Section 4 with references related to optimal designs under mixed-effects models (Section 4.5). [Section 4 is substantially revised and restructured.]

Section 5 considers a logistic regression model to assess synergistic effects between two quantitative inputs. The response is binary, and germination of individual seeds is considered as a case in kind. This is just a hypothetical example, no real experiment is considered. The main challenge with this kind of data, again, is clustering. Several seeds are usually placed on the same experimental units (pot, petri dish, filter paper), and the number of germinated counts is assessed per experimental unit. These binomial counts are very typically overdispersed, which is the main challenge in the analysis. The whole derivation in this section ignores overdispersion, so I am afraid it is not relevant for most practical purposes.

Response: Your point is well taken, and it is beyond the scope of this article. We added sentences to alert the importance of over-dispersion and relevant references at the end of Section 5. Thank you for your expertise and valuable feedback. [Lines 397-404]

In summary, my impression is that the authors do not have a very intimate connection with experimenters and real experiments in the weed sciences and that the methods they propose are not very practical. What I find mainly lacking are convincing real examples that fully motivate the proposed methods. Most of the derivations are based on rather hypothetical and artificial settings.

I do applaud the general intention of the paper, which I would say is focused on the optimal design for regression models in weed science. I believe, however, that this work would benefit greatly from a closer collaboration with weed scientists. If examples can be presented where the proposed designs have really been found to work for weed scientists, this could make a much more convincing case. This may also mean rather different designs presented as desirable in the end.

Response: We understand that there is a gap between the real experiments and simulation/example provided in this article. At the same time, we see a gap between current practice (e.g., balanced designs) and general understanding of experimental designs from statistical perspective (even under the simplified hypothetical situations). The primary point of this article is to demonstrate the importance of experimental designs (breaking the misconception of the superiority of the balanced design) and adaptive designs (i.e., the entire experimental units do not have to be spent at once, and adaptive decisions can improve the precision of parameter estimation). We greatly appreciate your expertized perspective, and most of your feedback is addressed by noting readers with references. Thank you.

Reviewer #3: The study is focused on one of the most important issues in weed science, experimental designs. The concept structured was firstly described and given its theoretical background. Although the data presented by the authors to clarify their ideas was enough, it would be obtained more realistic and powerful from herbicides instead of EtOH. Under different scenarios, the two-phase design may create lower MSE’s than single stage design.

Response: Thank you for your valuable perspective. We agree that there are more realistic and powerful herbicides in weed science. The experiment of EtOH was conducted due to our restricted budget and logistics (during the pandemic), and it was used for the purpose of demonstrating the experimental design and parameter estimation. In the revised manuscript, this point was made clear in Section 4.3. [Lines 280-282]

Please check the sentence in line 80-81 “The null hypothesis is H0: µ1- µ 2 = 0, and the alternative hypothesis is H1: µ1- µ 2 = 0”.

Response: Thank you for catching the typo. We meant H1: H1: µ1- µ2 ≠ 0. It is corrected. [Line 107]

Please check the value in line 216 “x3 = 0.5”.

Response: Thank you for catching the typo. We meant x4 = 0.5. It is corrected, and associated typos are also corrected. [Lines 296, 298, and 306]

Please add separate captions for each figure.

Response: Thank you for your suggestion which is helpful for readers. We added separate captions (explanations) for each figure. [See captions of Fig 1-8.]

Attachment

Submitted filename: Response to Reviewers 2021-07-26.pdf

Decision Letter 1

Ahmet Uludag

16 Aug 2021

PONE-D-21-17452R1

Applications of statistical experimental designs to improve statistical inference in weed management

PLOS ONE

Dear Dr. Kim,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Your last version is pretty good. I am looking for some additional changes which will improve the quality of paper. I guess it will be improved to be published quality.

Please submit your revised manuscript by Sep 30 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Ahmet Uludag, Ph.D.

Academic Editor

PLOS ONE

Journal Requirements:

Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

Additional Editor Comments (if provided):

I would like to accelerate your paper publishing. If you make relevant changes that are from the third referee and prepare and add a detailed rebuttal, I will check it myself after corrections. I think suggestions are very clear.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #3: All comments have been addressed

Reviewer #4: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #3: Yes

Reviewer #4: Partly

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #3: Yes

Reviewer #4: No

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #3: Yes

Reviewer #4: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #3: Yes

Reviewer #4: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: (No Response)

Reviewer #3: (No Response)

Reviewer #4: Article: Applications of statistical experimental designs to improve statistical inference in weed management

General Comment

In addition to the weed branch, a useful study has been written that will easily adapt to the agricultural field. The article is written from an innovative perspective. Although the study contains intensive information theoretically, there are doubts about its transfer to practice. The number of iterations (1000) of the simulation study was kept at an insufficient level. It is known in the literature that the results change with the increase in the number of simulations. Therefore, the number of iterations should be increased in the simulation. The simulation steps can be made more understandable by showing them with a diagram. The distributions used in the simulation study do not include all the conditions in terms of the shape of the distribution. Therefore, especially t(10), t(5), β(5,10), β(10,5) and χ2(3) distributions should be included in the study. In addition, while generating random numbers, they should be produced and reported by taking into account different effect sizes. As a result of the study, could different criteria be used besides Bias, Variance, and MSE arguments? (Table 1 and Table 2). So, are these criteria a sufficient argument for comparing the results?. Comparing the power (β) and type I error rates (α) of each of the tests in the scenarios created will strengthen the study. Which packages of the R program were used for the simulations?. If the package is not used and a new function is written, it should be stated in the manuscript.

Specific comments

Abstract

This part is written very generally and the material of the study, the methods applied and the results should be mentioned. The author needs to re-write this and improve novelty.

Introduction

Line 16 – 17, The criticism made for Duncan multiple comparison does not reflect the truth. In recent simulation studies, Duncan's superiority over other methods is understood. These statements should be supported by the literature.

Line 22, the sentences are not flowing well in paragraph entry. Please rephrase

Line 22 and 25, These statements should be supported by the literature.

Line 30 and 33, These statements should be supported by the literature.

Line 43 and 44, First, why wasn't the sample size determined using power analysis instead? Second, instead of dividing the dataset and estimating group variances, why not crossvalidation by dividing the dataset into testing and training?

The differences between the two-stage design and the bayesian approach should be clearly stated.

Materials and Methods and Results

This section is difficult to follow especially the results, are not well structured and clearly reported.

Only the normal distribution was used in the simulations for two-group comparisons. It is known that the shape of the distribution affects the power of the test. Therefore, the distributions N(0,1), t(10), t(5), β(5,10), β(10,5) and χ2(3) should be used.

The simulation steps can be made more understandable by showing them with a schematic diagram.

It is useful to give the power of the test in tables as well as graphs.

Discussion

Discussion of the results of the study in this section is rather poorly. It's more like the conclusion part. This section needs a review as there are only 8 literatures in this section.

Conclusion

This part does not exist. What is the take home message from the current study? This should be addressing the main aim (also the topic).

Reference

T ASLAN, E., KOŞKAN, Ö., & ALTAY, Y. (2021). Determination of the Sample Size on Different Independent K Group Comparisons by Power Analysis. Türkiye Tarımsal Araştırmalar Dergisi, 8(1), 34-41.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #3: Yes: Ahmet Tansel Serim

Reviewer #4: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2021 Sep 15;16(9):e0257472. doi: 10.1371/journal.pone.0257472.r004

Author response to Decision Letter 1


21 Aug 2021

We greatly appreciate the opportunity to improve our manuscript. We responded to all comments made by the reviewer, and it is attached as a separate PDF file in our submission. Thank you.

Attachment

Submitted filename: Response to Reviewers 2021-08-17.pdf

Decision Letter 2

Ahmet Uludag

2 Sep 2021

Applications of statistical experimental designs to improve statistical inference in weed management

PONE-D-21-17452R2

Dear Dr. Kim,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Ahmet Uludag, Ph.D.

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Congratulations. But, check some minor problems as reviewer pointed.

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #4: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #4: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #4: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #4: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #4: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #4: Please consider the following minor corrections.

In Line 161) " t(10), t(5), β(5, 10), β(10, 5) " should be used instead of " T(10), T(5), Beta(5, 10), Beta(10, 5".

In Line 361 Is the number of iterations 1000 or 10000? Please check?

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #4: No

Acceptance letter

Ahmet Uludag

6 Sep 2021

PONE-D-21-17452R2

Applications of statistical experimental designs to improve statistical inference in weed management 

Dear Dr. Kim:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Ahmet Uludag

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Data. Data observed after the first phase of the experiment.

    (CSV)

    S2 Data. Data observed after the second phase of the experiment (combined with the first phase).

    (CSV)

    Attachment

    Submitted filename: Response to Reviewers 2021-07-26.pdf

    Attachment

    Submitted filename: Response to Reviewers 2021-08-17.pdf

    Data Availability Statement

    All relevant data are within the manuscript and its Supporting information files.


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES