Abstract
When applying analysis of variance, the sample sizes may not be previously known, so it is more appropriate to consider them as realizations of random variables. A motivating example is the collection of observations during a fixed time span in a study comparing, for example, several pathologies of patients arriving at a hospital. This paper extends the theory of analysis of variance to those situations considering mixed effects models. We will assume that the occurrences of observations correspond to a counting process and the sample dimensions have Poisson distribution. The proposed approach is applied to a study of cancer patients.
Keywords: Random sample sizes, mixed effects, L extensions models, F-tests, counting processes, cancer registries
2010 Mathematics Subject Classifications: 62J12, 62J10, 62J99
1. Introduction
In some applications of analysis of variance in medicine, social sciences, economic or agriculture, etc., it is more appropriate to regard the sample sizes as random variables. These situations occur commonly when there is a fixed time span for collecting the observations, other examples arise when some other resource is limited. A motivating example is the collection of data from patients with several pathologies arriving at a hospital during a fixed time span. The number of patients for each pathology is not known in advance and a replication of the study during a different time period of the same length would result in a sample of different size. Therefore, if we plan to conduct just one study to compare the pathologies, it is more appropriate to consider the sample sizes as realizations, , of random variables, , [15,17,20]. Another important case arises when one of the pathologies is rare since, in that case, the desired number of patients in the sample set may not be achieved, [19]. In the cited studies, fixed effects ANOVA was applied. Now we extend the results to mixed effects models to deal with random sample sizes.
The current approach must be based on an adequate choice of the distribution of . In this paper, we will assume that the occurrence of observations corresponds to independent counting processes. An illustrative example of this is the aforementioned case, concerning the comparison of pathologies. This leads us to consider the assumption of being independent and Poisson distributed with parameters , , [12,15,17–20]. Since we need to have at least one observation per treatment, we will consider the random variables , , obtained truncating the random variables for , (see Appendix 1). Through the independence of , the variable has truncated Poisson distribution with parameter
For different situations, it will be more appropriate to consider other discrete distributions for random sample sizes, such as
the Binomial distribution, when there exists an upper bound for the sample sizes, which however may not be attained (either owing to occurrences of failures or for some other reason). An illustrative example of this is when a planned number of patients are approached but only a proportion of them give consent to be included in the study [16,17];
the Negative Binomial distribution, which can be used as an alternative to the Poisson distribution in cases in which the observations are overdispersed with respect to a Poisson distribution.
This paper is structured as follows. In Section 2, we present the formulation of the mixed models in the context of random sample sizes. The test statistics and their conditional and unconditional distributions are obtained in Section 3. Section 4 presents an application based on real medical data, namely on patients affected by cancer, in order to illustrate the usefulness of our approach. Finally, some concluding remarks are made in Section 5.
2. Model
When considering in mixed models that the sample size are random variables, very likely we will get different number of observations per treatment (combination of factor levels), that is, we have an unbalanced design. In order to cope with unbalanced situations a more broader class of models, designated as L extensions or L models, was developed some years ago in [3] and [14]. Using the L extensions in the formulation of the mixed models with random sample sizes, allow us to deal the lack of orthogonality originated by unbalanced situation.
Let us suppose that the m components of correspond to the treatments of a linear model and
(1) |
be the block diagonal matrix with the principal blocks , where denotes the vector with all n components equal to 1 and . Then
(2) |
corresponds to a model with sample sizes , where is the error vector with null mean vector and variance–covariance matrix , with the identity matrix and
Let's consider that
(3) |
where is fixed with components and are random and independent, with null mean vectors and variance–covariance matrices , where , , denote the number of components of . Thus has mean vector and variance–covariance matrix given by
with , , where matrices have m rows and , , columns, see e.g. [5,8,23]. We point out that and are random vectors with m and n components, respectively, since is an matrix.
3. Test statistics and their distributions
In this section, we obtain the test statistics and their conditional distribution and unconditional distribution, under the assumption that we have random sample sizes. We will start by presenting some important results about L extensions.
Let us assume that has orthogonal block structure, so the matrices commute and they will be linear combinations of pairwise orthogonal projection matrices , see [2]. Thus we have
and
where , . With , and , we also have
see e.g. [1,2,4,6]. Let's consider that the row vectors of , , constitute an orthonormal basis for the range space of , , , then we have
with .
Let the MOORE-PENROSE inverse of matrix , then the orthogonal projection matrices (OPM) on and on its orthogonal complement are [22]
So, with , we have
When is independent of , i.e. is normal with null mean vector and variance–covariance matrix , then and are also independent, since they have normal joint distribution and null cross-covariance matrices. Therefore
and
are independent.
Since the column vectors of are linearly independent we have [22]
So we can consider [3]
since , independent of , then independent of
(4) |
where has chi-square distribution with
degrees of freedom, .
Let us now observe that has mean vector and variance–covariance matrix given by
With , we will have
and
has mean vector and variance–covariance matrix
Being and the OPM on and , with rank and , , respectively and and the matrices which the row vectors constitute an orthonormal base to and , , we have
3.1. Fixed sample sizes
Let us now address the hypothesis tests for the canonical variance components [13], , assuming that, with
So, let's consider
which has null mean vector and variance–covariance matrix with
We intend to test the hypothesis
(5) |
When holds, we have
and consequently
Therefore, when holds, has null mean vector and variance–covariance matrix and has chi-square distribution with degrees of freedom, , [10].
Since is independent of S, is also independent of S, Due to this, when holds, the statistic
(6) |
has central F distribution with , , and degrees of freedom, , named as conditional distribution, and might be used as the test statistic [21]. Moreover, the tests with the statistic , are unbiased, e.g. [9,10].
3.2. Random sample sizes
Let us consider that is the realization of a random vector , which means that the samples will have random dimensions. In this section, we will focus on the case where
for this reason the previous results need to be unconditioned in order to .
Let us now suppose that we intend to test the hypothesis
where is a general parameter, and the test is unbiased whatever . So, denoting by [] the probability of rejecting for a significance level α, given and the parameter [the probability of rejecting , given and ], we have
(7) |
Unconditioning (7) in order to , we still obtain
and the test still unbiased.
So, since the tests for the hypothesis are unbiased whatever , we can conclude that they still remain unbiased after unconditioning.
Let us assume that the occurrence of observations corresponds to independent counting processes, which lead us to consider that have truncated Poisson distribution with parameters , . Furthermore, to perform inference we also consider that .
In order to avoid unbalanced cases we will assume that we have a global minimum dimension for the samples [12,20]. Therefore, considering , with , we may take the probability
where
(8) |
as defined in (A1), Appendix 1, which is dedicated to the truncated Poisson distribution.
Consequently, the unconditional distribution of , when the hypothesis holds, will be given by, e.g. [12,20],
(9) |
4. An application to real data
In this section, we apply the proposed methodology to a dataset from patients affected by cancer. The data was collected from the U.S. Cancer Statistics Working Group [24] according to official guidelines and refer to the age of disease detection in 2009. We compare the results obtained using our approach and the common ANOVA.
We will consider a mixed model with one fixed and one random effects factors. The fixed effects factor will be the Gender, with two levels (Male and Female). Due to the large number of cancer types we resorted to the simple random sampling method to select three different types of cancer from the available list. Thus the random effects factor will be the Type of Cancer and the selected types constitute a random sample.
Table 1 illustrates the types of cancer which have been selected, the number of patients and the mean ages at the time of disease detection. This leads to different treatments. The global frequencies of these three types of cancer, for males and females, are provided in Appendix 2.
Table 1. Number of patients and sample mean ages.
Number of patients | Sample means | |||
---|---|---|---|---|
Type of cancer | Male | Female | Male | Female |
Stomach (digestive system) | 44 | 30 | 70.523 | 68.833 |
Melanomas of the skin | 134 | 99 | 63.791 | 57.303 |
Non-Hodgkin lymphoma | 123 | 105 | 63.382 | 66.286 |
According to (3), in this particular example we have
(10) |
where is fixed and and are random, independent, corresponding, respectively, to the random effects factor (Type of cancer) and interaction between the two factors. We have the design matrices
where ⊗ denotes the Kronecker product, and
Let's assume that
which means that
and consequently the matrices j=1,2, will be given by
and
The matrices , j=1,2, which are the OPM on , j=1,2, will be given by
with and
Moreover, and . Besides this, the OPM on , j=1,2, are
We will test the hypotheses
which are the hypotheses of absence of random effects and interaction between the two factors.
Given , when , j=1,2 holds, the conditional distribution of
is a central F distribution with , j=1,2, and degrees of freedom, .
In the calculations, we assume that
which means that, with high probability, we have , so is the global minimum dimension for the samples. Therefore the unconditional distribution of the statistics will be given by
(11) |
Besides this, due to the monotony property of the F distribution [12], when , we have
(12) |
so that
which gives us a lower bound for . Thus, from , we can obtain upper bounds for the quantiles of the unconditional distributions , j=1,2. If we use these upper bounds as critical values, we will have tests with sizes that do not exceed the theoretical values.
Remark
We can use these upper bounds for a preliminary test. If the test statistic exceeds the upper bound it also exceeds the real critical value (obtained when using the unconditional distribution). For the cases when the test statistic is lower than the upper bound one must compute the critical value solving the equation , for z, j=1,2. To solve it we may truncate the series in Equation (11) according to the rule established in [11,19]. This way, restricting the sum to the term , with , where are the realizations of the , , we will haveConsidering ε small, we choose each such that
This inequality will be used to obtain the minimum value of needed to be a good approximation for the distribution , i=1,2, [11].
(13) Usually the analysis starts with a test of interaction and follows with the tests to the main effects whenever it is not significant. We do not follow this approach since we are interested in showing how these tests could be carried out through unconditioning [20].
4.1. Random effects factor
For the second factor, we have
where
with the vector of the sample means with components 70.523, 68.833, 63.791, 57.303, 63.382, 66.286 and
So, for the numerator of the statistic we obtain
When , is the product by of a central chi-square with degrees of freedom, . In this case, we obtained
Therefore, the statistic's value, , is given by
If we use the common conditional distribution of , which corresponds to , since n=535, we will obtain the quantiles given in Table 2.
Table 2. The quantiles of the conditional distribution.
Values of α | 0.1 | 0.05 | 0.01 |
---|---|---|---|
2.094 | 2.622 | 3.819 |
So, since , we do not reject for the usual levels of significance.
Let's assume that we have 12 [16 and 19] observations as global minimum dimensions for the samples, which means that we consider [ and ]. Table 3 shows the upper bounds for the quantiles with probability , , of the unconditional distribution .
Table 3. Upper bounds for the quantiles.
Values of α | 0.1 | 0.05 | 0.01 | |
---|---|---|---|---|
3.289 | 4.757 | 9.779 | ||
2.728 | 3.708 | 6.552 | ||
2.560 | 3.410 | 5.739 |
It is to be expected that the quantiles for random sample sizes (obtained when using the unconditional distribution) to exceed the classical ones (obtained when using common conditional distribution), since the first ones take into account a new source of variation. Then, since in this case we do not reject the hypothesis using the classical quantiles the same result is expected when using the quantiles for random sample sizes and consequently the upper bound approach. This interpretation leads us to not reject .
The quantiles for the unconditional distribution are approximated by truncation of the infinite series indicated in Equation (11). We obtained the minimum value for a truncation error not greater than (). To carry out the computation, we assumed that , , are the daily average of occurrences per year. So we have .
The obtained quantiles with probability , , of the truncated unconditional distribution
(14) |
are presented in Table 4.
Table 4. The quantiles of the truncated unconditional distribution.
Values of α | 0.1 | 0.05 | 0.01 | |
---|---|---|---|---|
3.255 | 4.693 | 9.583 | ||
2.720 | 3.695 | 6.518 | ||
2.555 | 3.402 | 5.722 |
Results in Table 4 agree with those in Table 3, i.e. is not rejected therefore the random factor is not significant.
4.2. Interaction
For the interaction between the fixed factor and the random one, we have
and
For the numerator of the statistic , we obtain
Therefore, the statistic's value, , is given by
If we use the common conditional distribution of , which corresponds to , we obtain the quantiles given in Table 2. Since , we reject for the usual levels of significance.
Considering the truncated unconditional distribution, , which correspond to defined in (14), we obtained the quantiles, , given in Table 4. The results in this table lead us to:
reject for and 0.05 and do not reject for , considering ;
reject for the usual level of significance, considering or 19.
Table 3 shows the upper bounds for the quantiles with probability , , of the unconditional distribution. These results agree with those based on the quantiles of the truncated unconditional distribution. Assuming the values of the test statistic remain unchanged, then we should have the total sample sizes presented in Table 5 for ensuring rejection.
Table 5. Minimum value that leads to reject the hypothesis .
Values of | 0.1 | 0.05 | 0.01 |
---|---|---|---|
8 | 9 | 15 |
Since for higher values of we would get lower values for the quantiles, we have for all . In this case, we reject considering the usual levels of significance, which means that the interaction between factors is significant.
4.3. Conclusion
Our discussion shows the relevance of the unconditional approach in avoiding false rejections. As we saw, the inference results for some situations depends on the approach. Since the unconditional approach is more secure, when testing the interaction the null hypothesis is not rejected when and , whereas the common conditional approach would lead to a false rejection.
The results in Tables 3 and 4 show that for higher minimum sample sizes, we get smaller upper bounds and quantiles of the unconditional distribution. Due to this, we may conclude that with the increase of the minimum sample sizes, the decision based on both approaches is similar.
To finish we would like to note that all the computations were performed using the R software.
5. Final remarks
The approach followed in this paper is more realistic than the usual F tests for the situations where it is not possible to known in advance the sample sizes. To do that, we have to make assumptions regarding the distribution of the sample sizes based on previous knowledge of the sample collection and incorporate this source of variation into the mixed model. We choose the Poisson distribution since it would correspond to Poisson processes for observation collection and the underlying assumption for these (independent and stable increments and not clustering) seems realist. Moreover, the L extensions fit easily in the assumption of random sample sizes. These model formulation have been used to solve the unbalance originated by different number of observations per treatment, which cause non-orthogonality in fixed and mixed effects models. We included an application with cancer data to illustrate how straightforward it is to apply our approach in a medical context. The comparative results show that when random sample sizes are considered the critical values may exceed those of classical ANOVA (obtained when using the common F conditional distribution). So, we can conclude that this approach avoids working with incorrect critical values and thus carrying out tests without the proper level. We would like also to highlight that our methodology is not restricted to the medical domain and yet may be applied to several other research areas.
Acknowledgments
The authors would like to thank the anonymous referees for useful comments and suggestions.
Appendices.
Appendix 1. Truncated Poisson distributions
This appendix presents some results about the truncated Poisson distribution, which are useful in obtaining the unconditional distribution of the test statistics.
Since we need to have at least one observation per treatment, we will consider the common form of truncated Poisson distribution, which corresponds to the omission of the zero class, e.g. [7]. So we have . To perform inference, we also consider that where
As previously mentioned, we assumed that , and . So we have
Therefore, the moment generating function of , when , , will be
and the probability generating functions
With , , the truncated variables , , when , and considering
we will obtain the probability generating function
where and denotes the cardinal of , any subset of .
Therefore we will have
It is interesting to observe that we have
where denotes the derivative of order s, which results from
have one or more null components and ,
Indeed, with the family of partitions of s with cardinal m, we have
and, if s<m, whatever ,
since has at least one null component. So, since , we obtain
Furthermore, the only non-null term of corresponds to , so
and
Considering the random vector with components , we have , which means there exists at least one if and only if so
We also have
(A1) |
Appendix 2. Frequency tables of types of cancer
Table A1. Males with stomach (digestive system) cancer.
Age | 1–4 | 5–9 | 10–14 | 15–19 | 20–24 | 25–29 | 30–34 | 35–39 | 40–44 |
---|---|---|---|---|---|---|---|---|---|
Mean age | 2 | 7 | 12 | 17 | 22 | 27 | 32 | 37 | 42 |
Patients | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
Age | 45–49 | 50–54 | 55–59 | 60–64 | 65–69 | 70–74 | 75–79 | 80–84 | 85 |
Mean age | 47 | 52 | 57 | 62 | 67 | 72 | 77 | 82 | 87 |
Patients | 1 | 2 | 4 | 5 | 6 | 7 | 7 | 6 | 5 |
Table A2. Females with stomach (digestive system) cancer.
Age | 1–4 | 5–9 | 10–14 | 15–19 | 20–24 | 25–29 | 30–34 | 35–39 | 40–44 |
---|---|---|---|---|---|---|---|---|---|
Mean age | 2 | 7 | 12 | 17 | 22 | 27 | 32 | 37 | 42 |
Patients | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 |
Age | 45–49 | 50–54 | 55–59 | 60–64 | 65–69 | 70–74 | 75–79 | 80–84 | 85 |
Mean age | 47 | 52 | 57 | 62 | 67 | 72 | 77 | 82 | 87 |
Patients | 2 | 2 | 2 | 3 | 3 | 3 | 4 | 4 | 5 |
Table A3. Males with melanomas of the skin.
Age | 1–4 | 5–9 | 10–14 | 15–19 | 20–24 | 25–29 | 30–34 | 35–39 | 40–44 |
---|---|---|---|---|---|---|---|---|---|
Mean age | 2 | 7 | 12 | 17 | 22 | 27 | 32 | 37 | 42 |
Patients | 0 | 0 | 0 | 0 | 1 | 2 | 2 | 4 | 6 |
Age | 45–49 | 50–54 | 55–59 | 60–64 | 65–69 | 70–74 | 75–79 | 80–84 | 85 |
Mean age | 47 | 52 | 57 | 62 | 67 | 72 | 77 | 82 | 87 |
Patients | 8 | 12 | 14 | 17 | 16 | 16 | 14 | 12 | 10 |
Table A4. Females with melanomas of the skin.
Age | 1–4 | 5–9 | 10–14 | 15–19 | 20–24 | 25–29 | 30–34 | 35–39 | 40–44 |
---|---|---|---|---|---|---|---|---|---|
Mean age | 2 | 7 | 12 | 17 | 22 | 27 | 32 | 37 | 42 |
Patients | 0 | 0 | 0 | 1 | 2 | 4 | 4 | 6 | 7 |
Age | 45–49 | 50–54 | 55–59 | 60–64 | 65–69 | 70–74 | 75–79 | 80–84 | 85 |
Mean age | 47 | 52 | 57 | 62 | 67 | 72 | 77 | 82 | 87 |
Patients | 10 | 10 | 10 | 10 | 8 | 7 | 7 | 6 | 7 |
Table A5. Males with non-Hodgkin lymphoma.
Age | 1–4 | 5–9 | 10–14 | 15–19 | 20–24 | 25–29 | 30–34 | 35–39 | 40–44 |
---|---|---|---|---|---|---|---|---|---|
Mean age | 2 | 7 | 12 | 17 | 22 | 27 | 32 | 37 | 42 |
Patients | 0 | 0 | 1 | 1 | 1 | 2 | 2 | 3 | 5 |
Age | 45–49 | 50–54 | 55–59 | 60–64 | 65–69 | 70–74 | 75–79 | 80–84 | 85 |
Mean age | 47 | 52 | 57 | 62 | 67 | 72 | 77 | 82 | 87 |
Patients | 8 | 10 | 12 | 14 | 15 | 14 | 14 | 12 | 9 |
Table A6. Males with non-Hodgkin lymphoma.
Age | 1–4 | 5–9 | 10–14 | 15–19 | 20–24 | 25–29 | 30–34 | 35–39 | 40–44 |
---|---|---|---|---|---|---|---|---|---|
Mean age | 2 | 7 | 12 | 17 | 22 | 27 | 32 | 37 | 42 |
Patients | 0 | 0 | 0 | 1 | 1 | 1 | 2 | 2 | 3 |
Age | 45–49 | 50–54 | 55–59 | 60–64 | 65–69 | 70–74 | 75–79 | 80–84 | 85 |
Mean age | 47 | 52 | 57 | 62 | 67 | 72 | 77 | 82 | 87 |
Patients | 5 | 7 | 9 | 11 | 13 | 13 | 13 | 12 | 12 |
Funding Statement
This work was partially supported by the FCT- Fundação para a Ciência e Tecnologia, under the projects UID/MAT/00212/2019 and UID/MAT/00297/2019.
Disclosure statement
No potential conflict of interest was reported by the authors.
References
- 1.Bailey R.A., Ferreira S.S., Ferreira D., and Nunes C., Estimability of variance components when all model matrices commute, Linear Algebra Appl. 492 (2016), pp. 144–160. doi: 10.1016/j.laa.2015.11.002 [DOI] [Google Scholar]
- 2.Carvalho F., Mexia J.T., Santos C., and Nunes C., Inference for types and structured families of commutative orthogonal block structures, Metrika 78 (2015), pp. 337–372. doi: 10.1007/s00184-014-0506-8 [DOI] [Google Scholar]
- 3.Ferreira S., Ferreira D., Moreira E., and Mexia J.T., Inference for L orthogonal models, J. Interdiscip. Math. 12 (2009), pp. 815–824. doi: 10.1080/09720502.2009.10700666 [DOI] [Google Scholar]
- 4.Ferreira S.S., Ferreira D., Nunes C., and Mexia J.T., Estimation of variance components in linear mixed models with commutative orthogonal block structure, Rev. Colomb. Estadist. 36 (2013), pp. 261–271. [Google Scholar]
- 5.Heinzl F. and Tutz G., Clustering in linear mixed models with approximate Dirichlet process mixtures using EM algorithm, Stat. Modell. 13 (2013), pp. 41–67. doi: 10.1177/1471082X12471372 [DOI] [Google Scholar]
- 6.Houtman A.M. and Speed T.P., Balance in designed experiments with orthogonal block structure, Ann. Statist. 11 (1983), pp. 1069–1085. doi: 10.1214/aos/1176346322 [DOI] [Google Scholar]
- 7.Johnson N.L. and Kotz S., Discrete Distributions, John Wiley & Sons, New York, 1969. [Google Scholar]
- 8.Khuri A.I., Mathew T., and Sinha B.K., Statistical Tests for Mixed Linear Models, John Wiley & Sons, New York, 1998. [Google Scholar]
- 9.Lehmann E.L., Testing Statistical Hypotheses, John Wiley & Sons, New York, 1959. [Google Scholar]
- 10.Mexia J.T., Best linear unbiased estimates, duality of F tests and the Scheffé multiple comparison method in presence of controlled heterocedasticity, Comput. Stat. Data Anal. 10 (1990), pp. 271–281. doi: 10.1016/0167-9473(90)90007-5 [DOI] [Google Scholar]
- 11.Mexia J.T. and Moreira E., Randomized sample size F tests for the one-way layout. 8th International Conference on Numerical Analysis and Applied Mathematics 2010. AIP Conf. Proc. 1281(II), 2010, pp. 1248–1251.
- 12.Mexia J.T., Nunes C., Ferreira D., Ferreira S.S., and Moreira E., Orthogonal fixed effects ANOVA with random sample sizes, Proceedings of the 5th International Conference on Applied Mathematics, Simulation, Modelling (ASM'11), 2011, pp. 84–90
- 13.Michalski A. and Zmyślony R., Testing hypothesis for variance components in mixed linear models, Statistics 27 (1996), pp. 297–310. doi: 10.1080/02331889708802533 [DOI] [Google Scholar]
- 14.Moreira E., Mexia J.T., Fonseca M., and Zmyślony R., L models and multiple regressions designs, Statist. Papers 50 (2009), pp. 869–885. doi: 10.1007/s00362-009-0255-3 [DOI] [Google Scholar]
- 15.Moreira E.E., Mexia J.T., and Minder C.E., F tests with random sample size. Theory and applications, Stat. Probab. Lett. 83 (2013), pp. 1520–1526. doi: 10.1016/j.spl.2013.02.020 [DOI] [Google Scholar]
- 16.Nunes C., Capistrano G., Ferreira D., Ferreira S.S., and Mexia J.T., One-way fixed effects ANOVA with missing observations, Proceedings of the 12th International Conference on Numerical Analysis and Applied Mathematics, AIP Conf. Proc. 1648, 2015, p. 110008.
- 17.Nunes C., Capristano G., Ferreira D., Ferreira S.S., and Mexia J.T., Exact critical values for one-way fixed effects models with random sample sizes, J. Comput. Appl. Math. 354 (2019), pp. 112–122. doi: 10.1016/j.cam.2018.05.057. [DOI] [Google Scholar]
- 18.Nunes C., Ferreira D., Ferreira S.S., and Mexia J.T., F Tests with Random Sample Sizes. 8th International Conference on Numerical Analysis and Applied Mathematics. AIP Conf. Proc. 1281(II), 2010, pp. 1241–1244
- 19.Nunes C., Ferreira D., Ferreira S.S., and Mexia J.T., F-tests with a rare pathology, J. Appl. Stat. 39 (2012), pp. 551–561. doi: 10.1080/02664763.2011.603293 [DOI] [Google Scholar]
- 20.Nunes C., Ferreira D., Ferreira S.S., and Mexia J.T., Fixed effects ANOVA: An extension to samples with random size, J. Stat. Comput. Simul. 84 (2014), pp. 2316–2328. doi: 10.1080/00949655.2013.791293 [DOI] [Google Scholar]
- 21.Scheffé H., The Analysis of Variance, Wiley Series in Probability and Statistics, John Wiley & Sons, New York, 1959. [Google Scholar]
- 22.Schott J.R., Matrix Analysis for Statistics, John Wiley & Sons, New York, 1997. [Google Scholar]
- 23.Searle S.R., Casella G., and McCulloch C.E., Variance Components, Wiley Series in Probability and Statistics, John Wiley & Sons, New York, 1992. [Google Scholar]
- 24.U.S. Cancer Statistics Working Group , United States Cancer Statistics: 1999–2010 Incidence and Mortality Web-based Report. Atlanta: U.S. Department of Health and Human Services, Centers for Disease Control and Prevention and National Cancer Institute, 2013. Available at https://nccd.cdc.gov/uscs/.