SUMMARY
The bootstrap, introduced in Efron (1979. Bootstrap methods: another look at the jackknife. The Annals of Statistics7, 1–26), is a landmark method for quantifying variability. It uses sampling with replacement with a sample size equal to that of the original data. We propose the upstrap, which samples with replacement either more or fewer samples than the original sample size. We illustrate the upstrap by solving a hard, but common, sample size calculation problem. The data and code used for the analysis in this article are available on GitHub (2018. https://github.com/ccrainic/upstrap).
Keywords: Bootstrap, sampling, sample size calculation
1. Algorithm
Consider a data set where the observed vectors of observations at the subject level are
,
,
is a parameter of interest, and
is an estimator of
. For a fraction
, we are interested in estimating the distribution of
, the estimator based on a resampled fraction
of the original sample size. The upstrap algorithm is as follows
The upstrap algorithm provides the entire distribution of the estimator for a dataset of size
times the sample size of the original data. The distribution can be used to estimate its characteristics, such as the mean or standard deviation. One might ask about the added value of resampling
rather than
observations. We explore this question and illustrate the added benefit for a challenging but routine question in applied regression.
| Estimate | Sth. error | z value | Pr( z ) |
|
|---|---|---|---|---|
| Gender | 1.157 | 0.079 | 14.61 |
2e-16
|
| Age | 0.037 | 0.005 | 7.38 | 1.61e-13
|
| BMI | 0.138 | 0.007 | 18.67 |
2e-16
|
| HTN | 0.758 | 0.473 | 1.60 | 0.109 |
| Age:HTN |
0.009 |
0.007 |
1.22 |
0.221 |
2. Sample size calculation for regression
We now show how to calculate the sample size using the upstrap in a relatively simple regression scenario for which there are no standard methods. Consider the case of a binary regression problem where the outcome is whether or not a person has moderate to severe sleep apnea and the predictors are gender, age, BMI, hypertension status (coded HTN), and hypertension by age interaction. The data comes from the Sleep Heart Health Study (SHHS) (Quan and others (1997) and Redline and others (1998)) and is publicly available as part of the The National Sleep Research Resource (NSRR) (Dean and others (2016)). Moderate to severe sleep apnea is defined as a respiratory disturbance index at
oxygen desaturation (labeled rdi4p in the SHHS dataset) greater or equal to
. We use data from visit one of the SHHS, which contains
individuals.
The regression results based on these data are shown in the table (intercept not shown) indicating that hypertension (coded HTN) is not significant at the
level.
The question that we would like to answer is at what sample size do we expect to identify a hypertension effect on having moderate to severe sleep apnea in this model using the two-sided Wald test at
with a power
? The idea is simple. We set a grid of fractions of the sample size; in this case, this grid is
and for every value of
we upstrap
data sets of size
. For every sample, we conduct the two-sided Wald test for HTN in the model above and reject the null hypothesis of no association if the corresponding p-value is less than
. Figure 1 provides the frequency with which the test for no HTN effect is rejected as a function of the multiplier of the sample size,
.
Fig. 1.

Power to detect the main effect of HTN (y-axis) as a function of the multiplier,
, of the original sample size (x-axis). Here,
on the x-axis corresponds to the original sample size,
,
corresponds to double the sample size,
, and so on. The model uses moderate to severe sleep apnea (binary variable) as an outcome and gender, age, BMI, HTN, and age by HTN interaction as predictors. Horizontal lines indicate powers equal to 0.8 and 0.9, respectively
For example, for multiplier
, we obtain the bootstrap p-value, the percent of times the null of no-association is rejected when sampling with replacement a dataset of the same size. For multiplier
, we produced
samples with replacement from SHHS data sets with twice the number of subjects
. For each sampled dataset, we ran the model and recorded whether the p-value for HTN was smaller than
. For this sample size, we obtained that the HTN effect was identified in
of the samples. We also obtained that the power was equal to
at the sample size multiplier
and
at multiplier
, indicating that the power
would be attained at
subjects. There are very few methods to estimate the sample size in such examples and we contend that the upstrap is a powerful and general method to conduct such calculations. Similar approaches could be used in many other situations, including sample size calculations for detecting a treatment effect in the context of longitudinal clinical trial data or gene by environment interactions in genomics studies.
One of the limitations of this approach is that it could result in large estimators of sample size that may be impractical in some applications. However, our approach provides the ability to conduct such calculations and support the decision to either initiate or not such a study. As a last point, we consider that the upstrap is safe to use in all problems where the bootstrap is used. However, more simulations and theoretical work are necessary to establish this assertion.
Acknowledgments
Conflict of Interest: None declared.
Funding
This work was supported by the National Heart, Lung, and Blood Institute, National Institutes of Health (5R01HL123407-04 to C.M.C.); National Institute of Neurological Disorders and Stroke, National Institutes of Health (5R01NS060910-10 to C.M.C.).
References
- Dean D. A., Goldberger A. L., Mueller R., Kim M., Rueschman M., Mobley D., Sahoo S. S., Jayapandian C. P., Cui L., Morrical M. G.. and others (2016). Scaling up scientific discovery in sleep medicine: the National Sleep Research Resource. Sleep 39, 1151–1164. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Efron B. (1979). Bootstrap methods: Another look at the jackknife. The Annals of Statistics 7, 1–26. [Google Scholar]
- GitHub (2018). https://github.com/ccrainic/upstrap.
- Quan S. F., Howard B. V., Iber C., Kiley J. P., Nieto F. J., O’Connor G. T., Rapoport D. M., Redline S., Robbins J., Samet J. M.. and others (1997). The sleep heart health study: design, rationale, and methods. Sleep 20, 1077–1085. [PubMed] [Google Scholar]
- Redline S., Sanders M. H., Lind B. K., Quan S. F., Iber C., Gottlieb D. J., Bonekat W. H., Rapoport D. M., Smith P. L. and Kiley J. P. (1998). Methods for obtaining and analyzing unattended polysomnography data for a multicenter study. Sleep Heart Health Research Group. Sleep 21, 759–767. [PubMed] [Google Scholar]










