Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2011 Jun 1.
Published in final edited form as: J Biomed Inform. 2009 Dec 1;43(3):358–364. doi: 10.1016/j.jbi.2009.11.007

Validating pathophysiological models of aging using clinical electronic medical records

David P Chen 1,2,3, Alexander A Morgan 1,2,3, Atul J Butte 1,2,3
PMCID: PMC2878870  NIHMSID: NIHMS167562  PMID: 19958842

Abstract

Bioinformatics methods that leverage the vast amounts of clinical data promises to provide insights into underlying molecular mechanisms that help explain human physiological processes. One of these processes is adolescent development. The utility of predictive aging models generated from cross-sectional cohorts and their applicability to separate populations, including the clinical population, has yet to be completely explored. In order to address this, we built regression models predictive of adolescent chronological age from 2001–2002 National Health and Nutrition Examination Survey (NHANES) data and validated them against independent 2003–2004 NHANES data and clinical data from an academic tertiary-care pediatric hospital. The results indicate distinct differences between male and female models with both alkaline phosphatase and creatinine as predictive biomarkers for all genders, hematocrit and mean cell volume for males, and total serum globulin for females. We also suggest that the models are generalizable, are clinically relevant, and imply underlying molecular and clinical differences between males and females that may affect prediction accuracy. The integration of both epidemiological and clinical data promises to create more robust models that shed new light on physiological processes.

Keywords: aging, pediatric, biomarker, translational bioinformatics, age prediction, electronic medical record, adolescent development

INTRODUCTION

Methodological innovation that uses data from basic biology and translates it for clinical relevance has often been the crux of translational bioinformatics. It is less often that we use data collected in the clinical environment to gain a better understanding of the biological underpinnings of human physiological processes or to validate hypotheses that may be generated in the traditional manner. Although the majority of research surrounding clinical data deals with acquisition of data, in the form of electronic medical records [1], and the application of this data to improve patient care [2], the data collected inherently has the ability to shed light on novel biology [3]. Clinical measurements are a window into underlying molecular processes that change accordingly with the physiological characteristics of the individual, whether it is gender, age, disease status, or one of the countless states that an individual can take.

The term reverse translational bioinformatics has been coined recently to describe the use of clinically relevant data to gain insight into basic biological phenomena [4]. Fliss and colleagues show that the use of cross-sectional survey data in the form of the 3rd National Health and Nutrition Examination Survey can be used to study the interplay between age, gender, and blood biomarkers. While they were able to show differentiated clusters of individuals based upon age and gender, the applicability of models built on one data set to other data sets, including data from clinical electronic medical records, remains unknown. In this study we examine the applicability of model building to various data sources, including clinical data from an academic tertiary-care pediatric hospital, in the context of pediatric development and aging.

Well known markers for aging in the pediatric population include anthropometric measurements such as the comparison between individuals and landmarks on the human growth curve, examining secondary sexual characteristics [57], dental development [810], and skeletal development [11, 12]. In fact, chronological age can be estimated from these markers. The average error between the estimated age and the actual chronological age has been show to range from −0.32 years to 0.36 years for skeletal development [13] and −1.18 years to 0.72 years for dental development [14]. Moreover, these methods, some of which have been used for almost half a century, are the standard of care when diagnosing individuals with developmental disorders. These anthropometric measures are well known and commonly used, but are markers for secondary changes and not primary aging.

The molecular and physiological process of aging is studied predominantly in animal models. For instance, it has been shown that maximum lifespan can be extended due to caloric restriction in organisms including yeast, worms, flies, and rodents [1517]. While caloric restriction has been implicated in reducing the rate of age-associated muscle loss in monkeys [18], its true effect in increasing the maximum age of primates and humans is still unknown. This is due to factors including malnutrition, which may be associated with caloric restriction [19], and the difficulty of studying aging in longer lived species. The release of sex steroids via the hypothalamic-pituitary-gonadal axis and its affect in the fetus, childhood, and pubertal phases has been studied for decades [20, 21]. More recently, mutations found in gonadotropins suggest the clinical consequences of aberrant or delayed sexual differentiation and development [22]. Other primary effectors of aging include IGF-1 and IGFBP-3 which have patterns of change throughout childhood as a function of both age and pubertal stage [2325]. It has also been shown that IGF-1 and IGFBP-3 have a positive relationship with bone remodeling levels as indicated by serum levels of bone alkaline phosphatase and CrossLaps [26]. However, there may be other physiological changes in early aging beyond what we observe (primary and secondary sex characteristics), sex steroid levels, height, and other anthropometric changes.

We propose the use of a broad panel of physiological measurements obtained for cross-sectional research to discover biomarkers that can correlate with age and development. Gender-specific multivariate linear regression models predicting adolescent chronological age are built from 2001–2002 National Health and Nutrition Examination Survey (NHANES) data. Models are then validated against independent 2003–2004 NHANES data, as well as clinical data from an academic tertiary-care pediatric hospital. A linear regression using alkaline phosphatase, creatinine, hematocrit, and mean cell volume was found to be predictive of male chronological age while a model built on alkaline phosphatase, creatinine, and total serum globulin was found to be predictive in the female population. We also suggest that models built from one data source are applicable to independent cohorts implying the greater role for clinical data in translational biomedical informatics.

METHODS AND RESULTS

The objective of this study was to gain insight into the primary process of aging in the pediatric population by generating a multivariate linear model that uses non-radiographic biomarker measurements (complete blood count and blood biochemistry profiles) to predict pediatric age. We also wanted to determine whether the models generated were applicable to data collected from the clinical environment, specifically patients seen at the Lucile Packard Children’s Hospital (LPCH) and its outpatient clinics (Fig. 1).

Figure 1.

Figure 1

Schematic of the model building and prediction pipeline

Model Building Using NHANES

NHANES is a biannual nationally representative cross sectional health survey of the non-institutionalized population of the United States conducted by the Centers for Disease Control and Prevention [27]. Due to the lack of blood biochemistry profiles for respondents less than 12 years of age, we limited all analyses to individuals between 12 and 18 years of age. Respondent data whose age at the mobile examination center (MEC) examination date was within this range were aggregated and joined by unique identifiers. Biomarker measurements were extracted from the Complete Blood Count with 5-part Differential and the Standard Biochemistry Profile, Follicle Stimulating Hormone, and Luteinizing Hormone data tables. Duplicate biomarker measurements due to unit transformations were removed. Luteinizing hormone and follicle stimulating hormone were removed due to lack of coverage (50%), reflective of their rare usage in pediatric care.

In total, NHANES 2001–2002 contained 1,831 individuals aged between 12 and 18 years with complete blood count and blood biochemistry data. Thirty nine biomarkers were examined (Table 1). Due to the large number of potential biomarkers, it was desirable to reduce the number of parameters for the model. While there are many methods that allow for the selection of a parsimonious set of features for prediction of the response variable, we chose Least Angle Regression (LARS) [28]. LARS is a computationally efficient method for model selection akin to forward selection. The end result of applying this method is an ordered list of covariates to include in the model. The benefit is that this allows for the most informative biomarkers to be chosen, which tend to be independent of one another. Secondly, having less biomarkers necessary for prediction potentially translates to fewer tests done in the clinical setting. Individuals with any missing data were excluded. 1,653 individuals had complete data: 793 males and 860 females. 90% of the individuals in both groups were randomly sampled and used for model building (n = 744 females, 713 males).

Table 1.

Top 10 most informative biomarkers along with Pearson’s correlation coefficient between biomarker and age (r), p-value of the correlation with the null-hypothesis that there is no linear association (r is equal to 0), and average least angle regression rank with 10-fold cross validation. Left: male, Right: female. Shaded regions are biomarkers used in their perspective models. ALK, alkaline phosphatase (U/L); AST, aspartamine aminotransferase (U/L); BIC, bicarbonate (mmol/L); BILI, bilirubin (mg/dL); CA, total calcium (mg/dL); CHOL, cholesterol (mg/dL); CR, creatinine (mg/dL); FE, iron ( ug/dL); GGT, gamma-glutamyl transpeptidase ( U/L); GLOB, total serum globulin(g/dL) ; HCT, hematocrit (%);K, potassium (mmol/L); cellMCV, mean cell volume (fL); SNP, segmented neutrophils percent (%);URIC, uric acid (mg/dL)

Male
Biomarker r p-value Average
LARS rank
ALK −0.6659 < 0.0001 1
CR 0.6367 < 0.0001 2
HCT 0.4809 < 0.0001 3
MCV 0.3385 < 0.0001 5.9
CA −0.0768 0.0404 6.1
URIC 0.3127 < 0.0001 6.8
BILI 0.2131 < 0.0001 6.9
BIC 0.2514 < 0.0001 7.3
GGT 0.1674 < 0.0001 7.9
AST −0.1293 0.0005 11.4
Female
Biomarker r p-value Average
LARS rank
ALK −0.6637 < 0.0001 1
CR 0.3483 < 0.0001 2
GLOB 0.1451 0.0001 3.1
CHOL 0.0683 0.0575 5
SNP 0.174 < 0.0001 6
URIC −0.0558 0.1206 6.3
GGT 0.067 0.0625 7.2
FE −0.0547 0.1281 8.3
CA −0.1529 < 0.0001 10.4
K −0.1044 0.0036 11.7

Feature selection using LARS with 10-fold cross validation was applied to each subgroup. Ranks of each biomarker were recorded along with the root mean squared error (RMSE) incurred by the addition of each feature. After observing the RMSE trend across the 10-fold validation sets, the male subgroup exhibited 4 features before the RMSE leveled off while the female subgroup exhibited 3 features (Fig. 2). The four biomarkers used in the male model were alkaline phosphatase, creatinine, hematocrit and mean cell volume. The three biomarkers used in the female model were alkaline phosphatase, creatinine, and total serum globulin. Upon further examination of the biomarkers used in our models, we noticed that alkaline phosphatase had a non-linear relationship with age. Due to its exponentially decreasing nature we used a natural logarithm transformation of alkaline phosphatase. The other biomarkers had a linear trend in the age range of interest. A multivariate linear model was built for each gender subgroup independently using their respective features:

AgeMale=17.4679+(1.6705×ln(Alkaline Phosphatase))+(2.86545×Creatinine)      +(0.0624×Hematocrit)+(0.01625×Mean Cell Volume)AgeFemale=24.0455+(2.4407×ln(Alkaline Phosphatase))+(1.4435×Creatinine)      +(0.4122×Total Serum Globulin)

Figure 2.

Figure 2

Root mean squared error plots with confidence intervals in relation to the number of features chosen. 0 indicates random. Left: female, right: male.

The models were first used to determine the r2, residual standard error, and p-value when applied to the training data. The null hypothesis in a multivariate linear regression is that all of the partial regression coefficients are equal to 0 which can be tested via an analysis of variance. The male subgroup model (adjusted r2 = 0.6301, residual standard error = 1.062, p-value < 0.0001) and female subgroup model (adjusted r2 = 0.5274, residual standard error = 1.195, p-value < 0.0001) were then used to predict age from the remaining 10% of the NHANES 2001–2002 data (Figure 3a). The female population (n = 86) resulted in a mean error (difference between predicted age and expected age) of 0.0978 years with a standard deviation of 1.2648 years and an adjusted r2 of 0.5008. The male population (n = 80) resulted in a mean error of 0.0308 years with a standard deviation of 0.9617 years and an adjusted r2 of 0.6667 (Table 2a).

Figure 3.

Figure 3

Actual age versus predicted age for the three data sets with the 95% confidence interval for the regression line a) NHANES 2001–2002 untrained, b) NHANES 2003–2004, c) LPCH

Table 2.

Mean error, standard error, standard deviation, root mean squared error of the differences between predicted and expected chronological age, adjusted r2, and p-values for the correlation between predicted and expected ages for a) NHANES 2001–2002, 10% untrained b) NHANES 2003–2004, and c) LPCH

Gender Mean error (years) SE (years) SD (years) RMSE (years) Adjusted r2 p-value
(a) NHANES 2001–2002 (10% untrained)
Female (n=86) 0.0978 0.1364 1.2648 1.2612 0.5008 < 0.0001
Male (n=80) 0.0308 0.1075 0.9617 0.9562 0.6667 < 0.0001
(b) NHANES 2003–2004
Female(n=712) 0.0667 0.0459 1.2241 1.2251 0.5027 < 0.0001
Male(n=787) 0.0699 0.0384 1.0768 1.0784 0.6042 < 0.0001
(c) LPCH
Female(n=1441) −0.0379 0.0372 1.4129 1.4129 0.3168 < 0.0001
Male(n=867) 0.0976 0.0396 1.1652 1.1687 0.5199 < 0.0001

Model Evaluation

We evaluated the accuracy of the model by checking its performance on an independent yet similar set of subjects reflected in the NHANES 2003–2004 data set. The above multivariate model was applied to predict pediatric age. The female population (n = 712) resulted in a mean error of 0.0667 years with a standard deviation of 1.2241 years and an adjusted r2 of 0.5027. The male population (n = 787) resulted in a mean error of 0.0699 years with a standard deviation of 1.0768 years and an adjusted r2 of 0.6042 (Figure 3b, Table 2b). These similar results suggest that our model is applicable to independent NHANES data sets.

Model Validation on Clinical Samples

To assess how our models would perform on clinical data, de-identified biomarker measurements and diagnosis codes linked by synthetic patient identifiers from LPCH, both labeled with the age in days of each patient, were used. LPCH laboratory test and diagnosis data were retrieved via the Stanford Translational Research Integrated Database Environment (STRIDE). The use of de-identified individual data in this manner was approved by the Institutional Review Board of the Stanford University School of Medicine. Data for clinical biomarkers equivalent to those our models required were extracted. Individuals were excluded if they were diagnosed with any one of 23 ICD-9-CM codes that were directly related to developmental disorders (e.g. 259.0 or Delay in sexual development and puberty not elsewhere classified) within 6 months of the visit for which we had lab measurements. We also excluded any biomarker electronically flagged as abnormal by the hospital electronic health record system. In reality, systems that flag are commonplace. Flagging of abnormal values is determined by using reference ranges such that the likelihood of a biochemical abnormality in a patient is small [29]. These reference ranges are often computed as the mean plus or minus two standard deviations of the normal sample population [30]. Individuals with multiple time stamps associated with their biomarker measurements had one chosen at random.

Data from LPCH consisted of 19,537 individuals aged between 12 and 18 associated with one or more of a set of 6,416 ICD-9-CM codes. There were a total of 3,585,609 measurements across 1,936 biomarkers. Individuals not missing any biomarker measurements required by their perspective models were retrieved and used to predict age. The female population consisted of 1,441 individual and the male population consisted of 867 individuals. Our model, when applied to the female population, resulted in a mean error of −0.0379 years with a standard deviation of 1.4129 years and an adjusted r2 of 0.3168. The application of our model to the male population resulted in a mean error of 0.0976 years with a standard deviation of 1.1652 years and an adjusted r2 of 0.5119 (Figure 3c, Table 2c).

In order to assess the generalizability of our model to different data sets we used Levene’s test to examine the homogeneity of variance across samples [31]. One of the benefits of Levene’s test is that it does not require the underlying data to be normal. The null-hypothesis of Levene’s test is that the population variances are equal. We applied Levene’s test to determine whether the differences between the predicted and actual age among the male populations differ significantly from one data set to another and whether the differences between the predicted and actual age among the female populations differ significantly as well. The male populations show no significant differences across all three datasets with a p-value cutoff of 0.05. The female populations show no significant differences across the datasets with the exception of NHANES 2003–2004 and LPCH distributions. We also applied the non-parametric K-sample Anderson-Darling test (adk-test) [32]. The null hypothesis of the adk-test is that K samples, which may be of differing sizes, come from a common continuous distribution. The adk-test recapitulated the results from Levene’s test.

A random distribution for each gender group was created by randomly sampling ages from the NHANES 2003–2004 dataset and computing the error. Each dataset was compared to random using Levene’s test as well as the adk-test and all were statistically significant (p-value <0.05) (Figure 4). This suggests that our model is applicable to different data sets, including clinical data, and that the errors generated are significantly different that one would expect at random.

Figure 4.

Figure 4

Comparisons between the distribution of error between actual and predicted ages for NHANES 2001–2002, NHANES 2003–2004, LPCH, and Random. Lower Triangle represents the CDF plot between the error distributions of the corresponding pairs. Upper triangle represents whether there is a significant difference (p-value < 0.05) between the error distributions of the corresponding pairs using the Levene’s test and K-sample Anderson-Darling test. N.S means not significant. Left: female, right: male.

DISCUSSION

While it is often overlooked in traditional translational biomedical informatics, where emphasis has focused around genes and proteins and their relationships to clinical outcome, biomarker data from electronic medical record systems are records of molecular information that can also be related to physiological characteristics, which include aging and disease status. Using publically available data collected by the Centers for Disease Control and Prevention in their biannual National Health and Nutrition Examination Survey (NHANES), we have proposed a set of physiological measurements that change with age. We have demonstrated the ability to estimate chronological age using these routinely-obtained blood biomarkers for individuals between the ages of 12 and 18. Finally, we have been able to validate the gender-specific models using an independent population gathered from the NHANES 2003 – 2004 survey as well as clinical data from Lucile Packard Children’s Hospital.

Our analysis has identified both known and potentially novel biomarkers that are associated with age. Hematocrit and mean cell volume have been previously shown to be correlated with age. Daniel and colleagues showed that hematocrit levels in males increase with age while remaining flat in females [33]. This can be attributed to a greater increase in size and volume of muscle fibers leading to a rise in hematocrit values. Our models seem to recapitulate this. Whereas hematocrit is the third highest biomarker that is informative for age prediction in males, it is the last in females. Similarly, in the Bogalusa Heart Study, it was shown that mean cell volume and age, in the age range of 12–17, had a significant linear trend in males that was absent in females [34]. Alkaline phosphatase originates mainly from the bone and liver [35]. The negative coefficients associated with alkaline phosphatase for both the male and female subgroups indicate the steep decline in bone growth known to occur after age 12. Bennet and colleagues have shown that there is a relationship between serum alkaline phosphatase and sexual maturity rating and that serum alkaline phosphatase decreases after reaching peak height velocity of growth [36]. Creatinine is a byproduct of the breakdown of muscle. It is produced at a constant rate depending on the muscle mass of an individual [35]. The positive sign of the coefficients for both models reflects an increase in muscle mass that is expected as one develops and has been observed in previous studies [37]. This analysis has also been able to identify serum globulin as a potentially novel factor that is predictive of female aging. It has been previously shown that the gamma globulins IgA, IgG, and IgM increase in normal individuals [38] but to our knowledge the total serum globulin levels have never been associated with female adolescent development. In addition to the biomarkers we used for building our multivariate models, we have also provided an ordered list of biomarkers, many of which, to our knowledge, have not been implicated in pediatric development and deserve further investigation.

We acknowledge that a multivariate linear regression may not be the optimal method to develop a predictive model due to the non-linearity of clinical data. However, within our age range of interest and due to our desire to develop the most parsimonious solution, we believe that with slight data transformations, such as the natural logarithm transformation for alkaline phosphatase, we do not violate any of the requisites for the usage of linear regressions. While Fliss has shown that artificial neural networks perform better for age prediction when children and adults are mixed [4], our population is very homogenous with ages ranging from 12–18 and would expect comparable performance as was exhibited when using increasingly homogenous age groups. The use of linear models in such a manner also allows for features that are more interpretable and which we can biologically pursue, instead of the complex model fit, such as in the case of an artificial neural network, which may obfuscate what individual biomarkers should be pursued biologically.

The models, when applied to LPCH had greater predictive error due to factors including but not limited to the diagnostic status of the individuals, the technical variance associated with medical equipment and the lack of truly normal data. While we only examined normal values in clinical data which are based on clinical guidelines, the majority of these individuals have an illness associated with them which may affect the biomarker value. The motivation for using only normal values was that it excluded the outliers which are common in the hospital environment due to procedures and/or effects of disease. Other methods of approaching this problem may involve quartile thresholding or applying a z-score threshold.

While application of the models to LPCH data showed worse performance compared to the NHANES data sets, we have shown that the error distributions are not significantly different with one exception. This is noteworthy because both the populations as well as the status of the individuals were different. The majority of NHANES survey participants are healthy individuals. In contrast to this, the majority of patients seen at LPCH have underlying physiological conditions that may affect their biomarker measurements. Regardless of this, the lack of significant difference in error distributions suggests that clinical data can be used to validate models built using independent data sources.

The predictability of chronological age from the female cohort in all data sets does worse than that of the male cohort. To our knowledge, this is the first time that this phenomenon has been observed. An underlying physiological factor may account for this discrepancy or it may be that the variance in females with regards to age is greater than that of males. We also propose that the male model is more likely to be directly applicable to male clinical populations. The root mean squared error and r2 values change only slightly between the application of the male model to both NHANES and LPCH. The female model, however, shows a greater magnitude of change when applied to NHANES and LPCH. This implies that there is a physiological or condition specific occurrence in the clinical setting that affects the applicability of the model. We believe that this may be due to the nature of clinical data as well as gender-specific features that may require further investigation.

We show that models applied to cross-sectional cohort data can be applied to the medical domain with a caveat that care must be taken when applying the female model. Our analyses, in addition to the work done by Fliss and colleagues, demonstrate that valuable information can be gained when bioinformatics methods are directly applied to clinical data. Clinical biomarker data gathered from electronic medical records offers investigators a new source of information from which to better understand the underlying biology. While we only examined chronological aging in this analysis, it is often the difference between the chronological age and biological age that is indicative of a disease or condition. The ability to computationally predict chronological age can help shed light on these differences so that we can have a better understanding of diseases that may modulate aging. For example, one can examine how disease affects the aging process by looking for perturbations in the rate of aging among patient groups stratified by ICD-9-CM codes [39]. While translational bioinformatics most often focuses on taking knowledge gained from molecular data into the clinical setting, reverse translational bioinformatics enables the use of clinical data to shed light on physiological processes. We believe that the vast amounts of clinical data from electronic medical records should be leveraged for this purpose and that doing so will enable pertinent physiological discoveries.

CONCLUSION

We have developed gender-specific models that predict pediatric adolescent chronological age based on blood biomarker profiles. The models resulted in previously known and possibly novel physiological responses to pediatric aging. We have suggested that models generated from biomarkers gathered from cross-sectional surveys are applicable to other independent cohorts. Finally, we have shown that clinical data can be used to validate these models. We believe that these results imply that bioinformatics methods can be applied directly to clinical data from electronic medical records to better understand biological processes underlying human pathophysiology. The amount of clinical data is staggering and the ability to leverage human data for reverse translational bioinformatics will give greater insight into aging, diseases, and many other dynamic human physiological processes.

ACKNOWLEDGEMENTS

We thank Shai Shen-Orr, Silpa Suthram, and Michael Walker for their comments and suggestions. We thank Susan Weber, Todd Ferris, and Henry Lowe for providing us with de-identified clinical data. This work was funded by the Lucile Packard Foundation for Children’s Health, the William R. Hewlett Stanford Graduate Fellowship, NIBIB and NLM training grant T15 LM007033. We thank Alex Skrenchuk and Dr. Boris Oskotsky for IT support. We thank the Hewlett Packard Foundation for computational resources.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

REFERENCES

  • 1.McDonald CJ, Overhage JM, Tierney WM, Dexter PR, Martin DK, Suico JG, Zafar A, Schadow G, Blevins L, Glazener T. The Regenstrief medical record system: a quarter century experience. International Journal of Medical Informatics. 1999;54:225–253. doi: 10.1016/s1386-5056(99)00009-x. [DOI] [PubMed] [Google Scholar]
  • 2.Hersh WR. Medical informatics: improving health care through information. Jama. 2002;288:1955. doi: 10.1001/jama.288.16.1955. [DOI] [PubMed] [Google Scholar]
  • 3.Payne PRO, Johnson SB, Starren JB, Tilson HH, Dowdy D. Breaking the translational barriers: the value of integrating biomedical informatics and translational research. Journal of Investigative Medicine. 2005;53:192. doi: 10.2310/6650.2005.00402. [DOI] [PubMed] [Google Scholar]
  • 4.Fliss A, Ragolsky M, Rubin E. Reverse Translational Bioinformatics: A Bioinformatics Assay Of Age, Gender And Clinical Biomarkers. In: Butte A, editor. 2008 Summit on Translational Bioinformatics. San Francisco, CA: 2008. [PMC free article] [PubMed] [Google Scholar]
  • 5.Tanner JM. Growth at adolescence, with a general consideration of the effects of hereditary and environmental factors upon growth and maturation from birth to maturity. 2d ed. Oxford: Blackwell Scientific Publications; 1962. [Google Scholar]
  • 6.Kulin HE, Bwibo N, Mutie D, Santner SJ. The effect of chronic childhood malnutrition on pubertal growth and development. Am J Clin Nutr. 1982;36:527–536. doi: 10.1093/ajcn/36.3.527. [DOI] [PubMed] [Google Scholar]
  • 7.Chaning-Pearce SM, Solomon L. A longitudinal study of height and weight in black and white Johannesburg children. S Afr Med J. 1986;70:743–746. [PubMed] [Google Scholar]
  • 8.Demirjian A, Goldstein H, Tanner JM. A new system of dental age assessment. Hum Biol. 1973;45:211–227. [PubMed] [Google Scholar]
  • 9.Demirjian A, Goldstein H. New systems for dental maturity based on seven and four teeth. Ann Hum Biol. 1976;3:411–421. doi: 10.1080/03014467600001671. [DOI] [PubMed] [Google Scholar]
  • 10.Eveleth PB, Tanner JM. Worldwide variation in human growth. 2nd ed. Cambridge [England]; New York: Cambridge University Press; 1990. [Google Scholar]
  • 11.Greulich WW, Pyle SI. Radiographic atlas of skeletal development of the hand and wrist [print] 2nd ed. Stanford, CA: Stanford University Press; 1959. [Google Scholar]
  • 12.Tanner JM. Assessment of skeletal maturity and prediction of adult height (TW2 method) [print] London; New York: Academic Press; 1975. [Google Scholar]
  • 13.Mora S, Boechat MI, Pietka E, Huang HK, Gilsanz V. Skeletal age determinations in children of European and African descent: applicability of the Greulich and Pyle standards. Pediatr Res. 2001;50:624–628. doi: 10.1203/00006450-200111000-00015. [DOI] [PubMed] [Google Scholar]
  • 14.Maber M, Liversidge HM, Hector MP. Accuracy of age estimation of radiographic methods using developing teeth. Forensic Sci Int. 2006;159(Suppl 1):S68–S73. doi: 10.1016/j.forsciint.2006.02.019. [DOI] [PubMed] [Google Scholar]
  • 15.Lee CK, Klopp RG, Weindruch R, Prolla TA. Gene expression profile of aging and its retardation by caloric restriction. Science. 1999;285:1390. doi: 10.1126/science.285.5432.1390. [DOI] [PubMed] [Google Scholar]
  • 16.Sohal RS, Weindruch R. Oxidative stress, caloric restriction, and aging. Nature. 1987;327:725. doi: 10.1126/science.273.5271.59. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.McCay CM, Crowell MF, Maynard LA. The Effect of Retarded Growth Upon the Length of Life Span and Upon the Ultimate Body Size: One Figure. Journal of Nutrition. 1935;10:63. [PubMed] [Google Scholar]
  • 18.Colman RJ, Beasley TM, Allison DB, Weindruch R. Attenuation of sarcopenia by dietary restriction in rhesus monkeys. Journals of Gerontology Series A: Biological and Medical Sciences. 2008;63:556. doi: 10.1093/gerona/63.6.556. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Anderson RM, Shanmuganayagam D, Weindruch R. Caloric restriction and aging: studies in mice and monkeys. Toxicol Pathol. 2009;37:47–51. doi: 10.1177/0192623308329476. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Kulin HE, Reiter EO. Gonadotropins during childhood and adolescence: a review. Pediatrics. 1973;51:260–271. [PubMed] [Google Scholar]
  • 21.Forest MG, Peretti E, Bertrand J. Hypothalamic-pituitary-gonadal relationships in man from birth to puberty. Clinical Endocrinology. 1976;5:551–569. doi: 10.1111/j.1365-2265.1976.tb01985.x. [DOI] [PubMed] [Google Scholar]
  • 22.Themmen APN, Huhtaniemi IT. Mutations of gonadotropins and gonadotropin receptors: elucidating the physiology and pathophysiology of pituitary-gonadal function. Endocrine Reviews. 2000;21:551–583. doi: 10.1210/edrv.21.5.0409. [DOI] [PubMed] [Google Scholar]
  • 23.Juul A, Bang P, Hertel NT, Main K, Dalgaard P, Jorgensen K, Muller J, Hall K, Skakkebaek NE. Serum insulin-like growth factor-I in 1030 healthy children, adolescents, and adults: relation to age, sex, stage of puberty, testicular size, and body mass index. Journal of Clinical Endocrinology & Metabolism. 1994;78:744–752. doi: 10.1210/jcem.78.3.8126152. [DOI] [PubMed] [Google Scholar]
  • 24.Juul A, Flyvbjerg A, Frystyk J, Muller J, Skakkebaek NE. Serum concentrations of free and total insulin-like growth factor-I, IGF binding proteins-1 and-3 and IGFBP-3 protease activity in boys with normal or precocious puberty. Clinical Endocrinology. 1996;44:515–523. doi: 10.1046/j.1365-2265.1996.711531.x. [DOI] [PubMed] [Google Scholar]
  • 25.Lofqvist C, Andersson E, Gelander L, Rosberg S, Blum WF, Albertsson Wikland K. Reference values for IGF-I throughout childhood and adolescence: a model that accounts simultaneously for the effect of gender, age, and puberty. J Clin Endocrinol Metab. 2001;86:5870–5876. doi: 10.1210/jcem.86.12.8117. [DOI] [PubMed] [Google Scholar]
  • 26.Leger J, Mercat I, Alberti C, Chevenne D, Armoogum P, Tichet J, Czernichow P. The relationship between the GH/IGF-I axis and serum markers of bone turnover metabolism in healthy children. European Journal of Endocrinology. 2007;157:685. doi: 10.1530/EJE-07-0402. [DOI] [PubMed] [Google Scholar]
  • 27.Hyattsville MD. Centers for Disease Control and Prevention (CDC). National Center for Health Statistics (NCHS) National Health and Nutrition Examination Survey Data. U.S. Department of Health and Human Services, Centers for Disease Control and Prevention; [Google Scholar]
  • 28.Efron B, Hastie T, Johnstone I, Tibshirani R. Least angle regression. ANNALS OF STATISTICS. 2004:407–451. [Google Scholar]
  • 29.Chuang-Stein PC. Laboratory data in clinical trials: a statistician’s perspective. Controlled clinical trials. 1998;19:167–177. doi: 10.1016/s0197-2456(97)00123-2. [DOI] [PubMed] [Google Scholar]
  • 30.Royston P, Matthews JNS. Estimation of reference ranges from normal samples. Statistics in Medicine. 1991;10 doi: 10.1002/sim.4780100503. [DOI] [PubMed] [Google Scholar]
  • 31.Olkin I, Hotelling H. Contributions to probability and statistics; essays in honor of Harold Hotelling. Stanford, Calif.: Stanford University Press; 1960. [Google Scholar]
  • 32.Scholz FW, Stephens MA. K-sample Anderson-Darling tests. Journal of the American Statistical Association. 1987:918–924. [Google Scholar]
  • 33.Daniel WA. Hematocrit: maturity relationship in adolescence. Pediatrics. 1973;52:388–394. [PubMed] [Google Scholar]
  • 34.Bao W, Dalferes ER, Jr, Srinivasan SR, Webber LS, Berenson GS. Normative distribution of complete blood count from early childhood through adolescence: the Bogalusa Heart Study. Preventive medicine. 1993;22:825. doi: 10.1006/pmed.1993.1075. [DOI] [PubMed] [Google Scholar]
  • 35.Fischbach FT, Dunning MB. A Manual of laboratory and diagnostic tests [print/digital] 7th ed. Philadelphia: Lippincott Williams & Wilkins; 2004. [Google Scholar]
  • 36.Bennett DL, Ward MS, Daniel WA., Jr The relationship of serum alkaline phosphatase concentrations to sex maturity ratings in adolescents. The Journal of pediatrics. 1976;88:633. doi: 10.1016/s0022-3476(76)80025-x. [DOI] [PubMed] [Google Scholar]
  • 37.Schwartz GJ, Haycock GB, Spitzer A. Plasma creatinine and urea concentration in children: normal values for age and sex. The Journal of pediatrics. 1976;88:828. doi: 10.1016/s0022-3476(76)81125-0. [DOI] [PubMed] [Google Scholar]
  • 38.Ritchie RF, Palomaki GE, Neveux LM, Navolotskaia O. Reference distributions for immunoglobulins A, G, and M: a comparison of a large cohort to the world’s literature. Journal of clinical laboratory analysis. 1998;12 doi: 10.1002/(SICI)1098-2825(1998)12:6&#x0003c;371::AID-JCLA7&#x0003e;3.0.CO;2-T. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Chen DP, Weber SC, Constantinou PS, Ferris TA, Lowe HJ, Butte AJ. Novel integration of hospital electronic medical records and gene expression measurements to identify genetic markers of maturation. Pac Symp Biocomput. 2008:243–254. [PMC free article] [PubMed] [Google Scholar]

RESOURCES