Table 1.
Experimental design and statistical considerations for maximizing the likelihood of reproducible biomarker discovery findings.
Study Design Step | Statistical Considerations |
---|---|
Study Goal | Define the population in which the protein marker will be used. Define the purpose of the protein marker—e.g., early detection, disease monitoring, etc. |
Specimens | Select case and control specimens randomly. Select from a prospectively collected (prior to knowledge of disease status) specimen biobank. Select specimens from the relevant time point in the disease course. Avoid convenience samples. Avoid pooling of specimens. |
Study Design | Plan sufficient sample size for discovery in light of realistic expected differences. Randomize specimens to assay run order. |
Differential Abundance Detection | Assess protein difference signals relative to variation. Incorporate statistical design into the analysis model. Apply correction for multiple comparisons. |
Panel or Signature Model Building | Finalize analysis plan in writing prior to beginning analyses. Employ optimism correction methods. Generate a fixed, locked-down algorithm. |
Validation | Perform verification of initial protein marker identifications. Perform internal model validation in the discovery sample set. Perform external model validation in an independent sample set. |