Skip to main content
. 2012 Nov 3;2012:436–445.

Figure 1.

Figure 1.

A graphical overview of the data-extraction process. From the data warehouse, which contains six types of records, and the genetic data, we extract histories for each patient. We represent each patient history using a vector of binary variables. Pairs of variables are used to represent recent and not-necessarily-recent diagnoses, procedures, etc. We vary the representations considered by optionally (i) using a set of curated risk factors to limit the variables included in each patient vector, and (ii) adding the variables that represent the genetic profile of each patient.