Skip to main content
. 2019 Aug 7;2(8):e198719. doi: 10.1001/jamanetworkopen.2019.8719

Figure 1. Overview of the Pulmonary Embolism (PE) Prediction Pipeline.

Figure 1.

A, Pulmonary Embolism Result Forecast Model (PERFORM): the proposed workflow begins with all raw structured electronic medical record (EMR) data within 1 year prior to the encounter that is then arranged as a timeline into feature vectors. A machine learning model is then trained with the feature vectors labeled with ground truth PE outcome data to arrive at a model capable of predicting the probability of PE for an unseen patient from the holdout internal and external data set. This can be applied to provide a risk score in clinical decision support for patients referred for computed tomographic (CT) imaging for PE. B, Overview of the temporal feature engineering (each type of EMR data is color coded) and example encounters of 2 patients and how the features have been computed by preserving temporal sequence. CTA indicates CT angiography; ED, emergency department; ICD, International Classification of Diseases; and PERC, Pulmonary Embolism Rule-out Criteria.