Abstract
Recent studies documented the importance of individuality and heterogeneity in care planning. In practice, varying behavioral responses are revealed in patients’ care management (CM) records. However, today’s care programs are structured around population-level evidence. What if care managers can take advantage of the revealed behavioral response for personalization? The goal of this study is thus to quantify behavioral response from CM records for informing individual-level intervention decisions. We present a Behavioral Response Inference Framework (BRIeF) for understanding differential behavioral responses that are key to effective care planning. We analyze CM records from a healthcare network over a 14-month period and obtain a set of 2,416 intervention-goal attainment records. Promising results demonstrate that the individual-level care planning strategies that are learned from practice by BRIeF, outperform population-level strategies, yielding significantly more accurate intervention recommendations for goal attainment. To our knowledge, this is the first study of learning practice-based evidence from CM records for care planning, suggesting that increased patient behavioral understanding could potentially benefit augmented intelligence for care management decision support.
Keywords: Care management, personalization, causal inference, metric learning, patient-centered care, augment intelligence
Introduction
Care management is inherently complex, affecting more than a quarter of the worldwide population who can benefit from care delivery models that incorporate behavioral understanding1,2,3,4. National Academy of Medicine recently reviewed 14 evidence delivery models and indicated the need for support patient individuality and heterogeneity5,6. In fact, intervention decisions in care management are never black or white. Two patients with similar medical conditions are likely to have different behavioral barriers for goal attainment and would, in turn, respond differently to the same intervention. Care delivery models must accommodate behavioral understanding to be effective.
However, existing care management (CM) programs follow structured protocols and guidelines to provide care manager advice on goals and interventions to the target patients before and during the call, without accounting for individual differences in behavioral responses. While structured programs have prior success in reducing readmission risk and improving chronic disease management, they are applying evidence that has been generated in “control” conditions at the population level. Given an increasing demand to explore patient responses in “real-world” conditions, a common solution is to design surveys like the Patient Health Engagement Scale to tailor interventions to individual needs7, as presented in Graffigna (2015)7.
In this study, we aim to explore how to automatically quantify patient behavioral response directly from observational data. In a more general sense, this search of real-world evidence is in a similar vein to those efforts that leverage observational data as an imperfect continuous panel survey in the field of public health8, as well as that differentiate behavioral expressions through Behavioral Signal Processing in the field of systems engineering9. In the field of health informatics, not many studies have attempted to quantify behavioral response to interventions, and when they do, they focus mainly on using insurance claim data, for one, Hawkins (2015) studied how to estimate patients’ “propensity to succeed” for data-driven referral, i.e., identifying which patients in a high-risk care management program are likely to benefit from being referred for additional Medicare interventions10.
Because we expect CM records themselves to contain practice-based knowledge regarding how patients respond to interventions differently, in this study we focus on learning from the CM records. In practice, providers have been keeping the records of CM-patient interactions for population health management and administrative purposes11.
More recently, the secondary use of CM records has been investigated to fuel care coordination applications such as shared care plan generation and best practice learning5,12,13. For our goal of quantifying behavioral responses to interventions, the CM records capture not only what interventions have been assigned to whom at what time, but also whether the implementation of each assigned intervention has led to positive goal attainment. These records can potentially help us to evaluate the intervention effectiveness in patients with different behavioral characteristics.
In this paper, we then set out to explore the potential of generating practice-based evidence from CM records and to test the hypothesis that individual-level care planning (as compared to population-level care planning) can better help suggest goal-attaining interventions. To test the hypothesis, we collect real-life goal attainment records and integrate machine learning approaches with a potential outcome framework14,15,16 to estimate population- and individual-level effects over multiple interventions for evaluation. The estimations are used to predict the likelihood of goal attainment, and the resulting recommendations are evaluated against the ground truth as observed in real life. The results are expected to increase patient behavioral understanding and engagement, while further facilitating care managers in care plan personalization for effective goal attainment.
Background
Several methodological challenges entail the task of care plan personalization. First, how to systematically compare multiple interventions in helping with outcomes? One common approach is a cross-over experiment: i.e., a number of subjects are given interventions of interest on separate occasions, and the proxy outcome measures are compared to determine its effectiveness20. However, doing an experiment on all interventions in a randomized controlled trial is not feasible and, in some cases, unethical. More recently, researchers are incorporating the principle of single-subject experiment21 into N-of-1 trials for precision medicine22. In the field of oncology, N-of-1 trials are applied to make treatment choices based on differential patient responses, e.g., US National Cancer Institute’s MATCH Trial23, Biomarker-integrated Approaches of Targeted Therapy for Lung Cancer Elimination (BATTLE) trial24.
In the domains where cross-over experiments are impractical, researchers have been exploring on the idea of emulating experiments from observational data. However, the problems of confounding and selection bias persist in observational data-driven approaches. To tackle these problems, on the one hand, a variety of statistical inference methods were proposed to capture and validate causal structure among risk factors25, 26, 27. On the other hand, the potential outcome framework14,15,16 has led to the development of a large family of causal inference methods to estimate the difference in counterfactual outcomes that would occur if receiving one intervention versus if not. Covariate adjustment (e.g., g-computation25) and data adjustment methods (e.g., propensity scoring28, doubly robust estimation29, targeted likelihood estimation30) were developed to estimate “as-treated” intervention effects on observational data in the presence of non-response31. Based on the estimation, alternative interventions can then be compared. For example, this is how public health researchers compared multiple smoking cessation policies, ranging from education, tax increases, an indoor smoking ban, to cessation service offerings32. In addition, epidemiologists could extend the framework to assess the risk ratio of one medication over another, e.g., for antiepileptic drug33.
While these studies aim to establish reliable estimators of average intervention effect at the population level, in this study we care more about the individual level estimation: i.e., how to leverage observational data to estimate intervention effect for each target patient? Although past studies have applied population averages on individual intervention34, they have also indicated its limitation. After all, we cannot really observe parallel universes, wherein each alternative intervention is given to the same patient. Furthermore, we do not usually have sufficient counts in observational data to emulate a controlled trial for individual intervention effect estimation35.
To investigate into individualized intervention effect, this study aims to integrate machine-learning approaches with the potential outcome framework. The aim is to develop models of conditional probability distributions for individual-level effect estimation. Although individual treatment effect estimation is not novel35,36,37,38,39,40, to our knowledge, this is the first study of individualized intervention recommendation based on revealed patient behavioral preferences in CM history, and the first to test the hypothesis whether an individual-level care planning strategy is preferable to population-level strategies in terms of recommendation accuracy for goal attainment.
Study Overview and Methodological Framework
Existing CM solutions do not support individualized intervention recommendation based on revealed patient behavioral preferences in CM history. As a first step, we empirically experiment with approaches that can quantify behavioral responses across multiple interventions and to recommend interventions individually. The rest of the paper is organized as follows: (1) data collection and goal attainment record extraction; (2) problem formulation for patient behavioral response understanding and multi-intervention ranking; (3) evaluation of care planning strategies.
Data Collection
We analyze goal attainment history in care management records from a private, not-for-profit healthcare network in the southeastern United States over a 14-month period and obtain 2,416 goal attainment records in patients’ care plans (“the GOAL dataset”). The GOAL dataset was collected between January 2016 and February 2017 for patients who have been recently discharged from an acute hospital admission and assigned to a transitional care program whose objective is reducing hospital readmissions. Care managers would follow a patient’s care plans to set goals for the patient to prompt certain actions, e.g., improving medication and discharge instruction adherence, addressing patient health and safety concerns, promoting health behavior change, and reinforcing self-management regimen of chronic diseases. During the data curation process, we removed records without a clear interpretation, e.g., those with interventions that closed after the end date of an assigned goal, those that were still open at the time of data collection. Table 1 summarizes the characteristics of the GOAL dataset. Program experience is defined as the number of care program assignments to a patient in the data collection period; days in the program is the number of days a patient has stayed in the program; CM call on which weekday indicates the percentage of calls being made on a specific weekday (Monday - Friday).
Table 1.
A summary of the GOAL dataset in this study.
| Age Median (25%-75%) | 66 (56-75) |
| Female (%) | 46.58% |
| Program Experience Median (25%-75%) | 1 (1-2) |
| Days in the program Median (25%-75%) | 13 (11-16) |
| CM call on which weekday (%) | |
| Monday | 36.26% |
| Tuesday | 34.08% |
| Wednesday | 27.18% |
| Thursday | 25.21% |
| Friday | 19.89% |
Using the dataset, we identify 16 goals covering a wide spectrum of care need. The assigned goals exhibit widely varying goal attainment rates. Here, the goal attainment rate is defined as the percentage of assigned goals that reached the “met” status. Figure 1 shows the distribution of goal attainment status across goals. Some goals are similar in nature, which can be further grouped into six focus areas: Education (e.g., post-discharge understanding), Medication (e.g., adherence), Reducing Risk (e.g., resolve care gaps), Self-care (e.g., heart failure home self-management), Implementation (e.g. installing fall prevention facility), and Others (e.g., obtaining accurate patient information).
Figure 1.
Distribution of goal attainment status across a set of 27 goals in the GOAL dataset.
Goal Attainment Record Extraction
Similarly, in the dataset we identify 131 interventions and group them into six categories where categories define types of care coordination activities that can be used to address the intervention: Referral (e.g., referral to see a nutritionist for diabetic diet education), Education (e.g., educate patients on the importance of physical activity), Coordination (e.g., follow up with providers on refills), Screening (e.g., assess breathing symptoms), Coaching (e.g., provide a log for side effect recording), and Other (including following up with provider treatment).
To operationalize goal attainment measure, we create a binary flag as the proxy outcome label: if the goal is indicated ‘met’ in the record, then 1; otherwise 0. Then, for each of the six intervention categories, we calculate its goal attainment success rate as the percentage of goal assignments that are successfully obtained after implementing an intervention in this category. Table 1 shows the overall goal attainment rates of each of the six categories, revealing a drastic difference in the effectiveness across interventions – with Referral and Education reaching over 90%, Coaching type of intervention around 50%, and Screening type yielding as low as 38% of success rate.
Problem formulation for patient behavioral response understanding
In this work, we develop Behavioral Response Inference Framework (BRIeF) to apply the potential outcome framework for population-level and individual-level intervention effect estimation. (See Figure 3 for the schematic framework in a larger picture of general behavioral learning pipeline.) Formally, the observed goal attainment outcome is denoted as Yi ∈ {0,1} of a patient i with a k-column vector of individual and contextual covariates Xi and Di ∈ {0,1} where Di = 0 represents the control group, and Di = 1 represents the intervention group.. The covariates include not only demographic information (such as age and gender), but also patient care program context (such as program experience and days in the program) and the interactions between care managers and patients (such as the day of making the recorded call). To avoid collinearity, highly correlated covariates (with correlation coefficients greater than 0.8) are pre-filtered. The individual intervention effect τ(Xi) is then defined as the expected treatment effect of being treated with an intervention versus not being treated on an individual unit with observed covariates x:
| (1) |
Figure 3.
BRIeF Framework in a Behavioral Learning pipeline
where Yi(0) represents the outcome of Patient i in a control group, and Yi(1) represents the outcome of the same patient in the intervention group. However, as Yi(1) and Yi(0) cannot be observed at the same time, they are estimated by sampling from p(Yt | x, t) and taking the difference in the mean outcomes of the samples under both condition. In its simplest form, a logistic regression formulation is applied:
| (2) |
where Di ∈ {0,1} where Di = 0 represents Patient i in control group, and Di = 1 represents Patient i in the intervention group. For each intervention, we quantify the behavioral response as the likelihood of goal attainment after implementing the target intervention. Propensity score analysis is applied to calculate the conditional probability of receiving intervention given the observed covariates. For covariate selection, we follow Rubin and Imbens (2015)16 to perform a sequence of likelihood ratio tests for selecting the covariates and higher order terms for inclusion in the propensity score analysis. For effect estimation, we modify Austin (2011)28’s doubly robust weighting procedure to derive a pseudo-population through adjusting data unit weighting, where the weight for Patient i is if Patient i is in the intervention group, and if Patient i is in the control group. The procedure has been shown in past research as robust to the problem of model misspecification.
Problem formulation for multi-intervention recommendation
Once the population- and individual level effects are estimated, the computed quantity can then be used in the next step of BRIeF as the basis to compare multiple interventions at each point of care. We then feed the estimation of intervention effects (X) and goal attainment outcomes (y) into a Gaussian Naïve Base classifier to train our models for predicting behavioral response (as in a binary flag of goal attainment) in each of the six intervention categories. At each point of care when multiple interventions are available as options for goal attainment for a target patient, the system would indicate whether each of the available interventions has been predicted to be effective for this patient given the individual-level intervention effect estimation. A care manager can then base on the predictions to determine which interventions would have better chances to help the target patient achieving his care plan goal.
Evaluation of care planning strategies
To establish the baseline and test the hypothesis in an experiment, we empirically evaluate the following three care planning strategies in the GOAL dataset: (1) Population-based effectiveness (BASELINE), (2) Population-level BR IeF estimation (BRIeF-Pop), and (3) Individual-level BRIeF estimation (BRIeF-Ind).
Population-level Effectiveness (Recommendation based on Population-based Effectiveness): To set up the baseline, we first assume a one-size-for-all strategy (“No care planning”), by which care managers would always recommend the best performing intervention category given population-level evidence. The best performing intervention is determined by the effectiveness shown in previous records at the population-level.
Population-level BRIeF (Personalization based on Population-level Estimation): With the baseline established, the next population-level care planning strategy is then achieved by BRIeF-generated, population-level behavioral response prediction (BRIeF-Pop). Simply put, this strategy guides multi-intervention recommendation with the predictions from a model that is trained with the input of population-level estimates.
Individual-level BRIeF (Personalization based on Individual-level Estimation): The next strategy is individual-level care planning, which is achieved by BRIeF-generated, individual-level behavioral response prediction (BRIeF-Ind). This strategy guides the process of multi-intervention recommendation with the predictions from a model that is trained with the input of individual-level estimates.
During the evaluation phase, we apply a 5-fold cross validation on the task of predicting whether a goal would be achieved given the recommended intervention. The interventions are then annotated with the BRIeF-generated behavioral response predictions. Based on the trained model, a binary decision of whether this intervention would reach goal attainment would then be assigned to each test record, and the performance of the model prediction on unseen records is evaluated. In each fold, 4/5 of the GOAL dataset are used to train behavioral response prediction models. The remaining 1/5 would be used to retrospectively evaluate the choice of an intervention for subsequent goal attainment status. All evaluations are done in a 5-fold cross validation setting to avoid over-fitting. A weighted average of the evaluation measures is taken to aggregate multiple intervention results at the strategy level.
Results
This section reports on results from (1) the BRIeF-generated intervention effects and (2) the evaluation of the three care planning strategies in terms of the evaluation metrics of person-specific recommendation prediction.
BRIeF-generated intervention effect
By applying the BRIeF framework to the GOAL dataset of 2416 goal-intervention records, we estimated the intervention effects after adjusting for covariate and data bias and made predictions on patient behavioral response in terms of goal attainment. (Please note that although the adjustment is achieved in a causal analysis fashion, we are not claiming to have performed a causal analysis under a strong ignorability assumption41.)
Based on the estimated intervention effects, we identified a new ranking of goal attainment effectiveness over the six intervention categories. Figure 2 reports on the average estimation of intervention effects on goal attainment under BRIeF. The estimated effects are averaged over goal attainment records that received the same target intervention and using the control condition for comparison. The y-axis shows the ranked order of intervention categories. The error bars of 95 percent confidence intervals are also added to each point.
Figure 2.

Sorted BRIeF-generated intervention effect toward goal attainment across intervention categories
Figure 2 suggests that patients have better success in achieving goals when Education type of intervention is implemented, but struggle with Screening and Coaching type of interventions. This is similar to the ranking in the population-level effectiveness before adjustment (c.f. Table 2). What is distinctively different in the two sets of rankings is that the top-ranked Referral intervention in Table 2 is not as effective after introducing the bias reduction adjustment procedures. In fact, if we only consider evidence in population-level effectiveness, a Naïve “no care planning” strategy would always suggest the Referral intervention, since this is the most effective intervention category based on the population-level evidence. However, without BRIeF, this Naïve care planning strategy is likely to lead to disappointing results in goal attainment (as suggested by after-the-fact practice-based evidence in Figure 2). This confirms the need for better behavioral understanding in care planning.
Table 2.
Goal attainment rates are sorted across the six categories of interventions that have been recommended to patients by care manager to help meet patients’ care plan goals.
| Intervention Category | Population-level Goal Attainment Effectiveness (% of successfully attained goals) |
|---|---|
| Referral | 93.65% |
| Education | 92.64% |
| Coordination | 62.86% |
| Other | 52.94% |
| Coaching | 50.00% |
| Screening | 37.09% |
Evaluation of the care planning strategies on the recommendation task
To further validate and examine how BRIeF can help with care planning, we evaluate the three strategies on a real-life task, i.e., to recommend interventions for goal attainment at the point of care. The recommendation accuracy are again measured under 5-fold cross-validation. In each fold, we apply the BRIeF framework to estimate on both the population- and individual-level, and then identify which intervention to recommend to each target patient at the point of care. We feed the estimations into the Gaussian Naïve-Bayes (GNB) classifier to train our models for behavioral response prediction for each of the six intervention categories. We also considered non-parametric Support Vector Machine (SVM) classifier, but since it yielded similar results as if we have chosen the computationally efficient GNB classifier, we decided to go with the GNB classifier. The resulting recommendations are compared with the observed ground truth. The goal attainment rate is measured in terms of accuracy, i.e., the percentage of correct goal attainment predictions if following the BRIeF recommendations for care planning, and other metrics including precision, recall and F1 score.
In the evaluation, we performed 5-fold cross validation. To compute baseline, the best-performing intervention ranked by the population-level evidence in the training set is considered the optimal intervention for all the test cases. Hence, the predicted outcome is 1 for the test cases where the applied intervention is the optimal intervention learned from training data; otherwise, it is 0 where the applied intervention is not the optimal intervention. The predicted outcome of the test data is compared with the observed outcome to benchmark the performance.
Table 3 shows the evaluation metrics of the three strategies proposed for this experiment. Among them, the care planning strategy based on individual-level estimation outperforms all the other population-level strategies, yielding 87.24% of correct recommendations. This validates our initial hypothesis that an individual-level care planning strategy is preferable. The baseline has high precision as the intervention with the highest goal attainment rate is applied. The strong disadvantage of applying a “one-size-fits-all” rule based on population-level evidence is reflected in the very low accuracy and recall of the baseline approach.
Table 3.
Performance of the three strategies of intervention recommendation on goal attainment.
| Care Planning Strategy | Precision | Recall | F1 Score | Accuracy |
|---|---|---|---|---|
| Baseline (no care planning) | 93.29 | 18.26 | 30.36 | 28.98 |
| BRIeF-Pop personalization | 79.68 | 88.07 | 83.22 | 85.70 |
| BRIeF-Ind personalization | 83.62 | 91.27 | 86.61 | 87.24 |
A deeper dive into other types of evaluation metric, including recall, and the aggregated metrics of F1 and accuracy is shown in Table 3. To generate the average effects, we compare effects to all target interventions to the ground truth observations. Results show that the performance varies widely from intervention to intervention. Results also suggest that a better care planning strategy should account for the varying behavioral responses systematically to the different intervention categories. Depending on the need of recommendations, e.g., identifying as many goal attaining-interventions as possible, or making as few mistakes in recommendations as possible, the most effective strategy might be different. This is indicated by the practice-based evidence found in care management records.
Discussion
The BRIeF framework infers the goal attainment outcome of a target intervention from a large observational dataset and compares multiple interventions for effective goal attainment. While previous studies commonly evaluated simulated datasets to learn theoretical bounds such as asymptotic consistency and generalization error bounds17,18,19, this study evaluates on observational data directly. In particularly, this study evaluates the empirical evidence generated from care management records on a real-life task, i.e., recommending interventions for effective goal attainment. BRIeF estimates population-level and individual-level intervention effects for each target patient at the point of care, and then predicts whether an intervention is likely to help this patient reach their goal. The evaluation task is to test whether each of the recommended interventions would help with goal attainment. The ground truth is obtained from observation CM records to calculate precision (i.e., whether a goal would have been obtained following the recommended intervention) and other evaluation metrics. We also use the assessed evaluation metrics to evaluate individual-level care planning strategy under different intervention categories.
Previous research has proposed to handle the bias inherent in observational data in two ways: One is covariate adjustment, e.g., through g-computation25. Another one is data unit adjustment by applying propensity-scoring methods (e.g., regression, inversed probability of treatment weighted28) to derive a pseudo-population for controlled trial emulation. Doubly robust estimation methods29 have been proposed to account for both attribute and data bias at the same time. In addition, methods such as targeted maximum likelihood estimation30 are under development to further allow for data-adaptive estimation and targeted minimum loss-based estimation simultaneously to reduce data bias while also minimizing the risk of model misspecification. In this study, we are currently taking the double robust estimation approach and using the serial likelihood ratio testing for covariate selection. A natural next step is for us to consider expanding our covariates to include automatically generated features from care manager notes (e.g., 1-gram and 2-gram language models). Then, we will need to start developing a high-dimension approach that can systematically perform estimation and covariate selection simultaneously43,44. The quantified behavior responses are obtained through not only information about the general effectiveness of an intervention to the overall population, but also how likely this patient would follow through the assigned intervention. Many questions remain to be open questions. For example, how to identify who would better respond to an intervention over an alternative and decide on what intervention to recommend for an individual?
To further investigate into these open questions, in the future, we will consider incorporating the recent development in modeling conditional probability distributions to estimate individual-level effect directly35,36,37, as well as those in learning subgroups for identifying heterogeneous intervention effects38,39,40. Moreover, in some of our prior work, contextual bandit and reinforcement learning methods have also demonstrated its potential to facilitate personalized intervention choices under dynamic treatment regimens and time-varying exposures 41,42. A comprehensive review is beyond the scope of this paper. But we believe that there is a need here to find an overarching framework to incorporate more dynamic input and time-varying treatment in an interactive environment. Figure 3 shows an initial drawing of the behavioral learning pipeline, with BRIeF as the starting point of this investigation. This framework is important in helping how to adapt interventions in a care plan according to the target patient’s changing context and need. Meanwhile, this framework would also allow us to start considering how to incorporate practice-based evidence into care flow. This requires not only incrementally increasing the understanding of behavioral phenotypes (e.g., location, literacy, education) from patient data being curated over time, but also learning differential care manager performance levels to the various interventions observed from practice data. These are all truly multidisciplinary questions. The next-generation paradigm would be for health informaticists, epidemiologists, economists and statisticians all working on this problem from different angles. We expect that more innovations in this direction would help us tap the full potential of understanding behavioral response differences for care planning.
Conclusion
Patient intervention choice for goal attainment could be improved if care managers leverage the BRIef-generated recommendation for care plan personalization. Using BRIeF for patient behavior understanding and multi-intervention recommendation will help enable practice-based evidence generation and personalized care planning, closing more care gaps and leading to better behavioral outcomes and health outcomes for patients. In this study, we empirically validated the hypothesis of individual-level care planning would be preferable to population-level planning in a real-life task. Our future goal is to systematically handle the risk of attribute and data bias, as well as model misspecification problems, by integrating machine learning approaches with the potential outcome framework in a scalable framework.
Table 4.
Performance of the BRIeF-IND personalization strategy on the task of recommending intervention for goal attainment. (D=1: coaching; D=2: coordination; D=3: education; D=4: other; D=5; referral; D=6: screening)
| Precision (Goal attainment rate) | Recall | F1 Score | Accuracy | ||
|---|---|---|---|---|---|
| Intervention category specific thresholds | D = 1 | 60.11 | 64.97 | 54.53 | 56.67 |
| D = 2 | 64.90 | 88.73 | 71.57 | 58.57 | |
| D = 3 | 92.26 | 100 | 95.96 | 92.26 | |
| D = 4 | 40.47 | 68 | 40.57 | 34.76 | |
| D = 5 | 93.64 | 100 | 96.70 | 93.64 | |
| D = 6 | 0 | 0 | 0 | 62.88 | |
| Weighted Average of Intervention category specific thresholds (D=1 to D=6) | 83.62 | 91.27 | 86.61 | 87.24 | |
References
- 1.Tuomilehto J, Lindström J, Eriksson G, Valle T, Hämäläinen H, Ilanne-Parikka P, Keinänen-Kiukaanniemi S, Laakso M, Louheranta A, Rastas M, Salminen V. Prevention of type 2 diabetes mellitus by changes in lifestyle among subjects with impaired glucose tolerance. NEJM. 2001 May 3;344(18):1343–50. doi: 10.1056/NEJM200105033441801. [DOI] [PubMed] [Google Scholar]
- 2.West JA, Miller NH, Parker KM, Senneca D, Ghandour G, Clark M, Greenwald G, Heller RS, Fowler MB, DeBusk RF. A comprehensive management system for heart failure improves clinical outcomes and reduces medical resource utilization. American Journal of Cardiology. 1997 Jan 1;79(1):58–63. doi: 10.1016/s0002-9149(96)00676-5. [DOI] [PubMed] [Google Scholar]
- 3.Aubert RE, Herman WH, Waters J, Moore W, Sutton D, Peterson BL, Bailey CM, Koplan JP. Nurse case management to improve glycemic control in diabetic patients in a health maintenance organization: a randomized, controlled trial. Annals of internal medicine. 1998 Oct 15;129(8):605–12. doi: 10.7326/0003-4819-129-8-199810150-00004. [DOI] [PubMed] [Google Scholar]
- 4.Greineder DK, Loane KC, Parks P. A randomized controlled trial of a pediatric asthma outreach program. Journal of Allergy and Clinical Immunology. 1999 Mar 1;103(3):436–40. doi: 10.1016/s0091-6749(99)70468-9. [DOI] [PubMed] [Google Scholar]
- 5.Brown RS, Peikes D, Peterson G, Schore J, Razafindrakoto CM. Six features of Medicare coordinated care demonstration programs that cut hospital admissions of high-risk patients. Health Affairs. 2012 Jun 1;31(6):1156–66. doi: 10.1377/hlthaff.2012.0393. [DOI] [PubMed] [Google Scholar]
- 6.Long PV. 2017. editor. Effective Care for High-need Patients: Opportunities for Improving Outcomes, Value, and Health. National Academy Of Medicine. [PubMed] [Google Scholar]
- 7.Graffigna G, Barello S, Bonanomi A, Lozza E. Measuring patient engagement: development and psychometric properties of the Patient Health Engagement (PHE) scale. Frontiers in psychology. 2015 Mar 27;6:274. doi: 10.3389/fpsyg.2015.00274. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Diaz F, Gamon M, Hofman JM, Kiciman E, Rothschild D. Online and social media data as an imperfect continuous panel survey. PloS one. 2016 Jan 5;11(1):e0145406. doi: 10.1371/journal.pone.0145406. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Narayanan S, Georgiou PG. Behavioral signal processing: Deriving human behavioral informatics from speech and language. Proceedings of the IEEE. 2013 May;101(5):1203–33. doi: 10.1109/JPROC.2012.2236291. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Hawkins K, Ozminkowski RJ, Mujahid A, Wells TS, Bhattarai GR, Wang S, Hommer CE, Huang J, Migliori RJ, Yeh CS. Propensity to succeed: Prioritizing individuals most likely to benefit from care coordination. Population health management. 2015 Dec 1;18(6):402–11. doi: 10.1089/pop.2014.0121. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Hodach R, Grundy P, Weiner M, Handmaker K. E. 2016. Provider-Led Population Health Management Key Healthcare Strategies in the Cognitive Era. Wiley. [Google Scholar]
- 12.Haynes S, Kim KK. A mobile care coordination system for the management of complex chronic disease. Studies in health technology and informatics. 2016;225:505–9. [PubMed] [Google Scholar]
- 13.Whatcom County Pursuing Perfection Project. My Shared Care Plan. Institute for Healthcare Improvement. Available from: http://www.ihi.org/resources/Pages/Tools/MySharedCarePlan.aspx.
- 14.Rubin DB. Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of educational Psychology. 1974 Oct;66(5):688. [Google Scholar]
- 15.Hernán MA, Robins JM. Instruments for causal inference: an epidemiologist’s dream?. Epidemiology. 2006 Jul 1;17(4):360–72. doi: 10.1097/01.ede.0000222409.00878.37. [DOI] [PubMed] [Google Scholar]
- 16.Imbens GW, Rubin DB. Cambridge University Press; 2015. Causal Inference in Statistics, Social, and Biomedical Sciences. [Google Scholar]
- 17.Chernozhukov V, Chetverikov D, Demirer M, Duflo E, Hansen C, Newey W, Robins J. 2017. Jun, Double/debiased machine learning for treatment and causal parameters. [Google Scholar]
- 18.Athey S, Imbens G. Recursive partitioning for heterogeneous causal effects. Proceedings of the National Academy of Sciences. 2016 Jul 5;113(27):7353–60. doi: 10.1073/pnas.1510489113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Shalit U, Johansson F, Sontag D. 2016. Jun 13, Estimating individual treatment effect: generalization bounds and algorithms. arXiv preprint arXiv:1606.03976. [Google Scholar]
- 20.Senn SS. John Wiley & Sons; 2008. Feb 28, Statistical issues in drug development. [Google Scholar]
- 21.Smith JD. Single-case experimental designs: A systematic review of published research and current standards. Psychological methods. 2012 Dec;17(4):510. doi: 10.1037/a0029312. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Schork NJ. Personalized medicine: time for one-person trials. Nature. 2015 Apr 30;520(7549):609–11. doi: 10.1038/520609a. [DOI] [PubMed] [Google Scholar]
- 23.Colwell J. NCI-MATCH Trial Draws Strong Interest. Cancer Discov. 2016;6:334. doi: 10.1158/2159-8290.CD-NB2016-018. [DOI] [PubMed] [Google Scholar]
- 24.Rashdan S, Gerber DE. Going into BATTLE: umbrella and basket clinical trials to accelerate the study of biomarker-based therapies. Annals of translational medicine. 2016 Dec;4(24) doi: 10.21037/atm.2016.12.57. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Pearl J. 2015. Detecting latent heterogeneity. Sociological Methods & Research; p. 0049124115600597. [Google Scholar]
- 26.Kleinberg S, Hripcsak GA. Review of causal inference for biomedical informatics. Journal of biomedical informatics. 2011 Dec 1;44(6):1102–12. doi: 10.1016/j.jbi.2011.07.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Ma S, Statnikov A. Methods for computational causal discovery in biomedicine. Behaviormetrika. 2017 Jan;44(1):165–191. [Google Scholar]
- 28.Austin PC. An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivariate behavioral research. 2011 May 31;46(3):399–424. doi: 10.1080/00273171.2011.568786. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Funk MJ, Westreich D, Wiesen C, Stürmer T, Brookhart MA, Davidian M. Doubly robust estimation of causal effects. American journal of epidemiology. 2011 Mar 8;173(7):761–7. doi: 10.1093/aje/kwq439. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Van der Laan MJ, Rose S. Springer Science & Business Media; 2011. Jun 17, Targeted learning: causal inference for observational and experimental data. [Google Scholar]
- 31.Hernán MA, Hernández-Díaz S. Beyond the intention-to-treat in comparative effectiveness research. Clinical Trials. 2012 Feb;9(1):48–55. doi: 10.1177/1740774511420743. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Glass TA, Goodman SN, Hernán MA, Samet JM. Causal inference in public health. Annual review of public health. 2013 Mar 18;34:61–75. doi: 10.1146/annurev-publhealth-031811-124606. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Devinsky O, Dilley C, Ozery-Flato M, Aharonov R, Goldschmidt YA, Rosen-Zvi M, Clark C, Fritz P. Changing the approach to treatment choice in epilepsy using big data. Epilepsy & Behavior. 2016 Mar;56:32–7. doi: 10.1016/j.yebeh.2015.12.039. [DOI] [PubMed] [Google Scholar]
- 34.Kent DM, Hayward RA. Limitations of applying summary results of clinical trials to individual patients: the need for risk stratification. Jama. 2007 Sep 12;298(10):1209–12. doi: 10.1001/jama.298.10.1209. [DOI] [PubMed] [Google Scholar]
- 35.Weiss J, Kuusisto F, Boyd K, Liu J, Page D. Machine learning for treatment assignment: Improving individualized risk attribution. In AMIA Annual Symposium Proceedings. 2015;Vol. 2015:1306. [PMC free article] [PubMed] [Google Scholar]
- 36.Hartford J, Lewis G, Leyton-Brown K, Taddy M. 2016. Dec 30, Counterfactual Prediction with Deep Instrumental Variables Networks. arXiv preprint arXiv:1612.09596. [Google Scholar]
- 37.Yoon J, Jordon J, van der Schaar m. 2018. GANITE: Estimation of Individualized Treatment Effects using Generative Adversarial Nets. ICLR. [Google Scholar]
- 38.Athey S, Imbens G. Recursive Partitioning for Heterogeneous Causal Effects. Proceedings of the National Academy of Sciences. 2016;113(27):7353–7360. doi: 10.1073/pnas.1510489113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Grimmer J, Messing S, Westwood SJ. Estimating heterogeneous treatment effects and the effects of heterogeneous treatments with ensemble methods. Political Analysis. 2017 Oct;25(4):413–34. [Google Scholar]
- 40.Shahn Z, Madigan D. Latent Class Mixture Models of Treatment Effect Heterogeneity. Bayesian Analysis. 2017;12(3):831–54. [Google Scholar]
- 41.Bastani H, Bayati M. Online decision-making with high-dimensional covariates [Google Scholar]
- 42.Hu X, Hsueh P.S, Qian M, Chen C-H, Diaz K.M, Cheung Y-K. (2017), 2017. A First Step Towards Behavioral Coaching for Managing Stress: A Case Study on Optimal Policy Estimation with Multi-stage Threshold Q-learning. AMIA. [PMC free article] [PubMed] [Google Scholar]
- 43.Heckman JJ, Lopes HF, Piatek R. Treatment effects: A Bayesian perspective. Econometric reviews. 2014 Feb 10;33(1-4):36–67. doi: 10.1080/07474938.2013.807103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Hahn PR, Carvalho CM, He J, Puelz D. Bayesian Analysis 2016. Supplement to “Regularization and confounding in linear regression for treatment effect estimation.”. [Google Scholar]


