Application of Machine Learning to Predict Patient No-Shows in an Academic Pediatric Ophthalmology Clinic

Jimmy Chen; Isaac H Goldstein; Wei-Chun Lin; Michael F Chiang; Michelle R Hribar

. 2021 Jan 25;2020:293–302.

Application of Machine Learning to Predict Patient No-Shows in an Academic Pediatric Ophthalmology Clinic

Jimmy Chen ¹, Isaac H Goldstein ¹, Wei-Chun Lin ², Michael F Chiang ^1,², Michelle R Hribar ^1,²

PMCID: PMC8075453 PMID: 33936401

Abstract

Patient "no-shows" are missed appointments resulting in clinical inefficiencies, revenue loss, and discontinuity of care. Using secondary electronic health record (EHR) data, we used machine learning to predict patient no-shows in follow-up and new patient visits in pediatric ophthalmology and to evaluate features for importance. The best model, XGBoost, had an area under the receiver operating characteristics curve (AUC) score of 0.90 for predicting no-shows in follow-up visits. The key findings from this study are: (1) secondary use of EHR data can be used to build datasets for predictive modeling and successfully predict patient no-shows in pediatric ophthalmology, (2) models predicting no-shows for follow-up visits are more accurate than those for new patient visits, and (3) the performance of predictive models is more robust in predicting no-shows compared to individual important features. We hope these models will be used for more effective interventions to mitigate the impact ofpatient no-shows.

Introduction

Medical clinics rely on full schedules to maintain revenue and provide high quality, longitudinal care for patients. Patient "no-shows" are missed appointments without prior clinic notification that disrupt clinical care. No-shows increase clinical inefficiencies and create revenue loss by decreasing clinic volumes.^1,2 Additionally, missed appointments result in discontinuity of care and worse health outcomes for patients.^3-5 Reducing patient no-shows is a focus for most medical clinics today.

Developing predictive models and/or identifying significant variables associated with patient no-shows may result in more effective, targeted interventions to reduce no-show rates. Electronic health records (EHRs) contain many available variables for secondary use in these models; EHR data has been successfully reused for models in clinical research, quality assurance, predictive modeling, and scheduling simulations.^⁸ There are many prior studies for patient no-shows in wide range of medical specialties; the average no-show rate reported in these studies is 23%.⁹ The models developed in these studies found the following significant variables: younger age,^10-14 distance from clinic,^10,15 lead time (time from the scheduling date to the appointment date),^13,15-18 insurance carrier¹⁹ (especially Medicaid),^11,14,15,18 and history of previous no-shows.^{20,21,17,13,22,18} Currently, the best performing models in literature have an AUC from 0.83-0.86.^17,22

The purpose of our study was to develop and validate a model to predict patient no-shows in a pediatric ophthalmology clinic. We chose pediatric ophthalmology as our subspecialty domain because vision loss due to common pediatric ophthalmology disorders (i.e. retinopathy of prematurity, strabismus, amblyopia) is often preventable with early intervention and regular follow-up. Furthermore, pediatrics is also a specialty with historically high no-show rates, ^14,15,24 but no models have been developed to predict no-shows or to analyze for variables significant for no-shows. We feel there are opportunities to improve prior no-show models for use in pediatric ophthalmology by using a combination of patient clinical and demographic data, history of past no-shows, and EHR specific variables (such as MyChart use), stratifying our models by comparing follow-up versus new patient visits, and rigorously evaluating imbalanced data using state-of-the-art machine learning algorithms.

Methods

Study Institution

Oregon Health & Science University (OHSU) is a large academic medical center in Portland, Oregon which includes over 50 faculty ophthalmology physicians who perform over 130,000 outpatient eye exams annually. The department provides primary eye care, and serves as a major referral center in the Pacific Northwest and nationally. In 2006, OHSU implemented an institution-wide EHR system (EpicCare; Epic Systems, Madison, WI) All ambulatory practice management, including documentation, order entry, and billing, are performed using the EHR. This study was approved by the institutional review board at OHSU and adheres to the Declaration of Helsinki.

Dataset

Appointment data were extracted from OHSU's EHR datamart for all patient office visits from January 1, 2012 to December 31, 2018 for 7 pediatric ophthalmology providers. The outcome variable, a patient no-show, was defined as a visit in which the patient failed to arrive the day of the appointment without contacting the clinic in advance. Canceled appointments prior to clinic were not counted as no-shows to ensure all missed appointments were truly not intervenable. The scheduled appointments in our modeling dataset were limited to the most recent follow-up appointment for each patient scheduled from January 1, 2015 to December 31, 2018. Specifically, the appointment was included if it was either the most recent follow-up office visit that the patient attended, a new patient visit, or the most recent no-show visit in that timeframe. Only one office visit per patient was included in this study. The restricted time frame of our appointment dataset ensured we had enough previous visit data in our larger dataset of visits from 2012 - 2018. Patients with missing insurance types and diagnoses were excluded from this study.

Model Features

We chose to include features that we hypothesized could potentially be associated with or have an impact on appointment attendance. These features were either readily extractable from the EHR or generated from existing encounter data and consisted of categorical and continuous data.

Categorical features were separated into the following areas: demographic, time-based, and diagnoses. Demographic features included: age, gender ethnicity, insurance type, clinic location, and English as the first language. Time-based features included appointment time of the day, day of the week, and month. Visit diagnoses were grouped into 15 categories based on ICD-9 and ICD-10 codes.

Continuous historical features generated from our larger extracted visit dataset included: number of previous visits, number of prior no-shows, lead time (time from when the appointment was scheduled to the visit date), average time between previous visits, number of previous cancels, number of prior same day cancels (defined as a cancelled visit by the patient within 24 hours of the scheduled visit time).

Predictive Model Development and Validation

Data analysis and model training were performed using R²⁵ (version 3.5.1) and Python²⁶ (version 3.7.3), specifically using the machine learning packages scikit-learn²¹ andXGBoost.²⁸ We hypothesized that longitudinal, previous visit data would significantly impact a model's ability to predict patient no-shows and tested this hypothesis by training separate models on datasets of only follow-up visits and new patient visits. In the latter, we excluded all continuous features that required calculations with previous visit data. All categorical features were one-hot encoded into binary variables (0 vs. 1) for each category. Each dataset was split into a training set and a test set in a 3:1 ratio with stratification to maintain the same proportion of no-shows in both datasets. The training set was further split into 5-folds for cross validation and hyperparameters were tuned using a randomized grid search on 4 different algorithms: XGBoost, random forest, support vector regression (SVR), and least absolute shrinkage and selection operator (LASSO) regression. Cross-validation scoring was tuned to maximize the PR score. We chose to train on XGBoost and random forest to evaluate whether these ensemble techniques could identify strong relationships in our set of diverse features. Additionally, we chose two standard regression algorithms (SVR and LASSO regression, a type of regularized logistic regression) to compare performance against the aforementioned models. In particular, logistic regression has historically been used extensively in predicting patient no-shows. ^{10-13,18,20,21,23}

Resampled learning curves were generated to ensure our algorithms did not overfit. Machine learning metrics evaluating model performance, specifically the area under the receiver operating characteristics curve (AUC-ROC) scores and Precision-Recall (PR) scores, were calculated on the test set of the best performing model for each algorithm. We chose to include PR scores, sensitivity, and positive predictive value (PPV) to emphasize and assess the importance of accurately predicting the minority class (no-shows) and account for dataset imbalance.

Calculating Feature Importance

Feature importances were calculated for the models trained on XGBoost and random forest using default functionality provided in the scikit-learn²⁷ and XGBoost²⁸ packages. Because both algorithms are ensemble methods that use features to construct multiple decision trees, coefficients and effect sizes of individual features are not generated. For each decision tree, importance was defined as the impact of the feature on creating split points. Feature importance was calculated in the packages using a Gini purity index and averaged across all decision trees in each model. Coefficients were also extracted from the models trained on LASSO regression.

Results

Dataset

Overall, 5188 follow-up office visits and 3606 new patient visits met inclusion criteria, with one visit per patient included. The no-show rate for patients in this dataset was 13.4%, of which 794 (15.4%) follow-up visits and 385 (10.7%) new patient visits were no-shows. Features and their distributions are shown for both visit types in Table 2.

Table 2. Performance of Models by Algorithm. Four algorithms were trained on our training dataset of follow-up visits with 5-fold cross validation. For each algorithm, performance metrics were generated on our testing set. We chose to include Precision-Recall (PR) score, sensitivity, and positive predictive value (PPV) to account for data imbalance.

	AUC-ROC Score	Precision-Recall(PR) Score	Sensitivity	Positive PredictiveValue (PPV)
Follow-up Patients
XGBoost	0.90	0.74	0.45	0.88
Random Forest	0.88	0.69	0.34	0.92
Support Vector Regression	0.81	0.50	0.46	0.52
LASSO Regression	0.79	0.46	0.41	0.54
New Patients
LASSO Regression	0.74	0.27	0.11	0.37
Random Forest	0.74	0.26	0.04	0.40
Support Vector Regression	0.71	0.21	0.15	0.28
XGBoost	0.64	0.26	0.14	0.25

Open in a new tab

Model Performance

Performance metrics (AUC-ROC score, PR score, sensitivity, and PPV) of the best performing models for both follow-up patients and new patients are shown in Table 2. For follow-up visits, the model trained with XGBoost had the highest performance on the testing set (AUC = 0.90, PR score = 0.74). Though sensitivities and PPV varied across models trained on all 4 algorithms, random forest had the second-best performance (PR = 0.69). AUC and PR curves for the model trained on XGBoost are shown in Figure 1. The four algorithms listed above were also used to train models on the new patient dataset. Overall, the model trained on LASSO regression had the highest performance in predicting new patient no-shows, though PR scores were low (AUC = 0.74, PR = 0.27). This was reflected across all algorithms trained on the new patient data (PR = 0.21-0.27). Low sensitivities and PPV for these models also suggest that the majority of no-shows were incorrectly predicted in the new patient dataset.

Feature Importance for Follow-up Visits

We report feature importance for follow-up visits only because the accuracy was significantly higher than the new patient models. Table 3 shows the 6 most important features for our highest performing algorithms, XGBoost and random forest. Both algorithms agreed on the relative importance of the number of previous visits and the average number of days between visits in predicting patient no-shows. Though other variables varied in importance depending on the model used, other highly ranked features for both XGBoost and random forest included: younger age and number of previous no shows, lead time, insurance type, and number of previous canceled appointments.

Table 3. Most Important Features from XGBoost and Random Forest. Feature importance was extracted from the models trained on XGBoost and random forest. Overall, both algorithms ranked number of previous visits and average number of days between visits as the most important features associated with patient no-shows.

XGBoost		Random Forest
Feature	Importance	Feature	Importance
Number of previous visits	0.038	Average no. of days between visits	0.15
Average no. of days between visits	0.023	Number of previous visits	0.12
Insurance - Out of state Medicaid	0.021	Lead time	0.08
Cataract diagnosis	0.020	Day of the month	0.08
MyChart activated	0.019	Number of same day cancels	0.02
Age - toddler	0.019	Number of prior cancels	0.02

Open in a new tab

Discussion

This study used EHR data to develop machine learning models to identify factors and evaluate the performance of these models in predicting patient no-shows in pediatric ophthalmology. Our study has three key findings: (1) secondary use of EHR data can be used to build datasets for predictive modeling and successfully predict patient no-shows in pediatric ophthalmology, (2) models predicting no-shows for follow-up visits are more accurate than those for new patient visits, and 3) performance of predictive models is more robust in predicting no-shows compared to individual important features.

The first key finding is that secondary use of EHR data can be used to build datasets for predictive modeling and successfully predict patient no-shows in pediatric ophthalmology. Because ophthalmology is a busy, high-volume specialty, the EHR is an ideal domain to generate large datasets for modeling. The EHR contains both demographic and clinical patient data, as well as time-series variables that can be calculated from encounter data over patients' visit histories. Using these data, we were able to create a large, robust dataset that contained a broad set of features to predict patient no-shows.

Our best performing model, trained on follow-up data with XGBoost, performed better than the best performing models currently reported in literature with an AUC of 0.90 compared to models performing with a range of AUCs from 0.68-0.86 (Table 2) to our knowledge. However, AUC likely overstates the performance of models focused on predicting patient no-shows because it is not sensitive to data imbalance. We report performance metrics such as PR score, sensitivity, and PPV to emphasize data imbalance in our datasets (15.4% and 10.7% no-show rates for follow-up and new patient datasets respectively) as well as the importance and difficulty of correctly predicting patient no-shows over patients showing up to their appointments. Compared to literature which reported models assessed with these metrics, our best performing model also has slight improvements in performance with these metrics. Though a few models have been developed for prediction of no-shows in primary care,^{10,12,13,20,22} our findings suggest that EHR data can be used to build large, robust datasets for many specialties, including ophthalmology.

The second key finding is that models predicting no-shows for follow-up visits are more accurate than those for new patient visits. For example, when both the follow-up and new patient datasets were trained on XGBoost, the follow-up model outperformed the new patient model (PR score = 0.74 vs. 0.26), especially in its ability to predict patient no-shows (sensitivity = 0.45 vs. 0.14). Our findings suggest that while patient no-shows are largely attributed to random circumstances, there is an element of behavioral tendencies that longitudinal time-series data can be used to predict. Training models on longitudinal data included in follow-up visits resulted in better performance compared to models trained on new patient visits without this data. Though there are studies examining the relationship between specific variables such as lead time and no-shows in follow-up and new patients,³⁰ previously published no-show models have either only used new patient status as a variable¹⁷ or did not specifically exclude or stratify new patients from their studies. Since new patient no-shows are especially harmful to clinic efficiency, adding previous visit data from other specialties or clinics to their predictive no-show models may improve their accuracy.

The third key finding is that performance of predictive models is more robust in predicting no-shows compared to individual important features. There was some consistency in our feature importance: the most important features in our follow-up patient models were the number of previous visits and the average number of days between visits. Number of previous attended visits attended is likely a reflection of patient attendance history, and average number of days between visits is similar to a metric used in a predictive model developed by Mohammadi et al (number of days since last appointment).¹⁷ Interestingly, the number of previous no-shows is a well-documented factor associated with no-shows^{10-13,17,18,21,22} and was not found to be an important feature in our models. A summary of reviewed show models, their performance, and important features is shown in Table 4.

Table 4: Published Models Predicting Patient No-Shows. We conducted a brief literature review on published no-show models by querying PubMed and Google Scholar using the terms: 'no-show models,' 'missed appointments,' and 'appointment non-adherence.' Ancestor search was used to broaden our search. We extracted the following from each paper: the dataset domain, the top performing algorithm and its performance, and the 3 most important variables (either the three highest odds ratios or the first three listed variables). Variables such as prior no-show history, age, scheduling lead time, and insurance type were the most significant predictors of future no-shows. To our knowledge, performance ranged from an AUC of 0.68 to 0.86.

Author, Year	Domain	Top Performing Algorithm	Performance Metrics	Top 3 Important Variables
Odonkor et al, 2017¹¹	Anesthesia	Logistic regression	AUC = 0.74	Age < 65 years, Ethnic minority, Medicaid or Medicare
Guzek et al, 2014¹⁵	Pediatric neurology	Univariate/multivariate logistic regressions	Odds ratio	Medicaid, distance for clinic, lead time
Mohammadi et al, 2018¹⁷	Communityhealthcenters	Naive Bayes Classifier	AUC = 0.86, PPV = 0.45, Sensitivity = 0.73	Lead time, prior no-shows, number of days since last appointment
Ding et. al,2018²³	Multiple specialties	LASSO regression	AUC = 0.8	Depended on specialty
Luo et al, 2018²⁹	Surgery	Random forest	AUC = 0.68, Sensitivity = 0.62,PPV = 0.45	N/A
Goffman et al, 2017²¹	Multiple specialties	Logistic regression	AUC = 0.71	Prior no-shows, lead time, same day appointments
Huang,Hanauer,2014¹²	Pediatrics	Logistic regression	Odds ratio	Visit type, younger age, prior no-show history
Torres et al, 2015¹³	Primary care	Logistic regression	Odds ratio	Prior no-shows, lead time, younger age
Shimoda et al,2018²²	Primary care (Japan)	XGBoost	AUC = 0.83	Prior no-shows
Daggy et al, 2010¹⁰	Primary care	Logistic regression	Odds ratio	Younger, lead time, distance from clinic
Lenzi et al, 2019²⁰	Primary care	Naive and mixed effect logistic regression	AUC =0.81	Previous no-shows, same day appointments
Harvey et al, 2017¹⁸	Radiology	Logistic regression	AUC = 0.75	Previous no-shows, lead time, insurance type

Open in a new tab

On the other hand, there was significant variability in the other important features in our models such as lead time, age MyChart use, and number of prior cancels. Disagreement in the relative importance of the other features used to build these predictive models may stem from how each algorithm incorporates features into their models. From our review, models reporting odds ratios of high-risk factors for patient no-shows are well reported in literature.^10,12,13,15 However, odds ratios show associations with no-shows retrospectively and are not always validated on unseen prospective data. In practice, patients identified as having one of these high-risk factors (i.e. Medicaid insurance) may be double booked in a scheduling system to reduce the risk of idle clinic time, but this may result in frequently overbooked clinics. For example, out of 2543 follow-up patients with Medicaid, 529 (20.8%) of them were no-shows, yet our model trained on XGBoost accurately predicted 45% of patients who no-showed. Therefore, using the model to understand and predict patient no-show behavior may be more meaningful than interpreting feature importances alone.

While many of our significant features correlate with those previously reported, the full prediction model will better detect patients at high risk of no-shows, allowing for more better overbooking strategies. However, targeted interventions such as customized patient education or mitigation of social factors may also be effective strategies to improve appointment adherence and continuity of care for children at risk for vision loss who are also predicted to be at risk for missing appointments.

Our study has limitations future work could address. First, our study was performed at a single academic center for a single subspecialty. Pediatrics is a domain in which patients often are brought to clinic by another person, introducing another layer of unpredictability. Additionally, it is unclear how generalizable a model trained on data from a pediatric ophthalmology clinic is generalizable to other ophthalmic subspecialties and institutions. However, it may be beneficial to produce models that focus on a local level such as subspecialty due to the varying context of each medical specialty.²³ Second, feature importances only represent how much weight a variable has in discriminating patient no-shows vs. shows and do not give the exact effect of the variable (such as positive or negative). Further work is needed to specifically understand how these variables are impacting patient no-shows.

Conclusion

In conclusion, machine learning models can be developed to accurately predict follow-up visits in pediatric ophthalmology. Our findings reinforce the importance of incorporating features that take into account past behavior when developing predictive no-show models. While EHRs have been criticized for decreasing clinical efficiency, real-time integration of our predictive models into the EHR may optimize future clinic scheduling compared to current scheduling heuristics. We hope our findings will result in more efficient clinic operations and help mitigate the adverse effects of patient no-shows.

Acknowledgments

Funding: This work was supported by the National Institutes of Health (Bethesda, MD) R00LM12238 and P30EY10572 and by unrestricted departmental funding from Research to Prevent Blindness (New York, NY). Jimmy Chen is supported by the Research to Prevent Blindness Medical Student Fellowship (New York, NY). Wei-Chun Lin is supported by a National Library of Medicine training grant from National Institutes of Health (Bethesda, MD), T15LM007088.

Disclosures

Michael F. Chiang is a consultant for Novartis (Basel, Switzerland), an equity owner in InTeleretina, LLC (Honolulu, HI).

Figures & Table

Table 1. Selected Features in the Follow-Up and New Patient Visit Datasets. Our dataset contained 5188 follow-up visits and 3606 new patient visits, with one visit included per patient. Time-related data (day of the week, month, etc) was not included in this table for brevity. Only categorical data that did not require previous visit information was included in modeling our new patient visits.

	Follow-Up Patients		New-Patients
Categorical Variables	No.	%	No.	%
Gender
Female	2669	51.4%	1860	51.6%
Age
Infant (0-1 years old)	255	4.9%	431	12.0%
Toddler (1-3 years old)	968	18.7%	814	22.6%
Pre-School (3-6 years old)	915	17.6%	528	14.6%
School Age (6-12 years old)	1852	35.7%	1003	27.8%
Adolescent (12 years old-18 years old)	602	11.6%	343	9.5%
Adult (> 18 years old)	584	11.3%	487	13.5%
Insurance Type
Commercial	2877	55.5%	1711	47.4%
Oregon Medicaid	2103	40.5%	1392	38.6%
Out-of-state Medicaid	440	8.5%	221	6.1%
Medicare	250	4.8%	215	6.0%
Other	76	1.5%	65	1.8%
None	8	0.2%	2	0.1%
Ethnicity
Non-Hispanic	3983	76.8%	2836	78.6%
Hispanic	1084	20.9%	666	18.5%
Unknown	121	2.3%	104	2.9%
Location
Portland	4510	86.9%	3300	91.5%
Bend	66	1.3%	36	1.0%
Vancouver	612	11.8%	270	7.5%
Previous Visit Diagnosis^a
Strabismus	1896	36.5%	N/A
Amblyopia	615	11.9%	N/A
Oculoplastics	481	9.3%	N/A
Syndromic Malformation or Systemic Illness	330	6.4%	N/A
Refractive Error	209	4.0%	N/A
Retinopathy of Prematurity	191	3.7%	N/A
Other^b	1466	28.3%	N/A
MyChart Activated	1397	26.9%	799	22.2%
English First Language	4590	88.5%	3291	91.3%
Continuous Variables^c	Mean ± SD		Mean ± SD
Number of Previous Visits	2.8 ± 1.7		N/A
Number of Prior No-Shows	0.1 ± 0.4		N/A
Lead Time (Days)	102.6 ± 80.7		N/A
Time from Last Appointment (Days)	152.7 ± 124.3		N/A
Number of Previous Cancels	0.5 ± 0.9		N/A
Number of Prior Same Day Cancels	0.3 ± 0.6		N/A
Total	5188		3606

Open in a new tab

Visit diagnosis was not included with new patients because of overfitting to missing diagnoses as no-show patients

"Other" encompasses the following diagnostic groupings: cornea, diplopia, glaucoma, retina, optic nerve, nystagmus, or non-ophthalmic (systemic) disease

All continuous variables were not included as features in the new patient dataset

References

1.Berg BP, Murr M, Chermak D, Woodall J, Pignone M, Sandler RS, et al. Estimating the cost of no-shows and evaluating the effects of mitigation strategies. Med Decis Mak Int J Soc Med Decis Mak. 2013 Nov;33(8):976–85. doi: 10.1177/0272989X13478194. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Kheirkhah P, Feng Q, Travis LM, Tavakoli-Tabasi S, Sharafkhaneh A. Prevalence, predictors and economic consequences of no-shows. BMC Health Serv Res. 2016 Jan 14;16:13–13. doi: 10.1186/s12913-015-1243-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Nguyen DL, Dejesus RS, Wieland ML. Missed appointments in resident continuity clinic: patient characteristics and health care outcomes. J Grad Med Educ. 2011 Sep;3(3):350–5. doi: 10.4300/JGME-D-10-00199.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Nuti LA, Lawley M, Turkcan A, Tian Z, Zhang L, Chang K, et al. No-Shows to primary care appointments: subsequent acute care utilization among diabetic patients. BMC Health Serv Res. 2012 Sep 6;12:304–304. doi: 10.1186/1472-6963-12-304. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Hwang AS, Atlas SJ, Cronin P, Ashburner JM, Shah SJ, He W, et al. Appointment “no-shows” are an independent predictor of subsequent quality of care and resource utilization outcomes. J Gen Intern Med. 2015 Oct;30(10):1426–33. doi: 10.1007/s11606-015-3252-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Sandhu E, Weinstein S, McKethan A, Jain SH. Secondary Uses of Electronic Health Record Data: Benefits and Barriers. Jt Comm J Qual Patient Saf. 2012 Jan 1;38(1):34–40. doi: 10.1016/s1553-7250(12)38005-7. [DOI] [PubMed] [Google Scholar]
7.Hribar MR, Read-Brown S, Goldstein IH, Reznick LG, Lombardi L, Parikh M, et al. Secondary use of electronic health record data for clinical workflow analysis. J Am Med Inform Assoc JAMIA. 2018 Jan 1;25(1):40–6. doi: 10.1093/jamia/ocx098. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Lin W-C, Goldstein IH, Hribar MR, Huang A, Chiang MF. Secondary Use of Electronic Health Record Data for Prediction of Outpatient Visit Length in Ophthalmology Clinics. AMIA Annu Symp Proc AMIA Symp. 2018 Dec 5;2018:1387–94. [PMC free article] [PubMed] [Google Scholar]
9.Dantas LF, Fleck JL, Cyrino Oliveira FL, Hamacher S. No-Shows in appointment scheduling – a systematic literature review. Health Policy. 2018 Apr 1;122(4):412–21. doi: 10.1016/j.healthpol.2018.02.002. [DOI] [PubMed] [Google Scholar]
10.Daggy J, Lawley M, Willis D, Thayer D, Suelzer C, DeLaurentis P-C, et al. Using no-show modeling to improve clinic performance. Health Informatics J. 2010 Dec 1;16(4):246–59. doi: 10.1177/1460458210380521. [DOI] [PubMed] [Google Scholar]
11.Odonkor CA, Christiansen S, Chen Y, Sathiyakumar A, Chaudhry H, Cinquegrana D, et al. Factors Associated With Missed Appointments at an Academic Pain Treatment Center: A Prospective Year-Long Longitudinal Study. Anesth Analg. [Internet] 2017;125(2) doi: 10.1213/ANE.0000000000001794. Available from: https://journals.lww.com/anesthesia- analgesia/Fulltext/2017/08000/Factors_Associated_With_Missed_Appointments_at_an.31.aspx . [DOI] [PubMed] [Google Scholar]
12.Huang Y, Hanauer DA. Patient no-show predictive model development using multiple data sources for an effective overbooking approach. Appl Clin Inform. 2014 Sep 24;5(3):836–60. doi: 10.4338/ACI-2014-04-RA-0026. [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Torres O, Rothberg MB, Garb J, Ogunneye O, Onyema J, Higgins T. Risk Factor Model to Predict a Missed Clinic Appointment in an Urban, Academic, and Underserved Setting. Popul Health Manag. 2014 Oct 9;18(2):131–6. doi: 10.1089/pop.2014.0047. [DOI] [PubMed] [Google Scholar]
14.Fiorillo CE, Hughes AL, I-Chen C, Westgate PM, Gal TJ, Bush ML, et al. Factors associated with patient no- show rates in an academic otolaryngology practice. The Laryngoscope. 2018 Mar;128(3):626–31. doi: 10.1002/lary.26816. [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Guzek LM, Fadel WF, Golomb MR. A Pilot Study of Reasons and Risk Factors for “No-Shows” in a Pediatric Neurology Clinic. J Child Neurol. 2014 Dec 10;30(10):1295–9. doi: 10.1177/0883073814559098. [DOI] [PubMed] [Google Scholar]
16.McMullen MJ, Netland PA. Lead time for appointment and the no-show rate in an ophthalmology clinic. Clin Ophthalmol Auckl NZ. 2015 Mar 18;9:513–6. doi: 10.2147/OPTH.S82151. [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Mohammadi I, Wu H, Turkcan A, Toscos T, Doebbeling BN. Data Analytics and Modeling for Appointment No-Show in Community Health Centers. J Prim Care Community Health. 2018;9:2150132718811692–2150132718811692. doi: 10.1177/2150132718811692. [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Harvey HB, Liu C, Ai J, Jaworsky C, Guerrier CE, Flores E, et al. Predicting No-Shows in Radiology Using Regression Modeling of Data Available in the Electronic Medical Record. J Am Coll Radiol. 2017 Oct 1;14(10):1303–9. doi: 10.1016/j.jacr.2017.05.007. [DOI] [PubMed] [Google Scholar]
19.Rosenthal BD, Hulst JB, Moric M, Levine BR, Sporer SM. The Effect of Payer Type on Clinical Outcomes in Total Knee Arthroplasty. J Arthroplasty. 2014 Feb 1;29(2):295–8. doi: 10.1016/j.arth.2013.06.010. [DOI] [PubMed] [Google Scholar]
20.Lenzi H, Ben ÂJ, Stein AT. Development and validation of a patient no-show predictive model at a primary care setting in Southern Brazil. PloS One. 2019 Apr 4;14(4):e0214869–e0214869. doi: 10.1371/journal.pone.0214869. [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Goffman RM, Harris SL, May JH, Milicevic AS, Monte RJ, Myaskovsky L, et al. Modeling Patient No-Show History and Predicting Future Outpatient Appointment Behavior in the Veterans Health Administration. Mil Med. 2017 May 1;182(5–6):e1708–14. doi: 10.7205/MILMED-D-16-00345. [DOI] [PubMed] [Google Scholar]
22.Shimoda A, Ichikawa D, Oyama H. Using machine-learning approaches to predict non-participation in a nationwide general health check-up scheme. Comput Methods Programs Biomed. 2018 Sep 1;163:39–46. doi: 10.1016/j.cmpb.2018.05.032. [DOI] [PubMed] [Google Scholar]
23.Ding X, Gellad ZF, Mather C, 3rd, Barth P, Poon EG, Newman M, et al. Designing risk prediction models for ambulatory no-shows across different specialties and clinics. J Am Med Inform Assoc JAMIA. 2018 Aug 1;25(8):924–30. doi: 10.1093/jamia/ocy002. [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Samuels RC, Ward VL, Melvin P, Macht-Greenberg M, Wenren LM, Yi J, et al. Missed Appointments: Factors Contributing to High No-Show Rates in an Urban Pediatrics Primary Care Clinic. Clin Pediatr (Phila) 2015 Feb 12;54(10):976–82. doi: 10.1177/0009922815570613. [DOI] [PubMed] [Google Scholar]
25.Core Team R. Austria: R Foundation for Statistical Computing . R: A Language and Environment for Statistical Computing. Vienna: 2014. [Google Scholar]
26.Python Software Foundation Python Language Reference. version 3.7. Available at http://www.python.org .
27.Pedregosa F., Varoquaux G., Gramfort A., Michel V., Thirion B., Grisel O., Blondel M., Prettenhofer P., Weiss R., Dubourg V., Vanderplas J., Passos A., Cournapeau D., Brucher M., Perrot M., Duchesnay E. Scikit-Learn: Machine learning in Python. Journal of Machine Learning Research. 2011;12:2825–2830. [Google Scholar]
28.Chen T, Guestrin C. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining [Internet] New York, NY, USA: Association for Computing Machinery; 2016. XGBoost: A Scalable Tree Boosting System; pp. 785–794. (KDD ’16). Available from: https://doi.org/10.1145/2939672.2939785. [Google Scholar]
29.Luo L, Zhang F, Yao Y, Gong R, Fu M, Xiao J. Machine learning for identification of surgeries with high risks of cancellation. Health Informatics J. 2018 Dec 5:1460458218813602. doi: 10.1177/1460458218813602. [DOI] [PubMed] [Google Scholar]
30.Drewek R, Mirea L, Adelson PD. Lead Time to Appointment and No-Show Rates for New and Follow-Up Patients in an Ambulatory Clinic. Health Care Manag [Internet] 2017;36(1) doi: 10.1097/HCM.0000000000000148. Available from: https://journals.lww.com/healthcaremanagerjournal/Fulltext/2017/01000/Lead_Time_to_Appointment_and_N o_Show_Rates_for_New.2.aspx . [DOI] [PubMed] [Google Scholar]

[r1-061_3409315] 1.Berg BP, Murr M, Chermak D, Woodall J, Pignone M, Sandler RS, et al. Estimating the cost of no-shows and evaluating the effects of mitigation strategies. Med Decis Mak Int J Soc Med Decis Mak. 2013 Nov;33(8):976–85. doi: 10.1177/0272989X13478194. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r2-061_3409315] 2.Kheirkhah P, Feng Q, Travis LM, Tavakoli-Tabasi S, Sharafkhaneh A. Prevalence, predictors and economic consequences of no-shows. BMC Health Serv Res. 2016 Jan 14;16:13–13. doi: 10.1186/s12913-015-1243-z. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r3-061_3409315] 3.Nguyen DL, Dejesus RS, Wieland ML. Missed appointments in resident continuity clinic: patient characteristics and health care outcomes. J Grad Med Educ. 2011 Sep;3(3):350–5. doi: 10.4300/JGME-D-10-00199.1. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r4-061_3409315] 4.Nuti LA, Lawley M, Turkcan A, Tian Z, Zhang L, Chang K, et al. No-Shows to primary care appointments: subsequent acute care utilization among diabetic patients. BMC Health Serv Res. 2012 Sep 6;12:304–304. doi: 10.1186/1472-6963-12-304. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r5-061_3409315] 5.Hwang AS, Atlas SJ, Cronin P, Ashburner JM, Shah SJ, He W, et al. Appointment “no-shows” are an independent predictor of subsequent quality of care and resource utilization outcomes. J Gen Intern Med. 2015 Oct;30(10):1426–33. doi: 10.1007/s11606-015-3252-3. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r6-061_3409315] 6.Sandhu E, Weinstein S, McKethan A, Jain SH. Secondary Uses of Electronic Health Record Data: Benefits and Barriers. Jt Comm J Qual Patient Saf. 2012 Jan 1;38(1):34–40. doi: 10.1016/s1553-7250(12)38005-7. [DOI] [PubMed] [Google Scholar]

[r7-061_3409315] 7.Hribar MR, Read-Brown S, Goldstein IH, Reznick LG, Lombardi L, Parikh M, et al. Secondary use of electronic health record data for clinical workflow analysis. J Am Med Inform Assoc JAMIA. 2018 Jan 1;25(1):40–6. doi: 10.1093/jamia/ocx098. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r8-061_3409315] 8.Lin W-C, Goldstein IH, Hribar MR, Huang A, Chiang MF. Secondary Use of Electronic Health Record Data for Prediction of Outpatient Visit Length in Ophthalmology Clinics. AMIA Annu Symp Proc AMIA Symp. 2018 Dec 5;2018:1387–94. [PMC free article] [PubMed] [Google Scholar]

[r9-061_3409315] 9.Dantas LF, Fleck JL, Cyrino Oliveira FL, Hamacher S. No-Shows in appointment scheduling – a systematic literature review. Health Policy. 2018 Apr 1;122(4):412–21. doi: 10.1016/j.healthpol.2018.02.002. [DOI] [PubMed] [Google Scholar]

[r10-061_3409315] 10.Daggy J, Lawley M, Willis D, Thayer D, Suelzer C, DeLaurentis P-C, et al. Using no-show modeling to improve clinic performance. Health Informatics J. 2010 Dec 1;16(4):246–59. doi: 10.1177/1460458210380521. [DOI] [PubMed] [Google Scholar]

[r11-061_3409315] 11.Odonkor CA, Christiansen S, Chen Y, Sathiyakumar A, Chaudhry H, Cinquegrana D, et al. Factors Associated With Missed Appointments at an Academic Pain Treatment Center: A Prospective Year-Long Longitudinal Study. Anesth Analg. [Internet] 2017;125(2) doi: 10.1213/ANE.0000000000001794. Available from: https://journals.lww.com/anesthesia- analgesia/Fulltext/2017/08000/Factors_Associated_With_Missed_Appointments_at_an.31.aspx . [DOI] [PubMed] [Google Scholar]

[r12-061_3409315] 12.Huang Y, Hanauer DA. Patient no-show predictive model development using multiple data sources for an effective overbooking approach. Appl Clin Inform. 2014 Sep 24;5(3):836–60. doi: 10.4338/ACI-2014-04-RA-0026. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r13-061_3409315] 13.Torres O, Rothberg MB, Garb J, Ogunneye O, Onyema J, Higgins T. Risk Factor Model to Predict a Missed Clinic Appointment in an Urban, Academic, and Underserved Setting. Popul Health Manag. 2014 Oct 9;18(2):131–6. doi: 10.1089/pop.2014.0047. [DOI] [PubMed] [Google Scholar]

[r14-061_3409315] 14.Fiorillo CE, Hughes AL, I-Chen C, Westgate PM, Gal TJ, Bush ML, et al. Factors associated with patient no- show rates in an academic otolaryngology practice. The Laryngoscope. 2018 Mar;128(3):626–31. doi: 10.1002/lary.26816. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r15-061_3409315] 15.Guzek LM, Fadel WF, Golomb MR. A Pilot Study of Reasons and Risk Factors for “No-Shows” in a Pediatric Neurology Clinic. J Child Neurol. 2014 Dec 10;30(10):1295–9. doi: 10.1177/0883073814559098. [DOI] [PubMed] [Google Scholar]

[r16-061_3409315] 16.McMullen MJ, Netland PA. Lead time for appointment and the no-show rate in an ophthalmology clinic. Clin Ophthalmol Auckl NZ. 2015 Mar 18;9:513–6. doi: 10.2147/OPTH.S82151. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r17-061_3409315] 17.Mohammadi I, Wu H, Turkcan A, Toscos T, Doebbeling BN. Data Analytics and Modeling for Appointment No-Show in Community Health Centers. J Prim Care Community Health. 2018;9:2150132718811692–2150132718811692. doi: 10.1177/2150132718811692. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r18-061_3409315] 18.Harvey HB, Liu C, Ai J, Jaworsky C, Guerrier CE, Flores E, et al. Predicting No-Shows in Radiology Using Regression Modeling of Data Available in the Electronic Medical Record. J Am Coll Radiol. 2017 Oct 1;14(10):1303–9. doi: 10.1016/j.jacr.2017.05.007. [DOI] [PubMed] [Google Scholar]

[r19-061_3409315] 19.Rosenthal BD, Hulst JB, Moric M, Levine BR, Sporer SM. The Effect of Payer Type on Clinical Outcomes in Total Knee Arthroplasty. J Arthroplasty. 2014 Feb 1;29(2):295–8. doi: 10.1016/j.arth.2013.06.010. [DOI] [PubMed] [Google Scholar]

[r20-061_3409315] 20.Lenzi H, Ben ÂJ, Stein AT. Development and validation of a patient no-show predictive model at a primary care setting in Southern Brazil. PloS One. 2019 Apr 4;14(4):e0214869–e0214869. doi: 10.1371/journal.pone.0214869. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r21-061_3409315] 21.Goffman RM, Harris SL, May JH, Milicevic AS, Monte RJ, Myaskovsky L, et al. Modeling Patient No-Show History and Predicting Future Outpatient Appointment Behavior in the Veterans Health Administration. Mil Med. 2017 May 1;182(5–6):e1708–14. doi: 10.7205/MILMED-D-16-00345. [DOI] [PubMed] [Google Scholar]

[r22-061_3409315] 22.Shimoda A, Ichikawa D, Oyama H. Using machine-learning approaches to predict non-participation in a nationwide general health check-up scheme. Comput Methods Programs Biomed. 2018 Sep 1;163:39–46. doi: 10.1016/j.cmpb.2018.05.032. [DOI] [PubMed] [Google Scholar]

[r23-061_3409315] 23.Ding X, Gellad ZF, Mather C, 3rd, Barth P, Poon EG, Newman M, et al. Designing risk prediction models for ambulatory no-shows across different specialties and clinics. J Am Med Inform Assoc JAMIA. 2018 Aug 1;25(8):924–30. doi: 10.1093/jamia/ocy002. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r24-061_3409315] 24.Samuels RC, Ward VL, Melvin P, Macht-Greenberg M, Wenren LM, Yi J, et al. Missed Appointments: Factors Contributing to High No-Show Rates in an Urban Pediatrics Primary Care Clinic. Clin Pediatr (Phila) 2015 Feb 12;54(10):976–82. doi: 10.1177/0009922815570613. [DOI] [PubMed] [Google Scholar]

[r25-061_3409315] 25.Core Team R. Austria: R Foundation for Statistical Computing . R: A Language and Environment for Statistical Computing. Vienna: 2014. [Google Scholar]

[r26-061_3409315] 26.Python Software Foundation Python Language Reference. version 3.7. Available at http://www.python.org .

[r27-061_3409315] 27.Pedregosa F., Varoquaux G., Gramfort A., Michel V., Thirion B., Grisel O., Blondel M., Prettenhofer P., Weiss R., Dubourg V., Vanderplas J., Passos A., Cournapeau D., Brucher M., Perrot M., Duchesnay E. Scikit-Learn: Machine learning in Python. Journal of Machine Learning Research. 2011;12:2825–2830. [Google Scholar]

[r28-061_3409315] 28.Chen T, Guestrin C. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining [Internet] New York, NY, USA: Association for Computing Machinery; 2016. XGBoost: A Scalable Tree Boosting System; pp. 785–794. (KDD ’16). Available from: https://doi.org/10.1145/2939672.2939785. [Google Scholar]

[r29-061_3409315] 29.Luo L, Zhang F, Yao Y, Gong R, Fu M, Xiao J. Machine learning for identification of surgeries with high risks of cancellation. Health Informatics J. 2018 Dec 5:1460458218813602. doi: 10.1177/1460458218813602. [DOI] [PubMed] [Google Scholar]

[r30-061_3409315] 30.Drewek R, Mirea L, Adelson PD. Lead Time to Appointment and No-Show Rates for New and Follow-Up Patients in an Ambulatory Clinic. Health Care Manag [Internet] 2017;36(1) doi: 10.1097/HCM.0000000000000148. Available from: https://journals.lww.com/healthcaremanagerjournal/Fulltext/2017/01000/Lead_Time_to_Appointment_and_N o_Show_Rates_for_New.2.aspx . [DOI] [PubMed] [Google Scholar]

PERMALINK

Application of Machine Learning to Predict Patient No-Shows in an Academic Pediatric Ophthalmology Clinic

Jimmy Chen, BA

Isaac H Goldstein, BA

Wei-Chun Lin, MD

Michael F Chiang, MD

Michelle R Hribar, PhD

Abstract

Introduction

Methods

Study Institution

Dataset

Model Features

Predictive Model Development and Validation

Calculating Feature Importance

Results

Dataset

Model Performance

Feature Importance for Follow-up Visits

Discussion

Conclusion

Acknowledgments

Disclosures

Figures & Table

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Application of Machine Learning to Predict Patient No-Shows in an Academic Pediatric Ophthalmology Clinic

Jimmy Chen, BA

Isaac H Goldstein, BA

Wei-Chun Lin, MD

Michael F Chiang, MD

Michelle R Hribar, PhD

Abstract

Introduction

Methods

Study Institution

Dataset

Model Features

Predictive Model Development and Validation

Calculating Feature Importance

Results

Dataset

Model Performance

Feature Importance for Follow-up Visits

Discussion

Conclusion

Acknowledgments

Disclosures

Figures & Table

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases