Abstract
Objective
The study tests a community- and data-driven approach to homelessness prevention. Federal policies call for efficient and equitable local responses to homelessness. However, the overwhelming demand for limited homeless assistance is challenging without empirically supported decision-making tools and raises questions of whom to serve with scarce resources.
Materials and Methods
System-wide administrative records capture the delivery of an array of homeless services (prevention, shelter, short-term housing, supportive housing) and whether households reenter the system within 2 years. Counterfactual machine learning identifies which service most likely prevents reentry for each household. Based on community input, predictions are aggregated for subpopulations of interest (race/ethnicity, gender, families, youth, and health conditions) to generate transparent prioritization rules for whom to serve first. Simulations of households entering the system during the study period evaluate whether reallocating services based on prioritization rules compared with services-as-usual.
Results
Homelessness prevention benefited households who could access it, while differential effects exist for homeless households that partially align with community interests. Households with comorbid health conditions avoid homelessness most when provided longer-term supportive housing, and families with children fare best in short-term rentals. No additional differential effects existed for intersectional subgroups. Prioritization rules reduce community-wide homelessness in simulations. Moreover, prioritization mitigated observed reentry disparities for female and unaccompanied youth without excluding Black and families with children.
Discussion
Leveraging administrative records with machine learning supplements local decision-making and enables ongoing evaluation of data- and equity-driven homeless services.
Conclusions
Community- and data-driven prioritization rules more equitably target scarce homeless resources.
Keywords: child, public housing, homeless persons, machine learning, policy
BACKGROUND AND SIGNIFICANCE
Growing homelessness rates impact public health—disproportionately burdening underrepresented and marginalized populations. The overwhelming demand for scarce homeless resources challenges federal policies requiring communities to develop equitable and cost-efficient responses to homelessness. Local service providers continually face difficult decisions on whom to serve first. Although current HUD guidelines emphasize vulnerability, serving the worst-off first, the realities of severe resource constraints and inaccurate and racially biased risk assessments undermine decision-making.1–4 Recent public health research notes the inherent ethical tradeoffs of prioritizing 1 group over another for scarce services.5–8 For example, policies prioritizing homeless infants before the homeless elderly require communities to engage in values-driven system design that exceeds the data-driven capacities of coordinated assessment strategies. Questions remain regarding how best to design local homeless services that achieve equity and efficiency.8,9,10,11
The present study extends recent innovations in data-driven homeless system design to integrate advances in equity-based artificial intelligence.12–14 Based on precision public health,15–18 counterfactual machine learning identifies services most likely to benefit similar individuals by examining heterogeneous treatment effects from observational data.19 Community-wide administrative records capture an array of homeless services delivered with outcomes over time that feed algorithmic predictions of household-level response to interventions—moving further upstream than predicting risk in the absence of intervention.8,20 Ethical considerations—including unique issues raised by resource scarcity—prohibit fully automated service allocations;5,6,21 however, aggregating household-level predictions for subgroups of interest provides communities with additional evidence for decision-making. Any allocation of scarce resources inherently produces winners and losers, but services guided by transparent prioritization rules allow ongoing assessment of pre-determined thresholds for equity and efficiency tradeoffs.6,21 We offer a proof-of-concept case study based on long-standing work within St. Louis, MO, and consider central issues for local implementation.
MATERIALS AND METHODS
The study tracked all requests and delivery of homelessness services from 2009 to 2014 across St. Louis, MO—a medium-sized Midwestern legacy city facing historical segregation and ongoing socioeconomic challenges.22–24 The present study includes first-time entries into 4 available services: (1) homelessness prevention provides 1-time financial assistance for families at imminent risk of homelessness; (2) emergency shelter offers time-limited group accommodations to avoid staying on the streets; (3) rapid rehousing—initially implemented in October 2009—offers homeless households short-term community-based rentals for up to 12 months; and (4) transitional housing gives up to 2 years of congregate accommodations with social services. During the study period, first-time enterers did not receive permanent supportive housing, which offers longer term and more intensive support.
A homelessness management information system (HMIS) recorded homeless service interactions following federal guidelines for the standardized data collection on homeless individuals and families.25 The HMIS in St. Louis captured most homeless services, except for 1 emergency shelter that was not included in the system. Robust data entry monitoring provided complete information on core features with little missingness. The longitudinal data also capture reinvolvement with homeless services over time across the more than 75 community-based programs serving the homeless in St. Louis. Uniquely, the St. Louis HMIS recorded all requests for homeless services regardless of availability, and thus, the data captured community-wide demand for services without capacity constraints. Data represent homeless services before federally required coordinated assessment and entry initiatives, and households generally received services on a first-come-first-serve basis. Moreover, St. Louis used a central hotline to coordinate all requests for housing assistance recorded in the HMIS. Thus, the data captured community-wide demand for services unconstrained by availability. Figure 1 illustrates the HMIS data architecture (Based on the HMIS Data and Technical Standards maintained by the U.S. Department of Housing and Urban Development. Projects include the community-based agencies providing each type of homeless service (eg, emergency shelter). Providers collect standard elements that describe the programs, household demographics, service types and dates, and observed outcomes. The Standards also outline the data quality control, security criteria, and exports used in analysis and reporting.).
Figure 1.
Data architecture of the Homelessness Management Information System (HMIS).
Analyses predict the binary outcome of whether households reentered homeless services—either requesting or receiving additional homeless services within 2 years of initial exit from the system. Features come from HUD and federally required HMIS universal and program-specific data elements. Predictors include 35 household characteristics collected upon initial entry into the services. Data record service requests, entry and exit dates, and an array of household sociodemographics (eg, age, gender, race/ethnicity, income), functioning (eg, chronic conditions, mental illness, substance abuse), household characteristics (eg, number of children, adults, prior living arrangements), and housing status (eg, homelessness, prior residence). Households assessed at the time of service request as homeless or whose prior place was in a homeless setting were defined as homelessness prevention ineligible.
In contrast, those in stable prior housing were deemed homelessness prevention eligible. Table 1 describes all features with summary statistics by initial homeless service received. Code and deidentified data from the study are publicly accessible on GitHub: https://github.com/amandakube/Community-And-Data-Driven_Homelessness_Prevention.
Table 1.
Descriptive statistics for the Homeless Management Information System (HMIS) administrative data by received service type
| Emergency shelter | Transitional housing | Rapid rehousing | Homelessness prevention | Total | |
|---|---|---|---|---|---|
| n = 2997 | n = 1469 | n = 840 | n = 4737 | n = 10043 | |
| Feature | %/M | %/M | %/M | %/M | %/M |
| Household characteristics | |||||
| Number of household members | 1.75 | 1.15 | 1.67 | 2.44 | 1.98 |
| Spouse present | 1.90 | 0.20 | 4.52 | 10.05 | 5.72 |
| Number of children | 0.73 | 0.14 | 0.60 | 1.25 | 0.88 |
| Number of children ages 0–2 | 0.26 | 0.08 | 0.09 | 0.15 | 0.17 |
| Number of children ages 3–5 | 0.17 | 0.03 | 0.11 | 0.18 | 0.15 |
| Number of children ages 6–10 | 0.17 | 0.02 | 0.15 | 0.31 | 0.21 |
| Number of children ages 11–14 | 0.09 | 0.01 | 0.11 | 0.24 | 0.15 |
| Number of children ages 15–17 | 0.03 | 0.00 | 0.07 | 0.17 | 0.10 |
| Number of unrelated adults | 0.00 | 0.00 | 0.03 | 0.07 | 0.04 |
| Number of unrelated children | 0.01 | 0.00 | 0.03 | 0.11 | 0.06 |
| Number of calls before entry | 4.52 | 4.14 | 3.13 | 1.07 | 2.72 |
| Wait before entry (in days) | 267.79 | 276.30 | 291.83 | 176.74 | 228.10 |
| Monthly income (in US Dollars) | 764.92 | 328.44 | 1627.27 | 2056.29 | 1328.73 |
| Head of household characteristics | |||||
| Female | 75.54 | 16.20 | 52.14 | 82.75 | 68.61 |
| Age (years) | 36.38 | 38.55 | 43.43 | 41.13 | 39.53 |
| White | 18.18 | 24.78 | 17.62 | 8.76 | 14.66 |
| African American | 79.85 | 73.66 | 81.43 | 90.10 | 83.91 |
| Hispanic or Latino ethnicity | 1.60 | 1.36 | 0.71 | 0.76 | 1.10 |
| Veteran | 3.14 | 8.51 | 7.02 | 2.48 | 3.92 |
| Disabling condition | 17.25 | 24.30 | 19.88 | 9.90 | 15.04 |
| Physical disability | 20.49 | 15.04 | 26.43 | 18.85 | 19.42 |
| Received physical disability services | 7.17 | 6.54 | 8.93 | 8.09 | 7.66 |
| Developmental disability | 3.47 | 2.25 | 4.76 | 1.90 | 2.66 |
| Received developmental disability services | 0.40 | 0.27 | 1.67 | 0.42 | 0.50 |
| Chronic health condition | 33.47 | 29.07 | 35.12 | 36.90 | 34.58 |
| Received chronic health services | 17.65 | 14.43 | 21.43 | 22.04 | 19.57 |
| HIV/AIDS | 0.70 | 0.48 | 0.95 | 0.53 | 0.61 |
| Received HIV/AIDS services | 0.37 | 0.20 | 0.83 | 0.13 | 0.27 |
| Mental health problem | 34.57 | 25.05 | 37.62 | 26.89 | 29.81 |
| Received mental health services | 12.85 | 11.71 | 15.71 | 10.83 | 11.97 |
| Alcohol abuse problem | 6.14 | 11.37 | 5.48 | 3.95 | 5.81 |
| Drug abuse problem | 13.71 | 24.10 | 11.55 | 10.39 | 13.48 |
| Both alcohol and drug abuse problem | 9.64 | 15.32 | 8.81 | 4.71 | 8.08 |
| Received substance abuse services | 10.08 | 32.40 | 9.52 | 10.32 | 13.41 |
| Domestic violence survivor | 1.23 | 0.48 | 1.07 | 0.61 | 0.82 |
| Chronically homeless | 0.27 | 2.86 | 17.98 | 1.03 | 2.49 |
| Homeless | 6.00 | 9.53 | 88.45 | 0.30 | 8.29 |
| At imminent risk of losing housing | 2.70 | 1.16 | 0.00 | 71.35 | 25.47 |
| At risk of homelessness | 0.17 | 0.07 | 0.00 | 24.17 | 8.91 |
| Stably housed | 0.03 | 0.14 | 10.36 | 4.15 | 2.88 |
| Coming from emergency shelter | 25.46 | 21.31 | 22.86 | 0.49 | 12.08 |
| Coming from transitional housing | 13.15 | 10.28 | 3.69 | 0.51 | 6.18 |
| Coming from substance abuse treatment | 15.48 | 13.55 | 0.83 | 0.11 | 8.06 |
| Coming from hospital/medical facility | 2.77 | 1.63 | 0.48 | 0.04 | 1.37 |
| Coming from a family member’s residence | 17.82 | 16.27 | 7.74 | 7.28 | 11.61 |
| Coming from a friend’s residence | 2.60 | 3.06 | 7.14 | 4.81 | 3.19 |
| Coming from a place not meant for habitation | 1.67 | 10.42 | 32.02 | 0.53 | 4.73 |
| Coming from a rental with subsidy | 0.00 | 0.20 | 0.36 | 4.11 | 1.42 |
| Coming from a rental without subsidy | 0.20 | 0.27 | 8.81 | 50.07 | 17.80 |
| Coming from a residence owned by client without subsidy | 0.03 | 0.00 | 1.19 | 27.08 | 19.61 |
Note: Among the households who received emergency shelter, rapid rehousing, and transitional housing (n = 5306) were 205 prevention-eligible families.
We develop and test the design of equity- and data-driven homeless systems in 4 phases, as visualized in Figure 2. First, counterfactual predictions generate household-level probabilistic estimates of system reentry given a receipt of homelessness prevention, emergency shelter, rapid rehousing, and transitional housing. We use Bayesian Additive Regression Trees (BART)—a machine-learning approach demonstrating bias reductions in observational and complex data. While the possibility of unobserved confounding, of course, exists with a rich set of covariates, BART can capture the complexity of the response surface to provide counterfactual predictions in a manner that typically makes it preferable to methods like propensity score or nearest neighbor matching for causal inference19,26 including in this domain and outperforms propensity score and nearest-neighbor matching algorithms.8,9
Figure 2.
Community- and data-driven procedure for efficient and equitable homeless services.
BART outputs 1000 sample estimates of the probability of reentry for each household, and average treatment effects (ATE) represent the differences in reentry probabilities between every pairwise combination of services (eg, shelter vs rapid rehousing, rapid rehousing vs transitional housing). Estimates include the mean and 2.5% and 97.5% quantiles for a 95% estimated credible interval of the pairwise differences. We restrict comparisons of homelessness prevention to households who did not meet federal criteria for homelessness upon initial entry into services; this addresses a potential confound in a counterfactual that not only provides prevention but also conceptually takes homes away from these households.
Second, we generate homeless service prioritization rules through iterative exchanges with community partners. Preferences were elicited as part of an ongoing community-engaged research partnership aimed at identifying and addressing gaps in a regional response to housing insecurity and homelessness in St. Louis, MO.27–29 Rooted in participatory action research,30–32 homeless service providers and consumers initially mapped system barriers for homelessness prevention through structured focus groups. System insights were presented to an oversight committee that included advocates, providers, and policymakers. Follow-up meetings with committee members elicited service preferences that were incorporated into modeling, with results being feedback for further discussion. Initial hypotheses articulate the subpopulations most likely to do best in available services. No household is expected to do best in an emergency shelter, although some could be significantly harmed. Households with comorbid conditions—operationalized as self-reporting at least 2 disabilities, mental health, and substance abuse problems—should be prioritized for transitional housing that attends to psychosocial barriers to stability. In contrast, families with children under 18 years of age are more likely to face socioeconomic drivers of instability better addressed with less intensive rapid rehousing.32,33 Community partners also consider service effectiveness for several subpopulations without clear hypotheses. Unaccompanied youth aged 18-to-24 years without children vary in need, with some youths appearing to do best with transitional housing and others in rapid rehousing depending on their circumstances. Likewise, concerns for minoritized and marginalized populations warrant comparisons across demographic variables, including race and gender. For subpopulations of interest, we estimate conditional average treatment effects (CATE) that aggregate household-level ATE of transitional housing versus rapid rehousing and provide group-level means and 95% credible intervals. A review of CATE across subpopulations of interest informs the formulation of prioritization rules that preference groups who do better in transitional housing or rapid rehousing.
Third, we evaluate the efficiency of prioritization rules by simulating optimal homeless service delivery under a set of different criteria. Simulations using an integer programming framework consider household-level service effectiveness and dynamic resource constraints limiting matching with services that most reduce homelessness. Specifically, we use BART out-of-sample system reentry predictions given a receipt of each service plus capacity limits—derived from multiplying the actual number of households who entered each service type in a week by the average weekly costs of each service derived from prior research and adjusted to 2022 inflation.34–37 A weighted bipartite matching algorithm assigns households entering services each week to 1 of the 4 services that minimize homeless reentries. Prioritization rules serve households from preferred subpopulations first until the exhaustion of resources. We assess prioritization allocations on system-wide efficiency (reentry reductions) and cost-effectiveness (total service expenditures) compared with (1) the original allocation (services-as-usual), (2) optimizing for minimal costs, and (3) optimizing for reducing reentry.
Finally, we evaluate the equity of the community- and data-driven prioritization rules. We operationalize equity using 2 commonly referenced metrics for scarce resource allocation—the utility gains and shortfalls experienced by subpopulations of interest.6,21,37 Utility represents the predicted probability of homelessness reentry under each potential service allocation. Gain compares the utility with a worst-case scenario of serving all households with the service predicted to prevent homelessness the least and, thus, measures the benefit gained from each allocation. Shortfall contrasts with the best-case scenario when all households receive the service most likely to prevent homelessness—assessing the loss generated by each assignment. We evaluate change in gain (ΔG) and shift in shortfall (ΔS) for allocation a compared with allocation r, which randomly allocates households to services; the differences are calculated for subpopulations of interest (s = 1) versus a comparison group (s = 0) as follows:
Random allocations represent the average results across 100 runs with all prevention-eligible households receiving prevention for consistency—the difference in gain and difference in shortfall equal zero when no bias exists across groups. By centering the fairness metrics, equity values of 0 indicate the allocation has equal bias to that of a random allocation. In our calculations, negative numbers indicate greater gain and shortfall for subpopulations of interest, while positive numbers signal comparison group preferences. We expect equitable homeless services to demonstrate greater gains and greater shortfalls for subpopulations of interest.
We hypothesize the following:
Prioritization rule-based allocations increase assignment to appropriate services for preferred groups without increasing overall system reentries and service costs.
Prioritization achieves equitable access to appropriate services for traditionally underrepresented and marginalized subgroups.
RESULTS
Administrative records track 10 043 households who initially entered homelessness prevention or homeless services between 2009 and 2012 in St. Louis. Complete records were available for 30 of the 35 features included in the analyses. At the same time, missingness was relatively low for race (1.2%), ethnicity (1.2%), disabling condition (2.1%), housing status (1.2% asked of prevention-eligible only), and prior residence (7.30%). Half of the households (49.3%) are eligible for homelessness prevention, while others require emergency housing. The prevention-ineligible population disproportionately includes Black (85.0%) and female-headed (66.6%) households aged 39.5 years on average (SD = 12.8), with 10.7% unaccompanied homeless youth aged 18–24 years. Families with children comprise 58.3% of households. Nearly one-fifth of household heads self-report a disabling health condition (15.0%), mental health problem (29.8%), or substance abuse (27.4%) upon entry into services.
Overall, 27.5% of households request to re-enter or re-enter services. Households initially referred to homelessness prevention (n = 4737) and transitional housing (n = 1469) reenter less (13.6% and 34.3%, respectively) compared with rates for households in emergency shelter (42.5% of 2997 households) and rapid rehousing (40.6% of 840 households). As is common in observational data, pre-existing differences at the time of service entry, as seen in Table 1, contribute to differences in reentry as well. Bayesian additive regression trees (BART) have been shown to mitigate this and so were used to model complex interactions and nonlinearities in the data and generate counterfactual pairwise reentry predictions. The model demonstrates good fit and accuracy when predicting reentry with an area under the receiver-operating-characteristic curve of 0.75, the misclassification error rate of 0.2, the precision of 0.6, recall of 0.3, and calibration of 0.9. All metrics suggest acceptable accuracy in the prediction task. Furthermore, the BART-generated out-of-sample counterfactual reentry predictions, given the actual service provided, correspond closely with the observed reentry rates across interventions.
We initially examine population average treatment effects (ATE) to inform prioritization rules. Counterfactual predictions suggest all 4942 homelessness prevention-eligible households do “best” (lowest probability of reentry) in prevention, with pairwise ATE showing a 5.54 percentage point (pp) reduction in reentry compared with transitional housing, 6.04 pp reduction for rapid rehousing, and 7.80 pp reduction for emergency shelter. Thus, prevention meets the needs of eligible families. Treatment effects for the 5101 prevention-ineligible households vary; 65% are predicted to do best in transitional housing, 30.1% in rapid rehousing, and 4.2% in emergency shelter. The treatment effect of transitional housing compared with rapid rehousing on homelessness reentry fell close to zero with considerable variation (ATE = −0.02, 95% credible interval = −0.12 to 0.07). Together, evidence supports the existing human and system heuristics for resource assignment for households needing low-intensity homelessness prevention but does less well at matching households into effective higher-intensity interventions.
Figure 3 illustrates the CATE of transitional housing compared to rapid rehousing for homeless subpopulations identified by community stakeholders. Treatment effects that fall further from the ATE, represented by the dashed line, indicate better response to rapid rehousing (above) and transitional housing (below). No subpopulation does best in shelter; thus, we ignore those pairwise estimates. Households with mental health problems do relatively worse in transitional housing, and therefore, we reoperationalize comorbidities as any 2 of alcohol, drug, or disability. CATE support community hypotheses that households with comorbid conditions exhibit lower reentry rates given transitional housing compared with rapid rehousing, averaging a 6.1 pp reduction. Results also show that families with children and without comorbidities exhibit a 3.2 pp reduction in reentry when receiving rapid rehousing, which supports the community hypothesis. The CATE for other subpopulations and intersectional identities fall close to the average treatment effect indicating no clear service preference, as presented in Table 2. Generally, results show households with comorbidities do better in transitional housing across identities, while households with children and without comorbidities do better in rapid rehousing.
Figure 3.
Conditional average treatment effects of transitional housing versus rapid rehousing by subpopulation. The dotted line represents the ATE of transitional housing vs rapid rehousing for the entire population. CATE of zero indicate no treatment effect. CATE above the dotted line indicate the population performing better than average in rapid rehousing compared to TH; CATE below the dotted line indicate the population performing better than average in transitional housing. Bayesian credible intervals (and not confidence intervals) show the probability that each estimate falls within the 95% range.
Table 2.
Conditional average treatment effects (CATE) with 95% credible intervals on homeless service reentry for subpopulations of interest
| % of homeless n = 10 043 | CATE | Lower | Upper | |
|---|---|---|---|---|
| Comorbid conditions | 18.02 | −0.06 | −0.14 | 0.03 |
| Black—comorbid | 12.96 | −0.07 | −0.15 | 0.03 |
| Female—comorbid | 6.12 | −0.03 | −0.11 | 0.04 |
| Unaccompanied youth—comorbid | 1.10 | −0.05 | −0.10 | 0.01 |
| Families—comorbid | 2.35 | −0.001 | −0.06 | 0.06 |
| Black—no comorbidity | 65.22 | −0.02 | −0.11 | 0.07 |
| Female—no comorbidity | 48.09 | 0.01 | −0.07 | 0.08 |
| Unaccompanied youth—no comorbidity | 9.66 | −0.03 | −0.11 | 0.03 |
| Families—no comorbidity | 25.70 | 0.03 | −0.04 | 0.09 |
Note: CATE below zero indicate better outcomes in transitional housing, above zero show better outcomes in rapid rehousing.
Figure 4 visualizes the resulting community- and data-driven prioritization rules. Homelessness prevention-eligible households receive prevention. Non-prevention-eligible households with comorbid health, mental health, or substance abuse problems receive transitional housing if accessible in the week the household enters the system. Families with children under 18 years without comorbidities receive rapid rehousing if available. Other non-prevention-eligible households enter a lottery for emergency shelter, rapid rehousing, or transitional housing. The lottery assigns as many households as possible to receive transitional housing as the ATE for transitional housing versus rapid rehousing indicates the general population does better in transitional housing while maintaining the same cost of service provision as in the services-as-usual allocation. Table 3 reports the average weekly costs and the calculation of the overall capacity limit.
Figure 4.
Prioritization rules for community- and data-driven homeless services.
Table 3.
Calculation of service costs across the study period for each service type
| Service type | Monthly cost per household | Avg. months in service | Number of households | Total cost |
|---|---|---|---|---|
| Emergency shelter | $1510 | 0.45 | 2997 | $2 022 885.10 |
| Transitional housing | $2319 | 0.61 | 1469 | $2 078 032.70 |
| Rapid rehousing | $809 | 0.99 | 840 | $672 764.40 |
| Homelessness prevention | $119 | 1.23 | 4737 | $694 763.90 |
| Total | $5 468 446.10 |
Note: Average monthly costs derived through a literature search31–34 and adjusted for 2022 inflation rates.
Table 4 reports simulation results assessing the implementation of prioritization rules compared with services-as-usual, random assignment, service-efficiency optimization, and cost-effectiveness optimization. Prioritization rules reduce system reentries by 1 pp compared with services-as-usual. The results correspond with cost-effectiveness optimization and yield similar budget savings compared with services-as-usual. The expected small efficiency gains from prioritization and cost-effectiveness optimization reflect budget constraints and ongoing demand that requires all households to receive assistance. Optimizing on service efficiency lowers reentries by 2.2 pp compared with services-as-usual; an additional 113 families could avoid homelessness for a cost of only $12 124 per household. This represents a fraction compared to the costs reported in the literature associated with policing, health, and social services for families who do become homeless.31–34 Prioritization results suggest minimal compromise on system efficiency.
Table 4.
Allocation comparison: expected cost and reentry percentages for the decision rule and budget allocations compared to the original and unconstrained allocations
| Allocation | Estimated cost | Estimated savings | (Expected) reentry percentage |
|---|---|---|---|
| Services-as-usual | $5 468 446.10 | – | 27.82 |
| Prioritization rules | $5 468 137.70 | $308.40 | 26.78 |
| Cost-effectiveness | $5 468 359.42 | $86.68 | 26.17 |
| Service efficiency | $6 838 410 | $−1 369 963.90 | 25.61 |
Figure 5 presents the equity of community- and data-driven prioritization rules. Bars plot the difference in gains (left) and shortfall (right) for subpopulations of interest versus majority groups (listed vertically). Results show that prioritization rules yield the largest gains and shortfalls for households with comorbidities compared with cost-effectiveness and services-as-usual. Thus, prioritization disproportionately favors households with comorbidities when giving their most and least useful service, which enhances equity. For families with children but without comorbidities, the utility gained from prioritization is similar to services-as-usual that favored families; however, allocating services based on efficiency disproportionately burdens families, making services less equitable. We also present equity metrics by race/ethnicity, gender, and unaccompanied youth. None of the allocation schemes disproportionately favor Black households, representing 4 of every 5 households. All allocations disproportionately burden female-headed households; however, the burden is substantially mitigated by prioritization rules compared with services-as-usual and cost-efficient assignments. Similarly, the burden on unaccompanied youth is mitigated by prioritization rules and cost-efficient assignments compared with services-as-usual. Overall, prioritization rules generally produce the most equitable homeless services—as intended.
Figure 5.
Group fairness bar chart comparing allocations on equity gains and shortfalls. Gain compares the utility with a worst-case scenario of serving all households with the service predicted to prevent homelessness the least, and thus, measures the benefit gained from each allocation. Shortfall contrasts with the best-case scenario when all households receive the service most likely to prevent homelessness—assessing the loss generated by each allocation. Bars pointing to the left of the figure indicate bias toward Group A and bars pointing to the right of the figure indicate bias toward Group B where groups are listed as Group A/Group B.
DISCUSSION
The study demonstrates the feasibility of an iterative community- and data-driven approach for effective and equitable homeless service delivery. Findings address an essential gap between policy and practice.3,8,9,20 Federal guidelines require communities to allocate scarce homeless assistance based on system-wide assessments of household risk;1 yet, little evidence supports the accuracy and cultural validity of existing tools currently used for coordinated entry.4,5 Furthermore, the scarcity of homeless services inherently requires homeless providers to make continual moral preferences on whom to serve first, with little ability to evaluate individual decision-making and system goals.5
Our approach elicits feedback from key stakeholders to define subpopulations of interest and relevant intersectional identities. In this pilot, target households initially included those with comorbid conditions, families with children, unaccompanied youth, African Americans, and female-headed. Leveraging historical administrative records, counterfactual machine learning shows transitional housing reduces reentries the most for households with comorbidities, and families with children and no comorbidities do best with rapid rehousing during the study period, regardless of race/ethnicity, gender, and age. The evidence informs transparent, easily implementable, and evaluable prioritization rules for targeting services that minimize system reentries. Simulations of prioritization rules demonstrate larger reductions for subpopulations of interest (ie, comorbidities and families) without perpetuating disparities by race/ethnicity, gender, age, and unaccompanied youth. Results demonstrate promise for efficient and equitable homeless service delivery incorporating community- and data-driven insights.
The use of historical data in the feasibility study raises interesting questions regarding generalizability and implementing community- and data-driven homeless services. Evidence from the Great Recession supports targeting individuals with comorbidities for transitional housing and families without comorbidities for rapid rehousing to reduce reentry into homeless services. However, local and temporal variations in economic conditions, policy priorities, service capacities, and other conditions could produce different results—a testable hypothesis. Likewise, prioritization depends on the goals of service delivery and the information available; rules could vary if optimizing on healthcare utilization, employment, criminal involvement, social connectedness, etc., as opposed to homelessness needs. Successful implementation requires meaningful engagement with diverse community stakeholders to evaluate the current service and data systems and develop goals for equity-driven decision-making. Engagement also provides an opportunity to foresee and plan for potential unintended consequences from new service processes that could perpetuate and exacerbate inequities.
Findings must be interpreted in the context of several conceptual and methodological limitations. First, targeting homeless services fails to address the lack of affordable housing that drives the overwhelming demand for housing assistance. Although we demonstrate an approach for articulating and evaluating ethical preferences for scarce resource allocation, reforms that make safe and affordable housing accessible to low-income households remain critical for just homeless service delivery. Second, we need to consider the implementation challenges of introducing data-driven decision supports into social services to demonstrate the approach's feasibility. Simulations automate resource allocation based on predicted success, but decision-making systems must incorporate caseworker insights that fail to appear in HMIS.38–40 Moreover, introducing data-driven decision supports into homeless services introduces nontrivial dynamics on how information is interpreted and used that could generate unexpected outcomes.21,41 Rigorous research must consider the intended and unintended consequences of prioritizing scarce resources.
Finally, a series of technical issues limit insights from the modeling. Noteworthy, the data predate federal initiatives around coordinated entry and housing first, and instead, allocations functioned primarily as first-come-first-serve. Generating unconfounded treatment effects could prove difficult with current data that includes changing system preferences. Likewise, federal system performance measures now focus on service receipt, not need. In contrast, our outcome considers all re-requests for assistance regardless of availability that might be less prone to systematic exclusion from services. Lastly, the model building requires considerable local tailoring that meets community interests and the statistical assumptions necessary for counterfactual estimation, such as measurement quality, sample size, statistical power, etc. HMIS collects selected household features that might not capture the highly dimensional mechanisms underlying service delivery and intersectionalities of interest. Likewise, the local availability of linked predictors (e.g., Census) and outcomes (eg, health outcomes) allow for tailored modeling that could expand upon service effects on homeless reentry. The iterative approach requires deep collaboration on the technical and substantive elements of model building.
CONCLUSION
In sum, the scarcity of homeless services introduces ethical tradeoffs between the efficiency and equity of service delivery. Study findings demonstrate the feasibility of community- and data-driven homeless service delivery that maximizes resources with explicit attention to disparities around minoritization and marginalization. The encouraging results require continued development of technical and ethical capacities for implementation. Accessible, affordable housing remains a fundamental issue for promoting housing security for low-income households.
ACKNOWLEDGMENTS
Special thanks go to the homeless persons represented in the data and our local partners who continue to collaborate on designing sustainable and responsive services that prevent homelessness.
Contributor Information
Amanda R Kube, Division of Data and Computational Sciences, Washington University in St. Louis, St. Louis, Missouri, USA.
Sanmay Das, Department of Computer Science, George Mason University, Fairfax, Virginia, USA.
Patrick J Fowler, Division of Data and Computational Sciences, Washington University in St. Louis, St. Louis, Missouri, USA; Brown School of Social Work, Public Health, and Social Policy, Washington University in St. Louis, St. Louis, Missouri, USA.
FUNDING
The National Science Foundation awards 2127752 (SD/PJF), 2127754 (SD/PJF), and 1939677 (SD/PJF); and Amazon through an NSF-Amazon Fairness in AI award (SD/PJF).
AUTHOR CONTRIBUTIONS
ARK contributed to Conceptualization, Data Curation, Formal analysis, Methodology, Validation, Visualization, Writing—Original Draft, Writing—Review and Editing. SD contributed to Conceptualization, Data Curation, Formal Analysis, Methodology, Validation, Visualization, Writing—Original Draft, Writing—Review and Editing, Supervision, Funding Acquisition. PJF contributed to Conceptualization, Data Curation, Formal analysis, Methodology, Validation, Visualization, Writing—Original Draft, Writing—Review and Editing, Supervision, Funding acquisition, Project Administration, Resource. Authors have directly accessed and verified the underlying data reported in the manuscript.
CONFLICT OF INTEREST STATEMENT
None declared.
DATA AVAILABILITY
Code and deidentified data from the study are publicly accessible on GitHub: https://github.com/amandakube/Community-And-Data-Driven_Homelessness_Prevention.
REFERENCES
- 1. Department of Housing and Urban Development. Coordinated entry core elements. HUD Exchange. 2018. https://files.hudexchange.info/resources/documents/Coordinated-Entry-Core-Elements.pdf. Accessed September 10, 2022.
- 2. Gubits D, Shinn M, Wood M, Brown SR, Dastrup SR, Bell SH.. What interventions work best for families who experience homelessness: impact estimates from the family options study. J Pol Anal Manage 2018; 37 (4): 835–66. [PMC free article] [PubMed] [Google Scholar]
- 3. Shinn M, Brown SR, Spellman BE, Wood M, Gubits D, Khadduri J.. Mismatch between homeless families and the homelessness service system. Cityscape 2017; 19 (3): 293–307. [PMC free article] [PubMed] [Google Scholar]
- 4. Brown M, Cummings C, Lyons J, Carrión A, Watson DP.. Reliability and validity of the Vulnerability Index-Service Prioritization Decision Assistance Tool (VI-SPDAT) in real-world implementation. J Soc Distress Homel 2018; 27 (2): 110–7. [Google Scholar]
- 5. Shinn M, Richard MK.. Allocating homeless services after the withdrawal of the vulnerability index–service prioritization decision assistance tool. Am J Public Health 2022; 112 (3): 378–82. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Mashiat T, Gitiaux X, Rangwala H, Fowler PJ, Das S. Trade-offs between group fairness metrics in societal resource allocation. In: proceedings of the ACM Conference on Fairness, Accountability, and Transparency; 2022: 1095–105.
- 7. Estornell A, Das S, Liu Y, Vorobeychik Y. Unfairness despite awareness: group-fair classification with strategic agents. arXiv:2112.02746. 2021. https://arxiv.org/abs/2112.02746.
- 8. Kube A, Das S, Fowler PJ.. Allocating interventions based on predicted outcomes: a case study on homelessness services. AAAI 2019; 33 (01): 622–9. [Google Scholar]
- 9. Kube A, Das S, Fowler PJ.. Fair and efficient allocation of scarce resources based on predicted outcomes: implications for homeless service delivery. J Artif Intell Res. In press. [Google Scholar]
- 10. Aubry T, Bell M, Ecker J, Goering P. Screening for housing first. Canadian Observatory on Homelessness, Mental Health Commission of Canada. 2015. https://www.homelesshub.ca/resource/screening-housing-first-phase-one-assessment-road-map. Accessed September 11, 2022.
- 11. Vaithianathan R, Kithulgoda CI.. Using Predictive Risk Modeling to Prioritize Services for People Experiencing Homelessness in Allegheny County: Methodology Update. Auckland, New Zealand: Centre for Social Data Analytics; 2020. [Google Scholar]
- 12. Azizi MJ, Vayanos P, Wilder B, Rice E, Tambe M.. Designing fair, efficient, and interpretable policies for prioritizing homeless youth for housing resources. Lect Notes Comput Sci 2018; 10848: 35–51. [Google Scholar]
- 13. Amulya Y, Bryan W, Eric R, et al. Bridging the gap between theory and practice in influence maximization: Raising awareness about HIV among homeless youth. In: proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence; 2018: 5399–5403.
- 14. Chouldechova A, Putnam-Hornstein E, Benavides-Prado D, Fialko O, Vaithianathan R.. A case study of algorithm-assisted decision making in child maltreatment hotline screening decisions. Proc Mach Learn Res 2018; 81: 1–15. [Google Scholar]
- 15. Gillman MW, Hammond RA.. Precision treatment and precision prevention: Integrating “Below and Above the Skin”. JAMA Pediatr 2016; 170 (1): 9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Dolley S. Big data’s role in precision public health. Front Public Health 2018; 6: 68. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Dowell SF, Blazes D, Desmond-Hellmann S.. Four steps to precision public health. Nature 2016; 540 (7632): 189–91. [Google Scholar]
- 18. Khoury MJ, Iademarco MF, Riley WT.. Precision public health for the era of precision medicine. Am J Prev Med 2016; 50 (3): 398–401. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Hill JL. Bayesian nonparametric modeling for causal inference. J Comput Graph Stat 2011; 20 (1): 217–40. [Google Scholar]
- 20. Rahmattalabi A, Vayanos P, Dullerud K, Rice E. Learning resource allocation policies from observational data with an application to homeless services delivery. arXiv:2201.10053. 2022. https://arxiv.org/abs/2201.10053.
- 21. Das S. Local justice and the algorithmic allocation of scarce societal resources. AAAI 2022; 36 (11): 12250–5. [Google Scholar]
- 22. Tighe JR, Ganning JP.. The divergent city: unequal and uneven development in St. Louis. Urban Geogr 2015; 36 (5): 654–73. [Google Scholar]
- 23. Arroyo-Johnson C, Woodward K, Milam L, et al. Still separate, still unequal: Social determinants of playground safety and proximity disparities in St. Louis. J Urban Health 2016; 93 (4): 627–38. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Murphy M. For the greater good of whom? Race, development, and civic values in the midwestern metropolis. J Plan Hist 2006; 5 (4): 355–64. [Google Scholar]
- 25. Department of Housing and Urban Development. HMIS Data and Technical Standards. HUD Exchange. 2021. https://www.hudexchange.info/programs/hmis/hmis-data-and-technical-standards/
- 26. Chipman HA, George EI, McCulloch RE.. BART: Bayesian additive regression trees. Ann Appl Stat 2010; 4 (1): 266–98. [Google Scholar]
- 27. Fowler PJ, Purnell JQ. We know how to prevent homelessness due to COVID-19. The St. Louis American. May 12, 2020. http://www.stlamerican.com/news/columnists/guest_columnists/we-know-how-to-prevent-homelessness-due-to-covid-19/article_388e7fe6-9457-11ea-af5c-bbc16f106816.html. Accessed May 12, 2020.
- 28. Fowler PJ, Wright K, Marcal KE, et al. Capability traps impeding homeless services: a community-based system dynamics evaluation. J Soc Serv Res 2019; 45 (3): 348–59. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Trickett EJ. Community-based participatory research as worldview or instrumental strategy: is It Lost in translation(al) research? Am J Public Health 2011; 101 (8): 1353–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30. Hovmand PS. Group Model Building and Community-based System Dynamics Process. Community Based System Dynamics. New York: Springer Science+Business Media; 2014. [Google Scholar]
- 31. Israel BA, Coombe CM, Cheezum RR, et al. Community-based participatory research: a capacity-building approach for policy advocacy aimed at eliminating health disparities. Am J Public Health 2010; 100 (11): 2094–102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Shinn M, Khadduri J.. In the Midst of Plenty: Homelessness and What to Do about It. Hoboken, NJ: Wiley; 2020. [Google Scholar]
- 33. Culhane DP, Metraux S, Park JM, et al. Testing a typology of family homelessness based on patterns of public shelter utilization in four US jurisdictions: Implications for policy and program planning. Housing Policy Debate 2007; 18 (1): 1–28. [Google Scholar]
- 34. Khadduri J, Leopold J, Sokol B, Spellman B. Costs associated with first-time homelessness for families and individuals. HUD. 2010. https://www.huduser.gov/portal/publications/povsoc/cost_homelessness.html
- 35. Culhane DP, Park JM, Metraux S.. The patterns and costs of services use among homeless families. J Community Psychol 2011; 39 (7): 815–25. [Google Scholar]
- 36. Gubits D, Shinn M, Bell S.. Family Options Study: Short-term Impacts of Housing and Services Interventions for Homeless Families. Washington, DC: HUD, Office of Policy Development and Research; 2015. [Google Scholar]
- 37. Elster J. Local Justice: How Institutions Allocate Scarce Goods and Necessary Burdens. New York, NY: Russell Sage Foundation; 1992. [Google Scholar]
- 38. Lee MK, Daniel K, Anson K, et al. WeBuildAI: participatory framework for fair and efficient algorithmic governance. In: proceedings of the ACM Conference on Human-Computer Interaction; 2019.
- 39. Karusala N, Wilson J, Vayanos P, Rice E. Street-level realities of data practices in homeless services provision. In: proceedings of the ACM Conference on Human-Computer Interaction; 2019.
- 40. Vayanos P, McElfresh D, Ye Y, Dickerson J, Rice E. Active preference elicitation via adjustable robust optimization. arXiv, arXiv:2003 01899, 2020, preprint: not peer reviewed.
- 41. Kube A, Das S, Fowler P, Vorobeychik Y. Just resource allocation? How algorithmic predictions and human notions of justice interact. In: proceedings of the ACM Conference on Economics and Computation; 2022.
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
Code and deidentified data from the study are publicly accessible on GitHub: https://github.com/amandakube/Community-And-Data-Driven_Homelessness_Prevention.





