Abstract
Summary
Offender rehabilitation seeks to minimise recidivism. Using their experience and actuarial-type risk assessment tools, probation officers in Singapore make recommendations on the sentencing outcomes so as to achieve this objective. However, it is difficult for them to maximise the utility of the large amounts of data collected, which could be resolved by using predictive modelling informed by statistical learning methods.
Findings
Data of youth offenders (N = 3744) referred to the Probation Service, Ministry of Social and Family Development for rehabilitation were used to create a random forests model to predict recidivism. No assumptions were made on how individual predictor values within the risk assessment tool and other administrative data on an individual’s socio-economic status such as level of education attained and dwelling type collected in line with organisational requirements influenced the outcome. Sixty per cent of the data was used to develop the model, which was then tested against the remaining 40%. With a classification accuracy of approximately 65%, and an Area under the Curve value of 0.69, it outperformed existing models analysing aggregated data using conventional statistical methods.
Application
This article identifies how analysis of administrative data at the discrete level using statistical learning methods is more accurate in predicting recidivism than using conventional statistical methods. This provides an opportunity to direct intervention efforts at individuals who are more likely to reoffend.
Keywords: Social work, youth offending, recidivism, quantitative research, heuristics, statistical learning methods, Singapore
The central objective of offender rehabilitation is to ensure that offenders do not become recidivists and are reintegrated into society. Given that individuals have different criminogenic risks and needs, it is therefore necessary to undertake risk assessment to prioritise areas that require more attention. In broad strokes, risk assessment can be divided into either unstructured or structured forms. As noted by Hanson and Morton-Bourgon (2009), the latter approach is becoming increasingly common in many jurisdictions, including Singapore. The rationale is that if it is possible to accurately anticipate the likelihood of an individual with a particular profile experiencing a negative outcome, interventions targeting specific issues can be put in place ahead of time. Consequently, probation officers in Singapore collect vast amounts of data in line with organisational requirements for case management and completing of risk assessment tools so as to make an informed recommendation to the courts on issues such as the suitability of an individual to be placed on probation.
However, interpreting the large amounts of routinely collected data can be very time consuming, and it also requires probation officers to rely on their experience and knowledge about theories of offending. As would be demonstrated, the use of statistical methods, which can handle large amounts of data effectively, may be useful as it offers an additional tool for probation officers to make more accurate risk assessments. As would be shown, the results of statistical learning methods are, at the very least, competitive with, if not superior to conventional statistical methods such as logistic regression. Given that the recommendations by probation officers can directly impact on the types of interventions given to youth offenders, there is a strong argument to at least consider the use of statistical learning methods to ensure that their recommendations are as accurate as possible.
Having more accurate risk assessments is important because they can potentially reduce the recidivism rate and therefore lower the overall costs of rehabilitation. This is because higher risk individuals can be given more intensive and targeted intervention that may then disrupt the negative cycle of offending. Nevertheless, it is essential to state that these techniques neither displace traditional hypothesis-driven research using conventional methods nor replace the professional inputs of probation officers who play a crucial role in ensuring that accurate data are being collected at each stage.
Economic and social costs of incarceration
The financial costs of incarceration are very high (Welsh & Farrington, 2011). For instance, the Australian Government reported that for the 2011/12 financial year, the cost per prisoner per day was AUD$305 (Legal and Constitutional Affairs References Committee, 2013). It would therefore cost $111,325 to house a prisoner for one year. According to the United States Bureau of Prisons, the per capita cost of incarceration in 2014 was USD$30,621 (James, 2016). Figures from the British Ministry of Justice (2013) showed that for the 2012/13 financial year, it cost £26,139 to house one prisoner for one year. Even though the cost is different in each jurisdiction making it difficult to make direct comparisons, it is clear that incarceration is not cheap. The issue of costs is especially pertinent for those who recidivate and are sentenced to prison because each incarceration stint costs money. In general terms, recidivism can be thought of as ‘reoffending after a prior contact with the criminal justice system’ (Farrington & Daview, 2007, p. 1). Elaborating on this definition, this article only regarded individuals to have reoffended when they were found guilty after undergoing a process of formal adjudication. In other words, the recidivism rate is not determined by police arrest.
From a public policy perspective, the ability to identify ahead of time those individuals who are more likely to offend may therefore provide an opportunity to disrupt this negative cycle of offending, thereby reducing the incurrence of such costs. At the same time, it is also important not to overlook the social costs associated with offending. Should offenders be incarcerated for their actions, families may lose their breadwinners and family bonds may also be negatively affected. Likewise, the impact of offending on victims, be it pain and suffering, mental distress and reduced quality of life can also be difficult to quantify. Nevertheless, these indirect social costs also arise due to criminal offending (Welsh & Farrington, 2011). Hence, it is important to disrupt this cycle and reduce the incurrence of such economic and social costs.
Between 2006 and 2009, the average three-year recidivism rate for all youth offenders who completed their probation and detention orders was 18.9% (Ministry of Social and Family Development, 2015). However, an objective comparison of Singapore’s juvenile recidivism rate with that of other jurisdictions is difficult due to the lack of data. For instance, due to substantial differences between the juvenile justice systems among the various Australian States and Territories (Australian Institute of Criminology, 2011), it is difficult to ascertain the juvenile recidivism rate in Australia accurately. Similarly, national data on juvenile recidivism from the United States are also not available due to the same constraints highlighted above (National Criminal Justice Reference Service, 2015).
Assessing the risk of reoffending: The Singapore context
In the Singapore context, young persons charged with crimes are referred to the Probation and Community Rehabilitation Service within the Ministry of Social and Family Development (MSF) for assessment. Probation officers are responsible for providing the courts with a Pre-Sentencing Report in which they recommend the type, length and intensity of order the individual should be placed. Given that it is at the adjudicatory stage, probation officers would have little prior contact with the youth offenders. Their recommendations are based largely on the information obtained from two main sources – the Statement of Facts, which is based on information the youth provided to the police after being arrested, and the interviews they conduct upon referral. Secondary sources of information may also include school reports, and/or psychiatric and psychological reports (Chua, Chu, Yim, Chong, & Teoh, 2014). The information is then used to complete the Youth Level of Service/Case Management Inventory 2.0 (YLS), henceforth referred to as YLS, an evidence-based and theory-informed actuarial-type risk assessment tool. Strictly speaking, the YLS is not an actuarial tool. This is because the instrument provides an option for professional override. This is because Hoge and Andrews (2011) believe that professionals should retain the final say in the management of their cases.
In broad strokes, the YLS is a tool that helps caseworkers to identify areas for intervention, as well as how to go about addressing them (MHS, 2004). By identifying the youth’s needs and risks, this tool helps the relevant stakeholders to design a tailored case management plan that maximises the effectiveness of rehabilitative efforts. The tool looks at the following eight domains:
Prior and Current Offenses/Disposition, which focuses on an individual’s prior contact with the justice system.
Family Circumstances/Parenting, which focuses on family functioning within the household.
Education/Employment, which focuses on an individual’s engagement with age-appropriate activity.
Peer Relations, which focuses on an individual’s social network.
Substance Abuse, which focuses on an individual’s frequency and patterns of drug and alcohol use.
Leisure/Recreation, which focuses on whether an individual participates in pro-social organised activities.
Personality/Behaviour, which focuses on whether an individual exhibits certain traits that contribute to heightened offending risk.
Attitudes/Orientation, which focuses on the extent to which an individual exhibits an antisocial mentality.
Improving professional judgement through actuarial assessment
The use of actuarial methods is very common in industries such as finance and insurance. In essence, they are statistical and mathematical methods that help to assess risk. For instance, finance companies use them to determine if borrowers are likely to default on their loans; insurance firms rely on them to determine the levels of premiums to charge. These examples are therefore not very different from predicting whether an individual with a certain profile will reoffend in the present context. For instance, research has shown that the YLS is useful in providing a preliminary assessment of a youth offender’s propensity for antisocial behaviour as well as flagging areas for intervention in the general social work context in Singapore (Chu et al., 2015; Chu, Yu, Lee, & Zeng, 2014).
It is hardly surprising that decisions reached via a structured decision-making process tend to be better than ones reached via unstructured decision-making process (Corrado & Turnbull, 1992). As early as the 1950s, research has shown that, on balance, statistical prediction based on structured assessments is superior to clinical prediction (Meehl, 1954). Providing further evidence, a meta-analysis conducted by Grove, Zaid, Lebow, Snitz and Nelson (2000) found that statistical prediction outperforms clinical prediction by more than 10%. More recent work has also shown that actuarial predictions outperform clinical judgement by approximately 17% when predicting future violent or offending behaviour (Ægisdóttir et al., 2006). Similarly, a meta-analysis conducted by Hanson and Morton-Bourgon (2009) also found that actuarial measures were more accurate than unstructured clinical judgement. Furthermore, when Onifade, Davidson, Campbell, Turke and Turner (2008) conducted logistic regression to ascertain the overall predictive validity of the YLS tool in predicting recidivism among a group of probationers (n = 328), the AUC value was approximately 0.62, and they concluded that the tool had a high accuracy rate. Similarly, regression analysis by Chu et al. (2015) using the aggregated data such as the total and sub-scale YLS scores on a group of youth offenders (n = 3264) yielded an AUC value of 0.64.
For Hoge (2002a), one of the main weaknesses associated with unstructured clinical judgements is the inconsistent assessment of risks as well as the identification of treatment needs for youth offenders. This is because clinical judgement can be affected by various factors such as intuition and selective memory (Hilton, Harris, & Rice, 2006). Consequently, it is highly possible that one’s biases and prejudices may impact the decision-making process (Hoge, 2002b; Young, Moline, Farrell, & Bierie, 2006). Hence, the adoption of such validated structured assessment tools in the social work sector in Singapore signals a commitment by probation officers to use evidence-based and theory-informed tools to help them be more effective in their work.
Difficulty in maximising utility of data contained in risk assessment tools
However, despite the demonstrated predictive validity of actuarial-type instruments, it can be difficult for probation officers to make full use of the discrete predictor values contained within such instruments. For instance, just Part I of the tool contains 42 items covering eight categories. Part III of the instrument, which focuses on the assessment of other needs and considerations, contains more than 50 items. Furthermore, as part of the normal work flow, more administrative data are being collected in line with organisational requirements, such as the youth offender’s demographic details. As highlighted in the preceding section, even though the individual domain scores as well as the overall risk rating provide a good indication of an individual’s risk level, there are many ways to reach a particular score. For instance, two individuals may have the same total score, but it would be rather simplistic to assume that they share identical risk profiles and would therefore benefit from the same interventions. This is because their score for each category might be different. Alternatively, the score in each category might be the same, but different items were marked. To complicate matters further, the tool does not put weight on either the items or the domains. Consequently, it can be difficult for probation officers to formulate the reoffending risks of different youth offenders in view of the volume of data they collect for each case.
Theories of offending: Different theories, different focus
One potential way for probation officers to interpret the collected administrative data as well as the ratings derived from the risk assessment tool is to base them on theories of youth offending and recidivism. However, doing so is not necessarily as straightforward as it seems because different theories emphasise different factors. For instance, from the perspective of criminal propensity theory, Gottfredson and Hirschi (1990) argue that the lack of self-control is the most important factor. Hence, for interventions to be effective, they should address this particular trait. Approaching the issue from a slightly different vantage point, social learning theorists focus on why most people do not come into contact with the justice system. To them, delinquency occurs when individuals’ bonds to society are undermined (Hirschi, 1969). From this perspective, individuals who are not meaningfully engaged with institutions such as families, schools and employment are more likely to commit crimes as well as to reoffend in later years. Rehabilitation efforts should therefore place attention on ensuring that individuals participate in age-appropriate and pro-social activities. On the other hand, for those adhering to the general personality and cognitive social learning school of thought, criminal offending is thought to be strongly influenced by the ‘Big Four’ variables of antisocial cognition, past antisocial behaviour, antisocial personality patterns and antisocial associates. Additionally, issues relating to substance use, family and marital relationships, education and/or employment as well as leisure activities are the ‘Central Eight’ factors that are also commonly associated with antisocial and offending behaviour (Andrews & Bonta, 2010).
It is clear that even from a cursory discussion, different theories put varying emphasis on a multitude of factors and processes. However, the objective of the preceding discussion is not to argue for the primacy of a particular theory; instead, the objective is to highlight the complex interplay of dynamic factors that contribute to offending. Given that resources are limited, it is not quite possible for rehabilitation efforts to target all risk factors at the same time. Probation officers and their organisations therefore still need to first determine and justify the theoretical perspective adopted, and second, to then interpret the data available to them from that angle. Hence, the initial difficulty, which is making sense of all the available data, remains.
However, statistical learning methods, which can handle a large amount of data from different sources, may be useful in this context. Furthermore, not only can the adoption of such methods help foster an epistemic culture that rules out appeal to special knowledge or privilege, the use of such advance statistical techniques can aid in the inductive characterisation of previously unknown relationships in the data (Berk, 2013). It can therefore focus attention on previously overlooked interactions between diverse and overlooked factors in a rigorous manner that is reproducible, thereby making the modelling process both highly transparent and defensible.
Differences between statistical learning methods and traditional research methods
From a conventional research perspective, a hypothesis is put forward, and analysis is undertaken to ascertain whether there is sufficient evidence to support it. For instance, if the hypothesis is that family factors (or any other factors) are responsible for explaining why individuals reoffend, variables related to family functioning are extracted and analysed. In general terms, an explanatory model described above is parsimonious as it seeks to put forward an explanation by excluding weak or unrelated predictors that are regarded as ‘nuisance variables’ that are then given ‘minimal attention’ (Cameron & Trivedi, 2005, p. 36). Hence, earlier research on the predictive validity of the YLS tool by Chu et al. (2015) only included the aggregate variables such as the overall score and the total scores of each domain, and did not include the discrete predictor values in the analysis. Similarly, research by Onifade et al. (2008) on this tool also only incorporated the aggregate risk scores and risks levels.
On the other hand, statistical learning methods do not seek to reduce dimensionality. This is because the greater the number of predictor values, the greater the information available; the greater the number of predictor values, the greater the number of combinations (Breiman, 2001b). Hence, no hypothesis was put forward when constructing this predictive model; instead, a ‘greedy’ approach was adopted in which all relevant predictor values contained within the YLS tool in addition to administrative data collected by probation officers in line with organisational guidelines were used for model training and testing. A forecast is subsequently generated. In this context, there is ‘no model in the usual social science sense and no necessary account of why criminal behavior did or did not occur’ (Berk, 2012, p. 19).
In other words, the two main points of departure between the present analytical method and traditional method are the analysis of predictor values at the discrete level as opposed to the aggregate level as well as that the forecast is generated by statistical machinery.As noted by Cleophas and Zwinderman (2013, p. 2):
Traditional statistical tests are unable to handle large numbers of variables. The simplest method to reduce large numbers of variables is the use of add-up scores. But add-up scores do not account for the relative importance of the separate variables, their interactions and differences in units … If data sets involve multiple variables, data analyses will be complex, and modern computationally intensive methods will have to be applied for analysis.
However, as with conventional methodology, statistical learning methods can also generate outputs such as a confusion matrix. This provides users with a clear and concise summary of not just the overall classification accuracy of a particular model, but also information on both the specificity and sensitivity of the said model. After all, statistical learning methods focus on accuracy, and so they have to provide evidence of their accuracy (Berk & Bleich, 2013).It is also straightforward to extract more informative measures on how well the model is performing through analysing Area Under the Curve (AUC) of the Receiver Operator Characteristics (ROC) curve,1 which allows for the objective assessment as to whether the model is fit for purpose.
Overview of data and statistical method used for modelling in this study
Ethical approval (RP-RR-2014-2012) for this study was granted by Rehabilitation and Protection Group, and the research received no specific grant from any funding agency in the public, commercial or not-for-profit sectors. As this study only made use of existing and routinely collected administrative data, no consent was sought. The data set used for this study contained 129 predictor variables for 3744 youth offenders who were given probation and detention orders between 2004 and 2008. Apart from containing data collected from the YLS/CMI risk assessment tool, the data set also contained detailed demographic information collected in line with organisational requirements. In other words, even though many discrete predictor values were used, there was a clear logic involved in the model construction. The dependent variable was whether they reoffended during the follow-up period. For this sample, the participants were between the ages of 12 and 19 years (M = 15.30, SD = 1.21, Mdn = 15). In terms of the gender distribution, 89% were males (n = 3327) and 11% were females (n = 417). The range of the follow-up was between 2.29 and 7.29 years (SD = 1.44, Mdn = 4.82). All statistical analyses were done with R (R Core Team, 2015). However, to undertake the necessary analysis, additional packages were also used. These packages are analogous to the add-on modules available in commercial statistical software such as IBM SPSS to allow for more advanced statistical modelling. In this article, the caTools package was used to partition the data into training and testing sets (Tuszynski, 2015); the randomForest package was used to run the random forests analysis (Liaw & Wiener, 2002); the AUC analysis was performed using the ROCR package (Singh, Sander, Beerenwinkel, & Lengauer, 2005); recursive partitioning was performed using the rpart package (Therneau, Atkinson, & Ripley, 2015); and plotting of the output was done with the rpart.plot package (Milborrow, 2016).
Random forests: A brief overview
In simple terms, a random forests model can be thought of being made up of a series of classification trees in which the predictor values used to construct an independent tree are drawn at random from the training data set (Zhang & Singer, 2010). An example of a classification tree built using recursive partitioning is shown below.
In essence, this classification method seeks to identify rules in which to stratify the available predictor values into groups that maximise the number of observations that are predicted to be of a certain outcome. The main advantage of classification trees is that the output is very easy to understand (Hastie, Tibshirani, & Friedman, 2009). Based on the above example, the model predicts no recidivism if the individual’s total YLS score is below 14 points, as seen in figure 1. Conversely, if someone scores more than or equal to 14 points, the next rule to consider would be the individual’s gender. If the youth offender is female, it predicts an outcome of no recidivism; if male, it predicts recidivism.
Figure 1.
Classification tree for predicting recidivism based on recursive partitioning.
When a large number of models are created, the outcome is a random forest, essentially a collection of classification trees. Breiman (2001a, p. 11) writes that the ‘simplest random forest with random features is formed by selecting at random, at each node, a small group of input variables to split on’. It is a form of ensemble machine learning method that averages the predictions over multiple models (Denil, Matheson, & Freitas, 2014, p. 1). The creation of a large number of classification trees, in addition to using a large number of random samples drawn from the training data set therefore:
… provides opportunities for otherwise overlooked relationships to be found. Associations that might appear to be weak in a given sample because of random sampling error may surface more importantly in other samples. Each sample provides another, and somewhat different look at features of the population. (Berk, 2012, p. 66)
Since its introduction in 2001, this algorithm has been used in diverse areas for both regression and classification purposes (Barnes & Hyatt, 2012; Berk & Bleich, 2013; Cutler et al., 2007; Schroff, Criminisi, & Zisserman, 2008; Sventik et al., 2003). To further ensure that the final model does not overfit, the initial data set was divided into a training data set and a testing data set. In other words, the analysis was not performed on the entire data set. For this article, the initial model was created using the training data, which comprised of 60% of the entire data set (n = 2246). The testing data, the remaining 40% (n = 1498), were then applied to the said model that was built using the training data. The principle is that if the model is really accurate, it must be able to replicate the accuracy on new and unseen data (Kuhn & Johnson, 2013). Without the use of testing data, the model’s performance is likely to be overstated and the results overly optimistic. This is because if performance is measured using the same data used to build the model, it remains to be seen if the model has any generalisation ability (Cios, Pedrycz, Swiniarski, & Kurgan, 2007). Hence, the use of testing data is arguably the most crucial step in ensuring that the model constructed will perform well when deployed on new data.
Evaluating the model’s predictive accuracy
To convince the relevant stakeholders to consider the use of statistical learning methods in identifying individuals who are more likely to reoffend, the model has to be accurate. In this context, the outcome is binary in nature. An individual offender will either reoffend or not within the time frame under examination. Ceteris paribus, given that there are only two outcomes, the probability of guessing the correct outcome for a specific individual is therefore 0.5. However, as can be seen in Table 1, the model, when applied to testing data not used in model construction, outperformed random guessing as its classification accuracy was 0.65.
Table 1.
Confusion matrix for model using testing data (n = 1498).
| Predicted no recidivism | Predicted recidivism | |
|---|---|---|
| No recidivism | 775 | 133 |
| Recidivism | 402 | 188 |
Apart from examining a model’s overall classification accuracy, another method of assessing accuracy is to examine the area under the ROC curve. In this example, the random forests model’s performance based on unseen data yielded an AUC value of 0.69,2 as seen in figure 2. Recall that logistic regression models by Onifade et al. (2008), as well as Chu et al. (2015) discussed earlier yielded AUC values of 0.62 and 0.64, respectively. Hence, there is strong evidence to show that the random forests model, which yielded a higher AUC value of 0.69 on data not used in the initial model construction, outperformed the two logistic regression models in terms of forecasting which individuals were more likely to reoffend.
Figure 2.
ROC curve for random forests predictive model.
Discussion of findings: Identifying significant variables that impact reoffending
Apart from the provision of easily understood statistical measures ranging from the overall classification accuracy to the AUC that demonstrates whether a predictive model is performing well as discussed above, another major benefit of using statistical learning methods such as random forests is the ability to visually identify the variables that are significant in affecting the model’s accuracy and their respective forecasting importance for recidivism. In the figure below, the variable importance plot showed the decrease in accuracy if a model did not consider that specific factor (Berk, 2012). In this context, the total YLS score was the most significant factor. In other words, a model that took into consideration a youth offender’s total YLS score was more accurate than one that did not include this particular predictor value. The second most significant variable in affecting the model accuracy was whether the youth offender was assessed to have difficulty in controlling behaviour, followed by the other predictor values as shown in figure 3.
Figure 3.
Variable importance plot for random forests predictive model. YLS: youth level of service.
Extracting more intelligence from the data, the partial dependence plot as shown in figure 4 provides a visual representation of the relationship between a youth offender’s total YLS score and the likelihood of recidivism. In brief, the partial dependence plot shows the marginal effect of a predictor value on the class probability for classification. As to be expected, the lower the total YLS score, the lower the likelihood for an individual to reoffend. This is because the YLS score represents the overall risk level as assessed by the probation officer.
Figure 4.
Dependence plot for YLS total score of random forests predictive model. YLS: youth level of service.
Through analysing the data within the YLS instrument at the discrete level in conjunction with other administrative data rather than only using aggregated values such as domain scores and the total YLS score, it can therefore provide probation officers with greater resolution that was previously unavailable when they make recommendations for individuals who may have the same total YLS score or domain scores. Since statistical learning methods handle high dimensional data well, it therefore makes operational sense to adopt such techniques that have the potential to help probation officers with their case management plans by allowing them to focus more on individuals who are more likely to reoffend.
Benefits of using predictive modelling: Objective and data-driven approach
In order to create a predictive model, there must be a dependent or outcome variable. In the present context, the objective is to predict recidivism among youth offenders in Singapore. Hence, there is a clear and specific target in place. Furthermore, by having a clear objective, it also ensures that stakeholders know ahead of time how they can evaluate whether the model is fit for purpose. Predictive modelling will never be 100% accurate. There will always be errors since statistical models are only able to provide information on probabilities and not certainties. Hence, as noted by Wilson and Kerr (2015, p. 578):
… models are not designed to be authoritatively right in predictions. Rather, despite their limitations, they are instruments for assessment of the available data, often attempting to reconcile several sources of data together, to provide implications, inferences, and further insights with more rigorous predictions from the knowledge base than could be achieved otherwise through simple extrapolation of past trends or speculations.
Hence, statistical learning methods, by allowing for the analysis of discrete-level data collected, are arguably a closer approximation of the contexts youth offenders face. This is because people (re)offend for different reasons. Some may do because of negative peer influence, whereas others may do so because of economic factors. For instance, as part of routine case management, other stakeholders such as psychologists may administer other types of assessments and collect different information. As more information is being gathered, it will become increasingly difficult to comprehend them. This is especially true if the information comes from a domain that probation officers may not be familiar with.
Predictive modelling, which is a data-driven approach, is highly beneficial in such an information-rich environment. It allows for the objective appraisal of data, one that is not driven by personal biases that may be different among the various assessors. Predictive modelling, with its ability to analyse vast amounts of data at the discrete level, can therefore serve as a tool to help probation officers derive intelligence from the data available to them. Generating accurate predictions is crucial in case management because it allows the relevant stakeholders to identify ahead of time individuals who may benefit from more intensive intervention so that a negative outcome – reoffending – could be prevented. Information gained from the predictive model therefore sets the foundation for the probation officers to concentrate on individuals who may have higher risks as well as specific areas that may require more attention. For instance, they do not have to rely on convenient and visible markers such as ethnicity and gender when identifying the target population for various interventions. This ensures that case management does not have to rely on profiling based on the simple cross-tabulation of data.
However, this current model, which is trained and then tested on the predictor values available at the start of the court order, cannot be absolutely right. This is because predicting recidivism is an inexact science:
Forecasts of criminal behaviour will often get it wrong. There are just too many factors involved within a highly nonlinear system. The proper benchmark, therefore should not be perfection. The proper benchmark is current practice. And in (the present) context, accuracy is not the only goal. (Berk, 2012, p. 112)
Others goals could also include transparency and accountability. This is because decisions and policies made in the offender rehabilitative domain directly affect the lives of people, and it has to be acknowledged that many of the decisions made are discretionary in nature. Predictive modelling will not be error free, but at the very least, there is a clear logic behind it. Hence, there is a strong ethical case to consider the use of advanced statistical learning methods that have been shown to be more accurate than conventional modelling using traditional statistical methods in predicting recidivism. By doing so, probation officers are given an additional tool that has the potential to help them identify more accurately individuals who may require more intensive intervention and make their recommendations more defensible.
Limitations of predictive modelling
The present objective is to put forward an argument that the incorporation of statistical learning methods can improve the accuracy of predicting recidivism; it is not to argue that random forests or another method such as deep learning neural networks is the best method. Furthermore, despite the benefits of predictive modelling stated above, it must be acknowledged that the accuracy of the adopted model is still highly dependent on the experience and professional inputs of probation officers. No matter how advanced the method, the usual adage of ‘rubbish in, rubbish out’ still holds true. Hence, probation officers continue to play arguably the most crucial aspect in the risk assessment process. As noted by Russell (2015), human involvement is always required in carrying out and reinterpreting the data and statistical output.
The current predictive model drew very heavily on the predictor values within the YLS tool, and it takes experience to accurately complete this tool. Youth offenders may not always be forthcoming in providing honest responses, and they may not always understand the questions being asked. Hence, probation officers are often required to paraphrase the questions and probe them in order to ensure accurate communication (Rubin & Rubin, 2011). One of the most effective ways of increasing the accuracy of the predictive model is to have accurate data that are relevant at the start of the modelling process. To this end, the importance of the probation officers’ skills and experience in eliciting and assessing information cannot be overstated. As noted by Fazel, Singh, Doll and Grann (2012), even though risk assessment tools are widely used in many jurisdictions, their accuracy is still very much dependent on the way they are being used.
Conclusion
In Singapore’s context, the overall risk rating as assessed by the YLS tool is one of the main factors probation officers consider when making their recommendations regarding sentences youth offenders receive. Existing research has indicated that this tool does have predictive validity, and the current predictive model using random forests also supports this assertion. However, unlike previous research in which aggregate predictor values were used in model construction thereby ignoring the relative significance of the variables as well as the possible interactions, the present model incorporated discrete predictor values in the model construction, and more importantly, in the testing of the model. No claim is made that predictive modelling using random forests is the best tool available; the current model is only put forward to illustrate that statistical learning methods facilitating the analysis of high dimensional data are at least competitive with, if not slightly superior to, traditional research using conventional statistical methods. Since recommendations on the sentence outcomes can directly affect the lives of youth offenders, it is perhaps only ethically right for the relevant stakeholders to consider using tools that could potentially help them be more accurate in predicting recidivism in this context.
It has also to be noted that the use of predictive modelling or the traditional method does not, inter alia, mean that there is no need to have human involvement in the decision-making process. Be it conventional research methods or statistical learning methods, the model’s accuracy is directly affected by the quality and types data being collected. In most organisations, the data collected are usually determined by evidence-based and/or theory-informed tools such as the YLS. Although statistical learning methods can handle high data dimensionality very well, it does not mean that all data are created equal and that no care needs to be exercised when deciding the predictor values to be included for constructing the model. Irrelevant data will only add noise; only relevant data will make the signal stronger. The current model performs well because data relevant to the rehabilitation of youth offenders were identified by the organisation and probation officers were able to elicit accurate data to complete the risk assessment tool, thereby establishing the framework for the use of predictive modelling.
Notes
Briefly, ROC analysis can be traced back to signal detection theory. It was developed in World War II by radar operators to determine whether a signal detected by a radar represented a signal such as a plane or noise such as a flock of birds. The ROC curve is a visual representation of the trade-off between sensitivity and specificity. Related to the ROC is the AUC. It is arguably one of the most widely used and accepted measure of a model’s discriminatory power. The larger the area under the ROC curve, the more discriminatory power the model has. A perfect model, with an AUC value of 1.0 means that it is 100% sensitive and 100% specific.
For comparison, when the predictive model was trained and tested on the same data set, it yielded an AUC value of 0.85. However, the predictive performance is likely to be over-optimistic due to possible over-fitting.
Author Note
All the authors are also affiliated to Ministry of Social and Family Development, Singapore
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
References
- Ægisdóttir S., White M. J., Spengler P. M., Maugherman A. S., Anderson L. A., Cook R. S., Rush J. D. (2006) The meta-analysis of clinical judgement project: Fifty-six years of accumulated research on clinical versus statistical prediction. The Counseling Psychologist 34(3): 341–382. doi: 10.1177/0011000006287390. [Google Scholar]
- Andrews D. A., Bonta J. (2010) The psychology of criminal conduct, Cincinnati, OH: Anderson. [Google Scholar]
- Australian Institute of Criminology. (2011). Measuring juvenile recidivism in Australia. Retrieved from http://www.aic.gov.au/publications/current%20series/tbp/41-60/tbp044/tbp44_measuring_juvenile_recidivism_in_australia.html.
- Barnes, G. C., & Hyatt, J. M. (2012). Classifying adult probationers by forecasting future offending. Retrieved from https://www.ncjrs.gov/pdffiles1/nij/grants/238082.pdf.
- Berk R. (2012) Criminal justice forecasts of risk: A machine learning approach, New York, NY: Springer. [Google Scholar]
- Berk R. (2013) Algorithmic criminology. Security Informatics 2(5): 1–14. [Google Scholar]
- Berk R., Bleich J. (2013) Statistical procedures for forecasting criminal behavior: A comparative assessment. Criminology & Public Policy 12(3): 513–544. doi: 10.1111/1745-9133.12047. [Google Scholar]
- Breiman L. (2001a) Random forests. Machine Learning 45(5): 5–32. doi: 10.1023/A:1010933404324. [Google Scholar]
- Breiman L. (2001b) Statistical modeling: The two cultures. Statistical Science 16(3): 199–231. [Google Scholar]
- Cameron A. C., Trivedi P. K. (2005) Microeconometrics: Methods and applications, Cambridge, UK: Cambridge University Press. [Google Scholar]
- Chu C. M., Lee Y., Zeng G., Yim G., Tan C., Ang Y., Ruby K. (2015) Assessing youth offenders in a non-Western context: The predictive validity of the YLS/CMI ratings. Psychological Assessment 27(3): 1013–1021. doi: 10.1037/a0038670. [DOI] [PubMed] [Google Scholar]
- Chu C. M., Yu H., Lee Y., Zeng G. (2014) The utility of YLS/CMI-SV for assessing youth offenders in Singapore. Criminal Justice and Behavior 41(12): 1437–1457. doi: 10.1177/0093854814537626. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chua J. R., Chu C. M., Yim G., Chong D., Teoh J. (2014) Implementation of the risk-need-responsivity framework across the juvenile justice agencies in Singapore. Psychiatry, Psychology and Law 21(6): 877–889. doi: 10.1177/0093854814537626. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cios K. J., Pedrycz W., Swiniarski R. W., Kurgan L. A. (2007) Data mining: A knowledge discovery approach, New York, NY: Springer. [Google Scholar]
- Cleophas, T. J., & Zwinderman, A.K. (2013). Machine Learning in Medicine, vol. 1, New York and London: Springer.
- Corrado R. R., Turnbull S. D. (1992) A comparative examination of the modified justice model in the United Kingdom and the United States. In: Corrado R. R., Bala N., Linden R., Blanc M. L. (eds) Juvenile justice in Canada: A theoretical and analytical assessment, Toronto, Canada: Butterworth, pp. 75–136. [Google Scholar]
- Cutler D. R., Edwards T. C., Beard K. H., Cutler A., Hess K. T., Gibson J. C. (2007) Random forests for classification in ecology. Ecology 88(11): 2738–2792. [DOI] [PubMed] [Google Scholar]
- Denil, M., Matheson, D., & Freitas, N. D. (2014). Narrowing the gap: Random forests in theory and in practice. In Article presented at the 31st international conference on machine learning, Beijing, China.
- Farrington, D. P., & Daview, D. T. (2007). Repeated contacts with the criminal justice system and offender outcomes: Final report to statistics Canada. Retrieved from http://www.crim.cam.ac.uk/people/academic_research/david_farrington/statcanf.pdf.
- Fazel, S., Singh, J. P., Doll, H., & Grann, M. (2012). Use of risk assessment instruments to predict violence and antisocial behavior in 73 samples involving 24827 people: Systematic review and meta-analysis, BMJ, 345, 1-12, doi: 10.1136/bmj.e4692. [DOI] [PMC free article] [PubMed]
- Gottfredson M. R., Hirschi T. (1990) A general theory of crime, Redwood City, CA: Stanford University Press. [Google Scholar]
- Grove W. M., Zaid D. H., Lebow B. S., Snitz B. E., Nelson C. (2000) Clinical versus mechanical prediction: A meta-analysis. Psychological Assessment 12(1): 19–30. doi: 10.1037//1040-3590.12.1.19. [PubMed] [Google Scholar]
- Hanson R. K., Morton-Bourgon K. E. (2009) The accuracy of recidivism risk assessments for sexual offenders: A meta-analysis of 118 prediction studies. Psychological Assessment 21(1): 1–21. doi: 10.1037/a0014421. [DOI] [PubMed] [Google Scholar]
- Hastie T., Tibshirani R., Friedman J. (2009) The elements of statistical learning: Data mining, inference, and prediction, New York, NY: Springer. [Google Scholar]
- Hilton N. Z., Harris G. T., Rice M. E. (2006) Sixty-six years of research on the clinical versus actuarial prediction of violence. The Counseling Psychologist 34(3): 400–409. doi: 10.1177/0011000005285877. [Google Scholar]
- Hirschi T. (1969) Causes of delinquency, Berkeley: University of California Press. [Google Scholar]
- Hoge R. D. (2002a) The juvenile offender: Theory, research, and applications, Boston, MA: Kluwer. [Google Scholar]
- Hoge R. D. (2002b) Standardized instruments for assessing risk and need in youthful offenders. Criminal Justice and Behavior 29(4): 380–396. doi: 10.1177/009385480202900403. [Google Scholar]
- Hoge R. D., Andrews D. A. (2011) YLS/CMI 2.0: Youth level of service/case management inventory 2.0, Toronto, Canada: Multi-Health System. [Google Scholar]
- James, N. (2016). The federal prison population buildup: Options for congress. Retrieved from https://fas.org/sgp/crs/misc/R42937.pdf.
- Kuhn M., Johnson K. (2013) Applied predictive modeling, New York, NY: Springer. [Google Scholar]
- Legal and Constitutional Affairs References Committee. (2013). Value of a justice reinvestment approach to criminal justice in Australia. Retrieved from: http://www.aph.gov.au/Parliamentary_Business/Committees/Senate/Legal_and_Constitutional_Affairs/Completed_inquiries/2010-13/justicereinvestment/report/∼/media/wopapub/senate/committee/legcon_ctte/completed_inquiries/2010-13/justice_reinvestment/report/report.ashx.
- Liaw A., Wiener M. (2002) Classification and regression by randomForest. R News 2(3): 18–22. [Google Scholar]
- Meehl P. E. (1954) Clinical versus statistical prediction: A theoretical analysis and a review of the evidence, Minneapolis: University of Minnesota. [Google Scholar]
- MHS. (2004). Youth level of service/case management inventory. Retrieved from http://www.mhs.com/product.aspx?gr=saf&id=overview&prod=yls-cmi.
- Milborrow, S. (2016). rpart.plot: Plot ‘rpart’ models: An enhanced version of ‘rpart.plot’. R package version 2.1.0.
- Ministry of Justice. (2013). Costs per place and costs per prisoner. National Offender Management Service Annual Report and Accounts 2012–13 Management Information Addendum. Retrieved from https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/251272/prison-costs-summary-12-13.pdf.
- Ministry of Social and Family Development. (2015). Juvenile delinquents: Recidivism rate. Retrieved from http://app.msf.gov.sg/Research-Room/Research-Statistics/Juvenile-Delinquents-Recidivism-Rate.
- National Criminal Justice Reference Service. (2015). What is the national juvenile recidivism rate? Retrieved from https://www.ncjrs.gov/app/QA/Detail.aspx?Id=113&context=9.
- Onifade E., Davidson W., Campbell C., Turke G., Malinowski J., Turner K. (2008) Predicting recidivism in probationers with the youth level of service case management inventory (YLS/CMI). Criminal Justice and Behavior 35(4): 478–483. doi: 10.1177/0093854807313427. [Google Scholar]
- R Core Team. (2015). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing.
- Rubin H. J., Rubin I. S. (2011) Qualitative interviewing: The art of hearing data, London, UK: Sage. [Google Scholar]
- Russell J. (2015) Predictive analytics and child protection: Constraints and opportunities. Child Abuse and Neglect 46: 182–189. [DOI] [PubMed] [Google Scholar]
- Schroff, F., Criminisi, A., & Zisserman, A. (2008, September 1–4). Object class segmentation using random forests. In Article presented at the Proceedings of the British Machine Vision Conference, University of Leeds.
- Singh T., Sander O., Beerenwinkel N., Lengauer T. (2005) ROCR: Visualizing classifer performance in R. Bioinformatics 21(20): 3940–3941. doi: 10.1093/bioinformatics/bti623. [DOI] [PubMed] [Google Scholar]
- Sventik V., Liaw A., Tong C., Culberson J., Sheridan R., Feuston B. (2003) Random forest: A classification and regression tool for compound classification and QSAR modeling. Journal of Chemical Information and Computer Science 43(6): 1947–1958. [DOI] [PubMed] [Google Scholar]
- Therneau, T., Atkinson, B., & Ripley, B. (2015). Rpart: Recursive partitioning and regression trees. R package version 4.1-10.
- Tuszynski, J. (2015). Package ‘caTools’. Retrieved from https://cran.r-project.org/.
- Welsh B. C., Farrington D. P. (2011) The benefits and costs of early prevention compared with imprisonment: Toward evidence-based policy. The Prison Journal 91(3): 1205–1375. doi: 10.1177/0032885511415236. [Google Scholar]
- Wilson D. P., Kerr C. (2015) Can we know in advance whether models will get it right? The Lancet Global Health 3(10): 577–578. doi: 10.1016/S2214-109X(15)00160-6. [DOI] [PubMed] [Google Scholar]
- Young D., Moline K., Farrell J., Bierie D. (2006) Best implementation practices: Disseminating new assessment technologies in a juvenile justice agency. Crime and Deliquency 52(1): 135–158. [Google Scholar]
- Zhang H., Singer B. H. (2010) Recursive partitioning and applications, (2nd ed New York, NY: Springer. [Google Scholar]




