Each rounded node represents an independent variable and each rectangular node represents one of two possible outcomes (offer (gold) or no offer (blue)). Only those variables in
Figure 5B were included. In the case of binary variables such as funding and fellowships, ">0" indicates a "yes" and "<=0" indicates a "no". All other variables, except for h-index, were split based on counts. The outcome nodes are labeled with three pieces of information: (
Cyranoski et al., 2011) the number of applicants who fell into the given branch (
n), (
Ghaffarzadegan et al., 2015) the most common outcome in that branch, and (
Schillebeeckx et al., 2013) the fraction of individuals with that outcome. For example, the rightmost branch shows applicants who had a career transition award and h-index >4. They constitute the largest group in our dataset (61 individuals). However, only 77% of these applicants received an offer. Similarly, the second and third largest groups included 51 applicants (63% with offer) and 42 applicants (67% with offer) respectively (see eighth outcome box from right and leftmost box). These three groups accounted for 48.6% of our survey respondents. Note that while decision trees have often been used as prediction models, this tree is only reflective of our dataset and choice of algorithm and parameters. We have used this solely for visualization purposes and advise against using this prospectively to evaluate chances of success on the job market as there may be alternative trees that are equally plausible and accurate. In fact, the accuracy of the overall decision tree in distinguishing between candidates with offers and those without was only 58.5%. Furthermore, no group with more than two applicants consisted purely of those with offers and those without. Even in the nine groups where the most common outcome was "no offer", on average, 25% of the applicants did receive offers.