Fig. 1.

This scree plot graphs the sum of squares (y-axis) if we were to force the observations into a different number of clusters (x-axis). Sum of squares always decrease with more clusters but at a diminishing rate. Thus, the optimal number of clusters to use is qualitatively determined at the point the incremental reduction in sum of squares from an additional cluster is significantly diminished relative to the previous cluster's benefit. This is referred to as the “elbow method” heuristic.