. 2019 Oct 9;5(2):vez039. doi: 10.1093/ve/vez039

Table 2.

Clustering performance of Phydelity on seasonal A/H3N2 influenza viruses collected by McCrone et al. (2018).

Basis	$n_{sample}$	$%_{trans}$	Purity	$I_{G}$	ARI	NMI
High-quality transmission clusters	All		0.98	0.02	0.96	0.99
	52	25%	0.87	0.06	0.72	0.93
		45%	0.87	0.04	0.74	0.95
		70%	0.85	0.07	0.76	0.94
	93	25%	0.94	0.03	0.88	0.98
	93	45%	0.94	0.03	0.90	0.98
Household	All		0.89	0.08	0.79	0.96
	52	25%	0.56	0.29	0.35	0.82
		45%	0.73	0.16	0.56	0.90
		70%	0.82	0.11	0.74	0.93
	93	25%	0.75	0.16	0.64	0.92
	93	45%	0.87	0.11	0.80	0.95

Ground truth used for clustering assessment was either based on the identities of genetically validated, high-quality transmission clusters as defined by McCrone et al. or by the patients’ households. Besides analysing all of the viruses collected (bolded results), Phydelity was also applied to downsampled datasets consisting of different sample size ( $n_{sample}$ ) and proportion of sequences derived from the aforementioned high-quality transmission pairs ( $%_{trans}$ ). Adjusted rand index (ARI) measures how accurate the output clusters corresponded with the ground truth labels. Purity gives the average extent clusters contain only a single class. Modified Gini index ( $I_{G}$ ) is the probability that a randomly selected sequence would be incorrectly clustered. Normalised mutual information (NMI) accounts for the trade-off between clustering quality and number of clusters (see Section 2).