. 2023 Jan 5;39(1):btad002. doi: 10.1093/bioinformatics/btad002

Table 1.

Agreement across random partition of HIV V1V2 and V3 datasets

		Deduplicated abundance $\geq$ 1							Deduplicated abundance $\geq$ 2
Dataset	Methods	Hap1	Hap2	SD	Jaccard		Ruzicka		Hap1	Hap2	SD	Jaccard		Ruzicka
V1V2	Calib	808	797	(21)	0.08	(0.00)	0.23	(0.00)	43	43	(4)	0.67	(0.03)	0.82	(0.02)
	DAUMI	90	90	(5)	0.72	(0.03)	0.84	(0.02)	37	37	(2)	0.77	(0.05)	0.87	(0.02)
	Naïve	293	303	(9)	0.29	(0.01)	0.57	(0.01)	43	43	(3)	0.75	(0.03)	0.88	(0.03)
	Starcode-umi	2304	2311	(48)	0.00	(0.00)	0.19	(0.00)	15	13	(1)	0.73	(0.05)	0.80	(0.02)
	UMI-tools	542	538	(16)	0.14	(0.00)	0.36	(0.00)	42	42	(3)	0.77	(0.04)	0.87	(0.01)
V3	Calib	5048	5021	(57)	0.06	(0.00)	0.11	(0.00)	64	66	(4)	0.55	(0.04)	0.71	(0.02)
	DAUMI	180	171	(7)	0.39	(0.01)	0.72	(0.01)	68	72	(4)	0.72	(0.03)	0.83	(0.01)
	Naïve	1561	1543	(10)	0.06	(0.00)	0.22	(0.00)	91	84	(4)	0.64	(0.04)	0.74	(0.02)
	Starcode-umi	15077	15097	(79)	0.02	(0.00)	0.09	(0.00)	96	103	(5)	0.08	(0.01)	0.77	(0.02)
	UMI-tools	3718	3724	(36)	0.06	(0.00)	0.14	(0.00)	72	74	(3)	0.64	(0.03)	0.75	(0.01)

Note: The mean and standard deviation (SD in parentheses) of five replicates of: Hap1 and Hap2: no. inferred haplotypes in each subset (rounded to integer); Jaccard: Jaccard Index of inferred haplotype sets; Ruzicka: Ruzicka Similarity, abundance-weighted form of Jaccard Index. For DAUMI, ρ = 10 for V3, ρ = 7 for V1V2, both subsets. Best performance is bolded per dataset.