. 2012 Nov 14;7(11):e49538. doi: 10.1371/journal.pone.0049538

Table 2. Statistical validation against patients in the brain HIV envsequence dataset of all HAD and non-HAD signatures generated by the PART algorithm.

Signature	Diagnosis	Patient Count:	Matching Patients:		p-value
		Total (HAD/None)	HAD	non-HAD
1_01 ^*	HAD	77 (39/38)	10	0	1.0E-03
1_02 ^*	non-HAD	77 (39/38)	1	9	6.8E-03
1_03 ^*	non-HAD	77 (39/38)	2	8	0.047
1_04 ^*	HAD	77 (39/38)	18	1	7.5E-06
1_05	non-HAD	77 (39/38)	7	12	0.19
1_06	non-HAD	51 (31/20)	11	5	0.54
1_07 ^*	non-HAD	77 (39/38)	9	23	1.2E-03
1_08	HAD	77 (39/38)	34	33	1
2_01 ^*	HAD	77 (39/38)	9	1	0.014
2_02 ^*	non-HAD	76 (38/38)	0	9	2.3E-03
2_03 ^*	HAD	77 (39/38)	14	2	1.4E-03
2_04 ^*	non-HAD	70 (33/37)	9	20	0.030
2_05 ^*	HAD	76 (38/38)	25	4	1.0E-06
2_06	non-HAD	49 (30/19)	13	8	1
2_07	non-HAD	77 (39/38)	4	8	0.22
2_08	non-HAD	77 (39/38)	18	16	0.82
2_09	HAD	77 (39/38)	12	5	0.098
2_10	non-HAD	77 (39/38)	23	25	0.64

The statistical significance of all HAD and non-HAD signatures was determined using Fisher’s exact test to evaluate the distribution of patients in the brain dataset with matching sequences. Diagnosis indicates whether the signature was predictive of HAD or non-HAD. Patient count reflects the total number of patients with sequence spanning the amino acid positions in the relevant signature (i.e. signature 1_01 was tested in 77 patients because 1 patient does not contain sequences spanning positions 304 through 343, which are included in signature 1_01). The number of HAD and non-HAD patients from the brain dataset, containing sequences matching each signature are given, followed by the p-value of that patient distribution, calculated by Fisher’s exact test.

= p-value <0.05.