. 2013 Jan 16;8:2. doi: 10.1186/1745-6150-8-2

Table 1.

Size of A. pompejana protein datasets

Dataset	% identity	#full-length	#partial with stop	#total
MPI (New data)	100	6 272	15 886	28 169
	98	5 778	14 893	26 992
	90	5 667	14 502	26 433
JGI + Genoscope (Existing data)	100	6 233	15 539	23 962
	98	5 360	13 365	19 890
	90	5 008	12 341	18 155
MPI + JGI + Genoscope (Combined data)	100	10 778	26 068	42 665
	98	9 359	23 131	38 185
	90	8 722	21 288	35 235

Number of full-length (with start and stop codon), partial (with stop codon), and total number of predicted protein sequences in the three datasets clustered at 100%, 98% and 90% identity.