De novo sequencing length, coverage, and accuracy.
A, The x axis plots the minimum distance (k) a sequence call or gap is from one end of a meta-contig sequence and the y axis plots the average sequencing accuracy over all annotated calls at each k-distance. Over all annotated calls reported more than 8 positions from their closest end, there were a total of 3 incorrect sequence calls at k = 20, 21, and 22 of a single meta-contig aligned to the aBTLA heavy chain (discussed in the Results section of Supplementary Materials). B, Protein identifiers are: P1 - leptin precursor, P2 - kallikrein-related peptidase, P3 - GroEL, P4 - myoglobin, P5 - aprotinin, P6 - peroxidase, P7 - aBTLA light chain, and P8 - aBTLA heavy chain. Protein Length is the length of each reference protein in amino acid residues. Spectrum Coverage is the percent of each protein covered by peptides identified MS-GFDB with 1% FDR. Coverage is taken over all mapped contigs and Accuracy is taken over all identified meta-contigs. Mapped meta-contigs must be aligned to a reference protein as described in the text whereas identified meta-contigs must assemble at least one identified spectrum whose peptide sequence is a substring of a reference protein. Sequencing Coverage is the percent of amino acids in each protein covered by at least one mapped meta-contig sequence. Coverage Redundancy is the average number of mapped meta-contig sequences covering each amino acid residue that is covered by at least one meta-contig sequence. Spectra Per Meta-contig is the average number of spectra assembled by each mapped meta-contig whereas Peptides Per Meta-contig is the average number of peptides (spectra with distinct parent masses) assembled by each mapped meta-contig. Average Seq. Length is the average number of amino acid residues covered by each mapped meta-contig and Longest Sequence is the maximum number of amino acid residues covered by a mapped meta-contig. Correct Sequence Calls is the percentage of annotated sequence calls that were correct in identified meta-contigs. Un-annotated Seq. Calls is the percentage of sequence calls that were un-annotated in identified meta-contigs.