Skip to main content
. 2021 Aug 19;33(11):3421–3453. doi: 10.1093/plcell/koab211

Figure 8.

Figure 8

Examples of identified proteins not captured well in Araport11 but detected in TAIR10. For a complete list, see Supplemental Data Set S3, A and B and Table 6. A, AT3G12012.1 was identified in TAIR10 but is not annotated as a protein-coding gene in Araport11. The predicted protein sequence is shown with the identified residues marked in orange (left side). The specific peptides identified by MS/MS and their frequency of observation are shown (middle). The right-hand panel shows the predicted gene structure with three exons in the TAIR10 annotation. This short gene is positioned within the 5′UTR of the protein-coding gene AT3G12010.1 and is likely an expressed uORF with unknown function. AT3G12010.1 (annotated as C18orf8; 782 aa) is identical in TAIR10 and Araport11 and was identified at the canonical level (59% sequence coverage). B, Alternative protein model AT2G38255.1 (unknown protein with DUF239) with an extended C-terminus in TAIR10 exhibiting multiple detected peptides not found in the shorter Araport11 entry. This was due to an alternative STOP codon combined with a change in splicing. Consequently, AT2G38255 has seven exons in TAIR10 but six exons in Araport11; exon 7 is missing in Araport11 and exon 6 partially differs between TAIR10 and Araport11. The protein sequence alignment shows that the C-terminal region of the TAIR10 (333 aa) and Araport11 (218 aa) proteins has 218 residues; the two sequences are identical until residue number 208. Five distinct peptides match to shared regions of the TAIR10 and Araport11 entries—these are SQIWLENGPR, TGCYNTNCPGFVIISR, LTIYWTADGYK, GELNSIQFGWAVHPR, LYGDTLTR (see PeptideAtlas for details). C, Detection of TAIR10 version of AT3G52130.1 (non-Type III lipid transfer protein), with no detection of the completely different sequence for AAT3G52130.1 in Araport11. This was due to alternative START and STOP codons combined with a change in splicing; the coding frames between the two genes are different. Consequently, in the case of Araport11, the N-terminal residue is a lysine and not a methionine. mRNA accumulation is limited to young flower buds, as displayed in BAR ePlant (yellow → red scale reflects low to high expression values). The primary sequences for both proteins are: TAIR10_AT3G52130.1 (125 aa): MMMKAMRVGLAMTLLMTITVLTIVAAQQEGLQQPPPPPMLPEEEVGGCSRTFFSALVQLIPCRAAVAPFSPIPPTEICCSAVVTLGRPCLCLLANGPPLSGIDRSMALQLPQRCSANFPPCDIIN Araport11_AT3G52130.1 (123 aa): RSKRACNNHLHHQCCPRRKWEDAAGHFSPRWYSSYHVEQQLLLLARSHRPRYVALPLSHLVVLVFASLPMDLHSLALTAPWLFSSLRDALLISLPAISSTRKDISSFFSFLFSFTFLFNNLAA.