Skip to main content
. 2021 Mar 10;6(2):e01224-20. doi: 10.1128/mSphere.01224-20

TABLE 1.

Protein prediction in the 95th percentile

Categorya COG(s) Gene(s) Protein information
Cellular processes and signaling [D],[M],[N],[O],[T],[U],[V],[W],[Y],[Z]b D vapB47 Antitoxin, TA group
M gid (6) Associated with streptomycin resistance
M murD Peptidoglycan biosynthesis
O prcA c Intermediary metabolism and respiration
T pstP c Regulatory proteins
T mazF3 Toxin, TA group
U tatBc (35) Probable transmembrane transporter
Information storage and processing [A],[B],[J],[K],[L]b J tsnR, mihF Information pathways
J tadA CMP-type deaminase domain protein
K mce1R, mprA, pknJ Regulatory proteins
K cspA Cold shock protein
L gyrA (6) Associated with quinolone resistance
L ligC DNA recombination and repair
Metabolism [C],[E],[F],[G],[H],[I],[P],[Q]b C, H fadH,c idsB Lipid metabolism
C, G, Q cyp132,c cyp51,c ppdK,c fucA, opcA Intermediary metabolism and respiration
C, E aroG,c glpQ1,c lldD2,c mpt53, gcvH Intermediary metabolism and respiration
E proX Transmembrane transporter activity
E aroG, cysK2, trpG,c serB2c Amino acid biosynthesis
F purF Purine biosynthesis/purine salvage
F, P ddlA,c ceoB, uspA, pstA1, pitAc Cell wall and cell processes
F, I, G, H guaB3,c lipR, idi, impA Intermediate metabolism and respiration
H ribC, cobD, cobL Riboflavin/cobalamin biosynthesis
H, I accD4,c fadD5, fadE32 Involved in lipid metabolism
H, I fadE33, pssA, fabG3 Involved in lipid metabolism
I, H lspF,c grcC2, lipJ Intermediate metabolism and respiration
P pstA1, cysW c Transmembrane transporter activity
P bfrB Iron storage protein
Q mce3C (25) Virulence factor, Mce family
Q yrbE2A Part of mce2 operon
Poorly characterized [R],[S]b S apa Immunogenic, cell wall and cell processes
S/UC hsaB Cholesterol catabolism
S.UC vapC38, vapC40, vapB19, mazF8 TA group
S/UC lppA, lppB, lpqQ, lppO Possible lipoproteins
Unable to characterize (UC) S/UC esxO, esxL, esxW ESAT-6-like protein
S/UC espK ESX-1 secretion system
S/UC mmpS3 Determinant of intrinsic M. tuberculosis AMR
a

Classification of clusters of orthologous protein groups (COGs) in the 95th percentile, combined with information from Mycobrowser (18) and UniProt (19). Additional information from the literature is individually cited within the table. Genes related to basic COG categories (e.g., metabolism) were observed in both percentiles. However, certain families, such as the fadD and fadE genes (e.g., fadE33, fadD5, fadE32), associated with fatty acid and cholesterol metabolism, were observed only in the 95th percentile (17). Genes related to pathogenesis of TB disease (e.g., ESAT-6/ESX genes) and antibiotic resistance (e.g., gyrA, gidB) are present. ESAT-6/ESX family genes were predicted as poorly characterized. The classification-involved proteins encoded by genetically characterized genes with reference to H37Rv. Protein prediction for the noncharacterized genes can be found in Data Set S1 in the supplemental material.

b

COG subcategories are explained analytically in the legend to Table S2 in the supplemental material.

c

A number of genes have been identified as high-confidence drug targets (22).