Analysis of Ehrlichia tandem repeat proteins TRP-32/47/120. TRP Homologs of E. chaffeensis Arkansas were first identified using Blastp among Ehrlichia spp., and the internal repeats were determined by XSTREAM (https://amnewmanlab.stanford.edu/xstream/). Colored boxes indicated different repeat sequences and lengths, and were drawn to scale with the protein lengths. TRP proteins are highly variable, and the length and numbers of repeats are different among all Ehrlichia spp. a
E. chaffeensis TRP32 (ECH_0170) protein (or variable length PCR target/VLPT, 198 AA) contains 4 consecutive VLPT repeats (30-AA). However, no repeats or VLPT domains were detected in Ehrlichia sp. HF (EHF_0893/EHF_RS04015, only 90 AA with 45% identity matched to the C-terminus of ECH0170), E. muris subsp. eauclairensis (EMUCRT_RS02860, 105 AA), and E. muris subsp. muris (EMUR_00520/MR76_RS00500, 112 AA). b
E. chaffeensis TRP47 protein (ECH_0166, 316 AA) contains eight consecutive 19-AA repeats at its C-terminus. TRP47 homologs in Ehrlichia sp. HF (EHF_0897/EHF_RS04625, annotation revised based on Tblastn against Ehrlichia sp. HF genome) encodes a smaller protein (255 AA) with 40% identity, mostly conserved in N-terminus. However, no repeat sequences were identified in TRP47 homologs in Ehrlichia sp. HF, E. muris subsp. eauclairensis (EMUCRT_0637/ EMUCRT_RS04575, 252 AA), and E. muris subsp. muris (EMUR_00500/MR76_RS04630, 228 AA). c
E. chaffeensis TRP120 protein (ECH_0039, 548 AA) contains 41/3 consecutive 80-AA repeats. TRP120 homolog in Ehrlichia sp. HF (EHF_0897/EHF_RS04625, 584 AA) contains 4¼ consecutive 100-AA repeats. A much larger protein was identified in E. muris subsp. muris AS145 (EMUR_0035/MR76_RS00035, 1,288 AA) with 121/3 repeats (8 repeats with 67-AA length and 41/3 repeats of 56-AA length). Two ORFs (EMUCRT_0995 and EMUCRT_09731) in E. muris subsp. eauclairensis that match to E. chaffeensis TRP120 at the N- and C-terminus respectively, were identified in two contigs (NZ_LANU01000002 and NZ_LANU01000003) of the incomplete genome sequences. Nine repeats of 65-AA length were identified in both proteins, whereas two shorter repeats of 38-AA length were found in EMUCRT_0995 only.