ScsC sequence comparison. (a) Cartoon representation of the structural superposition of the catalytic domain of PmScsC (cyan, PDB entry 5id4; Furlong et al., 2017 ▸) with StScsC (grey, PDB entry 4gxz; Shepherd et al., 2013 ▸). The globular catalytic domains of these two proteins align well (r.m.s.d. of 1.7 Å for 174 Cα atoms aligned using TM-align; Zhang & Skolnick, 2005 ▸). However, PmScsC has a long N-terminal domain involved in its trimerization that is absent in StScsC. The N-terminus of the PmScsC model is labelled N-Term and the C-terminus is labelled C-Term. The catalytic cysteines and an adjacent threonine–cis-proline sequence of the thioredoxin domain are shown as pink sticks (highlighted with a pink circle). (b) Sequence alignment of mature (no signal sequence) CcScsC (UniProt ID Q9A747), StScsC (UniProt ID H9L4C1) and PmScsC (UniProt ID B4EV21). The catalytic domains of the three sequences share high similarity (PmScsC shares 25% sequence identity with CcScsC and 53% sequence identity with StScsC). CcScsC and PmScsC both have an extra N-terminal domain which is absent in StScsC. Secondary-structure annotation based on the structure of CcScsC presented in this work is shown above the sequence alignment. Secondary-structure annotation based on the structure of PmScsC (PDB entry 5id4) is shown below the alignment: coils for α-helices and arrows for β-strands. Note that the first two residues as well as a seven-residue C-terminal TEV protease cleavage scar differ between the CcScsC UniProt sequence and the protein sequence used for crystallization. Similar residues are highlighted in yellow, identical residues are highlighted in red and catalytic cysteines and cis-prolines are highlighted with red arrows. (c) The N-terminal region (50 residues after the signal peptide) of CcScsC was compared with those of other known trimeric thioredoxin-fold proteins: PmScsC, StBcfH (UniProt ID A0A0H3N7J9, 65 residues beyond the signal sequence selected) and WpDsbA2 (UniProt ID Q73FL6). This alignment reveals a similarity between the N-terminal sequences of the different proteins (17.5% identity between CcScsC and WpDsbA2, 30% identity between CcScsC and PmScsC, 36% identity between CcScsC and StBcfH). Two large gaps are the consequence of the additional residues of StBcfH.