Skip to main content
. Author manuscript; available in PMC: 2015 Oct 16.
Published in final edited form as: Nature. 2015 Jan 26;520(7547):383–387. doi: 10.1038/nature14100

Extended Data Table 1. Condensation Domains with a HHHxxxDG motif and an upstream predicted serine- or threonine-activating A Domain.

Condensation domains with a HHHxxxDG motif and an upstream putative serine- or threonine- activating A domain were identified through a tblastn search on the NCBI nucleotide database. An initial search was done using the A4PCP4C5 tri-domain from the nocardicin gene cluster as a query resulting in 25532 hits from 7164 sequences under an E-value threshold of 1E-20. These hits were then filtered to exclude sequences with query coverages of < 75%, yielding 3834. All sequences were translated and parsed for HHHxxxDG motifs. The subsequent 37 sequences were submitted to NRPSPredictor232,33 to identify upstream predicted serine or threonine A domains. After removing duplicate sequences, a final set of 12 remained. From the final set of sequences, only those with high confidence predictions of upstream serine- or threonine-specific A domains were aligned with the nocardicin module 5 condensation domain. Analysis revealed that all sequences in the alignment displayed characteristic motifs predominantly observed in the DCL subtype34,35 of condensation domains despite the fact that nocardicin C5 exhibits LCL binding.

graphic file with name nihms643938f10.jpg