Fig. 9.
Tandem triplication of the putative anaBCD promoter region. a Alignment of the anaB gene and upstream promoter region between different assemblies of the Anabaena sp. WA102 culture. Promoters were identified with the Virtual Footprint online server, and only promoters with PWM alignment scores greater than 12 were plotted. The 5’ end of the anaB gene and upstream promoter region are triplicated in the PacBio assembly. None of the Illumina assemblies correctly assemble the tandem triplication. Assembly of 100 bp reads by IDBA v1.1.1 failed to correctly assemble the anaB gene and the promoter region. Assembly by PriceTI v1.0.1, using the IDBA contig to seed the assembly, produced two alternate versions of the anaB region. In the first version, the anaB gene and the upstream promoter region are both improperly assembled. In the second, the anaB gene and the most proximal portion of the promoter region are correctly assembled, but triplication is not assembled. b Read coverage across the promoter region upstream of the anaB gene. Illumina metagenome reads from a toxic bloom in Anderson Lake (WA25, blue line), Anabaena sp. AL93 culture (green line), and Anabaena sp. WA102 culture are mapped across anaB and its upstream promoter region. Coverage is summed at each nucleotide and illustrates the absence of two junctions formed between the triplications where the green line drops to zero for the Anabaena sp. AL93 culture. In contrast, both the Anabaena sp. WA102 culture and the Anderson Lake metagenome contain the junctions formed by the triplication because read coverage does not fall to zero at those loci. Presence of the triplication in the Anderson Lake metagenome indicates that it formed in the Anabaena sp. WA102 genome nearly a year prior to establishing the culture. It has been under selection in the environment and continues to be selected for in culture. *Read coverage values for the July 2012 Anderson Lake metagenome have been divided by 10 to facilitate comparison along the ordinate