Complete genomic sequences of nine Bacillota isolated from Alaskan permafrost

Quincy Faber; Jaimie R West; Elizabeth J Corriveau; Robyn A Barbato

doi:10.1128/mra.00268-25

. 2025 Aug 15;14(9):e00268-25. doi: 10.1128/mra.00268-25

Complete genomic sequences of nine Bacillota isolated from Alaskan permafrost

Quincy Faber ^1,^#, Jaimie R West ^1,^#, Elizabeth J Corriveau ¹, Robyn A Barbato ^1,^✉

Editor: Irene L G Newton²

PMCID: PMC12424323 PMID: 40815001

ABSTRACT

A total of nine Bacillota bacteria were isolated from Alaskan permafrost, and complete genomic sequences were obtained via hybrid assembly of long and short reads (Oxford Nanopore and Illumina paired-end sequencing, respectively). These genomes highlight the diversity of Arctic Bacillota and their potential applications in biotechnology.

KEYWORDS: permafrost, Arctic, psychrotrophs, Alaska, extremophile

ANNOUNCEMENT

Permafrost microorganisms have diverse stress responses and survival adaptations relevant to biotechnology (1). Here, we describe isolates under assessment for biotechnology applications (e.g. synthetic biology and ice modulation) as demonstrated in other psychrotrophic organisms (2).

Isolates were cultured from permafrost cores collected from within or above theCold Regions Research and Engineering Laboratory (CRREL) Permafrost Tunnel (Fox, AK, USA; 64.9528,−147.6178) (3). Cores were aseptically subsampled after SIPRE coring (3), then stored at −10°C or −80°C. Soil was suspended in 0.1% sodium pyrophosphate, and 100 µL was plated on various media (Table 1) and incubated at 0°C–25°C. Colonies were restreaked twice and Gram-stained to confirm isolation.

TABLE 1.

Description of isolation conditions, genomic characteristics, and sequencing depth^c

Isolate name	Genome assembly accession number	SRA (long reads)	SRA (short reads)^d	Isolation media and temperature	Closest relative in GTDB (% ANI or MSA)	Taxonomic identification	Genome length (bp)	Contigs	N50 (bp)	Secondary contig length (bp)	Completeness (%)	Contamination (%)	Total coding sequences	GC content	Long-read sequencing (gigabases)	Long-read depth of coverage (theoretical)	Short-read sequences (R1 + R2)	Short-read depth of coverage (theoretical)
PTI6	ASM4857296v1	SRR32738811	SRR32736231	TSA and 25°C	Oceanobacillus kimchii (99.2% ANI)	O. kimchii	4,243,483	2	3,947,607	295,876	100	1.08	4,183	0.35	0.268	63×	3,978,106	141×
B-H-4^a	ASM4857297v1	SRR32738810	SRR32736230	M9 media and 25°C	Paenibacillus xylanexedens (96.85% ANI)	P. xylanexedens	7,026,530	1	7,026,530	NA	100	0.03	5,927	0.46	1.09	155×	4,764,542	100×
PTI5	ASM4857295v1	SRR32738809	SRR32736229	TSA and 25°C	Pristimantibacillus sp031157315 (89.1% ANI)	Pristimantibacillus	7,148,351	1	7,148,351	NA	100	0.78	6,268	0.48	0.857	120×	3,964,332	82×
B-C-2	ASM4857294v1	SRR32738808	SRR32736228	R2A + 7% glycerol and 10°C	Psychrobacillus psychrotolerans (98.32% ANI)	P. psychrotolerans	3,723,914	1	3,723,914	NA	99.99	0.19	3,707	0.36	0.839	225×	3,153,984	125×
B-E-6	ASM4857293v1	SRR32738807	SRR32736227	unknown	Neobacillus niacini (97.04% ANI)	N. niacini	5,999,678	2	5,989,031	10,647	100	2.44	5,691	0.38	1.074	179×	1,946,482	48×
B-A-8	ASM4857291v1	SRR32738806	SRR32736226	R2A and 25°C	Paenibacillus sp000758585 (98.86% ANI)	Paenibacillus sp000758585	6,900,689	1	6,900,689	NA	100	0.03	6,120	0.44	0.801	116×	3,673,962	78×
PTI13^b	ASM4857290v1	SRR32738805	NA	TSA and 25°C	Bacillus licheniformis (99.62% ANI)	B. licheniformis	4,349,288	1	4,349,288	NA	99.98	2.53	4,596	0.46	0.807	186×	NA^b	NA^b
B-H-3^b	ASM4857292v1	SRR32738804	NA	R2A and 25°C	Peribacillus sp001866725 (93.86% ANI)	Peribacillus	4,675,083	2	4,664,557	10,526	100	0.18	4,555	0.41	1.731	370×	NA^b	NA^b
BM2	ASM4857289v1	SRR32738803	SRR32736223	Permafrost soil extract and 4°C	Psychrobacillus sp018140925.1 (92.81% MSA)	Psychrobacillus	4,250,465	1	4,250,465	NA	100	0.53	4,098	0.37	0.696	164×	2,286,626	80×

Open in a new tab

^{^a}

Isolate was generated from permafrost above the CRREL Permafrost Tunnel. All others were generated from permafrost cores collected from inside the tunnel.

^{^b}

No short reads were used to generate assemblies for these isolates.

^{^c}

ANI indicates average nucleotide identity; Multiple Sequence Alignment (MSA) was used when the ANI circumscription radius was too distant. Permafrost extract medium was prepared by mixing 168 g permafrost with 420 mL water, incubating at 4°C overnight, autoclaving (1 h), and collecting the supernatant. Agar (1.5%) was added, pH adjusted to 6.8, and the medium was autoclaved again.

^{^d}

NA is not available.

Isolates were grown in liquid media at room temperature, then a cell pellet was collected using a Sorvall RC 6 centrifuge (Thermo Fisher Scientific, Waltham, MA, USA). DNA was extracted using a DNeasy UltraClean Microbial Kit and QIAcube Connect (QIAGEN, Hilden, Germany) with samples heated at 65°C for 10 min prior to 5 min vortexing for lysis. DNA was assessed using a Qubit 3.0 Fluorometer, Invitrogen Qubit dsDNA BR Assay Kit, and Nanodrop 2000 (Thermo Scientific, Waltham, MA, USA).

Long-read sequencing was completed using the Nanopore-Only Microbial Isolate Sequencing Solution protocol, Barcoding Kit V14 (ONT, Oxford, UK), and R10.4.1 MinION flow cells with no shearing or size selection. Sequencing was performed on an ONT GridION for 72 h using minKNOW software v.24.06.14 and Dorado basecalling v.7.4.13 with super-accurate basecalling (v.4.3.0 at 400 bp) with a minimum Q score of 10.

Short reads (except PTI6) were sequenced at Plasmidsaurus (Eugene, OR, USA) with 100 ng of genomic DNA prepared using a SeqWell ExpressPlex 2.0 Library Prep Kit (301170, Beverly, MA, USA). Short-read sequencing for PTI6 was completed at Argonne National Laboratory, in which ~500 ng of genomic DNA was prepared with a TruSeq DNA PCR-free library prep kit, and IDT for Illumina TruSeq Unique Dual indexes (Illumina, San Diego, CA, USA) and size selection with an S220 Focused-ultrasonicator (Covaris, Woburn, MA, USA; target insert size 280–480 bp). All short-read sequencing was performed on the Illumina NextSeq 2000 platform, yielding 2 × 150 bp paired-end reads, using P1 flow cells (20100982, Illumina, San Diego, CA, USA) and standard quality filtering.

Long reads <4,000 bp and 10% of reads with the lowest Phred quality scores were discarded using Filtlong v.0.2.1 (https://github.com/rrwick/filtlong). Trycycler v.0.5.4 (4) was used to generate consensus long-read genomic assemblies, which were “polished” with short reads (5) using Polypolish v.0.6.0 (6) and Pypolca v.0.3.1 (--careful option) (7). All genomes were polished except PTI13 and B-H-3, where short reads did not improve assemblies due to repetitive regions; therefore, only long reads were used to generate these two assemblies (5, 8). Short reads underwent adapter trimming and error correction using fastp v.0.23.4 (9). Genomic assemblies were annotated using Bakta v.1.9.4 (10). Completeness and contamination were assessed using CheckM2 v.1.0.2 (11), and putative taxonomy was assigned using GTDB-Tk v.2.4.0 (12). Default parameters were used except as noted. Genomic characteristics, assembly quality, and sequencing metrics are in Table 1.

A total of nine complete Bacillota genomes (CheckM2 completeness scores ≥99.98%) ranging several genera (Fig. 1) were obtained, with depth of coverage from 63× to 370× for long-reads and 49× to 165× for short-reads. Genomes averaged 5,016 coding sequences, and three of the isolate genomes contained a plasmid.

Phylogenetic tree depicts relationships among Bacillota species, with Escherichia coli used as outgroup for comparative analysis. — A phylogenomic tree inferring evolutionary relationships of nine Bacillota and closest NCBI matches (accession number_phylum_genus_species). *Escherichia coli* is included as the outgroup. The tree was constructed using GToTree v.1.8.8 (13), which includes tree reconstruction with FastTree v.2.1.11 (14) using default parameters and an approximately maximum-likelihood method. Local support values were calculated using the Shimodaira–Hasegawa (SH-like) test.

ACKNOWLEDGMENTS

RAB received funding for this research from the Defense Advanced Research Projects Agency (DARPA) Ice Control for Cold Environments program and the US Department of Defense PE 0602144A Program Increase “Defense Resiliency Platform Against Extreme Cold Weather.” JRW had an appointment to the Department of Defense (DOD) Research Participation Program administered by the Oak Ridge Institute for Science and Education (ORISE) through an interagency agreement between the U.S. Department of Energy (DOE) and the DOD. ORISE is managed by ORAU under DOE contract number DE-SC0014664. The views, opinions, and/or findings expressed are those of the author and should not be interpreted as representing the official views or policies of the DOD, DOE, ORAU/ORISE, or the U.S. Government. Dr. Tom Douglas assisted with sample collection. Flora Laurent, Shaela Nestor, Joy O’Brien, and Logan Gonzalez assisted in generating the isolates described here.

Contributor Information

Robyn A. Barbato, Email: Robyn.A.Barbato@erdc.dren.mil.

Irene L. G. Newton, Indiana University Bloomington, Bloomington, Indiana, USA

DATA AVAILABILITY

These genomes and raw reads have been deposited in NCBI GenBank under BioProject accession PRJNA1223406. Approved for Public Release, Distribution Unlimited.

REFERENCES

1. Collins T, Margesin R. 2019. Psychrophilic lifestyles: mechanisms of adaptation and biotechnological tools. Appl Microbiol Biotechnol 103:2857–2871. doi: 10.1007/s00253-019-09659-5 [DOI] [PubMed] [Google Scholar]
2. Uko MP, Umana SI, Iwatt IJ, Udoekong NS, Mgbechidinma CL, Adie FU, Akan OD. 2024. Microbial ice-binding structures: a review of their applications. Int J Biol Macromol 275:133670. doi: 10.1016/j.ijbiomac.2024.133670 [DOI] [PubMed] [Google Scholar]
3. Barbato RA, Jones RM, Douglas TA, Doherty SJ, Messan K, Foley KL, Perkins EJ, Thurston AK, Garcia-Reyero N. 2022. Not all permafrost microbiomes are created equal: Influence of permafrost thaw on the soil microbiome in a laboratory incubation study. Soil Biol Biochem 167:108605. doi: 10.1016/j.soilbio.2022.108605 [DOI] [Google Scholar]
4. Wick RR, Judd LM, Cerdeira LT, Hawkey J, Méric G, Vezina B, Wyres KL, Holt KE. 2021. Trycycler: consensus long-read assemblies for bacterial genomes. Genome Biol 22:266. doi: 10.1186/s13059-021-02483-z [DOI] [PMC free article] [PubMed] [Google Scholar]
5. Bouras G, Judd LM, Edwards RA, Vreugde S, Stinear TP, Wick RR. 2024. How low can you go? Short-read polishing of Oxford Nanopore bacterial genome assemblies. Microb Genom 10:001254. doi: 10.1099/mgen.0.001254 [DOI] [PMC free article] [PubMed] [Google Scholar]
6. Wick RR, Holt KE. 2022. Polypolish: short-read polishing of long-read bacterial genome assemblies. PLoS Comput Biol 18:e1009802. doi: 10.1371/journal.pcbi.1009802 [DOI] [PMC free article] [PubMed] [Google Scholar]
7. Zimin AV, Salzberg SL. 2020. The genome polishing tool POLCA makes fast and accurate corrections in genome assemblies. PLoS Comput Biol 16:e1007981. doi: 10.1371/journal.pcbi.1007981 [DOI] [PMC free article] [PubMed] [Google Scholar]
8. Luan T, Commichaux S, Hoffmann M, Jayeola V, Jang JH, Pop M, Rand H, Luo Y. 2024. Benchmarking short and long read polishing tools for nanopore assemblies: achieving near-perfect genomes for outbreak isolates. BMC Genomics 25:679. doi: 10.1186/s12864-024-10582-x [DOI] [PMC free article] [PubMed] [Google Scholar]
9. Chen S. 2023. Ultrafast one-pass FASTQ data preprocessing, quality control, and deduplication using fastp. Imeta 2:e107. doi: 10.1002/imt2.107 [DOI] [PMC free article] [PubMed] [Google Scholar]
10. Schwengers O, Jelonek L, Dieckmann MA, Beyvers S, Blom J, Goesmann A. 2021. Bakta: rapid and standardized annotation of bacterial genomes via alignment-free sequence identification. Microb Genom 7:000685. doi: 10.1099/mgen.0.000685 [DOI] [PMC free article] [PubMed] [Google Scholar]
11. Chklovski A, Parks DH, Woodcroft BJ, Tyson GW. 2023. CheckM2: a rapid, scalable and accurate tool for assessing microbial genome quality using machine learning. Nat Methods 20:1203–1212. doi: 10.1038/s41592-023-01940-w [DOI] [PubMed] [Google Scholar]
12. Chaumeil PA, Mussig AJ, Hugenholtz P, Parks DH. 2022. GTDB-Tk v2: memory friendly classification with the genome taxonomy database. Bioinformatics 38:5315–5316. doi: 10.1093/bioinformatics/btac672 [DOI] [PMC free article] [PubMed] [Google Scholar]
13. Lee MD. 2019. GToTree: a user-friendly workflow for phylogenomics. Bioinformatics 35:4162–4164. doi: 10.1093/bioinformatics/btz188 [DOI] [PMC free article] [PubMed] [Google Scholar]
14. Price MN, Dehal PS, Arkin AP. 2010. FastTree 2 – approximately maximum-likelihood trees for large alignments. PLoS One 5:e9490. doi: 10.1371/journal.pone.0009490 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

These genomes and raw reads have been deposited in NCBI GenBank under BioProject accession PRJNA1223406. Approved for Public Release, Distribution Unlimited.

[B1] 1. Collins T, Margesin R. 2019. Psychrophilic lifestyles: mechanisms of adaptation and biotechnological tools. Appl Microbiol Biotechnol 103:2857–2871. doi: 10.1007/s00253-019-09659-5 [DOI] [PubMed] [Google Scholar]

[B2] 2. Uko MP, Umana SI, Iwatt IJ, Udoekong NS, Mgbechidinma CL, Adie FU, Akan OD. 2024. Microbial ice-binding structures: a review of their applications. Int J Biol Macromol 275:133670. doi: 10.1016/j.ijbiomac.2024.133670 [DOI] [PubMed] [Google Scholar]

[B3] 3. Barbato RA, Jones RM, Douglas TA, Doherty SJ, Messan K, Foley KL, Perkins EJ, Thurston AK, Garcia-Reyero N. 2022. Not all permafrost microbiomes are created equal: Influence of permafrost thaw on the soil microbiome in a laboratory incubation study. Soil Biol Biochem 167:108605. doi: 10.1016/j.soilbio.2022.108605 [DOI] [Google Scholar]

[B4] 4. Wick RR, Judd LM, Cerdeira LT, Hawkey J, Méric G, Vezina B, Wyres KL, Holt KE. 2021. Trycycler: consensus long-read assemblies for bacterial genomes. Genome Biol 22:266. doi: 10.1186/s13059-021-02483-z [DOI] [PMC free article] [PubMed] [Google Scholar]

[B5] 5. Bouras G, Judd LM, Edwards RA, Vreugde S, Stinear TP, Wick RR. 2024. How low can you go? Short-read polishing of Oxford Nanopore bacterial genome assemblies. Microb Genom 10:001254. doi: 10.1099/mgen.0.001254 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B6] 6. Wick RR, Holt KE. 2022. Polypolish: short-read polishing of long-read bacterial genome assemblies. PLoS Comput Biol 18:e1009802. doi: 10.1371/journal.pcbi.1009802 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B7] 7. Zimin AV, Salzberg SL. 2020. The genome polishing tool POLCA makes fast and accurate corrections in genome assemblies. PLoS Comput Biol 16:e1007981. doi: 10.1371/journal.pcbi.1007981 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B8] 8. Luan T, Commichaux S, Hoffmann M, Jayeola V, Jang JH, Pop M, Rand H, Luo Y. 2024. Benchmarking short and long read polishing tools for nanopore assemblies: achieving near-perfect genomes for outbreak isolates. BMC Genomics 25:679. doi: 10.1186/s12864-024-10582-x [DOI] [PMC free article] [PubMed] [Google Scholar]

[B9] 9. Chen S. 2023. Ultrafast one-pass FASTQ data preprocessing, quality control, and deduplication using fastp. Imeta 2:e107. doi: 10.1002/imt2.107 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B10] 10. Schwengers O, Jelonek L, Dieckmann MA, Beyvers S, Blom J, Goesmann A. 2021. Bakta: rapid and standardized annotation of bacterial genomes via alignment-free sequence identification. Microb Genom 7:000685. doi: 10.1099/mgen.0.000685 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B11] 11. Chklovski A, Parks DH, Woodcroft BJ, Tyson GW. 2023. CheckM2: a rapid, scalable and accurate tool for assessing microbial genome quality using machine learning. Nat Methods 20:1203–1212. doi: 10.1038/s41592-023-01940-w [DOI] [PubMed] [Google Scholar]

[B12] 12. Chaumeil PA, Mussig AJ, Hugenholtz P, Parks DH. 2022. GTDB-Tk v2: memory friendly classification with the genome taxonomy database. Bioinformatics 38:5315–5316. doi: 10.1093/bioinformatics/btac672 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B13] 13. Lee MD. 2019. GToTree: a user-friendly workflow for phylogenomics. Bioinformatics 35:4162–4164. doi: 10.1093/bioinformatics/btz188 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B14] 14. Price MN, Dehal PS, Arkin AP. 2010. FastTree 2 – approximately maximum-likelihood trees for large alignments. PLoS One 5:e9490. doi: 10.1371/journal.pone.0009490 [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Complete genomic sequences of nine Bacillota isolated from Alaskan permafrost

Quincy Faber

Jaimie R West

Elizabeth J Corriveau

Robyn A Barbato

Roles

ABSTRACT

ANNOUNCEMENT

TABLE 1.

Fig 1.

ACKNOWLEDGMENTS

Contributor Information

DATA AVAILABILITY

REFERENCES

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Complete genomic sequences of nine Bacillota isolated from Alaskan permafrost

Quincy Faber

Jaimie R West

Elizabeth J Corriveau

Robyn A Barbato

Roles

ABSTRACT

ANNOUNCEMENT

TABLE 1.

Fig 1.

ACKNOWLEDGMENTS

Contributor Information

DATA AVAILABILITY

REFERENCES

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases