Skip to main content
Data in Brief logoLink to Data in Brief
. 2024 Sep 21;57:110966. doi: 10.1016/j.dib.2024.110966

Metagenome assembly and annotation of data from the rhizosphere soil of drought-stressed CRN-3505 maize cultivar

Olubukola O Babalola a,, Rebaona R Molefe a, Adenike E Amoo b
PMCID: PMC11460481  PMID: 39381012

Abstract

This data article reports shotgun metagenomic data obtained from drought-stressed maize rhizosphere through the Illumina Novaseq platform, utilizing the KBase online platform. 428,339,852 high-quality post-sequences were obtained, showcasing an average GC content of 65.45 %. The investigation, conducted at Molelwane farm in Mafikeng, South Africa, identified 13 metagenome-assembled genomes (MAGs). Functional annotation of these MAGs revealed their involvement in essential plant growth and development functions, such as sulfur and nitrogen metabolism. The dataset was deposited into the NCBI database, and MAGs accessions are available at DDBJ/ENA/GenBank under the accession number PRJNA101755.

Keywords: Illumina, Metagenomics, Drought tolerance, Plant-microbe interactions


Specifications Table

Subject Microbial Ecology, Biological Sciences.
Specific subject area Microbial Biotechnology
Type of data Tables, Figures.
Raw, Analyzed.
Data collection The environmental samples from maize rhizosphere soil were collected. The DNA isolation from maize rhizosphere samples was performed using the DNeasy PowerSoil Pro kit (The Scientific Group (Pty) Ltd, Gauteng, South Africa), following the manufacturer's instructions. Subsequently, shotgun metagenomic sequencing was conducted using the Illumina NovaSeq platform.
Data source location North-West University farm at (Molelwane) Mafikeng, North-West Province, South Africa. GPS location (25.85 S 25.63 E)
Data accessibility Repository name: National Center for Biotechnology Information (NCBI)
Data identification number: SRR26065293 (Y60R1);
SRR26065293 (Y60R2); SRR26065293 (Y60R3); SRR26074284 (Y80R1); SRR26074283 (Y80R2); SRR26074282 (Y80R3); SRR26074281 (Y100R1); SRR26074280 (Y100R2); SRR26074279 (Y100R3)
Direct URL to data: Raw sequencing data are available at the NCBI under BioProject PRJNA1017550 with Sequence Read Archive (SRA) accession number SRR26065293 (Y60R1) https://trace.ncbi.nlm.nih.gov/Traces/?view=run_browser&acc=SRR26065293&display=metadata, SRR26065293 (Y60R2) https://trace.ncbi.nlm.nih.gov/Traces/?view=run_browser&acc=SRR26065293&display=metadata, SRR26065293 (Y60R3) https://trace.ncbi.nlm.nih.gov/Traces/?view=run_browser&acc=SRR26065293&display=metadata, SRR26074284 (Y80R1) https://trace.ncbi.nlm.nih.gov/Traces/?view=run_browser&acc=SRR26074284&display=metadata, SRR26074283 (Y80R2) https://trace.ncbi.nlm.nih.gov/Traces/?view=run_browser&acc=SRR26074283&display=metadata, SRR26074282 (Y80R3) https://trace.ncbi.nlm.nih.gov/Traces/?view=run_browser&acc=SRR26074282&display=metadata, SRR26074281 (Y100R1) https://trace.ncbi.nlm.nih.gov/Traces/?view=run_browser&acc=SRR26074281&display=metadata, SRR26074280 (Y100R2) https://trace.ncbi.nlm.nih.gov/Traces/?view=run_browser&acc=SRR26074280&display=metadata, SRR26074279 (Y100R3) https://trace.ncbi.nlm.nih.gov/Traces/?view=run_browser&acc=SRR26074279&display=metadata
The Whole Genome Shotgun project has been deposited at DDBJ/ENA/GenBank under the accession PRJNA101755. https://www.ncbi.nlm.nih.gov/datasets/genome/?bioproject=PRJNA1017550
Related research article None

1. Value of the Data

  • The dataset provides comprehensive details on the impact of microbial communities in improving maize crops under drought conditions.

  • By examining the data, we gain an understanding of the microbial composition and their functions within the rhizosphere of drought-stressed maize.

  • This dataset provides valuable insights for farmers and scientists, enabling the development of novel methods and biotechnological approaches to enhance drought tolerance in maize cultivars.

2. Background

The significance of drought tolerance in maize farming cannot be overstated, as it directly impacts food security [1]. With climate change leading to severe drought, it becomes crucial to prioritize maize plants' ability to withstand water scarcity. This is essential for maintaining crop yields and minimizing the risk of food shortages [2,3]. The rhizosphere facilitates food safety and security, representing the soil zone adjacent to plant roots. According to Pathan et al. [4], the rhizosphere functions as a point of exchange of materials and interactions between plant roots and different microorganisms. In agriculture, understanding the ecosystem that forms in the rhizosphere influences essential aspects such as plant resistance, soil health, and food safety regulations.

Metagenomic analysis, a powerful tool for studying the genetic material of entire microbial communities, holds immense potential to explore microbial genomes associated with the rhizosphere and identify beneficial microbes that can enhance crop productivity [5]. By harnessing this knowledge, we can develop innovative agricultural practices and biotechnological solutions to improve food security in the face of environmental challenges.

3. Data Description

The dataset consists of raw sequencing data collected through shotgun metagenomic sequencing of the rhizosphere microbiome from drought-stressed maize plants. This data was generated to investigate the dynamics of microbial communities under varying drought stress conditions. The data files, in FASTQ format, have been submitted to the NCBI. Table 1 provides statistical details of the metagenomic data within the drought-stressed maize rhizosphere, including the total number of raw and cleaned paired reads for each sample and the GC content percentage of the reads.

Table 1.

Raw sequencing data metrics of maize rhizosphere metagenomes.

Samples SRA Accession numbers No. of raw reads (Paired ends) Total number of bases GC Percentage (%) Sequences retained. (Post QC)
Y60_R1 SRR26065293 47,101,174 6856,748,270 65.76 46,422,870
Y60_R2 SRR26065293 50,797,694 7371,839,876 63.66 49,952,504
Y60_R3 SRR26065293 46,795,626 6795,416,810 65.54 46,052,458
Y80_R1 SRR26074284 43,966,888 6398,765,178 66.17 43,325,578
Y80_R2 SRR26074283 41,868,690 6115,504,992 56.98 41,273,526
Y80_R3 SRR26074282 46,124,210 6709,777,865 65.56 45,438,932
Y100_R1 SRR26074281 55,866,854 8151,987,912 63.94 55,127,916
Y100_R2 SRR26074280 43,746,608 6351,001,072 66.13 43,035,382
Y100_R3 SRR26074279 58,579,456 8520,267,788 66.32 57,710,686

After quality control, clean reads, such Y60R1, Y60R2, and Y60R3, were merged to create three combined assemblies, namely Y60R, Y80R, and Y100R [6].The Metagenome-Assembled Genomes (MAGs) were recovered using MEGAHIT v1.2.9 and Maxbin 2 v2.2.4 [7]. These MAGs had completeness of greater than 90 % and contamination of less than 5 %, achieved using dereplication, aggregation, and scoring approach [8]. The MAGs were deposited to GenBank under the accession PRJNA101755. https://www.ncbi.nlm.nih.gov/datasets/genome/?bioproject=PRJNA1017550. Table 2 shows eleven bacterial MAGs and two archaeal MAGs. For instance, in Y60R, four MAGs were classified as members of Actinobacteriota and Proteobacteria. Y80R comprised seven MAGs classified into Thermoproteota, Gemmatimonadota, Actinobacteriota, Proteobacteria, and Bacteroidota. Lastly, Y100R included two MAGs classified as members of Actinobacteriota and Thermoproteota. To provide a more detailed understanding of the functional potential of these MAGs, Fig. 1a summarizes the distribution of genes involved in metabolic processes across the MAGs identified in this study, while Fig. 1b illustrates the presence or absence of key components of the electron transport.

Table 2.

Metagenome-assembled genomes in the drought-stressed maize rhizosphere.

Samples NO. of contigs Average length (bp) BioSample Accession Bin ID GTBD lineage
Y60R 33,961 3,725.08 SAMN38082851
SAMN38082852
SAMN38082853
SAMN38082854
JAZDSL000000000
JAZDSM000000000
JAZDSN000000000
JAZDSO000000000
bin. 001.fastaY60R
bin. 002.fastaY60R
bin. 003.fastaY60R
bin. 004.fastaY60R
d—Bacteria; p—Actinobacteriota; c_UBA4738; o—UBA4738; f—HRBIN12.
d—Bacteria;p—Proteobacteria; c—Gammaproteobacteria;
o—Pseudomonadales; f—Pseudomonadaceae;
g—Pseudomonas_M.
d——Bacteria;p—Proteobacteria;
c—Gammaproteobacteria;o—Pseudomonadales;
f—Moraxellaceae;g—Acinetobacter;s—Acinetobacter johnsonii
d—Bacteria; p—Actinobacteriota; c—Acidimicrobiia; o—UBA5794; f—ZC4RG35; g—JACCTH01
Y80R 75 5,515.48 SAMN38082853
SAMN38082976
SAMN38082977
SAMN38082978
SAMN38082979
SAMN38082976
SAMN38082974
JAZDSN000000000
JAZDSR000000000
JAZDSS000000000
AZDST000000000
JAZDSU000000000
JAZDSR000000000
JAZDSP000000000
bin. 002.fastaY80R
bin. 003.fastaY80R
bin. 004.fastaY80R
bin. 005.fastaY80
bin. 006.fastaY80R
bin. 007.fastaY80R
bin. 001.fastaY80R
d—Bacteria; p—Proteobacteria; c—Gammaproteobacteria; o—Pseudomonadales; f—Moraxellaceae;g—Acinetobacter;s—Acinetobacter johnsonii
d—Bacteria; p—Gemmatimonadota;
c—Gemmatimonadetes; o—Gemmatimonadales; f—GWC-71-9;
d—Bacteria; p—Actinobacteriota; c—Acidimicrobiia; o—UBA5794; f—ZC4RG35; g—JACCTH01;
d—Bacteria; p—Bacteroidota; c—Bacteroidia; o—Flavobacteriales; f—Weeksellaceae;g—Chryseobacterium;s—Chryseobacterium aquaticum
d—Bacteria; p—Actinobacteriota; c—Rubrobacteria; o—Rubrobacterales; f—Rubrobacteracee; g—SCSIO-52909;
d—Bacteria; p—Gemmatimonadota;c—Gemmatimonadets;
o—Gemmatimonadales;f—GWC2-71-9; g—JACDDX01;
d—Archaea;p—Thermoproteota;c—Nitrososphaeria;
o—Nitrososphaerales;f—Nitrososphaeraceae;g—Nitrososphaera;
Y100R 45,645 3691.14 SAMN38082977
SAMN38082974
JAZDSS000000000
JAZDSS000000000
bin. 001.fastaY100R
bin. 002.fastaY100R
d—Bacteria;p—Actinobacteriota; c—Acidimicrobiia;
o—UBA5794; f—ZC4RG35;
g—JACCTH01;
d—Archaea; p—Thermoproteota; c—Nitrososphaeria; o—Nitrososphaerales; f—Nitrososphaeraceae;
g—Nitrososphaera;

Fig. 1.

Fig. 1:

The interactive heatmap was constructed to show the presence of metabolic functions (a), modules' coverage, and electron transport chain components (b).

4. Experimental Design, Materials and Methods

4.1. Experimental design

This study used a drought-sensitive maize cultivar (CRN-3505) in a greenhouse at the North-West University farm in Molelwane, South Africa. The seeds were sterilized with 5 % (v/v) hypochlorite solution and planted in plastic pots. The maize plants were cultivated under a controlled environment with a 14-h light/10-h dark photoperiod. This means the plants received 14 h of light and 10 h of darkness each day, simulating a natural day-night cycle. The temperature was maintained between 24 °C during the day and 18 °C at night. Replicate rhizosphere soil samples were collected from maize plants under a well-watered control for comparison, severe drought (60 % of field capacity) and moderate drought (80 % of field capacity). These conditions were chosen to simulate different degrees of water availability. They were collected on the 4th of March 2022 at 8 cm diameter and 15 cm depth of the maize plants, transported to the laboratory and stored until further use.

4.2. DNA extraction and illumina sequencing

DNA was extracted from maize rhizosphere samples using a DNeasy PowerSoil® Pro kit following the manufacturer's instructions. The Nextera DNA Flex Library Preparation Kit was employed to prepare libraries using 50 ng of purified DNA derived from various sources. The library size was assessed using Agilent 2100 Bioanalyzer and Novaseq 6000 sequenced 300 cycles paired-end after pooling and diluting the libraries [9]. Shotgun metagenomic libraries were generated for all rhizosphere samples from well-watered and drought-induced maize plants.

4.3. Metagenomic data analysis and genome annotation

DOE's Systems Biology Knowledgebase (KBase) was used to decipher the metagenomic sequences [6]. All software utilized default settings. Using the KBase metagenomics approach analysis approach, read library qualities were checked with FastQC version 0.11.5 [10], and the barcode sequences and sections with poor quality were eliminated using the Trimmomatic v0.36 [11]. Clean reads were then assembled into contigs using MEGAHIT v1.2.9 [12]. Maxbin 2 v2.2.4 was employed to bin the contigs, recovering the Metagenome-Assembled Genomes (MAGs) [7]. The MAGs were selected based on their high quality using DASTool v1.1.2 [8]. Taxonomic classification of the MAGs was accomplished using the GTDB-Tk toolkit (Table 2) [13]. Subsequently, DRAM was used to predict and annotate the MAGs, providing genome metabolic summaries and an interactive heatmap to compare the metabolic profiles of each genome[14].

Limitations

The data presented is based on metagenomics, and MAG predictions are not necessarily indicators of actual function, so metatranscriptomic data would be required.

Ethics Statement

The authors have read and followed the ethical requirements for publication in Data in Brief and confirmed that the current work does not involve human subjects, animal experiments, or any data collected from social media platforms.

CRediT Author Statement

O.O Babalola: Conception, Supervision, Writing-review & editing. R.R. Molefe: Data curation, Writing – original draft, Visualization, Investigation. A.E Amoo Software, Validation, Writing – review & editing.

Acknowledgments

OOB would like to thank the National Research Foundation of South Africa for grants (Grant Refs: UID123634; UID 132595 OOB) awarded to her. RRM would like to thank North-West University for the PhD bursary.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Data Availability

References

  • 1.McMillen M.S., Mahama A.A., Sibiya J., Lübberstedt T., Suza W.P. Improving drought tolerance in maize: tools and techniques. Front. Genet. 2022;13 doi: 10.3389/fgene.2022.1001001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Meseka S., Menkir A., Bossey B., Mengesha W. Performance assessment of drought tolerant maize hybrids under combined drought and heat stress. Agronomy. 2018;8(12):274. doi: 10.3390/agronomy8120274. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Field C.B. Cambridge University Press; 2012. Managing the Risks of Extreme Events and Disasters to Advance Climate Change adaptation: Special Report of the Intergovernmental Panel On Climate Change. [Google Scholar]
  • 4.Pathan S., Ceccherini M.T., Sunseri F., Lupini A. Springer; Singapore: 2020. Rhizosphere as hotspot for plant-soil-microbe interaction; pp. 17–43. [DOI] [Google Scholar]
  • 5.Nwachukwu B.C., Babalola O.O. Metagenomics: a tool for exploring key microbiome with the potentials for improving sustainable agriculture. Front. Sustain. Food Syst. 2022;6 [Google Scholar]
  • 6.Babalola O.O., Molefe R.R., Amoo A.E. Metagenome assembly and metagenome-assembled genome sequences from the rhizosphere of maize plants in Mafikeng, South Africa. Microbiol. Resour. Announcem. 2021;10(8) doi: 10.1128/mra.00954-20. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Wu Y.-W., Tang Y.-H., Tringe S.G., Simmons B.A., Singer S.W. MaxBin: an automated binning method to recover individual genomes from metagenomes using an expectation-maximization algorithm. Microbiome. 2014;2:1–18. doi: 10.1186/2049-2618-2-26. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Sieber C.M., Probst A.J., Sharrar A., Thomas B.C., Hess M., Tringe S.G., Banfield J.F. Recovery of genomes from metagenomes via a dereplication, aggregation and scoring strategy. Nat. Microbiol. 2018;3(7):836–843. doi: 10.1038/s41564-018-0171-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Babalola O.O., Alawiye T.T., Lopez C.R., Ayangbenro A.S. Shotgun metagenomic sequencing data of sunflower rhizosphere microbial community in South Africa. Data Br. 2020;31:105831. doi: 10.1016/j.dib.2020.105831. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Andrews S., Krueger F., Segonds-Pichon A., Biggins L., Krueger C., Wingett S. FastQC. A quality control tool for high throughput sequence data 370. https://www.bioinformatics.babraham.ac.uk/projects/fastqc/
  • 11.Bolger A.M., Lohse M., Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30(15):2114–2120. doi: 10.1093/bioinformatics/btu170. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Li D., Liu C.-M., Luo R., Sadakane K., Lam T.-W. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics. 2015;31(10):1674–1676. doi: 10.1093/bioinformatics/btv033. [DOI] [PubMed] [Google Scholar]
  • 13.Chaumeil P.-A., Mussig A.J., Hugenholtz P., Parks D.H. Oxford University Press; 2020. GTDB-Tk: a Toolkit to Classify Genomes With the Genome Taxonomy Database. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Chivian D., Jungbluth S.P., Dehal P.S., Wood-Charlson E.M., Canon R.S., Allen B.H., Clark M.M., Gu T., Land M.L., Price G.A. Metagenome-assembled genome extraction and analysis from microbiomes using KBase. Nat. Protoc. 2023;18(1):208–238. doi: 10.1038/s41596-022-00747-x. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement


Articles from Data in Brief are provided here courtesy of Elsevier

RESOURCES