Abstract
Streptococcus pyogenes, belonging to group A streptococcus (GAS), causes over 600 million infections annually being a predominant human pathogen. Lack of genomic data on GAS from India is one limitation to understand its virulence and antimicrobial resistance determinants. The genome of GAS isolates from clinical samples collected at Navi Mumbai, India was sequenced and annotated. Sequencing was performed on Ion Torrent PGM platform. The size of annotated S. pyogenes genomes ranged from ~1.69 to ~1.85 Mb with coverage of 38× to 189×. Most of the isolates had msr(D) and mef(A), and four isolates had erm(B) gene for macrolide resistance. The genome harboured multiple virulence factors including exotoxins in addition to phage elements in all GAS genomes. Four isolates belonged to sequence type ST28, 7 were identified as ST36 and 1 as ST55.
Specifications table
Value of the data
-
•
Group A streptococcus (GAS) causes over 600 million throat infections annually being a predominant human pathogen with high genomic plasticity due to the prophage integration and horizontal gene transfer.
-
•
This is the first genome report of S. pyogenes from India available in public database.
-
•
The GAS genomic data will serve as a base for further research focusing on the genomic attributes of virulence, antimicrobial resistance and clonal association by Whole genome shotgun sequencing.
1. Data
Streptococcus pyogenes, belonging to group A streptococcus (GAS), causes over 600 million infections annually being a predominant human pathogen. GAS throat infections are common in children between 4 and 7 years and pose several clinical and public health challenges [1]. Prevalence of Pharyngitis caused by S. pyogenes is difficult to determine as it is a throat colonizer, but some studies report as 10–15% [2]. The GAS pharyngitis is usually undetermined due to its self-limiting nature and major cases being of viral etiology [3]. M proteins, pili, leukocidins, streptolysins (O,S), complement inhibiting proteins, immunoglobulin-degrading enzymes, and superantigens are genome-encoded virulence factors that have been well characterized in S. pyogenes, [4], [5], where efflux pumps and leukocyte evasion strategies stays as an integral factors. High genomic plasticity is seen in S. pyogenes due to the prophage integration and horizontal gene transfer. [6].
The post Streptococcal sequelae following GAS pharyngitis are the non-suppurative manifestation of rheumatic fever followed by Rheumatic heart disease. In India, the overall prevalence is estimated at 1.5–2/1000 in all age groups, (total population about 1.3 billion) being suggestive of 2.0 to 2.5 million patients of RHD in the country [4]. Due to the high burden of the GAS infections in India, preventive strategies like vaccination turn to be the need of the hour.
Furthermore, lack of genomic data on GAS from India is one limitation to understand its virulence and antimicrobial resistance determinants. This study reports the whole genome sequence data of S. pyogenes for the first time from India. The GAS genomic data will serve as a base for further research focusing on the genomic attributes of virulence, antimicrobial resistance and clonal association by Whole genome shotgun sequencing.
2. Experimental design, materials and methods
2.1. Study isolates
During the months of March–May 2017, children up to 18 years with acute pharyngitis were screened for GAS infections at Dr. Yewale Multispeciality Hospital for Children, Navi Mumbai using the cutoff score of 3 of the Modified Centor criteria.
2.2. DNA extraction and genome sequencing
A total of 12 culture confirmed S. pyogenes were subjected to total DNA extraction using QiAamp DNA mini Kit (Qiagen, Germany).Whole genome shotgun sequencing was performed using IonTorrent PGM platform (Life Technologies) with 400 bp chemistry.
2.3. De novo assembly and annotation
Assembly of the raw reads were performed using AssemblerSPAdes v.5.0.0.0 embedded in Torrent suite server v.5.0.5. Annotation of the genome were done using the PATRIC database (the bacterial bioinformatics database and analysis resource) (http://www.patricbrc.org), [7] and the NCBI Prokaryotic Genome Automatic Annotation Pipeline (PGAAP) (http://www.ncbi.nlm.nih.gov/genomes/static/Pipeline.html). Further genome analysis was performed with the genomic tools available at the Center for Genomic Epidemiology (CGE) server (http://www.cbs.dtu.dk/services), and PATRIC database. The size of annotated S. pyogenes genomes ranged from ~1.69 to ~1.85 Mb with coverage of 38X to 189X (Table 1). The number of Coding DNA sequences (CDS) per genome ranged between 1725 and 2042. The draft genome sequences have been deposited in DDBJ/ENA/GenBank under the accession numbers provided in Table 1. The version described in this manuscript is version 1.
Table 1.
Isolate ID | Age in years/Gender |
Resistance |
Fever defervescence | Compliance to total duration antibiotic | Recurrence | Sequence Types | emm Type | Total size (bp) | Coverage | CDS | Contigs | AMR genes | Plasmids | Accession | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Penicillin/Amoxicillin | Clindamycin | Macrolide | ||||||||||||||
MUMCMC2276 | 7.6/F | No | Yes | Yes | 2 | Yes | Yes | 36 | emm12.0 (emm-cluster A-C4) | 1727473 | 184 | 1754 | 50 | msr(D), mef(A) | – | NGQI00000000 |
MUMCMC661 | 6.5/M | No | No | Yes | NA | NA | NA | 36 | emm12.4 (emm-cluster A-C4) | 1852181 | 174 | 1967 | 62 | msr(D), mef(A) | – | NGQK00000000 |
MUMCMC650 | 2.4/F | No | Yes | Yes | NA | NA | NA | 36 | emm12.0 (emm-cluster A-C4) | 1691843 | 164 | 1725 | 49 | – | – | NGQL00000000 |
MUMCMC317 | 5/F | No | No | Yes | 4 | Yes | No | 36 | emm12.0 (emm-cluster A-C4) | 1750987 | 189 | 1776 | 62 | msr(D), mef(A) | – | NGQN00000000 |
MUMCMC1953 | 3.5/F | No | No | No | 2 | Yes | No | 36 | emm12.0 (emm-cluster A-C4) | 1840495 | 115 | 1886 | 49 | msr(D), mef(A) | – | NIYX00000000 |
MUMCMC2034 | 2.5/M | No | Yes | No | 4 | Yes | No | 36 | emm12.0 (emm-cluster A-C4) | 1747918 | 136 | 1762 | 43 | msr(D), mef(A) | – | NJPV00000000 |
MUMCMC261 | 2/M | No | No | No | 2 | Yes | No | 36 | emm12.0 (emm-cluster A-C4) | 1732451 | 129 | 1752 | 53 | msr(D), mef(A) | – | NIYZ00000000 |
MUMCMC616 | 6/M | No | No | Yes | 2 | Yes | No | 28 | emm1.0. (emm cluster A-C3) | 1856054 | 38 | 2042 | 66 | aph(3')-III, ant(6)-Ia, erm(B), tet(M) | – | NGQM00000000 |
MUMCMC662 | 5/M | No | No | No | 1 | Yes | No | 28 | emm1.0. (emm cluster A-C3) | 1849506 | 88 | 1966 | 38 | aph(3')-III, ant(6)-Ia, erm(B), tet(M) | – | NGQJ00000000 |
MUMCMC51 | 5/M | No | No | Yes | 1 | Yes | No | 28 | emm1.0. (emm cluster A-C3) | 1849373 | 134 | 1912 | 39 | aph(3')-III, ant(6)-Ia, erm(B), tet(M) | – | NGQO00000000 |
MUMCMC13 | 6/F | No | No | Yes | 2 | Partial (7 days) | Yes | 28 | emm1.0. (emm cluster A-C3) | 1852166 | 169 | 1917 | 51 | aph(3')-III, ant(6)-Ia, erm(B), tet(M) | – | NGQP00000000 |
MUMCMC433 | 5.5/F | No | No | No | 2 | No antibiotic prescribed | No | 55 | emm2.0 (emm-cluster E4) | 1863902 | 121 | 1921 | 33 | msr(D), mef(A) | – | NIYY00000000 |
*NA- not available (patient couldn’t be followed).
Antimicrobial resistance (AMR) genes and plasmids were screened with ResFinder 2.1 and PlasmidFinder 1.3 tools [8], [9]. Most of the isolates had msr(D) and mef(A), and four isolates had erm(B) gene for macrolide resistance. Isolates MUMCMC616, MUMCMC662, MUMCMC51 and MUMCMC13 had aph(3')-III, ant(6)-Ia, and tet(M) genes for aminoglycoside and tetracycline resistance respectively (Table 1). Also, PATRIC analysis revealed ABC transporter membrane-spanning permease, multidrug resistance efflux pump pmrA and multi antimicrobial extrusion (MATE) family transporter genes responsible for macrolide and multi-drug resistance in all isolates.
Multiple virulence determinants in the GAS genomes were identified using the annotated data from PATRIC (Table 2). Of which, all the genomes harboured streptolysins O & S, and Streptococcal pyrogenic exotoxins C and G. Clusters of regularly interspaced short palindromic repeats (CRISPR) and spacer sequences in the genome were identified using CRISPR finder (http://crispr.u-psud.fr/Server/) [10]. All isolates carried 1,2,3,4,5d CRISPR type with varied repeat, spacer and array regions (Table 3).
Table 2.
Virulence trait | MUMCMC2276 | MUMCMC661 | MUMCMC650 | MUMCMC317 | MUMCMC1953 | MUMCMC2034 | MUMCMC261 | MUMCMC616 | MUMCMC662 | MUMCMC51 | MUMCMC13 | MUMCMC433 | Gene(s) with potential for conferring virulence traits |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Antiphagocytic M protein | + | + | + | + | + | + | + | + | + | + | + | + | emm, ennX, fbp, igaR |
Streptokinase | + | + | + | + | + | + | + | + | + | + | + | + | ska |
CAMP factor | + | + | + | + | + | + | + | + | + | + | + | + | cfa |
Streptolysin O | + | + | + | + | + | + | + | + | + | + | + | + | slo |
Streptolysin S | + | + | + | + | + | + | + | + | + | + | + | + | sagB, C, D, E, F, H, I, asn-ORF, ABC transporter |
Putative peptidoglycan hydrolase | + | + | + | + | + | + | + | + | + | + | + | + | GbpB/SagA/PcsB |
Hyaluronate lyase precursor | + | + | + | + | + | + | + | + | + | + | + | + | hyl |
Hyaluronan synthase | – | + | + | + | + | + | + | + | + | + | + | + | hasA |
Exotoxin* | + | + | + | + | + | + | + | – | – | – | – | – | Scarlet fever |
Streptococcal pyrogenic exotoxin A* | – | – | – | – | + | – | – | + | + | + | + | – | speA |
Cysteine Protease B* | + | + | + | + | + | + | + | + | + | + | + | + | speB |
Streptococcal pyrogenic exotoxin C* | + | + | + | + | + | + | + | + | + | + | + | + | speC |
Streptococcal pyrogenic exotoxin G | + | + | + | + | + | + | + | + | + | + | + | + | speG |
Streptococcal pyrogenic exotoxin H* | – | + | – | – | + | – | – | – | – | – | – | – | speH |
Streptococcal pyrogenic exotoxin I* | – | + | – | – | – | – | – | – | – | – | – | – | speI |
Streptococcal pyrogenic exotoxin J | – | – | – | – | + | – | + | + | + | + | + | – | speJ |
Streptococcal pyrogenic exotoxin K* | – | – | – | – | – | – | – | – | – | – | – | + | speK |
Streptococcal pyrogenic exotoxin L* | – | – | – | – | – | – | – | – | – | – | – | – | speL |
Streptococcal pyrogenic exotoxin M* | – | – | – | – | – | – | – | – | – | – | – | – | speM |
Streptococcal mitogenic exotoxin Z | + | + | + | + | + | + | + | + | + | + | + | – | smeZ |
C5a peptidase | + | + | + | + | + | + | + | + | + | + | + | + | scpA |
Secreted endo-beta-Nacetylglucosaminidase | + | + | + | + | + | + | + | + | + | + | + | + | ndoS |
Streptococcal inhibitor of complement | _ | – | – | – | – | – | – | + | + | + | + | + | sic |
Exotoxin nucleases | – | – | – | – | – | – | – | – | – | – | – | – | spd1, 2, 3, 4, sda |
Immunoglobulin-binding protease | + | + | + | + | + | + | + | + | + | + | + | + | ideS |
Collagen-like surface proteins | + | + | + | + | + | + | + | + | + | + | + | + | sclA, B |
Table 3.
Isolate | CRISPR/CAS type | CRISPR Repeat | CRISPR Spacer | CRISPR array |
---|---|---|---|---|
MUMCMC2276 | 1,2,3,4,5d | 9 | 7 | 2 |
MUMCMC662 | 1,2,3,4,5d | 9 | 7 | 2 |
MUMCMC661 | 1,2,3,4,5d | 4 | 3 | 1 |
MUMCMC650 | 1,2,3,4,5d | 9 | 7 | 2 |
MUMCMC616 | 1,2,3,4,5d | 9 | 7 | 2 |
MUMCMC317 | 1,2,3,4,5d | 9 | 7 | 2 |
MUMCMC51 | 1,2,3,4,5d | 9 | 7 | 2 |
MUMCMC13 | 1,2,3,4,5d | 9 | 7 | 2 |
MUMCMC1953 | 1,2,3,4,5d | 9 | 7 | 2 |
MUMCMC433 | 1,2,3,4,5d | 7 | 5 | 2 |
MUMCMC2034 | 1,2,3,4,5d | 9 | 7 | 2 |
MUMCMC261 | 1,2,3,4,5d | 9 | 7 | 2 |
Multi-locus sequence typing (MLST) of the GAS isolates were interpreted with the standard references available at the MLST 1.8 database (https://cge.cbs.dtu.dk//services/MLST/). Four isolates belonged to ST28, 7 were identified as ST36 and 1 as ST55. M protein typing was done using the Blast 2.0 server provided by National Centers for Disease Control, Biotechnology Core Facility Computing Laboratory and emm types were assigned. Isolates with ST28 corresponds to emm1.0 (emm cluster A-C3), ST36 to emm12.0 (emm-cluster A-C4) and ST55 to emm2.0 (emm-cluster E4) (Table 1).
The phages and phage associated elements in the genome of GAS were identified using PHAge Search Tool Enhanced Release (PHASTER) [11] (Table 4). Strept 315.2 phage was associated to all ST36 isolates with Clostr phiCT453B, Strept P9, Strept phiARI0131, Lactoc_PLgT, Strept phiARI0462, were the other phages seen. ST28 harboured PHAGE_Strept_T12, PHAGE Lactoc 28201, PHAGE Strept 315.3, PHAGE Pseudo phi3, PHAGE Strept 315.2 and PHAGE Strept T12 consistently among all isolates. PHAGE Strept 315.4, PHAGE Strept T12 and Clostr_phiCT453B were seen in ST55 isolate.
Table 4.
Isolate | Phage Name | Size | GC % | CDS |
---|---|---|---|---|
MUMCMC2276 | PHAGE_Strept_315.2_NC_004585 | 24.3Kb | 37.65 | 15 |
PHAGE_Clostr_phiCT453B_NC_029004 | 49.8Kb | 39.51 | 47 | |
MUMCMC661 | PHAGE_Strept_315.2_NC_004585 | 38Kb | 37.69 | 47 |
PHAGE_Lactoc_PLgT_1_NC_031016 | 63.1Kb | 39.14 | 66 | |
PHAGE_Strept_P9_NC_009819 | 33.2Kb | 39.73 | 42 | |
PHAGE_Strept_phiARI0131_2_NC_031941 | 26.2Kb | 38.91 | 36 | |
MUMCMC650 | PHAGE_Strept_315.2_NC_004585 | 21.7Kb | 36.94 | 16 |
MUMCMC317 | PHAGE_Clostr_phiCT453A_NC_028991 | 39.2Kb | 40.66 | 45 |
PHAGE_Strept_315.2_NC_004585 | 21.2Kb | 37.01 | 16 | |
PHAGE_Strept_P9_NC_009819 | 16Kb | 39.11 | 24 | |
MUMCMC1953 | PHAGE_Strept_phiARI0462_NC_031942(6) | 25.1Kb | 37.29 | 25 |
PHAGE_Clostr_phiCT453A_NC_028991(12) | 39.2Kb | 40.66 | 45 | |
PHAGE_Strept_P9_NC_009819(30) | 32.6Kb | 39.84 | 41 | |
PHAGE_Strept_phiARI0131_2_NC_031941(8) | 29Kb | 38.73 | 40 | |
PHAGE_Strept_315.2_NC_004585(17) | 11.7Kb | 37.58 | 21 | |
MUMCMC2034 | PHAGE_Clostr_phiCT453A_NC_028991(12) | 39.2Kb | 40.66 | 45 |
PHAGE_Strept_315.2_NC_004585(7) | 21Kb | 36.96 | 16 | |
MUMCMC261 | PHAGE_Clostr_phiCT453A_NC_028991(12) | 39.2Kb | 40.66 | 45 |
PHAGE_Strept_315.2_NC_004585(7) | 21Kb | 36.96 | 16 | |
MUMCMC616 | PHAGE_Strept_T12 | 28.2Kb | 38.55 | 45 |
PHAGE_Lactoc_28201_NC_031013 | 21.8Kb | 37.58 | 25 | |
PHAGE_Strept_315.3_NC_004586 | 15.9Kb | 36.07 | 31 | |
PHAGE_Pseudo_phi3_NC_030940 | 20.7Kb | 35.75 | 26 | |
PHAGE_Strept_315.3_NC_004586 | 20.9Kb | 38.56 | 35 | |
PHAGE_Strept_T12_NC_028700 | 20Kb | 35.94 | 29 | |
PHAGE_Strept_315.2_NC_004585 | 21.1Kb | 39.64 | 25 | |
MUMCMC662 | PHAGE_Strept_T12_NC_028700 | 28.2Kb | 38.55 | 46 |
PHAGE_Lactoc_28201_NC_031013 | 30Kb | 37.60 | 27 | |
PHAGE_Strept_315.2_NC_004585 | 21.1Kb | 39.64 | 26 | |
PHAGE_Strept_315.3_NC_004586 | 15.8Kb | 36.07 | 32 | |
PHAGE_Pseudo_phi3_NC_030940 | 20.7Kb | 35.76 | 26 | |
PHAGE_Strept_315.3_NC_004586 | 20.9Kb | 38.58 | 32 | |
PHAGE_Strept_T12_NC_028700 | 20Kb | 35.94 | 29 | |
MUMCMC51 | PHAGE_Strept_315.2_NC_004585 | 20.9Kb | 39.68 | 27 |
PHAGE_Strept_T12_NC_028700 | 28.4Kb | 38.54 | 43 | |
PHAGE_Lactoc_28201_NC_031013 | 30Kb | 37.60 | 26 | |
PHAGE_Strept_315.3_NC_004586 | 15.6Kb | 36.09 | 31 | |
PHAGE_Pseudo_phi3_NC_030940 | 20.6Kb | 35.77 | 26 | |
PHAGE_Strept_315.3_NC_004586 | 20.7Kb | 38.61 | 31 | |
PHAGE_Strept_T12_NC_028700 | 19.7Kb | 35.97 | 27 | |
MUMCMC13 | PHAGE_Strept_T12_NC_028700 | 28.1Kb | 38.56 | 43 |
PHAGE_Lactoc_28201_NC_031013 | 30Kb | 37.60 | 26 | |
PHAGE_Pseudo_phi3_NC_030940 | 22.1Kb | 35.81 | 26 | |
PHAGE_Strept_315.3_NC_004586 | 15.9Kb | 36.07 | 32 | |
PHAGE_Strept_315.3_NC_004586 | 20.8Kb | 38.58 | 33 | |
PHAGE_Strept_T12_NC_028700 | 20Kb | 35.95 | 28 | |
PHAGE_Strept_315.2_NC_004585 | 21Kb | 39.64 | 26 | |
MUMCMC433 | PHAGE_Strept_T12_NC_028700(23) | 22.4Kb | 38.89 | 34 |
PHAGE_Clostr_phiCT453B_NC_029004(11) | 49.8Kb | 39.51 | 47 | |
PHAGE_Strept_315.4_NC_004587(17) | 22.3Kb | 37.81 | 21 |
Acknowledgement
Paediatricians in Navi Mumbai for referring their patients to the study centre: Dr. P. Moralwar, Dr. Ranpise, Dr. P. Weekay, Dr. S. Shahane, Dr. C. Kulkarni, Dr. Shrikant, Dr. P. Gaikwad, Dr. U. Shrivastav, Dr. M. Shirodkar.
Footnotes
Supplementary data associated with this article can be found in the online version at doi:10.1016/j.dib.2018.03.129.
Transparency document. Supplementary material
.
References
- 1.Twisselmann B. Epidemiology, treatment, and control of infection with Streptococcus pyogenes in Germany. Eur. Surveill. 2000;4 (pii=1490) [Google Scholar]
- 2.Sanyahumbi A.S., Colquhoun S., Wyber R., Carapetis J.R. Global Disease Burden of Group A Streptococcus, 2016 Feb 10. In: Ferretti J.J., Stevens D.L., Fischetti V.A., editors. Streptococcus pyogenes: Basic Biology to Clinical Manifestations [Internet] University of Oklahoma Health Sciences Center; Oklahoma City (OK): 2016. 〈https://www.ncbi.nlm.nih.gov/books/NBK333415/〉 [PubMed] [Google Scholar]
- 3.Brahmadathan N.K. Molecular biology of Group A Streptococcus and its implications in vaccine strategies. Indian J. Med. Microbiol. 2017;35:176–183. doi: 10.4103/ijmm.IJMM_17_16. [DOI] [PubMed] [Google Scholar]
- 4.Cunningham MW M.W. Pathogenesis of group A streptococcal infections. Clin. Microbiol. Rev. 2000;13:470–511. doi: 10.1128/cmr.13.3.470-511.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Walker M.J., Barnett T.C., McArthur J.D., Cole J.N., Gillen C.M., Henningham A., Sriprakash K.S., Sanderson-Smith M.L., Nizet V. Disease manifestations and pathogenic mechanisms of Group A Streptococcus. Clin. Microbiol. Rev. 2014;27:264–301. doi: 10.1128/CMR.00101-13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Wong S., Yuen K.Y. Streptococcus pyogenes and reemergence of scarlet fever as a public health concern. Emerg. Microbes Infect. 2012;1:e2. doi: 10.1038/emi.2012.9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Wattam A.R., Abraham D., Dalay O., Disz T.L., Driscoll T., Gabbard J.L., Gillespie J.J., Gough R., Hix D., Kenyon R., Machi D., Mao C., Nordberg E.K., Olson R., Overbeek R., Pusch G.D., Shukla M., Schulman J., Stevens R.L., Sullivan D.E., Vonstein V., Warren A., Will R., Wilson M.J., Yoo H.S., Zhang C., Zhang Y., Sobral B.W. PATRIC, the bacterial bioinformatics database and analysis resource. Nucleic Acids Res. 2014;42:D581–D591. doi: 10.1093/nar/gkt1099. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Zankari E., Hasman H., Cosentino S., Vestergaard M., Rasmussen S., Lund O., Aarestrup F.M., Larsen M.V. Identification of acquired antimicrobial resistance genes. J. Antimicrob. Chemother. 2012 doi: 10.1093/jac/dks261. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Carattoli A., Zankari E., Garcia-Fernandez A., Voldby Larsen M., Lund O., Villa L., Møller Aarestrup F., Hasman H. PlasmidFinder and pMLST: in silico detection and typing of plasmids. Antimicrob. Agents Chemother. 2014;58:3895–3903. doi: 10.1128/AAC.02412-14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Grissa I., Vergnaud G., Pourcel C. CRISPRFinder: a web tool to identify clustered regularly interspaced short palindromic repeats. Nucleic Acids Res. 2007;35:W52–W57. doi: 10.1093/nar/gkm360. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Arndt D., Grant J., Marcu A., Sajed T., Pon A., Liang Y., Wishart D.S. PHASTER: a better, faster version of the PHAST phage search tool. Nucleic Acids Res. 2016;44:W16–W21. doi: 10.1093/nar/gkw387. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.