Skip to main content
mSystems logoLink to mSystems
. 2025 Oct 2;10(10):e00935-25. doi: 10.1128/msystems.00935-25

Characterizing Staphylococcus aureus genomic epidemiology with multilevel genome typing

Michael Payne 1,#, Liam Cheney 1,#, Sandeep Kaur 1, Genevieve McKew 2,3, Ruiting Lan 1,
Editor: Saheed Imam4
PMCID: PMC12542621  PMID: 41036854

ABSTRACT

Staphylococcus aureus is a major source of both hospital- and community-acquired infections worldwide. Advances in whole-genome sequencing (WGS) technologies have recently generated large volumes of S. aureus WGS data. The timely classification of S. aureus WGS data using genomic typing technologies has the potential to describe detailed genomic epidemiology at large and small scales. In this study, a multilevel genome typing (MGT) scheme, consisting of eight levels of multilocus sequence typing (MLST) schemes of increasing resolution, was developed for S. aureus and was used to analyze 50,481 publicly available genomes. The application of MGT to S. aureus epidemiology was shown in three case studies. First, the population structure of the globally disseminated MLST sequence type 8 (ST8) was described by MGT2 and compared with Spa typing. Second, MGT was used to characterize MLST ST8-USA300 isolates that colonized multiple body sites in the same patient. Finally, the MGT was used to describe the transmission of MLST ST239-SCCmec III throughout a single hospital. MGT STs were able to describe both isolates that had spread between wards and those that had colonized different reservoirs within a ward. S. aureus MGT describes S. aureus genomic epidemiology at multiple resolutions ranging from the global spread to local/individual scale using stable and standardized ST assignments. The S. aureus MGT database (https://mgtdb.unsw.edu.au/staphylococcus) is capable of tracking new and existing clones to facilitate the design of new strategies to reduce the global health burden of S. aureus infections.

IMPORTANCE

Staphylococcus aureus causes both hospital- and community-acquired infections worldwide. Methicillin-resistant S. aureus is best known and has spread across the globe. Whole-genome sequencing (WGS) can type strains at the highest resolution. To enable best use of WGS data for surveillance of S. aureus, this study developed a multilevel genome typing (MGT) scheme that provides a publicly available, standardized, flexible, and easily communicated system to describe S. aureus strains. MGT has eight typing levels that provide progressively higher resolution. Each of these levels allows subtypes to be accurately identified and tracked. We show that MGT can be used to track well-known S. aureus strains at low resolution while simultaneously being able to track outbreaks in hospital settings at high resolution. The S. aureus MGT will facilitate the use of genomic data for surveillance without the need for bioinformatic expertise, improving efforts to control this important pathogen and prevent infections.

KEYWORDS: Staphylococcus aureus, multilevel genome typing, epidemiology, genomics, genomic nomenclature, public health surveillance, database

INTRODUCTION

Staphylococcus aureus causes infections in both hospitalized individuals and those without clinical associations who are otherwise considered “healthy” (1). In 2017, S. aureus caused over 120,000 bloodstream infections and 20,000 deaths in the USA alone (2). A hallmark of S. aureus is its ability to acquire new mechanisms of antimicrobial resistance (AMR). The first report of methicillin-resistant S. aureus (MRSA) was in 1961, followed by a series of epidemic waves, wherein each acquired additional AMR (3). The spread of these epidemic waves was predominantly reported in individuals associated with hospitals, and the isolates were referred to as hospital-associated MRSA (HA-MRSA) (4). In the 1990s, there were reports of individuals not associated with clinical settings carrying MRSA. These cases were defined as community-associated MRSA (CA-MRSA). Recent epidemiological surveillance has reported the spread of CA-MRSA into hospital settings, with CA-MRSA predicted to displace HA-MRSA in most countries as the leading cause of MRSA infections (5, 6).

Early applications of multilocus sequence typing (MLST) retrospectively studied the global spread of HA-MRSA. One such example is sequence type 239 (ST239), which spread globally and caused multiple epidemics of HA-MRSA (7). MLST has also been used to characterize the emergence and spread of CA-MRSA. A comparative study between HA-MRSA and CA-MRSA has shown that CA-MRSA is more diverse in sequence types (STs) and geographically restricted (8). MLST is informative for broad epidemiological resolution analysis (9), providing utility in studying the origin and evolution of S. aureus (8). Other frequently used broad-resolution typing methods, commonly used in conjunction with MLST, include typing of the hypervariable Staphylococcal protein A (spa typing) and staphylococcal cassette chromosome mec typing (SCCmec typing) (8, 10, 11). SCCmec is a mobile genetic element and a determinant for broad-spectrum β-lactam resistance. SCCmec types are commonly used in conjunction with MLST types. So far, 13 different types of SCCmec elements have been discovered, which are further divided into subtypes based on the differences in their joining regions (8). SCCmec gene variants are discovered regularly. S. aureus strains are universally characterized using both MLST and SCCmec types. Spa typing, on the other hand, is often used for local or hospital outbreaks. The main source of variation in spa types is alterations (such as duplications, deletions, or mutations) in the repeat units of the polymorphic X region of the gene. Within closely related strains, spa types remain relatively stable. However, strain lineages cannot be reconstructed by direct sequence comparisons based on duplications/deletions of repeats (8). Furthermore, occasionally, recombination and/or homoplasy can lead to misclassification of types (12).

S. aureus whole-genome sequencing (WGS) data, however, are most often analyzed through phylogenomic reconstruction, especially when high-resolution strain differentiation of strains is required. Comparison of geographically diverse isolates sampled over long periods has uncovered the origins of globally disseminated clones (1316). For example, a study of ST8 isolates sampled across the globe identified its European origin and transmission into North America in the 2000s (13). Once ST8 was established in North America, phylogenetic analyses confirmed subsequent parallel epidemics of ST8-USA300 and ST8-USA400 CA-MRSA in North and South America, respectively (17).

While investigating S. aureus genomic epidemiology using phylogenetic approaches has provided high-resolution comparisons, phylogenetics without further implementation of genotyping algorithms cannot easily assign standardized names to strains (i.e., classify isolates into “types”). In 2014, an S. aureus core genome MLST (cgMLST) scheme was developed that offered high-resolution typing and standardized nomenclature (18). cgMLST has been sporadically used in small-scale studies, such as outbreaks in neonatal wards and household transmission (1921). cgMLST for each of these investigations distinguished epidemiologically indistinguishable isolates and allowed clustering of cgMLST sequence types (cgSTs) to describe the population structure. However, cgMLST for species-wide classification is hampered by the lack of flexibility in typing resolution. Existing cgMLST applications require setting allele thresholds to cluster isolates. Selecting different allele thresholds between investigations would prevent establishing a standardized nomenclature; to date, no such allele thresholds have been rigorously investigated and defined. Furthermore, the ability to vary the typing resolution is essential for describing large-scale epidemiology, such as the global dissemination of ST8, and small-scale epidemiology, such as ST8-USA300 outbreaks within a hospital setting (13, 20).

We previously developed a new method called multilevel genome typing (MGT) that has been applied to several organisms (2225). The MGT comprises a series of MLST schemes of increasing sizes, thus providing higher resolutions with each higher level. The advantage of using a series of MLST schemes within a single system is that at each level, an ST is assigned, which is a stable “type” assigned to an isolate without relying on grouping isolates based on multiple allele thresholds, which inherently lack this stability. For a species, MGT is typically organized such that MGT1 is the same as traditional seven-gene MLST, while MGT8 is the species cgMLST. MGT2 to MGT7 have an increasing number of loci as shown in Fig. 1 schematically. The lower-resolution levels can be used for longer-term epidemiology, while the higher-resolution levels can be used for shorter-term epidemiology. Following the well-established MLST terminology, an ST is assigned at each MGT level. Therefore, a strain can be typed at each level with specific STs at a given level, or the STs can be concatenated and referred to as a genome type, providing a standardized nomenclature for epidemiological typing.

Fig 1.

Hierarchical diagram of S. aureus multilevel genome typing system with 8 resolution levels from MLST (MGT1, 7 loci, ST age 111 years) to species cgMLST (MGT8, 1713 loci, ST age 0.2 years) using mutually exclusive cgMLST subsets.

Schematics of the MGT system. The S. aureus MGT scheme, as shown, consisted of eight levels with increasing resolution. The lowest-resolution level, MGT1, is the classic S. aureus MLST scheme, while the highest-resolution level, MGT8, is the species core genome MLST scheme. Levels 2 to 7 are composed of a mutually exclusive subsets of cgMLST loci. The age of an ST is defined as the average time for a new allele to emerge to give rise to a new ST at a given MGT level.

This study developed an S. aureus MGT that (i) offered flexibility in typing resolution to describe large- and small-scale S. aureus genomic epidemiology and (ii) established a standardized nomenclature for unambiguous communication of S. aureus types between investigations.

MATERIALS AND METHODS

Data set curation

Paired-end short-read data sets for 66,238 genomes that were sequenced using the Illumina Genome Analyzer, HiSeq, MiSeq, NextSeq, and NovaSeq platforms were downloaded on 15 July 2019 from the Sequence Read Archive of the National Center for Biotechnology Information (NBCI). Additionally, publicly available “year of isolation” metadata, where available, were downloaded from NCBI BioSample. Read sets were screened for contamination with Kraken (v.1.0.0), and read sets with more than 20% non-S. aureus reads were removed (26). Assemblies were generated with the MGTdb pipeline, which trimmed reads with Trimmomatic (v.0.39.0), performed reference-based assembly with Shovill (v.1.0.9) and SKESA (v.2.3.0), and calculated assembly quality metrics with Quast (v.5.2.0) (2730). The assemblies were quality-filtered using the thresholds shown in Table S1. The quality-filtered species data set contained 50,481 genomes (Table S2).

Core genome validation

A data set representative of all diversity within S. aureus was selected, which comprised assemblies from each MLST ST. This data set was termed the representative data set. The number of assemblies per ST included in the representative data set was proportionate to the frequency of that ST. All assemblies assigned a singleton ST were included.

A S. aureus core genome was previously defined with 1,861 core loci (18). The core loci of this core genome were verified using the representative data set (Supplementary Methods 1.7). We used two metrics to measure the quality of each locus: first, the number of assemblies of the representative data set in which locus presence was present (or absent), and second, the percentage of assemblies in which additional processing was required for a locus to be called (problematic locus count). Additional processing scripts handled loci with missing sequence. A locus was reported as absent when greater than 20% of the sequence was missing and problematic when at least 80% of the DNA sequence was present (22). Core loci that were either absent or problematic in more than 1% of the representative data set were removed. The validated core genome had 1,713 core genes.

MGT design

MGT1 loci were the same as the seven-gene MLST for S. aureus (31). MGT8 loci were the validated core genome genes. The MGT2 to MGT7 loci were selected from the validated core genome. MGT2–MGT7 loci were selected based on previously published methods (2224). The MGT design methodology included four stages: calculating the size of each level, calculating the selection criteria for each core locus, separating the core loci into preferences, and selecting core loci to fill the levels (Supplementary Methods, Supplementary Data sets, Supplementary Scripts). Loci within MGT levels 2–7 were mutually exclusive, allowing the independent assignment of strain relationships. This independence leads to hierarchical inconsistency, defined as strains assigned to the same ST at a higher level of resolution but split into multiple STs at a lower-resolution level, which occurs for a small proportion of the strains (see Supplementary Methods 1.6).

MGT typing of the S. aureus species data set

The species data set (n = 50,481) was processed using the MGT pipeline (available on GitHub at: https://github.com/LanLab/MGT_reads2alleles) (22). First, as MGT1 is the well-established seven-gene MLST for S. aureus (31), all MGT1 STs are identical to the MLST STs, and hence their assignment is made using the mlst (https://github.com/tseemann/mlst) program with the S. aureus database hosted on PubMLST (32). The MGT2 to MGT8 STs are assigned using the MGT allele-calling pipeline. Core locus reference alleles were selected using the available complete reference genome S. aureus subsp. aureus COL (NCBI SRA accession: GCF_000012045). Both the reference alleles and their corresponding MGT STs, based on the allelic profiles for each MGT level, were initially assigned the integer identifier “1.” For subsequent isolates, the MGT allele-calling pipeline applied the following thresholds: a maximum of 16 single nucleotide polymorphisms (SNPs) within a 40-base sliding window, 80% BLAST nucleotide identity, and 80% BLAST high-scoring segment pair. The 16 SNPs within a 40-base sliding window were implemented to account for potential misalignments during allele calling. New alleles identified for any locus were assigned the next available integer identifier. Similarly, newly observed allelic profiles at each MGT level were assigned the next available integer for their corresponding MGT ST. These integer identifiers are arbitrary, i.e. differences between the identifiers do not imply genetic relatedness between isolates.

Allele-based phylogenetic construction

Allele-based phylogenies were constructed for MGT1 ST8 isolates. Allele profiles from MGT8 (equivalent to cgMLST) were used to generate a phylogenetic tree using GrapeTree (v.1.0.0) with the rapid neighbor-joining algorithm (33). The MLST ST8 phylogeny was compared with the MGT classification of the MLST ST8 isolates. The isolates in the phylogeny generated by GrapeTree are colored by their assigned MGT2 ST.

SNP-based phylogenetic construction

Two SNP-based phylogenies were generated, one each for populations of ST239-SCCmec III and ST8-USA300 (34, 35). Except for the chosen reference genome, SNP-based phylogenies were generated for each population using the same process. The reference genome S. aureus JKD6008 (NCBI SRA accession: GCA_000145595) was used to generate the ST239-SCCmec III phylogeny, and the reference genome S. aureus TCH1516 (NCBI SRA accession: GCF_000017085) was used to generate the ST8-USA300 phylogeny. To generate both phylogenies, SNPs were called against the reference genome and an SNP alignment was generated using default Snippy-Core settings (v.4.6.0) (36). SNPs under the influence of recombination were predicted and removed from SNP alignment using RecDetect (v.6.1) (37). RecDetect predicted recombination SNPs using a strict recombination prediction model. IQ-TREE (v.2.0.4) processed the SNP alignment to create a maximum likelihood phylogeny using 1,000 bootstraps and automatic selection of model parameters (38). Additional metadata were visualized on the phylogeny using iTOL (v.6) (39).

In silico Spa and SCCmec typing

All spa type comparisons used spa types predicted in silico by SpaTyper (v.0.3.3) with default settings (40). SCCmec types were predicted using staphopia-sccmec (v.1.0.0) with default settings (41). It should be noted that the publicly available version of Staphopia used in this study can only identify SCCmec types I–VII, and recently described types (VIII–XIV) were not identifiable (42).

RESULTS

The S. aureus MGT consists of eight levels

S. aureus MGT consisted of eight levels with increasing numbers of loci (Fig. 1; Table 2). MGT1 was the traditional seven-gene S. aureus MLST scheme, and MGT8 was the species cgMLST, which had 1,713 core loci. The intermediate MGT levels of MGT2, 3, 4, 5, 6, and 7 each had 21, 38, 75, 183, 367, and 748 loci, respectively (Fig. 1; Table 2; Data set S9). From the global data set (n = 50,481), 98.73% (49,838/50,481) of the isolates were assigned an ST at all MGT levels. The remaining 1.27% (643/50,481) of the isolates were not assigned an ST at one of the eight MGT levels.

Epidemiology of the S. aureus species described using MGT

The division of the S. aureus population structure by MGT was examined using the species data set (50,481 isolates; Table S2). Of this data set, 26,416 (52.3%) had year metadata, 27,046 (53.6%) had country metadata, and 25,596 (50.7%) had both metadata types. The average number of years that major STs were sampled showed a clear trend of shorter timespan in higher-resolution MGT levels (Table 1; Fig. S1), demonstrating the potential epidemiological usefulness of each level in describing short- and long-lived clones.

TABLE 1.

Overview of the S. aureus MGT levels

MGT level No. of loci No. of major STsa Isolates in major STs % of isolates in major STs Average year span of major STs % of isolates in continent-specific major STsb
MGT1 7 144 46,621 96.14 17.29 30.22
MGT2 21 338 33,983 70.95 11.36 50.22
MGT3 38 450 26,234 54.78 8.6 84.27
MGT4 75 429 17,788 37.15 6.84 92.98
MGT5 183 339 9,721 20.30 4.07 96.48
MGT6 367 243 6,265 13.08 1.75 96.17
MGT7 748 173 4,095 8.55 1.07 99.4
MGT8 1,713 113 2,436 5.08 0.3 100
a

Major STs, STs with >10 isolates assigned to them.

b

Continent-specific STs, >80% of the assigned isolates from one continent.

To determine the level that would best describe geographical trends, we identified major STs (STs with more than 10 isolates assigned to them) with over 80% of their isolates from one continent. We then identified the lowest-resolution level, where more than 90% of the isolates were within these continent-specific STs (Table 1). The value of MGT4 was 92.98%; therefore, this level was selected to examine the distribution of continent-specific STs. The 100 largest MGT4 STs contained 7,952 isolates, 4,343 of which included continental metadata (Fig. 2A). These STs showed distinct distributions in both continent and country (Fig. 2B). Of the largest 100 STs, 26 contained no continent metadata, and the remaining 74 were continent specific. Of these, 63 were specific to one country, whereas 11 were found in more than one country on the same continent.

Fig 2.

Stacked bar charts showing S. aureus genetic diversity across MGT4 sequence types with geographic distribution and SCCmec patterns. Charts reveal hierarchical typing relationships and antimicrobial resistance profiles across global isolates.

Geographic and genetic diversity of S. aureus as described by MGT. The figure plots the 100 largest MGT4 STs with continent and country metadata in panels A and B and carriage of mecA, and mecC in panel C to show MSSA and MRSA. The y-axis is isolate count in each ST. MGT1, MGT2, and MGT3 STs are shown below each MGT4 ST and grouped to allow examination of STs at these levels. (A) The proportion by continent in each MGT4 ST is represented by column colors. (B) The proportion by country in each MGT4 ST is represented by column colors. Countries with fewer than 200 isolates were grouped into “other continent” categories for clarity. (C) SCCmec types assigned in each MGT4 ST. Several instances of hierarchical inconsistency can be observed in this figure. This occurs when one ST at a higher resolution (i.e., MGT3 ST338) is found in multiple STs at lower resolutions (i.e., MGT2 ST1 and ST199). This is an expected outcome of the MGT scheme design used in this study (see Supplementary Methods 1.6).

SCCmec types were also assigned to all isolates, and their distribution among the top 100 MGT4 STs is shown (Fig. 2C). In many cases, all isolates within an MGT1 ST were MRSA (or at least contained mecA) and were of a single SCCmec type (e.g., MGT1 ST22 and SCCmec IV). In other cases, MGT2 or MGT3 STs were required to describe groups of isolates that were either exclusively methicillin-sensitive S. aureus (MSSA) or MRSA and contained a single SCCmec type (e.g., MGT3 ST7: type IV, MGT3 ST338: MSSA, and MGT3 ST2141: type “V or VII”). At MGT4, only ST 3404 contained two SCCmec types (IV and “V or VII”).

Penicillin-sensitive S. aureus (PSSA), which lacks both blaZ and mecA/mecC and lacks resistance to penicillin, has recently emerged as a growing cause of infection (43, 44). We identified 9,318 (18.5% of the total data set) putative PSSA isolates (lacking blaZ, mecA, and mecC) and identified PSSA STs in the 100 largest MGT4 STs as described above. Six MGT4 STs were putative PSSA and were classified into four MGT1 STs (Table 2). These MGT1 STs were assigned to two clonal complexes (CCs) (ST5 and ST6 to CC5 and ST8 and ST254 to CC8).

TABLE 2.

PSSA MGT4 STs in top 100 largest STs

Clonal complex MGT1 ST MGT4 ST Isolates
5 5 14,648 68
6 9,209 53
8 8 594 230
885 141
1,114 125
254 10,735 30

Using higher-level MGT to describe MGT1 ST8 isolates

We selected MGT1 ST8 (traditional MLST ST8, a well-known CA-MRSA clone) to demonstrate the application of MGT in S. aureus epidemiology. Of the 50,481 isolates typed, 4,388 were MGT1 ST8 and had associated collection year metadata. As the MGT levels increased in resolution, the number of STs at each level increased, while the size of STs decreased (Fig. S2). The largest MGT2 ST (ST1) was assigned to 37.58% (1,649/4,388) of the isolates, whereas the largest MGT5 ST (ST900) was assigned to 5.38% (236/4,388).

In MGT2, there were 24 STs with >10 isolates and these isolates were sampled from 2008 to 2019. These 24 MGT2 STs varied in frequency over time (Fig. 3) with some STs persisting over multiple years and others sampled only in a single year. MGT2 ST1 was the only MGT2 ST sampled in all years and was by far the largest. In eight of these years (2008–2011 and 2016–2019), MGT2 ST1 was assigned to more than 70% of the isolates. The second-largest type, MGT2 ST199, was sampled only in 2012 and 2013. The third largest ST, MGT2 ST114, was similar to MGT2 ST1 and was sampled in all years except for 2014 and 2017. There was a considerable difference between MGT2 ST114 and MGT2 ST1 based on the frequency of isolates over time. MGT2 ST114 had 45.17% (262/580) of isolates sampled in 2015, and an average of 2.25 isolates for the remaining eight years (2008–2013, 2016, and 2018).

Fig 3.

Stacked bar chart showing yearly distribution of MGT1 ST8 isolates (2008-2019). Peak in 2009 (~720 isolates), followed by 2012 and 2013. MGT2 ST1 predominates across years. Largest MGT2 ST diversity appears in 2015, while 2019 shows fewest isolates.

Distribution of MGT1 ST8 isolates by year and MGT2 STs. The temporal distribution of 3,478 MLST ST8 isolates colored by MGT2 ST. The size of each bar represents the number of isolates assigned to each MGT2 ST. MGT2 STs in the figure legend are organized in descending order of frequency.

MGT typing was compared with spa typing of MGT1 ST8 isolates. The spa types of the MGT1 ST8 isolates were predicted in silico. A spa type was predicted in 99.91% (3,475/3,478) of the isolates, and there were 121 unique spa types (Fig. 4A). A large number of predicted spa types were small. Of the 121 spa types, 104 were assigned to fewer than 10 isolates, which cumulatively represented 6.15% (214/3,478) of the MGT1 ST8 data set. The majority of the 104 spa types (54.81%, 57/104) were assigned to a single isolate (Table S2).

Fig 4.

Phylogenetic tree comparing typing methods within MGT1 ST8. Left diagram shows spa types with t008 predominant (2297 isolates). Right shows MGT2 types with type 1 most common (2233 isolates). The tree reveals genetic relationships among 3488 isolates.

Comparing spa and MGT2 typing within MGT1 ST8. The MGT1 ST8 phylogeny (n = 3,488) was compared with that of the MGT2 and spa typing. The phylogeny was generated with GrapeTree (v.1.0.0), which uses the rapid neighbor-joining algorithm to process MGT8 (cgMLST) allele profiles (33). Each node was an MGT8 ST, and the nodes were colored by spa type (A) and MGT2 STs (B). Only spa types and MGT2 STs assigned to 10 or more isolates are shown. The frequencies of each type are shown in square brackets.

The spa types assigned to 10 or more isolates were selected for comparison with MGT2 STs. Seventeen spa types were selected that together typed 93.85% (3,264/3,478) of the MGT1 ST8 isolates (Fig. 4B). The majority of the isolates were assigned to spa type t008 (66.04%, 2,297/2,478). The next two largest spa types were t211 and t064, which included 312 and 309 isolates, respectively. The remaining 14 spa types were assigned to fewer than 100 isolates.

The division of the MGT1 ST8 data set into MGT2 STs and spa types was compared (Fig. 4; Fig. S3). Eight MGT2 STs had a single spa type. MGT2 ST283, ST300, and ST340 isolates were spa type t008, MGT2 ST126 and ST216 were spa type t064, MGT2 ST341 was spa type t190, MGT2 ST339 was spa type t723, and MGT2 ST397 was spa type t622. Other MGT2 STs had a predominant spa type with one or more other spa types.

MGT investigation of MGT1 ST8 asymptomatic S. aureus colonization

We used WGS data from a study that tracked S. aureus colonization of different body sites of the same patients (34) to demonstrate the usefulness of multiple-level resolution of MGT. The 82 S. aureus MGT1 ST8-USA300 isolates from a cohort of 29 patients were typed by MGT, 24 of which had more than one isolate. A maximum likelihood phylogeny was constructed using core genome SNPs (Fig. 5).

Fig 5.

Phylogenetic tree of 82 MGT1 ST8-USA300 isolates showing core SNP variation. Patient-specific isolates grouped by bolded sequence types with MGT1-MGT8 profiles displayed. Bootstrap values color-code branches (70-100).

MGT1–MGT8 classification of MGT1 ST8-USA300 isolates from the same patient. A collection of MGT1 ST8-USA300 isolates (n = 82) was sampled from 29 patients and classified using MGT. The MGT1–MGT8 STs and anonymized patient identifiers for each isolate were aligned next to the phylogeny. STs that were selected to group isolates from the same patient are bolded. The STs in gray were not used to describe multiple isolates from the same patient. A phylogenetic tree was generated using maximum likelihood based on variations in core SNPs. Branches are colored based on bootstrap support values per color legend. Branches with bootstraps <70 were not colored and remained black.

Using the levels of MGT2 to MGT6, a total of 34 STs at different MGT levels were identified that were specific to one of the 24 patients. In 18 patients (1, 2, 5–10, 12–15, 17–18, 20–22, and 24–29), all isolates were assigned a specific MGT ST (Fig. 5). The remaining six patients (3, 4, 11, 16, 19, and 23) each had isolates typed by two specific STs (Fig. 5). Only Patient 11 required STs from two different MGT levels to describe all isolates, with three isolates each of MGT4 ST1112 and MGT6 ST1840.

We further identified the number of patient-specific STs at each MGT level and the percentage of the total isolates assigned to them. The STs from MGT1 to MGT5 were unable to assign all isolates to STs that were specific to one patient (Table 3). MGT6 was the first level containing only patient-specific STs. At MGT6, 13 patients required two or three STs to group all isolates. Seven patients (4, 20, 22–24, and 27–28) each had two MGT6 STs, and six patients (1, 11, 13, 16, and 18–19) each had three MGT6 STs, which grouped all isolates.

TABLE 3.

Determination of MGT level to group all isolates of the same patient

Level Patients described Patient-specific STs Isolates in patient-specific STs (% of data set)a
MGT1 0 0 0
MGT2 5 6 9.76
MGT3 11 12 36.59
MGT4 17 21 53.66
MGT5 27 35 86.59
MGT6 29 49 100
MGT7 29 61 100
MGT8 29 69 100
a

For MGT1 to MGT8, the number of patients in which all isolates were assigned patient-specific STs was counted. An ST was patient-specific when 100% of the isolates assigned that ST were sampled from a single patient. At the MGT level, the total number of isolates in patient-specific STs was reported as a percentage of the ST8-USA300 data set (n = 82).

MGT application to investigation of MGT1 ST239 transmission in a hospital setting

In a published study that characterized the spread of MRSA throughout Concord Hospital (Sydney, NSW, Australia) between July 2012 and November 2014 (35), 238 MRSA isolates were collected and typed using multiplex PCR-reverse line blot binary typing (referred to as binary typing). Eighteen isolates were typed as (MGT1) ST239-SCCmec III and divided into three binary types (BTs): 280841, 280973, and 281997. When these isolates were typed using MGT, four MGT STs from MGT2 to MGT5 described the same sets of isolates as those described by BT (Fig. 6). MGT2 ST1294 and MGT3 ST2063 are equivalent to BT281997 and BT280973, respectively. MGT5 separated BT280973 into MGT5 STs (ST4115 and ST4123), which also represented isolates acquired from different hospital areas. MGT5 ST4115 was acquired within the burns operating theater and ward, whereas MGT5 ST4123 was acquired in the intensive care unit and general hospital area. Within the burns ward, BT281997 (MGT2 ST1294) was divided into MGT8 ST5758, which was patient-acquired, and MGT8 ST5743, which was from the hospital environment. Thus, MGT STs from MGT8 could distinguish patient isolates from those sampled from the environment. BT280841 was the only BT that could not be described by an ST at a single MGT level. BT280841 was separated into MGT4 ST3013 and MGT5 ST4153, with the latter containing only burn ward environment isolates. MGT4 ST3013 contained isolates sampled from patients and the environment, with the patient isolates distinguished by MGT5 ST4144.

Fig 6.

Phylogenetic tree of ST239-SCCmec III bacterial spread in Concord Hospital. Table shows MGT sequence types across eight MGT levels with binary types from different hospital units, marking patient and environmental isolates to track transmission.

MGT classification of MGT1 ST239-SCCmec III spread within Concord Hospital. The spread of MGT1 ST239-SCCmec III through the Concord Hospital (Sydney, Australia) was described by MGT and binary typing. The three binary types are colored orange, green, and purple. The same coloring scheme was used to mark MGT STs that were selected to group isolates as binary types. MGT STs that divided binary types in concordance with the phylogenetic structure are outlined with a gray box. Patient-sampled isolates are marked with an asterisk, and isolates sampled from the environment are marked with a double asterisk.

DISCUSSION

The development of genomic classification technologies that characterize S. aureus population structure and transmission is essential for designing and implementing control and prevention strategies. In this study, an MGT scheme was developed to classify all publicly available S. aureus WGS data (n = 50,481 as of 15 July 2019). The S. aureus MGT database is updated daily and publicly available for community use at https://mgtdb.unsw.edu.au/staphylococcus/. The public MGT database has minimal requirements for bioinformatics expertise. Sequencing reads can be directly uploaded to the server, which processes the data within a few hours and automatically assigns MGT STs at each of the eight MGT levels. These assignments and other data (such as alleles and allelic profiles) are made available to the user for further analysis. A comprehensive description of the usability and features of MGTdb, and its software architecture for local deployment, can be found in Kaur et al. (27). MGT was used to investigate both large- and small-scale S. aureus genomic epidemiology using published data as case studies.

MGT for the standardized genomics-based classification of S. aureus

In 2007, the European Society of Clinical Microbiology and Infectious Diseases released guidelines for the development of novel classification technologies (45). These guidelines emphasize the importance of creating new typing technologies that define a standardized nomenclature, offer flexibility in typing resolution, and assign types that are interpretable and easily communicable. S. aureus MGT has eight levels filled with loci from the species core genome. The differing number of loci at different MGT levels offers flexibility in typing resolution, while maintaining standardized ST nomenclature at each level. These standardized MGT STs are easily interpretable to encourage standardized communication regarding S. aureus genomic epidemiology.

The range of typing resolutions makes the MGT a useful epidemiological tracing tool. At lower-resolution levels, STs tend to be larger, longer-lived clones and more widely distributed worldwide. As the resolution increases, STs become smaller, more short-lived, and continent- or country-specific. At the highest resolution, MGT8 (cgMLST) has the power to uncover chains of transmission and outbreak origins (20, 46, 47). The benefit of MGT classification is that when multiple levels are considered together, the higher-resolution level progressively divides the lower-resolution level STs to hierarchically reveal the relationship of the isolates across MGT levels. However, hierarchical inconsistencies may be present for a small proportion of the isolates across MGT1 to MGT7 because of the mutually exclusive loci sets that make up the MGT levels (22). Hierarchical inconsistency is also further illustrated in Fig. S4. Random mutations of genes at a lower-resolution level will lead to the assignment of an isolate to a different ST from its closest isolate, while at a higher-resolution level, the same ST will be assigned due to no mutations in genes at the higher-resolution level. This inconsistency can be resolved by examining the levels above and below to identify their true relationships. Further examination of clonal complexing of the involved STs can also help to resolve the inconsistencies.

We used several MGT levels above MGT1 (seven-gene MLST) to showcase the flexibility and usefulness of MGT to examine S. aureus population structure and epidemiological surveillance. In general, the lower-resolution levels such as MGT2–MGT5 can be used for longer-term epidemiology, while the higher-resolution levels such as MGT6–MGT8 can be used for shorter-term epidemiology. The estimates of the time of emergence of a new ST at a given MGT level are based on the average nucleotide substitution rate of 2.83 × 10−6 (4851) (Fig. 1), which is also consistent with the average year span of the STs at different MGT levels from the global data set (Table 1; Fig. S1). Thus, the different MGT levels give an indication of their appropriateness for temporal epidemiological analysis ranging from >110 years when using MGT1 to 2 months when using MGT8. However, the selection of a specific MGT level or multiple levels will depend on the data and its epidemiological objectives.

We have shown that MGT offers high resolution across diverse data sets and could serve as a valuable tool for longitudinal studies that track the evolution and persistence of specific clones. Different MGT levels may be suited to examine S. aureus epidemiology in local, national, or international levels and to identify emerging clones in both hospital and community settings.

The highest level, MGT8, is species core genome MLST. Further typing resolution can be achieved by including core intergenic regions or constructing clone level core genome MLST. For Salmonella serovars Typhimurium and Enteritidis, a serovar level MGT9 increases resolution by 6%–18% based on the number of STs typed (22, 23).

Multiple MGT levels describe global distribution of SCCmec types and PSSA isolates

The utility of describing populations at multiple levels using MGT was demonstrated using the global distribution of MRSA SCCmec types and PSSA strains. At MGT1, many STs are already composed of only one SCCmec type (ST22, ST105, and ST239); therefore, no higher resolution is required to describe their distribution. However, some globally prevalent MRSA STs can be divided into higher MGT-level STs that have either different SCCmec types or are MSSA. Within MGT1 ST5, MRSA types can be distinguished by MGT2 STs: SCCmec II by MGT2 ST1943, ST6485, ST6650, SCCmec IV by MGT2 ST3627, and MSSA by MGT2 ST1159. For MGT1 ST8, MSSA subtypes can be distinguished from MRSA isolates using MGT2 (ST199 and ST3854) and MGT3 (ST338).

Among the largest 100 MGT4 STs, PSSA is more sporadic and requires a higher resolution to separate it from non-PSSA isolates. Of the six MGT4 STs, two could also be uniquely identified by the lowest-resolution level at MGT1, two could be identified using MGT2, and the remaining two required MGT4 to uniquely identify them. Four of the PSSA MGT4 STs (ST594, ST885, ST1114, ST10735) were mostly composed of isolates from artificial evolution experiments (52, 53) and were all within CC8, whereas ST14648 and ST9209 were from studies on avian S. aureus and general S. aureus diversity, respectively, and were from CC5 (54, 55).

MGT provides better description of the large-scale population structure of globally disseminated clones

MGT1 ST8 is a global CA-MRSA ST that has spread in the Americas and Europe, but has also been reported in Africa and Asia (13, 5658). Using MGT2, we can describe country-specific STs for the USA (MGT2 ST114, ST199, and ST443) and the UK (MGT2 ST146). In contrast, we can also identify STs that are still globally distributed, even at a significantly higher resolution of MGT4 (MGT4 ST6 and ST14). Within MGT1 ST8, the USA300 clone is of particular interest because of its hypervirulence and its association with spa type t008 and SCCmec type IV. MGT2 ST1 was the dominant MGT2 ST within MGT1 ST8 and was dominated by SCCmec type IV and spa type t008 isolates. MGT2 STs also describe other large spa types, such as t064 (MGT2 ST114), which is associated with Africa (59), and t211 (MGT2 ST199). MGT provides simpler descriptions of many MGT1 ST8 clades than spa types. Several spa types that are polyphyletic are divided into smaller but more phylogenetically congruent MGT2 STs, such as spa t064, which is divided into MGT2 ST114 and ST184. In addition, many small spa types can be more easily grouped into larger types using MGT2 STs. MGT can dissect the spatial population structure of global clones and provides a better description of the large-scale population structure than SCCmec typing and spa typing for MGT1 ST8.

MGT STs can uniquely describe isolates colonizing the same patient within ST8-USA300

The presence of S. aureus is an important risk factor when predicting MRSA onset, and 50%–80% of MRSA infections are caused by isolates already carried by the host (60). The ability of WGS-based classification to type isolates sampled from the same patient can describe the genomic epidemiology of persistently colonizing S. aureus. The flexibility in MGT typing resolution allows STs from multiple levels to describe the persistent colonization of patients. As shown by the MGT1 ST8-USA300 isolates, STs from MGT2 to MGT6 were able to uniquely group MGT1 ST8-USA300 isolates of most patients by using a specific MGT typing resolution, with the majority of patients (24/29) having one unique MGT ST identifier. A combination of two or more levels would allow description of the origin and diversity within a patient.

MGT STs described MGT1 ST239- SCCmec III spread throughout Concord Hospital, Sydney

MGT was used to describe the spread of MGT1 ST239-SCCmec III isolates in a hospital in a previous study (35). The multiple resolutions were able to describe both the spread between wards and identify isolates that colonized patients or environments within a ward. The STs from higher MGT levels separated isolates that colonized the hospital environment from the patients admitted to that ward, showing that higher MGT resolutions can separate MGT1 ST239-SCCmec isolates independently evolving within a hospital ward. Compared with binary typing, the flexibility in MGT typing resolution divides binary types into STs that better describe S. aureus acquisition and transmission. Higher resolution can distinguish more closely related isolates within a hospital ward, such as those colonizing a hospital environment and those colonizing an admitted patient. The application of MGT to describe MGT1 ST239-SCCmec III spread within a hospital acts as a proof-of-concept investigation and can be applied to any S. aureus control within a healthcare setting.

Limitations of this study and MGT classification of S. aureus

Investigating S. aureus genomic epidemiology in this study used publicly available S. aureus WGS data and metadata. The availability of accurate metadata is essential for interpreting S. aureus WGS analysis. Almost all temporal metadata for the investigation of MGT1 ST8 large-scale population structure were sourced from NCBI BioSamples. Interpretation of S. aureus WGS data for temporal or spatial epidemiology is also hampered when metadata is not available. Only 50% of the data set had both year and country metadata. Improving metadata availability in the future will greatly enhance the genome data for a detailed description of species-wide S. aureus genomic epidemiology and global surveillance. The agreement on international standards for epidemiological metadata, such as country of origin, year of isolation, and host and disease-causing status, would enable better use of MGT to describe the genomic epidemiology of S. aureus.

A technical challenge when using MGT to describe S. aureus genomic epidemiology is the definition of singleton STs. The division of S. aureus genomes into singletons prevents a description of the genetic relationship between the isolates. Higher MGT levels that offer higher resolutions are expected to increasingly separate isolates into singletons. For example, the MGT2 classification of MGT1 ST8 defined 9.80% (430/4,388) of the isolates as singleton STs, which were excluded from further investigation. It is possible that the isolates in these singleton STs were closely related to those in one of the other large MGT2 STs. The MGT2 level had 21 core loci, and an isolate assigned to a singleton ST needed to carry only a unique allele for one of the 21 loci. These singleton STs, likely representing novel variants, have previously been shown to arise more frequently in bacterial species with higher SNP mutation rates such as S. aureus (61). In traditional MLST schemes, CC is used to group singleton STs with their member of generally one or two allele difference (31, 62), which is widely used in the S. aureus MLST (63). In the S. aureus MGT online database, clonal complexes (defined by one allele difference between member STs) are automatically generated at each MGT level to enable users to view the clonal complex of singleton STs at each MGT level. Additionally, in general, singleton STs do not limit resolution in transmission inference during outbreak tracking, as they are often incorporated into outbreak clusters. Outbreak delineation typically relies on genomic distance cutoffs (SNPs or alleles) and contextual information such as timeframe, geography, or within-host diversity (64, 65). Thus, if a singleton ST differs from outbreak-associated strains by only a single SNP/allele, or by a number of SNPs/alleles below the defined threshold, then that ST is generally included within the outbreak cluster.

Future directions and public health implications

We show in this study that MGT can be used for standardized genomic surveillance of S. aureus. It has the potential for real-time public health surveillance with or without integration into judiciary-specific public health surveillance systems. We have developed basic surveillance reports that are generic to all MGT schemes (27) and can be further developed to cater to specific organisms. A major barrier to the application of genome data to public health is bioinformatic analysis, which often involves phylogenetic analysis and arbitrary SNP cutoffs (64). The turnaround time for bioinformatic analysis in an infection control application study was 6 days on average (65). The turnaround of MGT typing is short (generally approximating 4 to 6 hours), which is largely automated. This is particularly useful in settings with limited bioinformatics resources. We previously developed an outbreak detection algorithm for foodborne pathogens (66). It can be further developed for S. aureus and integrated into the S. aureus MGT system. It may further be used in conjunction with specialist infection control tools such as HAIviz (67), which has the capability for geotagging within a facility. Therefore, MGT can be used to monitor potential outbreaks in healthcare settings and support outbreak response, for instance, by guiding interventions within specific hospital wards or among specific patient groups.

S. aureus MGT can be further integrated with antimicrobial resistance gene profiling and virulence gene profiling to detect and track existing or emerging antimicrobial-resistant or virulent clones (68). S. aureus MGT can also be integrated with artificial intelligence to predict transmission pathways based on genome data in the future. Lastly, future studies can also aim to integrate genomic data with phenotypic data, such as antimicrobial resistance, virulence, and environmental persistence traits, as this will strengthen the translational relevance of genomic surveillance.

Conclusion

In this study, we developed an MGT scheme for flexible, stable, and standardized genomic classification of S. aureus. The MGT consists of eight levels, providing typing resolution flexibility. MGT describes the large-scale population structure of the species, including global, continent-specific, and country-specific STs, and their relationship to MRSA lineages and SCCmec subtypes. Within the globally disseminated MGT1 ST8, MGT2 was able to match or improve upon the commonly used spa typing method. Using a combination of higher-resolution MGT levels, MGT STs were able to precisely describe isolates from each patient from a study of persistent colonization. Finally, in a hospital outbreak, MGT STs from lower levels grouped isolates that had spread between hospital wards, whereas higher MGT levels assigned STs that distinguished isolates within the same ward. The MGT was able to describe the genomic epidemiology of S. aureus from large, long-lived, global STs to small, short-lived STs found only in a single patient. The S. aureus MGT is publicly available (https://mgtdb.unsw.edu.au/staphylococcus/), updated daily, and allows both public and private user submission. S. aureus MGT can assist in the tracking of existing and new S. aureus clones, which is essential when designing prevention and control strategies to reduce the disease burden of this important pathogen.

ACKNOWLEDGMENTS

We acknowledge Robin Heron for technical assistance.

This study was supported in part by a grant from the National Health and Medical Research Council of Australia. Liam Cheney was supported by an Australian Government Research Training Program Scholarship.

Contributor Information

Ruiting Lan, Email: r.lan@unsw.edu.au.

Saheed Imam, LifeMine Therapeutics, Cambridge, Massachusetts, USA.

DATA AVAILABILITY

All data used in this study are publicly available, and all MGT data are available at mgtdb.unsw.edu.au/Staphylococcus.

SUPPLEMENTAL MATERIAL

The following material is available online at https://doi.org/10.1128/msystems.00935-25.

Figure S1. msystems.00935-25-s0001.pdf.

Distribution of year spans of major STs at each MGT level.

DOI: 10.1128/msystems.00935-25.SuF1
Figure S2. msystems.00935-25-s0002.pdf.

The size distribution of MGT2 to MGT8 STs assigned to MGT1 ST8 isolates.

DOI: 10.1128/msystems.00935-25.SuF2
Figure S3. msystems.00935-25-s0003.pdf.

Comparing MLST ST8 classification between MGT and Spa typing.

DOI: 10.1128/msystems.00935-25.SuF3
Figure S4. msystems.00935-25-s0004.pdf.

Illustration of hierarchical inconsistency at MGT3 in six isolates.

DOI: 10.1128/msystems.00935-25.SuF4
Supplemental text. msystems.00935-25-s0005.docx.

Supplemental methods.

DOI: 10.1128/msystems.00935-25.SuF5
Supplemental data sets. msystems.00935-25-s0006.xlsx.

Data sets S1 to S9.

DOI: 10.1128/msystems.00935-25.SuF6
Table S1. msystems.00935-25-s0007.xlsx.

Criteria for quality-filtering assemblies.

DOI: 10.1128/msystems.00935-25.SuF7
Table S2. msystems.00935-25-s0008.txt.

The quality-filtered species data set with MGT assignment and available metadata.

DOI: 10.1128/msystems.00935-25.SuF8

ASM does not own the copyrights to Supplemental Material that may be linked to, or accessed through, an article. The authors have granted ASM a non-exclusive, world-wide license to publish the Supplemental Material files. Please contact the corresponding author directly for reuse.

REFERENCES

  • 1. Sakr A, Brégeon F, Mège J-L, Rolain J-M, Blin O. 2018. Staphylococcus aureus nasal colonization: an update on mechanisms, epidemiology, risk factors, and subsequent infections. Front Microbiol 9:2419. doi: 10.3389/fmicb.2018.02419 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Kourtis AP, Hatfield K, Baggs J, Mu Y, See I, Epson E, Nadle J, Kainer MA, Dumyati G, Petit S, Ray SM, Ham D, Capers C, Ewing H, Coffin N, McDonald LC, Jernigan J, Cardo D, Emerging Infections Program MRSA author group . 2019. Vital signs: epidemiology and recent trends in methicillin-resistant and in methicillin-susceptible Staphylococcus aureus bloodstream infections — United States. MMWR Morb Mortal Wkly Rep 68:214–219. doi: 10.15585/mmwr.mm6809e1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Chambers HF, Deleo FR. 2009. Waves of resistance: Staphylococcus aureus in the antibiotic era. Nat Rev Microbiol 7:629–641. doi: 10.1038/nrmicro2200 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Lindsay JA. 2013. Hospital-associated MRSA and antibiotic resistance—what have we learned from genomics? Int J Med Microbiol 303:318–323. doi: 10.1016/j.ijmm.2013.02.005 [DOI] [PubMed] [Google Scholar]
  • 5. Skov RL, Jensen KS. 2009. Community-associated meticillin-resistant Staphylococcus aureus as a cause of hospital-acquired infections. J Hosp Infect 73:364–370. doi: 10.1016/j.jhin.2009.07.004 [DOI] [PubMed] [Google Scholar]
  • 6. D’Agata EMC, Webb GF, Horn MA, Moellering RC Jr, Ruan S. 2009. Modeling the invasion of community-acquired methicillin-resistant Staphylococcus aureus into hospitals. Clin Infect Dis 48:274–284. doi: 10.1086/595844 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Monecke S, Slickers P, Gawlik D, Müller E, Reissig A, Ruppelt-Lorz A, Akpaka PE, Bandt D, Bes M, Boswihi SS, et al. 2018. Molecular typing of ST239-MRSA-III from diverse geographic locations and the evolution of the SCCmec III element during its intercontinental spread. Front Microbiol 9:1436. doi: 10.3389/fmicb.2018.01436 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Lakhundi S, Zhang K. 2018. Methicillin-resistant Staphylococcus aureus: molecular characterization, evolution, and epidemiology. Clin Microbiol Rev 31:e00020-18. doi: 10.1128/CMR.00020-18 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Park K-H, Greenwood-Quaintance KE, Uhl JR, Cunningham SA, Chia N, Jeraldo PR, Sampathkumar P, Nelson H, Patel R. 2017. Molecular epidemiology of Staphylococcus aureus bacteremia in a single large Minnesota medical center in 2015 as assessed using MLST, core genome MLST and spa typing. PLoS One 12:e0179003. doi: 10.1371/journal.pone.0179003 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. David MZ, Taylor A, Lynfield R, Boxrud DJ, Short G, Zychowski D, Boyle-Vavra S, Daum RS. 2013. Comparing pulsed-field gel electrophoresis with multilocus sequence typing, spa typing, staphylococcal cassette chromosome mec (SCCmec) typing, and PCR for panton-valentine leukocidin, arcA, and opp3 in methicillin-resistant Staphylococcus aureus isolates at a U.S. Medical Center. J Clin Microbiol 51:814–819. doi: 10.1128/JCM.02429-12 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Vossenkuhl B, Brandt J, Fetsch A, Käsbohrer A, Kraushaar B, Alt K, Tenhagen B-A. 2014. Comparison of spa types, SCCmec types and antimicrobial resistance profiles of MRSA isolated from turkeys at farm, slaughter and from retail meat indicates transmission along the production chain. PLoS One 9:e96308. doi: 10.1371/journal.pone.0096308 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Sabat AJ, Budimir A, Nashev D, Sá-Leão R, van Dijl J m, Laurent F, Grundmann H, Friedrich AW, ESCMID Study Group of Epidemiological Markers (ESGEM) . 2013. Overview of molecular typing methods for outbreak detection and epidemiological surveillance. Euro Surveill 18:20380. doi: 10.2807/ese.18.04.20380-en [DOI] [PubMed] [Google Scholar]
  • 13. Strauß L, Stegger M, Akpaka PE, Alabi A, Breurec S, Coombs G, Egyir B, Larsen AR, Laurent F, Monecke S, Peters G, Skov R, Strommenger B, Vandenesch F, Schaumburg F, Mellmann A. 2017. Origin, evolution, and global transmission of community-acquired Staphylococcus aureus ST8. Proc Natl Acad Sci USA 114:E10596–E10604. doi: 10.1073/pnas.1702472114 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Challagundla L, Reyes J, Rafiqullah I, Sordelli DO, Echaniz-Aviles G, Velazquez-Meza ME, Castillo-Ramírez S, Fittipaldi N, Feldgarden M, Chapman SB, Calderwood MS, Carvajal LP, Rincon S, Hanson B, Planet PJ, Arias CA, Diaz L, Robinson DA. 2018. Phylogenomic classification and the evolution of clonal complex 5 methicillin-resistant Staphylococcus aureus in the Western Hemisphere. Front Microbiol 9:1901. doi: 10.3389/fmicb.2018.01901 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Gill JL, Hedge J, Wilson DJ, MacLean RC. 2021. Evolutionary processes driving the rise and fall of Staphylococcus aureus ST239, a dominant hybrid pathogen. mBio 12:e0216821. doi: 10.1128/mBio.02168-21 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Baines SL, Jensen SO, Firth N, Gonçalves da Silva A, Seemann T, Carter GP, Williamson DA, Howden BP, Stinear TP. 2019. Remodeling of pSK1 family plasmids and enhanced chlorhexidine tolerance in a dominant hospital lineage of methicillin-resistant Staphylococcus aureus. Antimicrob Agents Chemother 63:e02356-18. doi: 10.1128/AAC.02356-18 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Planet PJ, Diaz L, Kolokotronis S-O, Narechania A, Reyes J, Xing G, Rincon S, Smith H, Panesso D, Ryan C, Smith DP, Guzman M, Zurita J, Sebra R, Deikus G, Nolan RL, Tenover FC, Weinstock GM, Robinson DA, Arias CA. 2015. Parallel epidemics of community-associated methicillin-resistant Staphylococcus aureus USA300 infection in North and South America. J Infect Dis 212:1874–1882. doi: 10.1093/infdis/jiv320 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Leopold SR, Goering RV, Witten A, Harmsen D, Mellmann A. 2014. Bacterial whole-genome sequencing revisited: portable, scalable, and standardized analysis for typing and detection of virulence and antibiotic resistance genes. J Clin Microbiol 52:2365–2370. doi: 10.1128/JCM.00262-14 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Cunningham SA, Chia N, Jeraldo PR, Quest DJ, Johnson JA, Boxrud DJ, Taylor AJ, Chen J, Jenkins GD, Drucker TM, Nelson H, Patel R. 2017. Comparison of whole-genome sequencing methods for analysis of three methicillin-resistant Staphylococcus aureus outbreaks. J Clin Microbiol 55:1946–1953. doi: 10.1128/JCM.00029-17 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Slingerland B, Vos MC, Bras W, Kornelisse RF, De Coninck D, van Belkum A, Reiss IKM, Goessens WHF, Klaassen CHW, Verkaik NJ. 2020. Whole-genome sequencing to explore nosocomial transmission and virulence in neonatal methicillin-susceptible Staphylococcus aureus bacteremia. Antimicrob Resist Infect Control 9:39. doi: 10.1186/s13756-020-0699-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Zhu F, Zhuang H, Ji S, Xu E, Di L, Wang Z, Jiang S, Wang H, Sun L, Shen P, Yu Y, Chen Y. 2021. Household transmission of community-associated methicillin-resistant Staphylococcus aureus. Front Public Health 9:658638. doi: 10.3389/fpubh.2021.658638 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Payne M, Kaur S, Wang Q, Hennessy D, Luo L, Octavia S, Tanaka MM, Sintchenko V, Lan R. 2020. Multilevel genome typing: genomics-guided scalable resolution typing of microbial pathogens. Euro Surveill 25:1900519. doi: 10.2807/1560-7917.ES.2020.25.20.1900519 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Luo L, Payne M, Kaur S, Hu D, Cheney L, Octavia S, Wang Q, Tanaka MM, Sintchenko V, Lan R. 2021. Elucidation of global and national genomic epidemiology of Salmonella enterica serovar Enteritidis through multilevel genome typing. Microb Genom 7:000605. doi: 10.1099/mgen.0.000605 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Cheney L, Payne M, Kaur S, Lan R. 2021. Multilevel genome typing describes short- and long-term Vibrio cholerae molecular epidemiology. mSystems 6:e0013421. doi: 10.1128/mSystems.00134-21 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Payne M, Xu Z, Hu D, Kaur S, Octavia S, Sintchenko V, Lan R. 2023. Genomic epidemiology and multilevel genome typing of Bordetella pertussis. Emerg Microbes Infect 12:2239945. doi: 10.1080/22221751.2023.2239945 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Wood DE, Salzberg SL. 2014. Kraken: ultrafast metagenomic sequence classification using exact alignments. Genome Biol 15:R46. doi: 10.1186/gb-2014-15-3-r46 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27. Kaur S, Payne M, Luo L, Octavia S, Tanaka MM, Sintchenko V, Lan R. 2022. MGTdb: a web service and database for studying the global and local genomic epidemiology of bacterial pathogens. Database (Oxford) 2022:baac094. doi: 10.1093/database/baac094 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Gurevich A, Saveliev V, Vyahhi N, Tesler G. 2013. QUAST: quality assessment tool for genome assemblies. Bioinformatics 29:1072–1075. doi: 10.1093/bioinformatics/btt086 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Bolger AM, Lohse M, Usadel B. 2014. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30:2114–2120. doi: 10.1093/bioinformatics/btu170 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Souvorov A, Agarwala R, Lipman DJ. 2018. SKESA: strategic k-mer extension for scrupulous assemblies. Genome Biol 19:153. doi: 10.1186/s13059-018-1540-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Enright MC, Day NP, Davies CE, Peacock SJ, Spratt BG. 2000. Multilocus sequence typing for characterization of methicillin-resistant and methicillin-susceptible clones of Staphylococcus aureus. J Clin Microbiol 38:1008–1015. doi: 10.1128/JCM.38.3.1008-1015.2000 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Jolley KA, Bray JE, Maiden MCJ. 2018. Open-access bacterial population genomics: BIGSdb software, the PubMLST.org website and their applications. Wellcome Open Res 3:124. doi: 10.12688/wellcomeopenres.14826.1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Zhou Z, Alikhan N-F, Sergeant MJ, Luhmann N, Vaz C, Francisco AP, Carriço JA, Achtman M. 2018. GrapeTree: visualization of core genomic relationships among 100,000 bacterial pathogens. Genome Res 28:1395–1404. doi: 10.1101/gr.232397.117 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Read TD, Petit RA 3rd, Yin Z, Montgomery T, McNulty MC, David MZ. 2018. USA300 Staphylococcus aureus persists on multiple body sites following an infection. BMC Microbiol 18:206. doi: 10.1186/s12866-018-1336-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. McKew G, Ramsperger M, Cheong E, Gottlieb T, Sintchenko V, O’Sullivan M. 2020. Hospital MRSA outbreaks: multiplex PCR-reverse line blot binary typing as a screening method for WGS, and the role of the environment in transmission. Infect Dis Health 25:268–276. doi: 10.1016/j.idh.2020.05.007 [DOI] [PubMed] [Google Scholar]
  • 36. Seemann T. 2015. Snippy: rapid haploid variant calling and core genome alignment. https://github.com/tseemann/snippy.
  • 37. Hu D, Liu B, Wang L, Reeves PR. 2020. Living Trees: high-quality reproducible and reusable construction of bacterial phylogenetic trees. Mol Biol Evol 37:563–575. doi: 10.1093/molbev/msz241 [DOI] [PubMed] [Google Scholar]
  • 38. Nguyen L-T, Schmidt HA, von Haeseler A, Minh BQ. 2015. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol Biol Evol 32:268–274. doi: 10.1093/molbev/msu300 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39. Letunic I, Bork P. 2021. Interactive Tree Of Life (iTOL) v5: an online tool for phylogenetic tree display and annotation. Nucleic Acids Res 49:W293–W296. doi: 10.1093/nar/gkab301 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40. High Content Genomics & Bioinformatics . 2022. SpaTyper: in silico prediction of S. aureus spa types. https://github.com/HCGB-IGTP/spaTyper.
  • 41. Petit RA 3rd, Read TD. 2018. Staphylococcus aureus viewed from the perspective of 40,000+ genomes. PeerJ 6:e5261. doi: 10.7717/peerj.5261 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. Cheney L, Payne M, Kaur S, Lan R. 2024. SaLTy: a novel Staphylococcus aureus lineage typer. Microb Genom 10:001250. doi: 10.1099/mgen.0.001250 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43. Butler-Laporte G, Lee TC, Cheng MP. 2018. Increasing rates of penicillin sensitivity in Staphylococcus aureus. Antimicrob Agents Chemother 62:e00680-18. doi: 10.1128/AAC.00680-18 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44. Davido B, Lawrence C, Dinh A, Bouchand F. 2018. Back to the future with the use of penicillin in penicillin-susceptible Staphylococcus aureus (PSSA) bacteremia. Am J Med 131:e155. doi: 10.1016/j.amjmed.2017.10.032 [DOI] [PubMed] [Google Scholar]
  • 45. van Belkum A, Tassios PT, Dijkshoorn L, Haeggman S, Cookson B, Fry NK, Fussing V, Green J, Feil E, Gerner-Smidt P, Brisse S, Struelens M, European Society of Clinical Microbiology and Infectious Diseases (ESCMID) Study Group on Epidemiological Markers (ESGEM) . 2007. Guidelines for the validation and application of typing methods for use in bacterial epidemiology. Clin Microbiol Infect 13:1–46. doi: 10.1111/j.1469-0691.2007.01786.x [DOI] [PubMed] [Google Scholar]
  • 46. Lagos AC, Sundqvist M, Dyrkell F, Stegger M, Söderquist B, Mölling P. 2022. Evaluation of within-host evolution of methicillin-resistant Staphylococcus aureus (MRSA) by comparing cgMLST and SNP analysis approaches. Sci Rep 12:10541. doi: 10.1038/s41598-022-14640-w [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47. Chen Y, Sun L, Wu D, Wang H, Ji S, Yu Y. 2018. Using core-genome multilocus sequence typing to monitor the changing epidemiology of methicillin-resistant Staphylococcus aureus in a teaching hospital. Clin Infect Dis 67:S241–S248. doi: 10.1093/cid/ciy644 [DOI] [PubMed] [Google Scholar]
  • 48. Nübel U, Dordel J, Kurt K, Strommenger B, Westh H, Shukla SK, Zemlicková H, Leblois R, Wirth T, Jombart T, Balloux F, Witte W. 2010. A timescale for evolution, population expansion, and spatial spread of an emerging clone of methicillin-resistant Staphylococcus aureus. PLoS Pathog 6:e1000855. doi: 10.1371/journal.ppat.1000855 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49. Young BC, Golubchik T, Batty EM, Fung R, Larner-Svensson H, Votintseva AA, Miller RR, Godwin H, Knox K, Everitt RG, Iqbal Z, Rimmer AJ, Cule M, Ip CLC, Didelot X, Harding RM, Donnelly P, Peto TE, Crook DW, Bowden R, Wilson DJ. 2012. Evolutionary dynamics of Staphylococcus aureus during progression from carriage to disease. Proc Natl Acad Sci USA 109:4550–4555. doi: 10.1073/pnas.1113219109 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50. Smyth DS, McDougal LK, Gran FW, Manoharan A, Enright MC, Song J-H, de Lencastre H, Robinson DA. 2010. Population structure of a hybrid clonal group of methicillin-resistant Staphylococcus aureus, ST239-MRSA-III. PLoS One 5:e8582. doi: 10.1371/journal.pone.0008582 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51. Harris SR, Feil EJ, Holden MTG, Quail MA, Nickerson EK, Chantratita N, Gardete S, Tavares A, Day N, Lindsay JA, Edgeworth JD, de Lencastre H, Parkhill J, Peacock SJ, Bentley SD. 2010. Evolution of MRSA during hospital transmission and intercontinental spread. Science 327:469–474. doi: 10.1126/science.1182395 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52. Kim S, Lieberman TD, Kishony R. 2014. Alternating antibiotic treatments constrain evolutionary paths to multidrug resistance. Proc Natl Acad Sci USA 111:14494–14499. doi: 10.1073/pnas.1409800111 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53. Bacigalupe R, Tormo-Mas MÁ, Penadés JR, Fitzgerald JR. 2019. A multihost bacterial pathogen overcomes continuous population bottlenecks to adapt to new host species. Sci Adv 5:eaax0063. doi: 10.1126/sciadv.aax0063 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54. Donker T, Reuter S, Scriberras J, Reynolds R, Brown NM, Török ME, James R, Network E, Aanensen DM, Bentley SD, Holden MTG, Parkhill J, Spratt BG, Peacock SJ, Feil EJ, Grundmann H. 2017. Population genetic structuring of methicillin-resistant Staphylococcus aureus clone EMRSA-15 within UK reflects patient referral patterns. Microb Genom 3:e000113. doi: 10.1099/mgen.0.000113 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55. Murray S, Pascoe B, Méric G, Mageiros L, Yahara K, Hitchings MD, Friedmann Y, Wilkinson TS, Gormley FJ, Mack D, Bray JE, Lamble S, Bowden R, Jolley KA, Maiden MCJ, Wendlandt S, Schwarz S, Corander J, Fitzgerald JR, Sheppard SK. 2017. Recombination-mediated host adaptation by avian Staphylococcus aureus. Genome Biol Evol 9:830–842. doi: 10.1093/gbe/evx037 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56. Wang X, Zhao H, Wang B, Zhou Y, Xu Y, Rao L, Ai W, Guo Y, Wu X, Yu J, Hu L, Han L, Chen S, Chen L, Yu F. 2022. Identification of methicillin-resistant Staphylococcus aureus ST8 isolates in China with potential high virulence. Emerg Microbes Infect 11:507–518. doi: 10.1080/22221751.2022.2031310 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57. Vandenesch F, Naimi T, Enright MC, Lina G, Nimmo GR, Heffernan H, Liassine N, Bes M, Greenland T, Reverdy M-E, Etienne J. 2003. Community-acquired methicillin-resistant Staphylococcus aureus carrying Panton-Valentine leukocidin genes: worldwide emergence. Emerg Infect Dis 9:978–984. doi: 10.3201/eid0908.030089 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58. Thwala T, Madoroba E, Basson A, Butaye P. 2021. Prevalence and characteristics of Staphylococcus aureus associated with meat and meat products in African countries: a review. Antibiotics (Basel) 10:1108. doi: 10.3390/antibiotics10091108 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59. Asadollahi P, Farahani NN, Mirzaii M, Khoramrooz SS, van Belkum A, Asadollahi K, Dadashi M, Darban-Sarokhalil D. 2018. Distribution of the most prevalent spa types among clinical isolates of methicillin-resistant and -susceptible Staphylococcus aureus around the world: a review. Front Microbiol 9:163. doi: 10.3389/fmicb.2018.00163 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60. Turner NA, Sharma-Kuinkel BK, Maskarinec SA, Eichenberger EM, Shah PP, Carugati M, Holland TL, Fowler VG Jr. 2019. Methicillin-resistant Staphylococcus aureus: an overview of basic and clinical research. Nat Rev Microbiol 17:203–218. doi: 10.1038/s41579-018-0147-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61. Feil EJ, Cooper JE, Grundmann H, Robinson DA, Enright MC, Berendt T, Peacock SJ, Smith JM, Murphy M, Spratt BG, Moore CE, Day NPJ. 2003. How clonal is Staphylococcus aureus? J Bacteriol 185:3307–3316. doi: 10.1128/JB.185.11.3307-3316.2003 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62. Maiden MCJ, Jansen van Rensburg MJ, Bray JE, Earle SG, Ford SA, Jolley KA, McCarthy ND. 2013. MLST revisited: the gene-by-gene approach to bacterial genomics. Nat Rev Microbiol 11:728–736. doi: 10.1038/nrmicro3093 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63. Planet PJ, Narechania A, Chen L, Mathema B, Boundy S, Archer G, Kreiswirth B. 2017. Architecture of a species: phylogenomics of Staphylococcus aureus. Trends Microbiol 25:153–166. doi: 10.1016/j.tim.2016.09.009 [DOI] [PubMed] [Google Scholar]
  • 64. Coll F, Raven KE, Knight GM, Blane B, Harrison EM, Leek D, Enoch DA, Brown NM, Parkhill J, Peacock SJ. 2020. Definition of a genetic relatedness cutoff to exclude recent transmission of meticillin-resistant Staphylococcus aureus: a genomic epidemiology analysis. Lancet Microbe 1:e328–e335. doi: 10.1016/S2666-5247(20)30149-X [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65. Forde BM, Bergh H, Cuddihy T, Hajkowicz K, Hurst T, Playford EG, Henderson BC, Runnegar N, Clark J, Jennison AV, Moss S, Hume A, Leroux H, Beatson SA, Paterson DL, Harris PNA. 2023. Clinical implementation of routine whole-genome sequencing for hospital infection control of multi-drug resistant pathogens. Clin Infect Dis 76:e1277–e1284. doi: 10.1093/cid/ciac726 [DOI] [PubMed] [Google Scholar]
  • 66. Payne M, Hu D, Wang Q, Sullivan G, Graham RM, Rathnayake IU, Jennison AV, Sintchenko V, Lan R. 2024. DODGE: automated point source bacterial outbreak detection using cumulative long term genomic surveillance. Bioinformatics 40:btae427. doi: 10.1093/bioinformatics/btae427 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67. Permana B, Harris PNA, Roberts LW, Cuddihy T, Paterson DL, Beatson SA, Forde BM. 2024. HAIviz: an interactive dashboard for visualising and integrating healthcare-associated genomic epidemiological data. Microb Genom 10:001200. doi: 10.1099/mgen.0.001200 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68. Kaur S, Payne M, Partridge SR, Sintchenko V, Lan R. 2024. Global genomic dissection of antimicrobial resistance in Salmonella Typhimurium. bioRxiv. doi: 10.1101/2024.05.12.593721 [DOI]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Figure S1. msystems.00935-25-s0001.pdf.

Distribution of year spans of major STs at each MGT level.

DOI: 10.1128/msystems.00935-25.SuF1
Figure S2. msystems.00935-25-s0002.pdf.

The size distribution of MGT2 to MGT8 STs assigned to MGT1 ST8 isolates.

DOI: 10.1128/msystems.00935-25.SuF2
Figure S3. msystems.00935-25-s0003.pdf.

Comparing MLST ST8 classification between MGT and Spa typing.

DOI: 10.1128/msystems.00935-25.SuF3
Figure S4. msystems.00935-25-s0004.pdf.

Illustration of hierarchical inconsistency at MGT3 in six isolates.

DOI: 10.1128/msystems.00935-25.SuF4
Supplemental text. msystems.00935-25-s0005.docx.

Supplemental methods.

DOI: 10.1128/msystems.00935-25.SuF5
Supplemental data sets. msystems.00935-25-s0006.xlsx.

Data sets S1 to S9.

DOI: 10.1128/msystems.00935-25.SuF6
Table S1. msystems.00935-25-s0007.xlsx.

Criteria for quality-filtering assemblies.

DOI: 10.1128/msystems.00935-25.SuF7
Table S2. msystems.00935-25-s0008.txt.

The quality-filtered species data set with MGT assignment and available metadata.

DOI: 10.1128/msystems.00935-25.SuF8

Data Availability Statement

All data used in this study are publicly available, and all MGT data are available at mgtdb.unsw.edu.au/Staphylococcus.


Articles from mSystems are provided here courtesy of American Society for Microbiology (ASM)

RESOURCES