Skip to main content
Journal of Clinical Microbiology logoLink to Journal of Clinical Microbiology
. 2021 Nov 18;59(12):e00135-21. doi: 10.1128/JCM.00135-21

Framing Bacterial Genomics for Public Health (Care)

Alison Laufer Halpin a,, L Clifford McDonald a, Christopher A Elkins a
Editor: Romney M Humphriesb
PMCID: PMC8601216  PMID: 34076468

ABSTRACT

Advancements in comparative genomics have generated significant interest in defining applications for health care-associated pathogens. Clinical microbiology, however, relies on increasingly automated platforms to quickly identify pathogens, resistance mechanisms, and therapy options within Clinical Laboratory Improvement Amendments (CLIA)- and FDA-approved frameworks. Additionally, and most notably, health care-associated pathogens, especially those that are resistant to antibiotics, represent a diverse spectrum of genera harboring complex genetic targets, including antibiotic, biocide, and virulence determinants that can be highly transmissible and, at least for antibiotic resistance, serve as potential targets for containment efforts. U.S. public health investments have focused on rapidly detecting outbreaks and emerging resistance in health care-associated pathogens using reference, culture-based, and molecular methods that are distributed, for example, across national laboratory network infrastructures. Herein we describe the public health applications of genomic science that are built from the top-down for broad surveillance, as well as the bottom-up, starting with identification of infections and infectious clusters. For health care-associated, including antimicrobial-resistant, pathogens, we propose a combination of top-down and bottom-up genomic approaches leveraged across the public health spectrum, from local infection control, to regional and national containment efforts, to national surveillance for understanding emerging strain ecology and fitness of health care pathogens.

KEYWORDS: DNA sequencing, antibiotic resistance, drug resistance mechanisms, health care, pathogens, public health

INTRODUCTION

Prevalence survey data estimate that hospitalized patients in the United States endured 687,200 health care-associated infections (HAIs) in 2015 and that ∼11% of affected patients died during their hospital stay (1). A diverse range of pathogens cause HAIs; many harbor intrinsic and/or acquired multidrug resistance genes of public health import (2). Antibiotic resistance is a critical global health threat, and health care settings serve as amplifiers not only of antibiotic resistance but also of transmission of multidrug-resistant pathogens. Given the number of HAIs each year and the pathogen diversity, rapid and affordable laboratory methods are integral to not only direct clinical care but also to support public health in detecting and investigating outbreaks, tracking epidemiological trends, and informing local, regional, and national containment and prevention efforts.

Though next-generation sequencing still presents challenges, it has repeatedly demonstrated capacity for improving public health; for example, in prediction of tuberculosis drug susceptibility patterns (3). Beyond tuberculosis, much interest and large-scale infrastructure for sequencing bacterial pathogens in the United States stemmed from the foodborne arena as CDC’s PulseNet transitioned from pulsed-field gel electrophoresis (PFGE) to whole-genome sequencing (WGS), and FDA-developed GenomeTrakr (4, 5). These approaches focused on moving real-time molecular typing to a replete data set for the purpose of flagging outbreaks caused by a defined set of pathogens distributed over space and time and linking these outbreaks to an increasingly global food supply with many potential environmental reservoirs. WGS is a significant advancement over previous molecular typing schemes, including PFGE and multilocus sequence typing (MLST). These approaches all have significant caveats for phylogenetic inferences and various degrees of discriminatory power—often without the ability for direct cross-method comparisons. In contrast, WGS provides a convergent, single data stream that can assume any number of reductionist approaches for the genetic determinants of interest (e.g., strain typing for lineage determination, specific gene characteristics for phenotypes like antibiotic resistance, single nucleotide variants [SNVs] for transmission analysis).

The molecular testing spectrum, which WGS is poised to replace, reflects a progression of successive tools created and commercialized to meet specific ends. Given the often-diverse goals of these tools, the question becomes how much molecular depth is necessary to serve public health, specifically given the scale and scope of HAIs? The answer to this question relates to the intended use of the resultant information, such as informing public health surveillance (and associated measures) or for establishing epidemiological relatedness that can be actionable for mitigating an outbreak, limiting transmission, and/or containment of emerging threats. The implications for the use of molecular tools are expectedly different and highly dependent on setting (Fig. 1). For example, a novel resistance or virulence phenotype, or its genetic underpinning (as detected by PCR/reverse transcription-PCR [RT-PCR]), may be useful for surveillance at a global or regional level but is most significant (and timely) for clinical decision-making at the individual level, usually in directing treatment or containment efforts. However, such genetically crude measures have little relevance for inferring epidemiologic relatedness, except in such instances where multiple temporally related cases first appear in a population of dissimilar (using the same crude measure) cases. Interestingly, legacy molecular typing methods, such as MLST or PFGE, were developed to provide subspecies stratification and identification, with broad value across the surveillance landscape for monitoring movement and spread. However, these methods have limited relevance as contextual or corroborating but not definitive evidence for relatedness. In contrast, WGS-derived stratifications such as core genome MLST (cgMLST) and whole genome MLST (wgMLST) straddle broad evolutionary relatedness, beyond that of traditional MLST, and yet provide highly discriminatory data to infer epidemiological relationships within (and between) cases and facilities. Considering that these two approaches are fundamentally WGS derived, as an extension, a fully discrete and replete data set from WGS can be distilled to any level of interrogation needed for surveillance or epidemiological pursuits. However, selecting appropriate sequencing and analysis methods is critical and requires consideration of sensitivity, specificity, timeliness, and cost, each bearing different weight depending on the public health question (Fig. 1).

FIG 1.

FIG 1

Relationships of various clinical laboratory methodologies as a function of setting, measured by relative intensity for surveillance efforts and epidemiological utility. Health care-associated identification of and response to emerging and novel public health threats incorporate lines of investigation that serve inherently different purposes. Laboratory methods with different resolving powers have traditionally complemented and bolstered epidemiology with the potential to improve public health and lead epidemiological associations. Phenotypic and molecular genetic tools are employed at all levels of granularity but are fit-for-purpose depending on the context of the relevant public health and epidemiological question(s). Ultimately, investment in WGS is envisioned to drive actionable metrics for establishing a risk framework for prioritizing sequence-type fitness, local and regional tracking, clonal dissemination, and emerging resistance. Advancing whole-genome DNA sequencing as a single data source will benefit infection prevention and containment.

Major challenges stem from diversity of HAI pathogens and of health care found in this setting/population. HAI pathogen diversity is extensive, covering hundreds of genera, including novel species and strains, with various levels of available sequence data for comparisons. Health care settings are diverse (e.g., acute care, long-term care, urgent care, outpatient clinics, home health), each with unique facility design factors. Finally, patient populations have different HAI risks and exposures; each patient undergoes essentially a personalized treatment plan and procedures.

Most HAI outbreaks are local (e.g., single facility) or regional (e.g., multiple interconnected facilities), and not often national or international except for widely distributed contaminated products or devices (e.g., global outbreak of infections from Mycobacterium chimaera contamination of heater-cooler devices [6]; multistate outbreak of Burkholderia cepacia infections from contaminated docusate [7]). Local and regional outbreaks require direct contextual comparisons to elucidate potential transmission pathways and importation events, whereas broad descriptive molecular epidemiology of HAI pathogens is essential for public health agencies to understand emerging pathogens and their genomic, geographic, and temporal trends (Fig. 1). Surveillance and response must track transmission within patient-movement networks on all scales (e.g., local, regional, national, global). Finally, across all levels of analysis, an understanding of antimicrobial resistance and other virulence mechanisms that transcend bacterial genera and strains remains a key component for HAI containment and response.

Given these considerations, there exists no one-size-fits-all application of WGS for public health. Thus, for effective public health action for HAIs, two broad functions of WGS emerge: (i) describing HAI pathogens at a global level for public health surveillance and molecular epidemiology (i.e., top-down); and (ii) high-resolution characterization of HAI pathogens for elucidating whole-genome relatedness as well as resolving mobile genetic elements in health care-associated outbreaks and endemic facility transmission (e.g., Clostridioides difficile, methicillin-resistant Staphylococcus aureus [MRSA]) (i.e., bottom-up). Combining approaches allows us to leverage a local-global paradigm for public health response and action, strategic planning, and tool development through the identification of and response to novel and emerging public health threats (Fig. 1).

FUNCTIONS OF BACTERIAL GENOMICS

Top-down genomic profiling and national surveillance.

Isolates unlinked to an outbreak represent the bulk of HAIs reported (8). Public health institutions, like CDC, have a responsibility to provide landscape, high-level analyses to support molecular epidemiology efforts, as well as to detect and investigate new and emerging clusters and threats (e.g., novel carbapenemases [9], hypervirulence [10], and linkage of seemingly unrelated events between facilities and regions [11]) at regional, national, and even international levels. By leveraging top-down approaches, such as whole-genome or core genome MLST tools, over more discriminatory analyses (e.g., high-quality SNV [hqSNV] calling), we can easily and rapidly evaluate endemic strains in specific regions, which may have been long circulating (Fig. 1).

Surveillance conjures images of comprehensive testing. However, given the burden, diversity, and often novel nature of HAI pathogens, prospective genomic surveillance of all HAI pathogen isolates is not a rational allocation of limited resources. General considerations for genomic surveillance decisions include incorporating local epidemiology to prioritize where and when to routinely apply WGS such that the information generated can be used to prevent infections and deaths (see the “Science-Public Health Interface” section below). Additionally, nationally representative isolates from preexisting surveillance infrastructure can provide important context when a novel or emerging threat is identified at a local or regional level. The U.S. Centers for Disease Control and Prevention (CDC) Division of Healthcare Quality Promotion (DHQP) has built or participates in several efforts that can support response to emerging threats and/or serve as stable, longitudinal data sources: the Antibiotic Resistance Laboratory Network (AR Lab Network; https://www.cdc.gov/drugresistance/solutions-initiative/ar-lab-network.html), DHQP Sentinel Surveillance, and the health care-associated infection component of CDC’s Emerging Infections Program (EIP; https://www.cdc.gov/hai/eip/index.html).

The AR Lab Network expands national capacity to rapidly detect and respond to antibiotic resistance (12), and it serves as a valuable, rapid infrastructure for performing special studies in response to emerging threat signals detected via any source. Starting in 2019, AR Lab Network Regional Laboratories and a small number of other state public health laboratories are being supported to incorporate HAI pathogen WGS into their standard workflows to further our capacity to detect novel and emerging HAI and antibiotic resistance threats. This includes revealing nonendemic but known carbapenemases not detected by existing nucleic acid amplification diagnostic test platforms, as well as actively searching for putative novel metallo-beta-lactamase genes [e.g., HMB-2 (9)] when comparing an isolate’s sequence data to antimicrobial resistance gene databases.

Complementing AR Lab Network’s focus on containing emerging antibiotic resistance threats, the DHQP Sentinel Surveillance system obtains isolates of both susceptible and resistant phenotypes among relevant HAI pathogens. The DHQP Sentinel Surveillance platform consists of a group of hospitals submitting isolates of specified species throughout the year and without regard to outbreak status. Such systematic sampling of HAI pathogens enhances understanding of the microbial ecology and molecular epidemiology of strain types within health care settings and the association of specific strains with emerging resistance phenotypes.

Although less agile, the health care-associated infection-community interface component of the EIP collects additional information, including links to epidemiological and clinical data, as well as tracking incidence in relatively stable catchment populations over time (13). Given the detailed epidemiological data collected by “boots on the ground” EIP surveillance officers conducting population-based surveillance, WGS of the associated isolates provides useful information, including changes in strain incidence by epidemiological category (i.e., hospital onset, community onset but health care facility associated, and community associated) over time, as well as geographic genomic diversity snapshots, and detection of novel or emerging resistance determinants (9, 11).

Bottom-up approaches for outbreak response.

Rapid response is crucial for containing antimicrobial-resistant and other HAI threats. Recognized HAI outbreaks represent the minority of HAIs (8). HAI clusters and outbreaks are often difficult to define, and thus, isolates go unrecognized as being part of a cluster or outbreak. Current efforts on automated detection of hospital outbreaks improve our ability to detect these events (8). Expanding and developing implementation of WGS for HAI pathogens will likely detect an increasing number of previously unrecognized clusters by providing highly discriminatory genomic evidence through bottom-up approaches (e.g., hqSNVs, wgMLST), indicating cases are related over larger geographic and/or temporal spans than would typically be detected through other methods (e.g., laboratory record review) (Fig. 1). The primary objective of sequencing in outbreak investigations is often to identify local transmission, versus multiple unrelated importations or a combination of the two (Fig. 1). Additional value is added by discriminatory sequencing analyses for assessing presence and transmission of antimicrobial resistance gene variants and other virulence genes. Sequencing has demonstrated its value in outbreak response when applied in conjunction with the clinical, epidemiology, and infection control data, providing detailed information on relatedness between cases and suspected sources (e.g., medical device, contaminated product or environmental reservoir) (see Table S1 in the supplemental material).

A challenge of highly discriminatory outbreak data, for which we do not know a priori the complete transmission pathway, is that given the diversity of HAI pathogens, health care settings, and patient populations, defining standard high-quality SNV cutoffs for clusters remains aspirational. Identifying hqSNVs is highly dependent on bioinformatics methods and reference selected for analyses. Efforts exist to establish a rate of expected SNV accumulation (i.e., the molecular clock) for different bacterial species in the laboratory setting; for example, sequencing of serial passaging in the laboratory. However, the utility and reliability are unclear, and almost certainly do not recapitulate real world evolution, especially as it involves transmission between hosts and, sometimes, the patient care environment with various ecologic pressures (e.g., antibiotics, biocides). Moreover, we are frequently the blind men describing the elephant in outbreak investigations. Isolates available for sequencing from an outbreak investigation represent only a fraction of all isolates from animate and inanimate sources involved in all transmission pathways, with key missing links from health care personnel, potential environmental reservoirs (14), and asymptomatically colonized individuals (15). Moreover, the time elapsed between source (human or environmental) and a patient presenting with infection is generally unknown. A spore-forming anaerobe, such as C. difficile, a common cause of HAIs responsible for 223,900 infections and 12,800 deaths annually in the United States (2), can further confound the association of elapsed time between transmission events and evolutionary distance as spores do not evolve while dormant for weeks or months. Furthermore, information is not readily available regarding the effect of underlying host status (including host genetic background and colonization status) and other environmental pressures (e.g., antibiotic exposures within the patient, disinfectants, and desiccation) on SNV accumulation in the pathogen.

Finally, the most common approach to characterizing the bacteria from an infected or colonized patient is to consider the individual to be harboring a single clone. However, genomics has demonstrated there is often within-host evolution resulting in intrahost genetic diversity—a “cloud” of a pathogen harbored within a single individual (16). Pathogens evolve and change independently and variably, but sequencing approaches attempt to evaluate these in a single analysis as if events occur simultaneously and/or in direct succession. All these factors should be incorporated into any application of sequencing to HAI pathogen outbreaks.

SCIENCE-PUBLIC HEALTH INTERFACE

Resolving HAI outbreaks.

WGS provides a high-resolution view of the microevolution of pathogens. The role of public health is translating this into data for action through fit-for-purpose approaches. Likely there is no single correct bioinformatics method for all outbreak investigations. However, transparency is key. Therefore, we consistently follow certain vital criteria in all outbreak investigation-related WGS analyses, including thresholds for data quality (e.g., genome coverage minimums, ceiling for number of contigs comprising the assembly), and we communicate standard metrics (e.g., hqSNVs, core genome size) in all publications and partner communications (6, 7, 11, 17).

To elucidate relatedness and possible transmission events, we use phylogenetically informative hqSNVs from pairwise distances between sequenced isolates identified by read mapping to a selected outbreak isolate genome that serves as the "reference" (Fig. 2). The reference is the centroid genome among all outbreak-related genomes, as identified by mash distances (18). By using an isolate from the outbreak as the reference for reference-based hqSNV calling, we avoid the challenges of reference-free approaches (e.g., computational demand, turnaround time). Second, we incorporate a "core genome" metric; this represents the portion of the outbreak reference genome used for calculating pairwise distances (Fig. 2). The size of the core genome relative to the reference genome informs more accurate transmission analyses and improves public health response. We always report the core genome metric with pairwise distances (i.e., hqSNVs). Most phylogenetic tools only output hqSNVs, limiting inferences that can be made; however, tools like SNVPhyl calculate the core genome size (19). Looking for only low hqSNV counts, without consideration of core genome size, can incorrectly indicate closely related isolates if the isolate set core genome size is relatively small (e.g., <90%). Applying standardized approaches will improve public health actions, communications, collaborations, and response time.

FIG 2.

FIG 2

(A) For reference-based hqSNV calling to explore phylogenetic relationships in outbreak investigations, we use the mash distances among all presumptive outbreak-related isolates to identify the centroid isolate which subsequently serves as the reference for pairwise distance calculations. (B) Visual representation of the core genome metric. In outbreak investigations, a low core genome, as depicted here, even in the context of low hqSNV counts, suggests one or more presumptive outbreak-related isolates are unrelated.

Core genome MLST (cgMLST) schemes, which evaluate only coding regions and often exclude half the genome or more, do not always provide sufficient discriminatory power to illuminate potential transmission pathways or potential common sources. However, “turning the dial back” from hqSNVs to these slightly less discriminatory analyses provides an understanding of the genome landscape and informs triage of additional hqSNV analyses so as to better resolve suspected isolates for their empirical relationship to the outbreak, especially in investigations with multiple, distinct strains circulating (11). In addition, rapid WGS analyses that provide top-down global perspective (e.g., mash distance trees), have worked well to demonstrate uniqueness in outbreaks caused by completely novel strains or species (Fig. 1 and Table S1).

Across HAI organisms.

Since 2014, DHQP has performed WGS and/or bioinformatics analyses for more than 100 outbreaks, including more than 40 different bacterial species from more than 20 genera. Experience and literature yield lessons learned for sequencing HAI pathogens. The increasing application of next-generation sequencing to bacterial species landscapes has implications in defining and redefining systematics but also has revealed significant differences relative to and within species. Most importantly, all sequencing analyses with public health objectives need to be performed in the context of available clinical and epidemiological data. See Table 1 for an overview of the challenges and opportunities presented by HAI pathogens.

TABLE 1.

HAI and antimicrobial-resistant pathogen landscaping assessment for directing public health genomic investmentsa

Pathogen or pathogen group Legacy molecular
typing methods
Current state Desired future state Application Barriers
Overall Varies by organism Pathogen diversity
Technically challenging workflows and analysis
WGS triage driven by organism ID, available epidemiology, phenotype
Lack of uniform inter- and intraspecies typing schemes and nomenclature
Connect WGS data with laboratory, epidemiology, and clinical data
Relevant prospective data over time and space for emerging threats
Surpass basic species identification to better direct resources
Reveal transmission pathways
Define molecular epidemiology at global vs local levels
Detect emerging resistance phenotypes, and virulence with potential to inform treatment
Describe changing ecological fitness, One Health, and infection control drivers
wgMLST or cgMLST schemes in driving top-down and bottom-up phylogenies
Burden precludes real-time sequencing using current technology
Lack of convergence on available platforms
Limited bioinformatics capacity in public health laboratories
For antibiotic resistance, AST platforms are fit-for-purpose (rapid, cost-effective, interpretive, clinically relevant), especially as predictions of phenotype from genotype remain in relative infancy
S. aureus (including MRSA) PFGE
MLST
Clonal complexes
SCCmec type (MRSA)
No standardized approach for WGS relatedness (SNV cutoff consensus)
Apply WGS when initial infection control efforts fail (outbreak setting)
Directed WGS subtyping of relevant public health/clinical isolates
Harmonize subtyping methods for epidemiology and surveillance under WGS umbrella
Bottom-up transmission dynamic is primary objective
Epidemiology drivers (IV drug users, animal contact, etc.)
Large burden in health care
Asymptomatic carriage
C. difficile Ribotyping
PFGE
MLST
Clonal complexes
MLVA
Cross-walking ribotype-MLST-cgMLST Identifying resistance markers
Harmonize subtyping methods for epidemiology and surveillance under WGS umbrella
Top-down national landscape remains primary objective
Associate epidemiological risk factors with sequence types
Health care burden
Asymptomatic carriage
Environment and spores hinder determination of transmission/relatedness
Lack universal, efficient method for typing and resistance testing of isolates
Carbapenem-resistant Enterobacteriaceae MLST
PFGE
Big three-centric (Klebsiella pneumoniae, Escherichia coli, Enterobacter spp.)
Sequencing primarily driven by carbapenem resistance phenotype
Rapid molecular subtyping for targeting WGS coverage of regional and local trends
Evaluation and identification of new and existing carbapenemase variants
Top-down and bottom-up pursuits for surveillance and containment of key antibiotic resistance
Identify novel and emerging resistance
Determine public health response to hypervirulence
Identification of high-risk sequence types that are more transmissible and/or more virulent
Confounding inter- and intraspecies plasmid transmission
Current resistance phenotype drivers lack depth for tailoring WGS efforts
Carbapenem-resistant Pseudomonas aeruginosa MLST Sequencing primarily driven by carbapenemase production Rapid molecular subtyping for targeting WGS coverage of regional and local trends Top-down and bottom-up pursuits for surveillance and containment of key antibiotic resistance
Identify novel and emerging resistance
Identification of high-risk sequence types that are more transmissible and/or more virulent
Low prevalence of carbapenemase producers
Can be difficult to sequence
Carbapenem-resistant Acinetobacter baumannii MLST
PFGE
WGS first and only strategy for typing Stratified landscape analyses Top-down and bottom-up pursuits for surveillance and containment of key antibiotic resistance
Control of OXA-type resistance
Identification of high-risk sequence types as highly fit vehicles for transmission
Competing MLST schemes (Oxford and Institut Pasteur) and granularity
A. baumannii is highly recombinant and confounds typing schemes
Extended-spectrum beta-lactamase-producing Enterobacteriaceae NA – antimicrobial susceptibility profile Limited efforts to date
E. coli centric
Stratified landscape analyses for community-dissemination/drivers/sources and One Health context
Understand sequence type diversity/distribution and AST profile stratified by health care vs environmental reservoirs
Improved knowledge base for directing efforts at control
Determine public health response to hypervirulence
Identification of high-risk sequence types that are more transmissible and/or more virulent
High burden
Nontuberculous mycobacterium 16S
rpoB
Diversity
Publicly available genomic depth lacking
“Sequence first” approach to combine strain typing, transmission analysis, and species
AST from genotypic data
Rapid genomic subtyping for identification all in one and characterization
Reduce turnaround for slower-growing species (i.e., sequence from primary growth in liquid media)
Metagenomic application – direct detection from environment/clinical specimen
Very few reference genomes with wide species diversity
Difficult to culture (slows response time in outbreaks)
Many are slow growing
Vancomycin-resistant enterococci (VRE) Clonal complexes
MLST
MLVA
PFGE
Limited efforts Stratified landscape analyses
AST from genotypic data
Identify novel and emerging resistance
Determine public health response to hypervirulence
Identification of high-risk sequence types that are more transmissible and/or more virulent
Extreme genome fluidity E. faecalis and E. faecium often grouped together but are different in attributes (clonality, AR presentation, health care presentation)
New and emerging threats NA Surveillance-driven Directed real-time approaches for WGS
Prospective and historical screening for new virulence factors (including AR genes)
Detection and containment Difficult to detect unknown genetic targets
Limited resources for universal real-time WGS
Timeliness of bioinformatic and functional analyses
Plasmids NA Nomenclature unclear and based loosely on available public database submissions Ability to resolve plasmids with rapid, inexpensive methods
Track plasmids of similar bacterial MLSTs of public health import
Identifying and characterizing conserved plasmids circulating
Inter- and intraspecies transmission Short-read technologies unable to resolve plasmids
Lacking high fidelity long-read sequencing technology with sufficient confidence to call high-quality SNVs
a

This table is not an exhaustive review of all HAI pathogens, targets of interest, methods, barriers, etc. Abbreviations: ID, identifier; AST, antimicrobial susceptibility testing; SCCmec, staphylococcal cassette chromosome mec element; IV, intravenous; MLVA, multiple locus variable number of tandem repeat analysis; AR, antimicrobial resistance; NA, not available.

Methicillin-resistant Staphylococcus aureus (MRSA) is a highly conserved stable organism responsible for 323,700 hospitalizations and 10,600 deaths annually in the United States (2). Outbreaks of MRSA benefit from WGS over traditional typing methods, especially in high-risk settings like neonatal intensive care units (NICUs). Given its natural carriage/colonization in and on humans, MRSA creates significant infection control issues in health care settings. Hence, informative and reliable genomic thresholds for this species with public health utility are needed. Even still, S. aureus is a particularly intractable organism for such analysis given its highly clonal structure. Literature surveys suggest a broad sliding scale of SNV thresholds for relatedness, ranging from <10 SNVs to >70 reported in outbreak investigations or from individuals sampled over time and/or from different body sites (2022). Since hqSNV cutoffs may confound the desire to reach outbreak resolution and transmission-level directionality, it may be necessary (and fitting) to scale analyses to general “diversity indices” with interpretable criteria suggestive of a need for enhanced infection control. Such an approach may argue for prospective or periodic sampling and sequencing of a given NICU patient population to establish baseline (expected) strain diversity (i.e., diversity indices) and determine informative and actionable metrics for diversity skewing and/or convergence under outbreak-type scenarios indicative of ongoing transmission. Most importantly and above all, the primary objective is infection prevention. The first efforts wherever an MRSA outbreak is suspected should focus on improving infection control, with a focus on hand hygiene and possibly barrier precautions, to stop transmission that infrequently involves an environmental reservoir. When accounting for limited resources and current turnaround time, WGS is a secondary tool to rapid methods for species identification and resistance detection. If cases continue, sequencing can be valuable to understand transmission pathways and the origin of infections (i.e., multiple importations, person-to-person transmission, persistently colonized health care personnel, or combination thereof).

Certain pathogens, such as nontuberculous mycobacteria (NTM), might be candidates for routine genomic surveillance to identify species and detect emerging clusters or outbreaks. NTM are a group of environmental bacteria known to form biofilms and contaminate medications, products, and medical devices. NTM infections are exceedingly difficult to treat and require lengthy treatment. Currently, a small number of states in the United States have notifiable reporting of NTM. Recent outbreaks from contaminated medical devices and the anecdotal experience in the United Kingdom and globally (6, 23) have demonstrated the potential benefit of prospective NTM sequencing for early detection of and rapid response to locally or nationally disseminated outbreaks.

REAL-TIME CLINICAL APPLICATIONS

Genotype-phenotype.

Real-time PCR-based assays exist for detection of genes encoding a variety of resistance mechanisms (e.g., common carbapenemases in Enterobacterales, methicillin resistance in S. aureus). Although we recognize the appeal in a sequence-first approach and recognize groups are moving in this direction, there are several considerations that make this a risky transition in the health care setting. Unlike tuberculosis, which is highly conserved and evolves slowly, pathogens in the health care setting are diverse and rapidly evolving in response to the stresses encountered (e.g., antibiotics, environmental disinfectants, or desiccation); mobile genetic elements allow for rapid horizontal sharing of genetic material, including antimicrobial resistance genes. Moreover, mechanisms of resistance to some classes of drugs are not clearly defined. For example, while acquired carbapenemases in Pseudomonas aeruginosa are clearly significant in conferring resistance to carbapenems, there are additional mechanisms that complicate the picture. Chromosomal changes mediating various combinations of porin, efflux pump, and AmpC expression (24) are often the drivers. In other instances, the mechanism may not be detectable by WGS (e.g., inducible resistance mechanisms) or the genetic mechanism is completely unknown. Traditional phenotypic antimicrobial susceptibility testing is available in clinical laboratories, with faster turnaround times than current sequencing efforts (25). We could not only waste precious time with incorrect antimicrobial susceptibility predictions in highly compromised patients but also miss emerging mechanisms and phenotypes altogether by relying primarily on genomic data. Use cases where early investments in genotype-phenotype predictions might be most beneficial are for those organisms in which traditional antimicrobial susceptibility testing is technically more challenging or involves slow-growing organisms (e.g., C. difficile or some slow-growing NTMs).

Metagenomics and diagnostics.

In 2017, CDC launched the Containment Strategy, which focuses on health departments and health care facilities working together to aggressively respond to and control transmissible antibiotic resistance threats (12). A central element is identifying colonized contacts who might serve as a silent reservoir for ongoing transmission. When a target of public health import is identified from a clinical culture, screening is conducted to identify contacts of the index patient who are asymptomatically colonized. A major hurdle to this approach is the need for rapid, validated, approved diagnostic tests for colonization screening, which do not exist for all known or yet to be identified antibiotic resistance genes. Clinical metagenomic applications, as well as highly multiplexed amplicon sequencing panels—more sensitive, but less flexible than metagenomic approaches—hold promise for addressing this need. Applications for infectious diseases have been documented in the cases of severe disease or in retrospective studies, most with confirmatory testing by other methods (26). In addition, recent validation of a cell-free DNA sequencing test for clinically relevant pathogens in plasma has been reported in a Clinical Laboratory Improvement Amendments (CLIA) laboratory for a cohort of patients with a sepsis alert (27). However, in these examples, the targets are less dynamic than emerging antibiotic resistance determinants, with specimen sources of relatively low microbial complexity (i.e., blood or cerebrospinal fluid [CSF] versus stool or wound). Without a flexible, approved, culture-independent, highly multiplexed amplicon panel, or metagenomics approach that can accommodate new and changing antibiotic resistance targets to detect asymptomatically colonized contacts, infection control interventions are incomplete and less effective, and ongoing transmission can occur. This may require a different approach to the CLIA framework for diagnostic development and validation, including the use of documented, approved database query processes and dynamic databasing with CLIA-approved methods for improving databases in real time. Use cases involving specimens of relatively high microbial complexity, such as stool, could be optimized if combined with other metagenomically derived data such as a microbiome complexity or dominance index that predicted risk of infection with a given pathogen (28). Ultimately, the desired end state by employing such cutting-edge technology will avoid delays in response that can result in missed opportunities for instituting effective prevention precautions in health care facilities but, more importantly, realized gains in overall containment efforts where time is a critical factor.

PERSPECTIVES MOVING FORWARD

Antimicrobial resistance may represent the single largest public health threat of our time, and health care is a major driver. A major concern with antibiotic resistance genes involves their potential for rapid horizontal transfer facilitated by mobile genetic elements. Their movement has been documented using next-generation technologies and observed to exhibit intra- and interspecies transfer in outbreak investigations (17). Although the ability to determine whether an antibiotic resistance gene is harbored on a mobile genetic element is not readily conferred currently by short-read sequencing platforms, we propose here several potential solutions and opportunities for innovation. Long-read sequencing provides data to allow for resolution of plasmids. However, given the current error rates in long-read technologies used to generate high-quality closed genomes and to understand both the mobile genetic elements and the relatedness of isolates in an outbreak investigation, additional methods to provide hqSNV-level information are also needed. Thus, we often combine high-fidelity short-read technology (e.g., Illumina) with long-read sequencing technology (e.g., Nanopore Technology MinION, PacBio Sequel Systems) to generate hybrid assemblies (10). However, concurrent sequencing on multiple platforms can be expensive and inefficient. Thus, we have garnered anecdotal successes implementing publicly available tools designed to attempt to reconstruct plasmids from only short-read sequencing data. (e.g., PlasFlow [29]). These tools provide a potential solution when long-read sequencing is unavailable or may have a longer turnaround time. As next-generation technologies continue to mature, we see the promise for resolving mobile elements rapidly and in a cost-effective manner to support public health response and decision-making.

Cross-walking historical typing methods with sequence-based methods poses another major challenge. Anecdotal evidence demonstrates there exists no single “Rosetta stone,” even within species. However, the added benefit of WGS data is the ability to easily standardize, transmit, and communicate data, thus removing the art form of interpretation or the constant curation and validation required for some methods, such as C. difficile ribotyping. C. difficile, a health care-associated pathogen frequently implicated in outbreaks involving emergent epidemic strains, has high endemic rates. Thus, the ability to cross-walk ribotyping results with genomic subtyping is crucial and DHQP/CDC and partners (e.g., EIP) are already transitioning away from traditional approaches to WGS-based typing methods for HAI pathogens (e.g., C. difficile) (30).

As automation in the wet laboratory workflows allows microbiologists to address more-advanced problems, easy-to-use, accessible tools for routine bioinformatics analyses can yield a similar impact for bioinformaticians, as well as raise capacity in settings historically lacking bioinformatics expertise. DHQP has developed a single pipeline, QuAISAR-H, which automates routine evaluation, and initial processing and analyses of WGS data. This is publicly available for groups that are able to set up their own dependencies and databases (https://github.com/DHQP), as well as in a containerized format. To facilitate data sharing and accelerate public health efficacy of sequence data generated, DQHP has also established and hosts an NCBI umbrella BioProject, “CDC HAI-Seq” (https://www.ncbi.nlm.nih.gov/bioproject/531911), intended to house any HAI-related sequence data (genome, plasmid, amplicon, shotgun) generated by CDC, CDC-funded laboratories, and other public health laboratories.

In recent years, efforts continue to develop and implement automated instruments or robotics in clinical and public health laboratories; this is expected to continue to expand. Given the portable nature of sequence data, instances in which both public health and clinical microbiology laboratories are sequencing represents an additional area for collaboration to reduce redundancy or duplication and support cost effectiveness. Serving two sides of the same coin, clinical laboratories stand at the frontline, working tirelessly to provide data relevant for clinical decision-making, often astutely detecting early signals in the data. Public health laboratories focus on reference and surveillance testing, among many other functions. Implementation of sequencing for patient safety would not be possible without the significant convergence of time, energy, and resources dedicated by all involved to health care-associated infections, outbreaks, and surveillance.

CONCLUSIONS

Overinflated expectations of the promise of next-generation sequencing lead to a mounting urge for direct conversion to and use of this new technology as a front-end initial public health response. More realistically, next-generation sequencing may be strategically positioned and employed as a rational complement to current phenotypic and molecular data streams that serve as the frontline sentinels. We recognize that, undoubtedly, technologies will continue to advance and improve in cost, timeliness, and utility. By framing sequencing within health care safety to be adaptable and responsive to ever evolving and emerging threats with applied tools that are fit-for-purpose, we anticipate this will allow us to respond rapidly to the unpredictable landscape of health care and HAI pathogens and advance HAI containment and prevention.

ACKNOWLEDGMENTS

We thank the many individuals at CDC who contribute to the progress of health care-associated infection and antibiotic-resistant pathogen sequencing, including Brandi Limbago, Gillian McAllister, Lindsay Parnell, Allison Perry-Dow, Allison C. Brown, Michael Craig, Stephanie Gumbis, and CDC’s Office of Advanced Molecular Detection. We are also grateful to Kamile Rasheed for careful review of the paper and Denise Cardo for her leadership and thoughtful review of the paper.

This paper received no specific funding from any agency, commercial or non-for-profit sectors.

The findings and conclusions in this report are those of the authors and do not necessarily represent the official position of the Centers for Disease Control and Prevention (CDC). Use of trade names and commercial sources is for identification only and does not imply endorsement.

Footnotes

Supplemental material is available online only.

Supplemental file 1
Table S1. Download jcm.00135-21-s0001.pdf, PDF file, 0.1 MB (131.7KB, pdf)

Contributor Information

Alison Laufer Halpin, Email: alaufer@cdc.gov.

Romney M. Humphries, Vanderbilt University Medical Center

REFERENCES

  • 1.Magill SS, O’Leary E, Janelle SJ, Thompson DL, Dumyati G, Nadle J, Wilson LE, Kainer MA, Lynfield R, Greissman S, Ray SM, Beldavs Z, Gross C, Bamberg W, Sievers M, Concannon C, Buhr N, Warnke L, Maloney M, Ocampo V, Brooks J, Oyewumi T, Sharmin S, Richards K, Rainbow J, Samper M, Hancock EB, Leaptrot D, Scalise E, Badrun F, Phelps R, Edwards JR, Emerging Infections Program Hospital Prevalence Survey Team. 2018. Changes in prevalence of health care-associated infections in U.S. hospitals. N Engl J Med 379:1732–1744. 10.1056/NEJMoa1801550. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Centers for Disease Control and Prevention. 2019. Antibiotic resistance threats in the United States, 2019. Centers for Disease Control and Prevention, Atlanta, GA. https://www.cdc.gov/drugresistance/pdf/threats-report/2019-ar-threats-report-508.pdf.
  • 3.Walker TM, Kohl TA, Omar SV, Hedge J, Del Ojo Elias C, Bradley P, Iqbal Z, Feuerriegel S, Niehaus KE, Wilson DJ, Clifton DA, Kapatai G, Ip CLC, Bowden R, Drobniewski FA, Allix-Béguec C, Gaudin C, Parkhill J, Diel R, Supply P, Crook DW, Smith EG, Walker AS, Ismail N, Niemann S, Peto TEA, Modernizing Medical Microbiology (MMM) Informatics Group. 2015. Whole-genome sequencing for prediction of Mycobacterium tuberculosis drug susceptibility and resistance: a retrospective cohort study. Lancet Infect Dis 15:1193–1202. 10.1016/S1473-3099(15)00062-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Ribot EM, Freeman M, Hise KB, Gerner-Smidt P. 2019. PulseNet: entering the age of next-generation sequencing. Foodborne Pathog Dis 16:451–456. 10.1089/fpd.2019.2634. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Brown E, Dessai U, McGarry S, Gerner-Smidt P. 2019. Use of whole-genome sequencing for food safety and public health in the United States. Foodborne Pathog Dis 16:441–450. 10.1089/fpd.2019.2662. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Hasan NA, Epperson LE, Lawsin A, Rodger RR, Perkins KM, Halpin AL, Perry KA, Moulton-Meissner H, Diekema DJ, Crist MB, Perz JF, Salfinger M, Daley CL, Strong M. 2019. Genomic analysis of cardiac surgery-associated Mycobacterium chimaera infections, United States. Emerg Infect Dis 25:559–563. 10.3201/eid2503.181282. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Glowicz J, Crist M, Gould C, Moulton-Meissner H, Noble-Wang J, de Man TJB, Perry KA, Miller Z, Yang WC, Langille S, Ross J, Garcia B, Kim J, Epson E, Black S, Pacilli M, LiPuma JJ, Fagan R, B. cepacia Investigation Workgroup. 2018. A multistate investigation of health care-associated Burkholderia cepacia complex infections related to liquid docusate sodium contamination, January-October 2016. Am J Infect Control 46:649–655. 10.1016/j.ajic.2017.11.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Huang SS, Yokoe DS, Stelling J, Placzek H, Kulldorff M, Kleinman K, O’Brien TF, Calderwood MS, Vostok J, Dunn J, Platt R. 2010. Automated detection of infectious disease outbreaks in hospitals: a retrospective cohort study. PLoS Med 7:e1000238. 10.1371/journal.pmed.1000238. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Zhu W, McAllister GA, Machado MJ, Campbell D, Karlsson M, Rasheed J, Kainer MA, Muleta D, Walters MS, Grass JE, Halpin AL, Stanton RA. 2019. Identification and characterization of HMB-2, a novel metallo-β-lactamase in a Pseudomonas aeruginosa isolate. Open Forum Infect Dis 6:S283–S284. 10.1093/ofid/ofz360.675. [DOI] [Google Scholar]
  • 10.Karlsson M, Stanton RA, Ansari U, McAllister G, Chan MY, Sula E, Grass JE, Duffy N, Anacker ML, Witwer ML, Rasheed JK, Elkins CA, Halpin AL. 2019. Identification of a carbapenemase-producing hypervirulent Klebsiella pneumoniae isolate in the United States. Antimicrob Agents Chemother 63:e00519-19. 10.1128/AAC.00519-19. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Stanton RA, McAllister G, Daniels JB, Breaker E, Vlachos N, Gable P, Moulton-Meissner H, Halpin AL. 2020. Development and application of a core genome multilocus sequence typing scheme for the healthcare-associated pathogen Pseudomonas aeruginosa. J Clin Microbiol 58:e00214-20. 10.1128/JCM.00214-20. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Woodworth KR, Walters MS, Weiner LM, Edwards J, Brown AC, Huang JY, Malik S, Slayton RB, Paul P, Capers C, Kainer MA, Wilde N, Shugart A, Mahon G, Kallen AJ, Patel J, McDonald LC, Srinivasan A, Craig M, Cardo DM. 2018. Vital signs: containment of novel multidrug-resistant organisms and resistance mechanisms - United States, 2006–2017. MMWR Morb Mortal Wkly Rep 67:396–401. 10.15585/mmwr.mm6713e1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Magill SS, Dumyati G, Ray SM, Fridkin SK. 2015. Evaluating epidemiology and improving surveillance of infections associated with health care, United States. Emerg Infect Dis 21:1537–1542. 10.3201/eid2109.150508. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Suleyman G, Alangaden G, Bardossy AC. 2018. The role of environmental contamination in the transmission of nosocomial pathogens and healthcare-associated infections. Curr Infect Dis Rep 20:12. 10.1007/s11908-018-0620-2. [DOI] [PubMed] [Google Scholar]
  • 15.Popovich KJ, Snitkin ES. 2017. Whole genome sequencing—implications for infection prevention and outbreak investigations. Curr Infect Dis Rep 19:15. 10.1007/s11908-017-0570-0. [DOI] [PubMed] [Google Scholar]
  • 16.Didelot X, Walker AS, Peto TE, Crook DW, Wilson DJ. 2016. Within-host evolution of bacterial pathogens. Nat Rev Microbiol 14:150–162. 10.1038/nrmicro.2015.13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.de Man TJB, Yaffee AQ, Zhu W, Batra D, Alyanak E, Rowe LA, McAllister G, Moulton-Meissner H, Boyd S, Flinchum A, Slayton RB, Hancock S, Spalding WM, Laufer HA, Rasheed JK, Noble-Wang J, Kallen AJ, Limbago BM. 2020. Multispecies outbreak of Verona integron-encoded metallo-β-lactamase-producing multidrug resistant bacteria driven by a promiscuous incompatibility group A/C2. Clin Infect Dis 72:414–420. 10.1093/cid/ciaa049. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Ondov BD, Treangen TJ, Melsted P, Mallonee AB, Bergman NH, Koren S, Phillippy AM. 2016. Mash: fast genome and metagenome distance estimation using MinHash. Genome Biol 17:132. 10.1186/s13059-016-0997-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Petkau A, Mabon P, Sieffert C, Knox NC, Cabral J, Iskander M, Iskander M, Weedmark K, Zaheer R, Katz LS, Nadon C, Reimer A, Taboada E, Beiko RG, Hsiao W, Brinkman F, Graham M, Van Domselaar G. 2017. SNVPhyl: a single nucleotide variant phylogenomics pipeline for microbial genomic epidemiology. Microb Genom 3:e000116. 10.1099/mgen.0.000116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Ankrum A, Hall BG. 2017. Population dynamics of Staphylococcus aureus in cystic fibrosis patients to determine transmission events by use of whole-genome sequencing. J Clin Microbiol 55:2143–2152. 10.1128/JCM.00164-17. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Harrison EM, Ludden C, Brodrick HJ, Blane B, Brennan G, Morris D, Coll F, Reuter S, Brown NM, Holmes MA, O’Connell B, Parkhill J, Torok ME, Cormican M, Peacock SJ. 2016. Transmission of methicillin-resistant Staphylococcus aureus in long-term care facilities and their related healthcare networks. Genome Med 8:102. 10.1186/s13073-016-0353-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Senn L, Clerc O, Zanetti G, Basset P, Prod’hom G, Gordon NC, Sheppard AE, Crook DW, James R, Thorpe HA, Feil EJ, Blanc DS. 2016. The stealthy superbug: the role of asymptomatic enteric carriage in maintaining a long-term hospital outbreak of ST228 methicillin-resistant Staphylococcus aureus. mBio 7:e02039-15. 10.1128/mBio.02039-15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Hedge J, Lamagni T, Moore G, Walker J, Crook D, Chand M. 2017. Mycobacterium chimaera isolates from heater-cooler units, United Kingdom. Emerg Infect Dis 23:1227. 10.3201/eid2307.170442. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Lister PD, Wolter DJ, Hanson ND. 2009. Antibacterial-resistant Pseudomonas aeruginosa: clinical impact and complex regulation of chromosomally encoded resistance mechanisms. Clin Microbiol Rev 22:582–610. 10.1128/CMR.00040-09. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Mitchell SL, Simner P. 2019. Next-generation sequencing in clinical microbiology: are we there yet? Clin Lab Med 39:405–418. 10.1016/j.cll.2019.05.003. [DOI] [PubMed] [Google Scholar]
  • 26.Simner PJ, Miller S, Carroll KC. 2018. Understanding the promises and hurdles of metagenomic next-generation sequencing as a diagnostic tool for infectious diseases. Clin Infect Dis 66:778–788. 10.1093/cid/cix881. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Blauwkamp TA, Thair S, Rosen MJ, Blair L, Lindner MS, Vilfan ID, Kawli T, Christians FC, Venkatasubrahmanyam S, Wall GD, Cheung A, Rogers ZN, Meshulam-Simon G, Huijse L, Balakrishnan S, Quinn JV, Hollemon D, Hong DK, Vaughn ML, Kertesz M, Bercovici S, Wilber JC, Yang S. 2019. Analytical and clinical validation of a microbial cell-free DNA sequencing test for infectious disease. Nat Microbiol 4:663–674. 10.1038/s41564-018-0349-6. [DOI] [PubMed] [Google Scholar]
  • 28.Halpin AL, de Man TJ, Kraft CS, Perry KA, Chan AW, Lieu S, Mikell J, Limbago BM, McDonald LC. 2016. Intestinal microbiome disruption in patients in a long-term acute care hospital: a case for development of microbiome disruption indices to improve infection prevention. Am J Infect Control 44:830–836. 10.1016/j.ajic.2016.01.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Krawczyk PS, Lipinski L, Dziembowski A. 2018. PlasFlow: predicting plasmid sequences in metagenomic data using genome signatures. Nucleic Acids Res 46:e35. 10.1093/nar/gkx1321. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Wang X, Holzbauer S, Pung K, Bye M, Adamczyk M, Paulick A, Vlachos N, Guh A, Laufer-Halpin AS, Karlsson MS, Boxrud D, Emerging Infections Program CDI Workgroup. 2018. Molecular typing of Clostridium difficile: concordance between PCR-ribotyping and multilocus sequence typing (MLST). Open Forum Infect Dis 5(Suppl 1):S176. 10.1093/ofid/ofy210.482. [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental file 1

Table S1. Download jcm.00135-21-s0001.pdf, PDF file, 0.1 MB (131.7KB, pdf)


Articles from Journal of Clinical Microbiology are provided here courtesy of American Society for Microbiology (ASM)

RESOURCES