Abstract
Campylobacter spp. resistant to fluoroquinolones and macrolides are serious public health threats. Studies aiming to identify risk factors for drug-resistant Campylobacter have narrowly focused on antimicrobial use at the farm level. Using chain graphs, we quantified risk factors for fluoroquinolones- and macrolide-resistance in Campylobacter coli isolated from two distinctive swine production systems, conventional and antibiotic-free (ABF). The chain graphs were learned using genotypic and phenotypic resistance data from 1082 isolates and host exposures obtained through surveys for 18 cohorts of pigs. The gyrA T86I point mutation alone explained at least 58 % of the variance in ciprofloxacin minimum inhibitory concentration (MIC) for ABF and 79 % in conventional farms. For macrolides, genotype and host exposures explained similar variance in azithromycin and erythromycin MIC. Among host exposures, heavy metal exposures were identified as risk factors in both conventional and ABF. Chain graph models can generate insights into the complex epidemiology of antimicrobial resistance by characterizing context-specific risk factors and facilitating causal discovery.
Author summary
Antimicrobial resistance is influenced by multiple factors, including exposures to selecting agents, such as antibiotics, antiseptics, or heavy metals, and factors affecting the transmission of resistant pathogens, such as biosecurity and hygiene measures. Understanding what specific factors are associated with resistance in a given context is challenging. We developed an approach based on probabilistic graphical models to investigate context-specific antimicrobial resistance risk factors. We applied the approach to Campylobacter coli isolated from pigs in antibiotic-free and conventional farms. We demonstrated how for fluoroquinolones, risk factors were similar across both types of farms, but risk factors for macrolides were different across settings.
Introduction
Campylobacter jejuni and C. coli are common causes of human gastroenteritis [1]. In particular, drug-resistant Campylobacter is classified as a serious public health threat by the Centers for Disease Control and Prevention’s (CDC) report on antibiotic resistant threats [2]. Every year in the United States alone, Campylobacter spp. causes an estimated 1.5 million infections, 29% of which have decreased susceptibility to fluoroquinolones or macrolides [2]. These antibiotic classes are used to treat severe Campylobacter infections in older and immunocompromised patients [3]. While C. jejuni is the most frequent source of infection, C. coli tends to have higher levels of drug resistance compared to C. jejuni [4–6]. Campylobacter is often acquired from consuming contaminated food and water and through direct contact with animals [1, 7]. Food animals are reservoirs for both Campylobacter species, thus understanding the factors that drive Campylobacter antimicrobial resistance at the farm level is necessary to control and mitigate drug-resistant Campylobacter effectively [8].
Studies aiming to identify risk factors for drug-resistant Campylobacter have narrowly focused on antimicrobial use [9]. While antimicrobial use is a widely established risk factor for antimicrobial resistance among enteric pathogens, such as Campylobacter [10, 11], antimicrobial use only captures a subset of all potential exposures that may drive resistance outcomes [11]. Heavy metals and biocides may also select for antimicrobial resistance [12]. Biosecurity and husbandry practices that modify microbial interactions can also influence resistance dynamics in agricultural settings [13]. Additionally, while antimicrobial exposure imposes selection pressures for resistant microorganisms, reduced or complete absence of antimicrobial exposure often does not correspond to either a reduction in bacterial resistance or reversion to susceptibility [14–16]. For example, resistant bacterial strains are found in organic and antibiotic-free farms [17], and resistance to banned antibiotics for food animals, such as chloramphenicol, persists in farms decades after the ban implementation [18]. An analysis of resistance drivers must include a comprehensive assessment of exposures beyond antimicrobial use.
Recent advances in data collection, including sequencing technologies, have made exposure and outcome data to assess risk factors for antimicrobial resistance more available and accessible than ever before [19]. However, the large number of exposures and outcomes presents challenges for classic analytic methods traditionally used in epidemiology. Regression-based methods are well suited for understanding relationships between a single disease outcome of interest and a relatively small number of potential exposures [20]. However, bacteria can be resistant to multiple antimicrobial drugs, and understanding the relationship between multiple resistant outcomes can yield additional information on their selection [21]. Additionally, ecological, evolutionary, and epidemiological processes at the microbial, host, and environmental levels can directly or indirectly influence the occurrence of resistance among microbial populations in a context-specific manner [11, 17]. Therefore, the number of exposures we may want to consider is large and we may not have enough prior knowledge to select the relevant ones before the analysis.
Probabilistic graphical models are an effective framework for handling complex distributions in high dimensional space [22]. These statistical models are composed of random variables (represented as nodes) and edges encoding direct dependence relationships among nodes [22]. They are classified based on the type of edges included in the models. Markov networks are a type of probabilistic graphical models with undirected edges, signifying that the directionality of the interaction between the connected variables cannot be ascribed [22]. Bayesian networks have only directed edges, indicating directional (causal) interactions. Chain graphs have both directed and undirected edges. Thus, they can represent a wider range of systems than Markov and Bayesian networks. Data from epidemiological studies of antimicrobial resistance often includes phenotypic and/or genotypic resistance and exposure variables. This data structure can be represented in multi-layer chain graphs (Fig. 1). Phenotypic resistance outcomes such as the minimum inhibitory concentrations (MIC) can be represented as a layer with undirected edges with the nodes in the layer, as resistance outcomes can be correlated among themselves [21]. However, exposure and genotypic variables can explain part of the variation among these phenotypic outcomes. Thus, directed edges can link exposure layers (host exposures and/or microbial genotypes) to phenotypic outcomes. Recent advances in graph learning for mixed and multi-layer data are quickly expanding our ability to learn realistic graph representations from data [23–27]. Although the application of probabilistic graphical models in large epidemiological datasets has been scarce to date, their application as integration tools of antimicrobial resistance data is promising [28, 29].
Fig 1.
Practical framework of specific chain graph analyses to be learned in this study. Microbial resistance phenotype variables were used as the outcome variables for all chain graphs in this study. A illustrates the chain graph with microbial genotype predictors, B illustrates the chain graph with host exposure predictors, and C represents the chain graph with both microbial genotype and host exposure predictors. A and B represent the chain graphs with single-scale predictors, whereas C represents the chain graph with multi-scale predictors.
We aimed to identify risk factors for macrolide- and fluoroquinolone-resistance Campylobacter coli (C. coli) in two distinctive swine production systems: conventional and antibiotic-free (ABF). Previous research on C. coli populations in these systems found considerable resistance to fluoroquinolone and macrolide antibiotics [30]. Most notably, the prevalence of macrolide resistance was over 20% and not significantly different between bacterial isolates from conventional versus ABF production systems. Thus, we hypothesize that risk factors related to agricultural biosecurity practices and non-antimicrobial selection pressures would be significant predictors of macrolide resistance. To investigate the epidemiology of macrolide- and fluoroquinolone-resistant C. coli, we applied chain graph models and learned the model structure using genotypic and phenotypic resistance data from C. coli isolates and host exposure data collected through surveys.
Results
The data used for this study were collected from 18 cohorts of 35 pigs each, longitudinally sampled from birth to death between October 2008 and December 2010 in North Carolina, USA. Pigs were reared under either conventional or ABF production systems. In the conventional farms, animals are reared indoors and slaughtered at 6 months of age. Additionally, animals received antimicrobial drugs for therapeutic and growth promotion purposes. In ABF farms, animals are reared outdoors and slaughtered at approximately 9 months. Animals in ABF did not receive antimicrobial drugs. During each of the three swine production stages, i.e. farrowing, nursery, and finishing, livestock management questionnaires were administered to producers to ascertain antimicrobial use, biocide use, biosecurity practices, and other animal husbandry practices. Fecal and environmental samples were collected five times over the course of the pigs’ lifespan. C. coli isolated in the samples underwent broth microdilution antimicrobial susceptibility testing (AST) and were whole-genome sequenced. Sequences were screened for antimicrobial resistance genes. Isolates with genotypic and phenotypic resistance data were included in the analysis, along with the exposure data generated in the questionnaires.
In the chain graphs, the outcome layer contains the resistance phenotypes-measured as log2-transformed MIC. The two predicted layers are the resistance genotype and host exposure layers. The lists and explanations of all included predictor variables (host exposure data and genotypic resistance data) for the conventional and ABF chain graphs are given in Supplemental Material (S1 and S2). The learned chain graph yields the effect estimation of predictor variables, or risk factors, significantly associated with the resistance outcomes, represented by , while simultaneously accounting for the unconditional associations among resistance outcomes, represented by (Fig. 1).
The density of and edges, and Bayesian Information Criterion (BIC) for each chain graph are given in Table 1. We investigated a range of and penalties ranging from 0.10 to 0.5. The chain graphs presented in the main text had a penalty of 0.25 for both and . The density of all edges in the genotype predictor chain graphs were higher than in chain graphs with host exposure predictors. Despite the unequal numbers of predictors considered in conventional versus ABF chain graphs, edge densities of the components were comparable among conventional versus ABF chain graphs for the chain graphs with single-scale predictors, i.e. genotype or host exposure predictors. The density of edges was relatively high across all chain graphs for both production systems (Table 1). Considerable variation in resistance outcomes still was not accounted for by the predictor variables evaluated.
Table 1.
Global graphical metrics for all six learned chain graphs in this study. Edge densities represent the proportion of all possible edges that are present in the graph. The densities and number of and edges in each graph are shown here, along with BICs for each graph.
| System | Predictors (N) | Density edges | Density edges | BIC |
|---|---|---|---|---|
|
| ||||
| Conventional | Genotypes (11) | 0.26 | 0.61 | 18.43 |
| Conventional | Host exposures (35) | 0.16 | 0.69 | 16.22 |
| Conventional | Genotypes and host exposures (46) | 0.10 | 0.61 | 16.19 |
| ABF | Genotypes (13) | 0.26 | 0.72 | 18.21 |
| ABF | Host exposures (20) | 0.14 | 0.69 | 14.81 |
| ABF | Genotypes and host exposures (33) | 0.16 | 0.72 | 13.79 |
Risk Factors for Fluoroquinolone Resistance
Overall, chain graphs predicting fluoroquinolone phenotypic resistance contain few predictors (Fig. 2 and 3). Genotype predictors explain most of the variance of ciprofloxacin MIC. In conventional systems, the gyrA T86I point mutation alone explained 79% of the variance in ciprofloxacin MIC (Table 2). Its presence corresponded to a 64-fold dilution increase in ciprofloxacin MIC in both conventional and ABF systems. Similarly, the gyrA T86I point mutation was also associated with an increase in MIC for the quinolone drug nalidixic acid. Overall, host exposures were poor predictors of fluoroquinolone resistance. In conventional systems, host exposures alone explained only 35% of the variation in the conventional system, and had no predictive value in ABF systems. No host exposure variables were selected when both genetic and host exposures were considered in both systems (Fig. 2, C and 3, C).
Fig 2.
Chain subgraph focusing on fluoroquinolone and quinolone resistance phenotype outcomes among C. coli from conventional swine farms. A illustrates the subgraph for the genotype single-scale predictor chain graph. B depicts the host exposure single-scale predictor chain subgraph. C depicts the multi-scale chain subgraph with both genotype and host exposure predictor variables. Host exposure variables are illustrated with teal colored nodes, microbial resistance genotypes as purple nodes, and microbial resistance phenotype as yellow nodes. Only predictors with a coefficient equal or greater than 1 are depicted.
Fig 3.
Chain subgraph focusing on fluoroquinolone and quinolone resistance phenotype outcomes among C. coli from ABF swine farms. A illustrates the subgraph for the genotype single-scale predictor chain graph. B depicts the host exposure single-scale predictor chain subgraph. C depicts the multi-scale chain subgraph with both genotype and host exposure predictor variables. Host exposure variables are illustrated with teal colored nodes, microbial resistance genotypes as purple nodes, and microbial resistance phenotype as yellow nodes. Only predictors with a coefficient equal or greater than 1 are depicted.
Table 2.
Summary of the variance explained by the predictor layers () and the remaining partial correlations () for selected microbial resistant phenotypes. The complete list of independent variables and dependent included in each models and the summary for other microbial resistant phenotypes are listed in supplemental material
| Microbial phenotype | System | Predictor layers | |||
|---|---|---|---|---|---|
|
| |||||
| Ciprofloxacin | Conventional | Genotypes | 79.7 | 14.2 | 94.0 |
| Host exposures | 37.5 | 53.5 | 91.0 | ||
| Genotypes and host exposures | 80.3 | 13.8 | 94.2 | ||
| ABF | Genotypes | 58.3 | 25.8 | 84.1 | |
| Host exposures | 6.1 | 68.8 | 74.8 | ||
| Genotypes and host exposures | 11.8 | 53.6 | 65.4 | ||
|
| |||||
| Erythromycin | Conventional | Genotypes | 28.1 | 69.4 | 97.5 |
| Host exposures | 15.3 | 82.5 | 97.7 | ||
| Genotypes and host exposures | 34.8 | 62.9 | 97.7 | ||
| ABF | Genotypes | 34.3 | 63.4 | 97.7 | |
| Host exposures | 30.7 | 66.6 | 97.3 | ||
| Genotypes and host exposures | 71.5 | 25.9 | 97.4 | ||
Risk Factors for Macrolide Resistance
The chain graphs for the conventional systems explained a relatively low amount of variability observed in azithromycin and erythromycin MICs. The chain graph with both genotypic and host exposures explained only 35% of the variation (Table 2). Conversely, chain graphs with genotypic factors alone or both genotypic and host exposures as predictors explained a higher proportion of the observed variance in the ABF system, up to 76%. The ABF chain graphs were also more dense than the chain graphs for the conventional system.
Among genotypic predictors, the presence of the A2075G point mutation on the 23S rRNA gene was the most consistent and strongest predictor of macrolide resistance. In the subgraphs for macrolides, we included the edges to clindamycin and telithromycin, as they belong to the drug classes lincosamide and ketolides, respectively. These classes have similar mechanisms of action as macrolide drugs and are often grouped under macrolides [31]. The A2075G 23S rRNA mutation was also associated with clindamycin and telithromycin phenotypes, but its effect magnitudes were much smaller(Fig. 4 and 5). None of the chain graphs learned in this study found the A103V amino acid substitution of the L22 protein in the 50S ribosomal subunit (50S L22 A103V) to be associated with MIC to any macrolide or related drug. In addition to A2075G 23S rRNA mutation, other resistance genes and mutations were associated with macrolides MIC. However, the strength and direction of the association were not consistent across systems or chain graphs (Fig. 4 and 5). For example, blaOXA-193 gene and blaOXA-489 were positively associated with macrolide MIC in conventional farms, but negatively associated in ABF.
Fig 4.
Chain subgraph focusing on macrolide, lincosamide and streptogramin resistance phenotype outcomes among C. coli from conventional swine farms. A illustrates the subgraph for the genotype single-scale predictor chain graph. B depicts the host exposure single-scale predictor chain subgraph. C depicts the multi-scale chain subgraph with both genotype and host exposure predictor variables. Host exposure variables are illustrated with teal colored nodes, microbial resistance genotypes as purple nodes, and microbial resistance phenotype as yellow nodes. Only predictors with a coefficient equal or greater than 1 are depicted.
Fig 5.
Chain subgraph focusing on macrolide, lincosamide and streptogramin resistance phenotype outcomes among C. coli from ABF swine farms. A illustrates the subgraph for the genotype single-scale predictor chain graph. B depicts the host exposure single-scale predictor chain subgraph. C depicts the multi-scale chain subgraph with both genotype and host exposure predictor variables. Host exposure variables are illustrated with teal colored nodes, microbial resistance genotypes as purple nodes, and microbial resistance phenotype as yellow nodes. Only predictors with a coefficient equal or greater than 1 are depicted.
Discussion
Epidemiological analytical studies are broadly concerned with understanding the relationship between an outcome and a set of potentially causal predictor variables (e.g., exposures). Antimicrobial resistance outcomes at the isolate level are often multivariate, as MICs for a panel of antibiotics are commonly performed. In epidemiological studies, these are usually simplified by focusing on one resistance outcome at a time and by dichotomizing MICs using epidemiological cut-offs or clinical breakpoints. Both simplifications come with a loss of information [32]. Chain graphs are a multivariate method that allows us to simultaneously model the joint probability distributions of all resistance outcomes and their predictors [22]. This feature is critical to fully investigate multidrug resistance phenomena because high correlations among MICs are common as numerous processes result in the co-selection of resistances, including genetic linkage of resistance genes on mobile genetic elements, cross-resistance among related drugs, and epistatic interactions between genes [33–35].
Our chain graphs showed that resistant genes and mutations were the most important predictors of MICs changes. The largest effects among all chain graphs learned in this study were associated resistance point mutations gyrA T86I and 23S rRNA A2075G as risk factors for fluoroquinolone- and macrolide-resistance in C. coli, respectively. These results aligned with previous experimental work indicating that the T86I point mutation in the gyrA gene is the most important fluoroquinolone-resistance genotype among Campylobacter spp. [36], and that the A2075G 23S rRNA mutation confers high-levels of macrolide resistance and it is the most predominant macrolide-resistance conferring genotype in Campylobacter spp. [37]. None of the chain graphs learned in this study found the A103V amino acid substitution of the L22 protein in the 50S ribosomal subunit (50S L22 A103V) to be associated with MIC to any macrolide or related drug (Figs. 4 and 5). Experimental studies have found that mutations in the L22 ribosomal protein conferred modest levels of macrolide resistance in some bacterial species [38]. As a result, the 50S L22 A103V has been suspected to potentially confer macrolide resistance in Campylobacter spp. [37]. However, other experimental studies have not identified similar effects in Campylobacter spp. [39], nor have observational studies identified associations between 50S L22 A103V and phenotypic macrolide resistance [40]. Our findings are consistent with these latter studies in Campylobacter spp., suggesting the 50S L22 A103V mutation has minimal effect on macrolide resistance in this population of C. coli.
The host exposures include risk factors related to biosecurity and antimicrobial, metal, and disinfectant exposures. These exposures generally had less explanatory power than genotype features ( for all the graphs with host exposures alone). For fluoroquinolones, host exposures were not selected for the combined graph. Some host exposures remained in the combined graph for macrolides, but they had less explanatory power than genotype features. Given the high levels of macrolide resistance in both conventional and ABF farms, we originally hypothesized that antimicrobial exposure would only be identified as a risk factor for fluoroquinolone resistance but not macrolide resistance. For macrolides, no antimicrobial exposure variables were significantly associated with MIC to either macrolide drugs, supporting our hypothesis. We did not find fluoroquinolone exposure to be a risk factor for fluoroquinolone resistance. However, the presence of laboratory-confirmed respiratory pathogens was positively associated with ciprofloxacin MIC. Swine respiratory diseases is the only indication for use of fluoroquinolones in swine [41]. Other antimicrobial drugs were identified as risk factors. Penicillin-G, ceftiofur, and lincomycin exposures were predictors of fluoroquinolone resistance in conventional farms’ host exposure chain graph. Ceftiofur and penicillin-G were negatively associated with ciprofloxacin MICs. The most likely explanation is that the exposure to injectable ceftiofur or penicillin-B was a marker for not using enrofloxacin, an injectable fluoroquinolone, as these drugs may have similar indications.
Our findings are consistent with other studies that found antimicrobial exposure only partially explained resistance levels in swine farms [42–45]. Despite the obvious role of antimicrobial exposures in selecting for resistance, it is generally difficult to establish epidemiological associations between antimicrobial use and resistance [32]. Antimicrobial use metrics, in our case presence of the exposure, are often a rough measurement of the selective pressures occurring at the individual level. Pharmacokinetics and pharmacodynamics dynamics at the gut level modify the relationship between exposure and resistance levels [46, 47]. Additionally, variations on the fitness effects of antimicrobial-resistant genes and mutations complicate the relationship between antimicrobial use and resistance [15, 48].
Exposure to heavy metals, including zinc, copper, and arsenic, was found to be associated with higher MICs for azithromycin among C. coli. Zinc and copper supplementation was positively associated with macrolide resistance in the ABF farms. Zinc and copper exposure were not included in the conventional chain graphs because all pigs sampled for this study were exposed to zinc and copper. Therefore, conventional pigs had no variation in zinc or copper exposure. However, roxarsone exposure, an organoarsenic compound no longer approved for use in swine, was positively associated with azithromycin’s MIC in conventional farms. Co-selection of antimicrobial resistance by heavy metals is a well-documented phenomenon in agriculture settings [49–51]. In other bacterial species, including C. jejuni, heavy metals can select for the persistence and expression of multidrug efflux pumps that act on a broad range of antimicrobial drugs, including macrolides [51, 52]. Factors related to biosecurity (e.g., presence of other animal species on site or presence of respiratory signs) were related to macrolide resistance outcomes. However, the variables and strength varied by farm type and graph. In conventional farms, the presence of ruminants and respiratory signs remain predictors of macrolide resistance outcomes in the combined graphs. Ruminants represent other important livestock reservoirs of Campylobacter spp. [53, 54].
The method presented here represents a new approach to address the complexity of antimicrobial resistance epidemiology. This data-driven approach aims to overcome some of the limitations of more classic analytical approaches. The classic framework relies heavily on testing a priori hypotheses informed by existing knowledge that may be unavailable due to the complexity of the processes driving resistance dynamics. These complex data sets also contain many highly correlated variables that may be irrelevant to the research question. Common assumptions from regression analysis, such as linearity or lack of multicollinearity, are often challenging to meet, and the need to run large numbers of hypothesis tests creates further statistical error [55]. Probabilistic graphical models are highly interpretable methods, thus facilitating the interpretation of the identified risk factors and their relation to the modeled outcome [22]. As with other methods in machine learning, however, the identified and estimates are sensitive to the hyperparameters and chosen for the chain graph analysis. We used BIC and stability selection methods to identify the and penalties needed to produce the most computationally stable models of the data [56].
Probabilistic graphical models have extensively used for causal discovery [57]. By simultaneously evaluating the effect of multiple resistance genotypes on multiple resistance phenotypes within a single model, we uncovered novel genotype-phenotype relationships in natural C. coli populations that have not been described before. For example, the effect estimates observed zinc exposure on macrolide MIC among ABF C. coli highlight promising areas for future experimental, genomic, and epidemiological research to untangle why so much macrolide resistance was observed without antimicrobial drug exposure.
Finally, with the chain graph methods used in this study, we could not account for microbial population structure, which can influence resistance dynamics, especially for Campylobacter spp. Several strains of C. coli exist and have distinct biological features that enable them to survive in specific environments better than others [53]. These strains may survive well because of their non-resistant genotypes, but may still incidentally retain specific resistance genotypes that do not necessarily confer a fitness advantage. In this case, population structure could explain why specific resistance genotypes and phenotypes exist in contexts without strong selective pressures for resistance. We could include sequence type (ST) as exposure measuring population structure. However, in this study, the large number of unique STs identified among C. coli made this approach impractical, while clonal complexes (CC) cannot be used in C. coli as these all cluster as CC-828 [15].
Conclusion
In this study, we have demonstrated how a chain graph-based analytical framework can generate insights in the complex epidemiology of antimicrobial resistance. Our approach characterizes context-specific risk factors and identifies future research areas to better understand the mechanisms driving resistance dynamics.
Methods
Overview of the Data
The data used for this study were collected from 18 cohorts of 35 pigs longitudinally sampled 5 times from birth to death at 26 weeks between October 2008 and December 2010 in North Carolina, USA [30]. Pigs were reared under either conventional or antibiotic-free (ABF) production systems. These two systems differed considerably in several management practices. In conventional systems, pigs received antimicrobial drugs in feed, water, and injectables and were reared indoors. In ABF systems, pigs did not receive antimicrobial drugs and were reared outdoors. During each of the three swine production stages, i.e. farrowing, nursery, and finishing, livestock management questionnaires were administered to producers to ascertain antimicrobial use, biocide use, biosecurity practices, and other animal husbandry practices. Fecal and environmental samples were collected five times throughout the pigs’ lifespan. Environmental samples included soil, feed, water, and surface swabs. All samples were cultured for Campylobacter spp., and 2898 C. coli isolates were subjected to broth microdilution antimicrobial susceptibility testing (AST) (CAMPY Sensititre panel; Trek Diagnostic Systems, Ohio). For each isolate subjected to AST, the minimum inhibitory concentrations (MIC) were determined for azithromycin (AZI), ciprofloxacin (CIP), erythromycin (ERY), gentamicin (GEN), tetracycline (TET), florfenicol (FFN), nalidixic acid (NAL), telithromycin (TEL), and clindamycin (CLI).
Of the 2,898 C. coli isolates phenotypically characterized via AST, 1,466 were whole-genome sequenced using the Illumina MiSeq platform as part of the US FDA GenomeTrakr project [58]. DNA was extracted from C. coli isolates based on manufacturer protocols using the Qiagen DNeasy Blood and Tissue Kits. DNA libraries were prepared using Nextera XT DNA Library Preparation Kits, then sequenced using Illumina MiSeq v2 (500 cycles) chemistry kits and flow cells. Once whole-genome sequences were obtained and submitted to NCBI, the associated FASTQ files for C. coli isolates used in this study were downloaded from GenBank using the fasterq-dump function from the NCBI SRA Toolkit (v2.10.8) [59], de novo assembled using the Shovill (vs 1.0.4) implementation of SPAdes (v3.12) [60], then annotated using Prokka (v1.14.6) [61]. Afterward, genomes were screened for AMR genes and other virulence factors using AMRFinderPlus (v3.10.5) [62]. Resistance genes with >90% coverage and >60% identity were considered present in the genome. Quality scores for all sequenced genomes were ascertained using Quast (v5.0.2) [63]. To be included in the analysis, the genome size had to be 1.4–2.0 Mbp, which is the expected genome size of Campylobacter spp. Additionally, only genomes with ≤ 200 contigs, N50 ≥25 kb, and L50 ≤50 were included in the analysis [15]. Of the 1466 C. coli that were whole-genome sequenced, 1282 had sufficient quality scores. The Sequencing Read Archives identifiers for all C. coli isolates considered for inclusion in this study are listed in Supplemental Material.
All sequenced C. coli isolates with complete AST data and sufficient genomic quality scores were further considered for inclusion in the chain graph analyses. The algorithm to learn chain graphs from data requires complete observations [26]. Of the 1282 C. coli with sufficient quality scores, 1082 isolates had complete antimicrobial exposure data and therefore qualified for final study inclusion. Afterward, other host exposure variables with any missing observations or zero variance were excluded from the analysis. Binary variables with very low variance (< 1% prevalence) were also excluded from the study. Collinearities among variables in the dataset were identified as any pair with correlations ≥ |0.7| based on a correlation matrix of all host exposure and genotype variables. Variables were removed from the analysis until no variable pairs with correlations ≥ |0.7| remained. The final dataset of conventional C. coli isolates included 683 observations and 46 predictor variables, and the dataset for ABF C. coli isolates included 399 observations and 33 predictor variables. A list and detailed explanation of all included predictor variables for the conventional and ABF chain graphs are given in Supplemental Material. The table of co-occurrence of resistant genes and mutations are given in Supplemental Material.
Chain Graph Models
We aimed to identify risk factors at the host exposure and microbial genotype scales for phenotypic macrolide- and fluoroquinolone-resistance among C. coli from swine farms. We accomplished this by learning three distinct chain graphs in which log2-transformed MIC comprised the outcome (phenotypic) data layer and variables from the following layers comprised the predictor data layer: 1) microbial genotype, 2) host exposure, 3) microbial genotype & host exposure (Fig. 1). A edge in each chain graph represents an association between a predictor variable and a specific log2-transformed MIC. Thus, a edge for a binary predictor variable with a value of 1.0 can be interpreted as follows: after accounting for all other predictor variables in the model, the presence of this binary predictor was associated with an observed 2-fold dilution increase in MIC for a given antibiotic. On the graphs, we only displayed predictor variables with edge magnitudes ≥1.0 because these edges corresponded to microbiologically detectable changes in MIC. The presence of an edge between two phenotype nodes represents a non-trivial penalized partial correlation among these two MICs after accounting for all outcome variation explained by the predictor variables and other outcome variables. The presence of an edge, therefore, suggests that other variables that are not included in the model explained this correlation among outcome variables.
Several host exposure variables, such as antimicrobial exposure and biosecurity practices, were highly correlated with the swine production system. Therefore, the graphs were stratified by the production system. In total, six unique chain graphs were learned for this study: three learned from C. coli isolated from conventional swine production systems, and three learned from ABF production systems.
The structure and parameters of these chain graphs were learned using a penalized maximum likelihood method developed by [26]. The algorithm to learn the chain graphs is an iterative process with the following steps: 1) regress the outcome data layer on the predictor data layer, 2) identify the conditional edges () connecting nodes in these different data layers, and 3) identify the partial correlations () connecting nodes within the outcome data layer. These steps are repeated until model convergence. The resulting edge adjacency matrix encompasses all effect estimates in which a predictor variable had a significant, conditional association with a resistance phenotype outcome after accounting for any variation explained by other predictor variables in the model. The edge adjacency matrix represents the non-trivial penalized partial correlations after accounting for conditional associations explained by the edge adjacency matrix.
Lasso and graphical lasso (glasso) L1 regularization methods were used to induce sparsity in the and edge adjacency matrices, respectively [64, 65]. The and penalties determine which and edges are included in the final model or reduced to zero, respectively. Using lower penalties leads to the inclusion of more edges. All and penalties used in our chain graph were evaluated using a stability selection procedure [56]. The final penalties were chosen based on model interpretability.
Bootstrapping was used to generate 95% confidence intervals for all and effect estimates. For each of the six distinct chain graphs, 200 bootstrapped samples were selected. The chain graphs were learned via the same algorithm applied to the whole dataset, except for the stability selection procedure. Instead of deriving unique and penalties for each bootstrapping iteration, chain graph models for each bootstrapped sample used the same and penalties identified from the chain graphs using the entire conventional and ABF datasets. The bootstrap sample proportion was half the available sample for each system (340 isolates for conventional and 198 for ABF). The resulting and effect estimates from each bootstrapping iteration were used to generate 95% confidence intervals. All analyses for this study were conducted in R (v3.6.3).
In addition to the and edge parameter values, several global graphical metrics were also quantified. The edge density represents the number of edges learned in the network out of all possible edges, and can be quantified for both and edge adjacency matrices. Additionally, because edges with magnitudes ≥1.0 are of greater interest, the density of edges with magnitudes ≥1.0 were also calculated for each chain graph. To better interrogate qualitative differences in risk factors for each resistance phenotype of interest, we also calculated the densities of all edges and edges with magnitudes ≥1.0 for subgraphs of each resistance phenotype. To assess whether the model fits the data, the Bayesian information criterion (BIC) was also calculated for each chain graph. Similar to the Akaike Information Criterion, a lower BIC corresponds to models that fit the data better than a higher BIC. All comparisons were considered significant at .
To evaluate the relative contribution of the directed and undirected portions of the chain graphs to explain each phenotypic outcome, , two coefficients of determination, and for the predictors and the other resistances , respectively were estimated 2. The value of for each outcome can be estimated directly using the linear models defined in as: , where is the set of predictors. In contrast, linear coefficients cannot be derived from and cannot be calculated directly. Instead, the residuals from the former model, , were fit to linear models with terms selected by nonzero elements of . The coefficients of determination for these residual linear models were calculated as , where were the set of variables adjacent to in the chain graph, and was the vector of maximum likelihood estimates for the linear model defined by . However only describes the portion of variance not explained by and we must multiply by to calculate . Subsequently, the total proportion of variance in each phenotypic outcome explained by the model, , may be calculated as the sum of and . The set of equations to determine the coefficients of determinations is the following:
Acknowledgments
This work was supported by the U.S. National Institutes of Health (NIH) (R35GM134934 and F30OD030022). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Footnotes
Supporting information
S1 File. Accession Numbers Complete list of all sequenced Campylobacter coli isolates from BioProject PRJNA293228 that were included in the analysis
S2 File. Predictor variables A list and detailed explanation of all included predictor variables for the chain graphs.
S3 File. Co-occurrence Resistance genotype co-occurrence tables generated from conventional swine farms and those from ABF swine farms.
S4 File. Input data Dataset with the input data necessary to run the chain graphs.
S5 File. Code Zip file with the R code necessary to run the chain graphs.
References
- 1.CDC. Campylobacter (Campylobacteriosis): Diagnosis and Treatment;. Available from: https://www.cdc.gov/campylobacter/diagnosis.html.
- 2.CDC. Antibiotic Resistance Threats in the United States, 2019. Atlanta, GA: U.S. Department of Health and Human Services, CDC; 2019. [Google Scholar]
- 3.Yang Y, Feye KM, Shi Z, Pavlidis HO, Kogut M, J Ashworth A, et al. A historical review on antibiotic resistance of foodborne Campylobacter. Front Microbiol. 2019;10:1509. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.FDA. 2021. NARMS Update: Integrated Report Summary;. Available from: https://www.fda.gov/animal-veterinary/national-antimicrobial-resistance-monitoring-system/2021-narms-update-integrated-report-summary.
- 5.Sproston EL, Wimalarathna HM, Sheppard SK. Trends in fluoroquinolone resistance in Campylobacter. Microb Genom. 2018;4(8):e000198. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Zhang P, Zhang X, Liu Y, Cui Q, Qin X, Niu Y, et al. Genomic insights into the increased occurrence of campylobacteriosis caused by antimicrobial-resistant Campylobacter coli. MBio. 2022;13(6):e02835–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Kaakoush NO, Castaño-Rodríguez N, Mitchell HM, Man SM. Global epidemiology of Campylobacter infection. Clin Microbiol Rev. 2015;28(3):687–720. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Wagenaar JA, French NP, Havelaar AH. Preventing Campylobacter at the source: why is it so difficult? Clin Infect Dis. 2013;57(11):1600–1606. [DOI] [PubMed] [Google Scholar]
- 9.Murphy CP, Carson C, Smith BA, Chapman B, Marrotte J, McCann M, et al. Factors potentially linked with the occurrence of antimicrobial resistance in selected bacteria from cattle, chickens and pigs: A scoping review of publications for use in modelling of antimicrobial resistance (IAM. AMR Project). Zoonoses Public Health. 2018;65(8):957–971. [DOI] [PubMed] [Google Scholar]
- 10.Akwar HT, Poppe C, Wilson J, Reid-Smith RJ, Dyck M, Waddington J, et al. Associations of antimicrobial uses with antimicrobial resistance of fecal Escherichia coli from pigs on 47 farrow-to-finish farms in Ontario and British Columbia. Can J Vet Res. 2008;72(2):202. [PMC free article] [PubMed] [Google Scholar]
- 11.Holmes AH, Moore LSP, Sundsfjord A, Steinbakk M, Regmi S, Karkey A, et al. Understanding the mechanisms and drivers of antimicrobial resistance. The Lancet. 2016;387(10014):176–187. doi: 10.1016/s0140-6736(15)00473-0. [DOI] [PubMed] [Google Scholar]
- 12.Wales AD, Davies RH. Co-selection of resistance to antibiotics, biocides and heavy metals, and its relevance to foodborne pathogens. Antibiotics. 2015;4(4):567–604. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Davies R, Wales A. Antimicrobial resistance on farms: a review including biosecurity and the potential role of disinfectants in resistance selection. Compr Rev Food Sci Food Saf. 2019;18(3):753–774. [DOI] [PubMed] [Google Scholar]
- 14.Andersson DI, Hughes D. Persistence of antibiotic resistance in bacterial populations. FEMS Microbiol Rev. 2011;35(5):901–911. [DOI] [PubMed] [Google Scholar]
- 15.van Vliet AH, Thakur S, Prada JM, Mehat JW, La Ragione RM. Genomic screening of antimicrobial resistance markers in UK and US Campylobacter isolates highlights stability of resistance over an 18-year period. Antimicrob Agents Chemother. 2022;66(5):e01687–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Veltcheva D, Colles FM, Varga M, Maiden MC, Bonsall MB. Emerging patterns of fluoroquinolone resistance in Campylobacter jejuni in the UK [1998–2018]. Microb Genom. 2022;8(9):000875. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Ager EO, Carvalho T, Silva EM, Ricke SC, Hite JL. Global trends in antimicrobial resistance on organic and conventional farms. Sci Rep. 2023;13(1):22608. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Bischoff KM, White DG, Hume ME, Poole TL, Nisbet DJ. The chloramphenicol resistance gene cmlA is disseminated on transferable plasmids that confer multiple-drug resistance in swine Escherichia coli. FEMS Microbiol Let. 2005;243(1):285–291. [DOI] [PubMed] [Google Scholar]
- 19.Boolchandani M, D’Souza AW, Dantas G. Sequencing-based methods and resources to study antimicrobial resistance. Nat Rev Genet. 2019;20(6):356–370. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Dean C, Nielsen JD. Generalized linear mixed models: a review and some extensions. Lifetime Data Anal. 2007;13(4):497–512. [DOI] [PubMed] [Google Scholar]
- 21.Love WJ, Zawack KA, Booth JG, Grohn YT, Lanzas C. Markov networks of collateral resistance: national antimicrobial resistance monitoring system surveillance results from Escherichia coli isolates, 2004–2012. PLoS Comput Biol. 2016;12(11):e1005160. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Koller D, Friedman N. Probabilistic Graphical Models: Principles and Techniques. MIT press; 2009. [Google Scholar]
- 23.Drton M, Maathuis MH. Structure Learning in Graphical Modeling. Annu Rev Stat Appl. 2017;4:365–393. [Google Scholar]
- 24.Sedgewick AJ, Shi I, Donovan RM, Benos PV. Learning mixed graphical models with separate sparsity parameters and stability-based model selection. BMC Bioinform. 2016;17(5):307–318. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Lee B, Zhang S, Poleksic A, Xie L. Heterogeneous multi-layered network model for omics data integration and analysis. Front genet. 2020;10:1381. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Lin J, Basu S, Banerjee M, Michailidis G. Penalized Maximum Likelihood Estimation of Multi-layered Gaussian Graphical Models. J Mach Learn Res. 2016;17:1–51. [Google Scholar]
- 27.Majumdar S, Michailidis G. Joint estimation and inference for data integration problems based on multiple multi-layered gaussian graphical models. J Mach Learn Res. 2022;23(1):1–53. [Google Scholar]
- 28.Getoor L, Rhee JT, Koller D, Small P. Understanding tuberculosis epidemiology using structured statistical models. Artif Intell Med. 2004;30(3):233–256. [DOI] [PubMed] [Google Scholar]
- 29.Love WJ, Wang CA, Lanzas C. Identifying patient-level risk factors associated with non-β-lactam resistance outcomes in invasive MRSA infections in the United States using chain graphs. JAC-Antimicrob Resist. 2022;4(4):dlac068. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Quintana-Hayashi MP, Thakur S. Longitudinal study of the persistence of antimicrobial-resistant Campylobacter strains in distinct swine production systems on farms, at slaughter, and in the environment. Appl Environ Microbiol. 2012;78(8):2698–2705. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Leclercq R. Mechanisms of resistance to macrolides and lincosamides: nature of the resistance elements and their clinical implications. Clin Infect Dis. 2002;34(4):482–492. [DOI] [PubMed] [Google Scholar]
- 32.Schechner V, Temkin E, Harbarth S, Carmeli Y, Schwaber MJ. Epidemiological interpretation of studies examining the effect of antibiotic usage on resistance. Clin Microbiol Rev. 2013;26(2):289–307. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Cantón R, Ruiz-Garbajosa P. Co-resistance: an opportunity for the bacteria and resistance genes. Curr Opin Pharmacol. 2011;11(5):477–485. [DOI] [PubMed] [Google Scholar]
- 34.Hughes D, Andersson DI. Environmental and genetic modulation of the phenotypic expression of antibiotic resistance. FEMS Microbiol Rev. 2017;41(3):374–391. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Durão P, Balbontín R, Gordo I. Evolutionary mechanisms shaping the maintenance of antibiotic resistance. Trends Microbiol. 2018;26(8):677–691. [DOI] [PubMed] [Google Scholar]
- 36.Ge B, McDermott PF, White DG, Meng J. Role of efflux pumps and topoisomerase mutations in fluoroquinolone resistance in Campylobacter jejuni and Campylobacter coli. Antimicrob Agents Chemother. 2005;49(8):3347–3354. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Gibreel A, Taylor DE. Macrolide resistance in Campylobacter jejuni and Campylobacter coli. J Antimicrob Chemo. 2006;58(2):243–255. [DOI] [PubMed] [Google Scholar]
- 38.Poehlsgaard J, Douthwaite S. The bacterial ribosome as a target for antibiotics. Nat Rev Microbiol. 2005;3(11):870–881. [DOI] [PubMed] [Google Scholar]
- 39.Corcoran D, Quinn T, Cotter L, Fanning S. An investigation of the molecular mechanisms contributing to high-level erythromycin resistance in Campylobacter. Int J Antimicrob Agents. 2006;27(1):40–45. [DOI] [PubMed] [Google Scholar]
- 40.Hull DM, Harrell E, van Vliet AH, Correa M, Thakur S. Antimicrobial resistance and interspecies gene transfer in Campylobacter coli and Campylobacter jejuni isolated from food animals, poultry processing, and retail meat in North Carolina, 2018–2019. PLoS One. 2021;16(2):e0246571. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.FDA. Extralabel Use and Antimicrobials;. Available from: https://www.fda.gov/animal-veterinary/antimicrobial-resistance/extralabel-use-and-antimicrobials.
- 42.Birkegård AC, Halasa T, Græsbøll K, Clasen J, Folkesson A, Toft N. Association between selected antimicrobial resistance genes and antimicrobial exposure in Danish pig farms. Sci Rep. 2017;7(1):1–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Spronk T, Green A, Vuolo M, Ruesch L, Edler R, Haley C, et al. Antimicrobial use and antimicrobial resistance monitoring in pig production in the United States of America. Sci Tech Rev. 2023;42:52–64. [DOI] [PubMed] [Google Scholar]
- 44.Mencía-Ares O, Argüello H, Puente H, Gómez-García M, Álvarez-Ordóñez A, Manzanilla EG, et al. Effect of antimicrobial use and production system on Campylobacter spp., Staphylococcus spp. and Salmonella spp. resistance in Spanish swine: A cross-sectional study. Zoonoses Public Health. 2021;68(1):54–66. [DOI] [PubMed] [Google Scholar]
- 45.Yun J, Muurinen J, Nykäsenoja S, Seppä-Lassila L, Sali V, Suomi J, et al. Antimicrobial use, biosecurity, herd characteristics, and antimicrobial resistance in indicator Escherichia coli in ten Finnish pig farms. Prev Vet Med. 2021;193:105408. [DOI] [PubMed] [Google Scholar]
- 46.Chandra Deb L, Timsina A, Lenhart S, Foster D, Lanzas C. Quantifying trade-offs between therapeutic efficacy and resistance dissemination for enrofloxacin dose regimens in cattle. Sci Rep. 2024;14(1):20598. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Stapleton GS, Cazer CL, Gröhn YT. Modeling the effect of tylosin phosphate on macrolide-resistant enterococci in feedlots and reducing resistance transmission. Foodborne Pathog Dis. 2021;18(2):85–96. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Souque C, González Ojeda I, Baym M. From Petri Dishes to Patients to Populations: Scales and Evolutionary Mechanisms Driving Antibiotic Resistance. Annu Rev Microbiol. 2024;78. [DOI] [PubMed] [Google Scholar]
- 49.Mazhar SH, Li X, Rashid A, Su J, Xu J, Brejnrod AD, et al. Co-selection of antibiotic resistance genes, and mobile genetic elements in the presence of heavy metals in poultry farm environments. Sci Total Environ. 2021;755:142702. [DOI] [PubMed] [Google Scholar]
- 50.Zhao Y, Su JQ, An XL, Huang FY, Rensing C, Brandt KK, et al. Feed additives shift gut microbiota and enrich antibiotic resistance in swine gut. Sci Total Environ. 2018;621:1224–1232. [DOI] [PubMed] [Google Scholar]
- 51.Yu Z, Gunn L, Wall P, Fanning S. Antimicrobial resistance and its association with tolerance to heavy metals in agriculture production. Food Microbiol. 2017;64:23–32. [DOI] [PubMed] [Google Scholar]
- 52.Lin J, Michel LO, Zhang Q. CmeABC functions as a multidrug efflux system in Campylobacter jejuni. Antimicrob Agents Chemother. 2002;46(7):2124–2131. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Sheppard SK, Maiden MC. The evolution of Campylobacter jejuni and Campylobacter coli. Cold Spring Harb Perspect Biol. 2015;7(8):a018119. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Thépault A, Rose V, Quesne S, Poezevara T, Béven V, Hirchaud E, et al. Ruminant and chicken: important sources of campylobacteriosis in France despite a variation of source attribution in 2009 and 2015. Sci Rep. 2018;8(1):1–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Wiemken TL, Kelley RR. Machine learning in epidemiology and health outcomes research. Annu Rev Public Health. 2020;41(1):21–36. [DOI] [PubMed] [Google Scholar]
- 56.Liu H, Roeder K, Wasserman L. Stability approach to regularization selection (stars) for high dimensional graphical models. Adv Neural Inf Process Syst. 2010;23. [PMC free article] [PubMed] [Google Scholar]
- 57.Glymour C, Zhang K, Spirtes P. Review of causal discovery methods based on graphical models. Frontiers Genet. 2019;10:524. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.FDA. GenomeTrakr Network;. Available from: https://www.fda.gov/food/whole-genome-sequencing-wgs-program/genometrakr-network.
- 59.Leinonen R, Sugawara H, Shumway M, Collaboration INSD. The sequence read archive. Nucleic Acids Res. 2010;39(suppl 1):D19–D21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, et al. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol. 2012;19(5):455–477. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Seemann T. Prokka: rapid prokaryotic genome annotation. Bioinform. 2014;30(14):2068–2069. [DOI] [PubMed] [Google Scholar]
- 62.Feldgarden M, Brover V, Gonzalez-Escalona N, Frye JG, Haendiges J, Haft DH, et al. AMRFinderPlus and the Reference Gene Catalog facilitate examination of the genomic links among antimicrobial resistance, stress response, and virulence. Sci Rep. 2021;11(1):1–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Gurevich A, Saveliev V, Vyahhi N, Tesler G. QUAST: quality assessment tool for genome assemblies. Bioinformatics. 2013;29(8):1072–1075. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Tibshirani R. Regression shrinkage and selection via the lasso. J R Stat Soc Ser B Methodol. 1996;58(1):267–288. [Google Scholar]
- 65.Friedman J, Hastie T, Tibshirani R. Sparse inverse covariance estimation with the graphical lasso. Biostatistics. 2008;9(3):432–441. [DOI] [PMC free article] [PubMed] [Google Scholar]





