Abstract
Shiga toxin-producing Escherichia coli (STEC) is detected in healthy individuals, who face work restrictions to prevent secondary transmission. To assess the virulence potential, we sequence 495 STEC isolates from healthy food handlers and social welfare workers in 2021 and compare them with 250 isolates from symptomatic patients. Nineteen serotypes (e.g., O156:H25, O174:H21, O105:H7) are significantly associated with asymptomatic carriers (SAAC), while five serotypes (O157:H7, O26:H11, O111:H8, O121:H19, O145:H28) are significantly associated with symptomatic patients (SAPA). SAPA strains frequently carry major virulence factors, including the LEE-encoded type III secretion system (100%) and Shiga toxin 2a (63.2%), which are less common in SAAC strains (38.3% and 3.7%). Among the 495 carrier isolates, 35 (7.1%) are high-risk, 178 (36.0%) moderate-risk, and 282 (57.0%) low-risk based on serotype and virulence markers. These findings suggest many strains in asymptomatic carriers have limited virulence, underscoring the need for risk-based strategies that avoid unnecessary restrictions.
Subject terms: Infection, Bacteriology
Comparative genomics of Shiga toxin-producing E. coli from healthy carriers and patients uncovers key virulence differences and supports risk-based management strategies.
Introduction
Shiga toxin-producing Escherichia coli (STEC) causes diarrhoea, haemorrhagic colitis (HC), and life-threatening complications, including haemolytic uraemic syndrome (HUS). STEC infections are prevalent globally, particularly in Southeast Asia, Europe, and North America. In Japan, there are approximately 3000 cases annually1. The principal virulence factor of STEC strains is Shiga toxin (Stx), an AB5-type toxin that disrupts protein synthesis by cleaving a specific adenine residue in the 28S rRNA within host cells, ultimately leading to apoptosis2. There are two antigenically distinct Stx types, Stx1 and Stx2, which are further categorized into at least three (Stx1a, -c, and -d) and fifteen (Stx2a–o) subtypes, respectively3–7. Stxs are encoded by phages and are transmitted between E. coli and related species8. Epidemiological studies indicate that Stx2a-producing strains are more strongly associated with severe cases, including HUS, than other strains are2,9.
O157 is the predominant STEC serogroup in many countries, and most O157 strains produce Stx2a with or without Stx1a1,10. In 2011, a large-scale outbreak caused by Stx2a-producing O104 occurred in Europe, primarily in Germany11. Various other STEC serogroups are also prevalent globally. Among these non-O157 serotypes, O26, O45, O103, O111, O121, and O145 constitute the majority of clinical infections and are commonly referred to as the “Big 6”12. Many strains within the Big 6 produce Stx1a, Stx2a, or both.
In addition to Shiga toxins, certain STEC strains possess a type III secretion system (T3SS) encoded by the locus of enterocyte effacement (LEE). Effectors secreted by the T3SS inhibit or modify various host functions and enhance adhesion through the formation of attaching-effacing (A/E) lesions in intestinal epithelial cells, significantly contributing to virulence13. Indeed, LEE-positive STEC strains have been shown to be closely associated with the onset of HUS9. Most strains of O157 and the Big 6 non-O157 serotypes are LEE-positive12. In addition to seven effectors in the LEE region, there are non-LEE effectors on phages and other mobile elements14. Additionally, numerous accessory virulence genes, such as those encoding haemolysin and proteases, are also believed to play a role in virulence15.
Ruminant animals, particularly cattle, constitute the primary reservoir for STEC strains, with contaminated food and water serving as the principal sources of STEC infection16. Owing to its high infectivity, secondary person-to-person transmission of STEC strains within households is also frequent in both sporadic cases and outbreaks1,17. Additionally, undiagnosed asymptomatic STEC carriers may facilitate secondary infections. It is estimated that 15% of STEC infections can be prevented by mitigating person-to-person transmission18. Infection prevention strategies, such as the sanitary isolation of STEC patients with typical symptoms, are crucial and indispensable for curbing person-to-person spread. Moreover, the increased use of multipathogen detection systems in the diagnosis of diarrhoeal patients has significantly increased the prevalence of sporadic identification of STEC strains, independent of clinically apparent disease or outbreaks19. In Japan, following a major STEC outbreak in Sakai city in 1996, food safety control measures and a comprehensive STEC surveillance system were established, making STEC infections notifiable in all cases, including asymptomatic carriers20. Furthermore, to prevent the transmission of STEC through food, individuals with STEC infections, regardless of their symptoms, are legally prohibited from working as food handlers in manufacturing, preparation facilities, and restaurants. Consequently, the Ministry of Health, Labour and Welfare requires food handlers to undergo routine stool testing for various infectious pathogens, including STEC21. Additionally, employees in social welfare facilities, such as nursery schools and elderly care facilities, are required to undergo regular faecal testing. Notably, previous research on asymptomatic STEC carriers identified several minor serotypes distinct from those causing severe infections, suggesting they pose a lower risk22–24. However, genomic studies on these strains remain limited. We thus analysed nearly 500 strains derived from healthy human adults and elucidated their genomic characteristics and virulence potential.
Results
Classification of serotypes associated with asymptomatic carries and patients
In stool tests conducted by the Japan Microbiological Laboratory on food handlers and social welfare facility workers across Japan in 2021, the STEC detection rate was 0.035%. A total of 521 strains were isolated from the positive samples, excluding duplicates, and the draft genome sequences of 495 strains were successfully determined (Supplementary Data 1). The predominant serotype of the STEC strains isolated from the asymptomatic carriers was O156:H25 (13.5%), followed by O174:H21 (7.7%) and O105:H7 (6.1%) (Table 1). In contrast, the Infectious Agents Surveillance Report by the National Institute of Infectious Diseases indicates that the most prevalent serotype among STEC strains isolated from symptomatic patients in Japan between 2011 and 2021 was O157:H7(H-) (64.5%), followed by O26:H11(H-) (18.5%) and O111:H8(H-) (3.9%). Based on statistical analyses of serotype distributions between patient-derived and healthy-carrier-derived isolates, nineteen serotypes were statistically identified as being significantly associated with asymptomatic carriers (SAAC) and 60.6% of the strains isolated from asymptomatic carriers were classified as SAAC strains (Table 1). Five serotypes were identified as being significantly associated with patients (SAPA), and 92.4% of the strains isolated from symptomatic patients were classified as SAPA strains. The genome sequences of 250 strains, with 50 representatives from each SAPA, were randomly selected and extracted from the public database for subsequent analysis (Supplementary Data 1).
Table 1.
Summary of the serotypes of strains isolated from the asymptomatic carriers and patients
| Serotypea | No. of strains fromb | P-valuec | Classificationd | |
|---|---|---|---|---|
| AC (%) | PA (%) | |||
| O156:H25 | 67 (13.5) | 13 (0.1) | <0.0001 | SAAC |
| O174:H21 | 38 (7.7) | 3 (0.0) | <0.0001 | SAAC |
| O105:H7 | 30 (6.1) | 1 (0.0) | <0.0001 | SAAC |
| O91:H14 | 29 (5.9) | 37 (0.2) | <0.0001 | SAAC |
| O8:H19 | 25 (5.1) | 6 (0.0) | <0.0001 | SAAC |
| O115:H10 | 15 (3.0) | 36 (0.2) | <0.0001 | SAAC |
| O128:H2 | 12 (2.4) | 16 (0.1) | <0.0001 | SAAC |
| O63:H6 | 11 (2.2) | 5 (0.0) | <0.0001 | SAAC |
| O148:H8 | 11 (2.2) | 0 | <0.0001 | SAAC |
| O8:H9 | 9 (1.8) | 0 | <0.0001 | SAAC |
| O113:H21 | 7 (1.4) | 9 (0.0) | <0.0001 | SAAC |
| O146:H21 | 7 (1.4) | 23 (0.1) | <0.0001 | SAAC |
| O109:H21 | 7 (1.4) | 4 (0.0) | <0.0001 | SAAC |
| O76:H19 | 6 (1.2) | 2 (0.0) | <0.0001 | SAAC |
| O89:H9 | 6 (1.2) | 0 | <0.0001 | SAAC |
| O168:H8 | 5 (1.0) | 8 (0.0) | <0.0001 | SAAC |
| O100:H20 | 5 (1.0) | 1 (0.0) | <0.0001 | SAAC |
| O181:H16 | 5 (1.0) | 0 | <0.0001 | SAAC |
| O55:H12 | 5 (1.0) | 4 (0.0) | <0.0001 | SAAC |
| O157:H7(H-) | 10 (2.0) | 12218 (64.5) | <0.0001 | SAPA |
| O26:H11(H-) | 16 (3.2) | 3507 (18.5) | <0.0001 | SAPA |
| O111:H8(H-) | 3 (0.6) | 741 (3.9) | <0.0001 | SAPA |
| O121:H19(H-) | 1 (0.2) | 588 (3.1) | <0.0001 | SAPA |
| O145:H28(H-) | 1 (0.2) | 458 (2.4) | 0.0001 | SAPA |
| O103:H2 | 20 (4.0) | 554 (2.9) | 0.14 | NS |
| O103:H11 | 2 (0.4) | 104 (0.5) | 1.00 | NS |
| O103:H25 | 1 (0.2) | 44 (0.2) | 1.00 | NS |
| O165:H25 | 1 (0.2) | 80 (0.4) | 1.00 | NS |
| O183:H18 | 2 (0.4) | 16 (0.1) | 0.08 | NS |
| O76:H7 | 1 (0.2) | 13 (0.1) | 0.30 | NS |
| O5:H9 | 0 | 84 (0.4) | 0.28 | NS |
| O186:H2 | 0 | 43 (0.2) | 0.63 | NS |
| O55:H7 | 0 | 11 (0.1) | 1.00 | NS |
| O172:H25 | 0 | 17 (0.1) | 1.00 | NS |
| O103:H8 | 0 | 16 (0.1) | 1.00 | NS |
| O177:H25 | 0 | 22 (0.1) | 1.00 | NS |
| O118:H16 | 0 | 8 (0.0) | 1.00 | NS |
| O69:H11 | 0 | 7 (0.0) | 1.00 | NS |
| O186:H11 | 0 | 6 (0.0) | 1.00 | NS |
| O84:H2 | 0 | 5 (0.0) | 1.00 | NS |
| Others | 137 (27.7) | 236 (1.2) | MS | |
| total | 495 (100) | 18946 (100) | ||
aSerotypes with 5 or fewer strains in either asymptomatic carriers or patients were classified as MS (minor serotype).
bStrains were isolated from the asymptomatic carriers (AC) in 2021 and the patients (PA) between 2011 and 2021 in Japan.
cStatistical analysis were performed using the Fisher’s exact test.
dSerotypes significantly associated with AC and PA were classified into SAAC and SAPA, respectively. The remaining serotypes were classified into NS (not significant).
Risk assessment of strains belonging to SAAC and SAPA
Core gene-based phylogenetic analysis revealed that the SAAC strains were phylogenetically indistinguishable from the SAPA strains (Fig. 1A). All the SAPA strains were LEE-positive, whereas 38.3% of the SAAC strains were LEE-positive. The prevalence of stx1 and stx2 exhibited a similar pattern in both SAAC and SAPA strains, with more than half of the strains in each group harbouring stx2 (Fig. 1B). Classification by stx1 subtype revealed that all the stx1 genes in the SAPA strains were stx1a, whereas 80.7% of the stx1 genes in the SAAC strains were stx1a; the remaining genes were stx1c (Fig. 1C). Notably, classification by stx2 subtype revealed that 88.8% of the stx2 genes in the SAPA strains were stx2a, whereas only 3.7% of those in the SAAC strains were stx2a. stx2 in the SAAC strains was classified into nine subtypes, with stx2f being the most prevalent (31.2%), followed by stx2e (22.4%). These stx2 subtypes were disseminated among STEC strains from the asymptomatic carriers in a largely lineage-independent manner (Fig. 1A). More importantly, none of the SAAC strains were simultaneously positive for both LEE and stx2a.
Fig. 1. The phylogenetic relationships and stx subtype distributions in the strains classified as SAAC and SAPA.
A A maximum likelihood (ML) tree based on the core genes depicts the phylogeny of STEC strains classified as SAAC (n = 300) and SAPA (n = 250), highlighting the distribution of LEE and stx subtypes. The ML tree is based on 91,555 SNP sites in 1,818 core genes. The tree was rooted by the cryptic Escherichia clade I strain TW15838. B Toxin types and (C) stx subtypes are summarized. Source values for Fig. 1C can be found in Supplementary Data 1.
We then analysed the distribution of genes encoding non-LEE effectors and accessory virulence factors in the LEE-positive SAAC and SAPA strains. The eae subtypes were consistent with the phylogenetic distribution, whereas the distributions of non-LEE effectors and accessory virulence factors exhibited various patterns (Fig. 2A). Notably, compared with the SAPA strains, the SAAC strains presented a significantly lower prevalence of both non-LEE effectors and accessory virulence factors (Fig. 2B and c). Seventeen effectors and eight accessory virulence factors were more prevalent in the SAPA strains than in the SAAC strains, whereas only one non-LEE effector and five accessory virulence factors were more prevalent in the SAAC strains than in the SAPA strains (Fig. 2D). Specifically, among the non-LEE effectors, espN, espW, espX, and nleL were present in approximately 80% or more of the SAPA strains but in less than 20% of the SAAC strains. Among the accessory virulence factors, efa1, espP, ihaA, and katP were apparently less prevalent in the SAAC strains than in the SAPA strains.
Fig. 2. Distribution of virulence factors among the LEE-positive STEC strains classified as SAAC and SAPA.
A Core-gene-based ML tree of LEE-positive STEC strains classified as SAAC (n = 115) and SAPA (n = 250) and the distribution of genes encoding non-LEE effectors and other E. coli virulence genes in these strains. The ML tree is based on 96,481 SNPs located in 2099 core genes. B The total counts of non-LEE effector genes and (C) other virulence genes are summarized. D Summary of the prevalence of each non-LEE effector gene and other virulence genes. The numbers of LEE-positive SAAC and SAPA strains were 115 and 250, respectively. Statistical analyses were performed via the Wilcoxon rank sum test (b and c) and Fisher’s exact test (d). ***P < 0.0001; **P < 0.01. Source values for B, C, and D can be found in Supplementary Data 1.
Risk group classification of STEC strains from asymptomatic carriers
We classified STEC strains isolated from asymptomatic carriers into risk groups. Strains belonging to SAPA serotypes (O157:H7(H-), O26:H11(H-), O111:H8(H-), O121:H19(H-), O145:H28(H-)) and other serotypes harboring both LEE and stx2a were classified as high-risk (Table 2). Strains of other serotypes harboring either LEE or stx2a were classified as moderate-risk, and those harboring neither LEE nor stx2a were classified as low risk. Out of 495 STEC isolates from the asymptomatic carriers, 10, 16, 3, 1, and 1 strains were identified as belonging to the SAPA serotypes, O157:H7, O26:H11, O111:H8, O121:H19, and O145: H-, respectively, constituting 31 strains in total (Table 3). All of these strains were LEE-positive, and four O157:H7 strains along with one O111:H8 strain harbored stx2a. The SAPA strains from asymptomatic carriers also exhibited high conservation of various effectors and other virulence factors, including espN (96.8%), espW (90.3%), espX (96.8%), efa1 (74.2%), espP (74.2%), ihaA (71.0%), and katP (77.4%), although nleL was absent in all strains (Supplementary Data 1). We considered these 31 strains as high-risk STEC based on their serotypic profiles (Table 3).
Table 2.
Risk group classification of STEC strains from asymptomatic carriers
| Risk group | Serotype | presence of stx2a and LEE | No. of strains |
|---|---|---|---|
| high-risk | O157:H7(H-), O26:H11(H-), O111:H8(H-), O121:H19(H-), O145:28(H-) | regardless | 31 |
| others | both positive | 4 | |
| moderate-risk | others | either positive | 178 |
| low-risk | others | both negative | 282 |
| Total | 495 |
Table 3.
Prevalence of LEE and stx subtypes in STEC from the asymptomatic carriers classified as SAPA
| Serotype | LEE | stx subtypes | No. of strains |
|---|---|---|---|
| O157:H7 | + | stx2c | 5 |
| stx1a, stx2a | 4 | ||
| stx1a, stx2c | 1 | ||
| O26:H11 | + | stx1a | 16 |
| O111:H8 | + | stx1a | 2 |
| O111:H8 | + | stx1a, stx2a | 1 |
| O121:H19 | + | stx1a | 1 |
| O145:H- | + | stx1a | 1 |
| total | 31 | ||
Among the STEC isolates from the asymptomatic carriers, 164 were neither classified under SAAC nor SAPA serotypes and were instead categorized as Not Significant (NS) or Minor Serotypes (MS) with five or fewer isolates per serotype (Table 1). These isolates were distributed across 152 distinct serotypes, with 12 strains remaining untypeable by in silico O-serotyping. O103:H2 was the most frequently identified serotype (20 strains). Of the NS and MS strains, 36 were LEE-positive and 28 carried stx2a (Table 4). Four isolates contained both LEE and stx2a, corresponding to the serotypes O103:H11, O108:H25, O150:H2, and O165:H25. These four strains possessed a range of non-LEE effectors, including espH, espJ, espL, espO, ibe, nleA, nleB, nleE, and nleH, as well as virulence factors such as pchA, ehxA, and paa (Supplementary Data 1). Additional effectors and virulence factors were also relatively well-conserved across these strains. These four strains were genetically considered as high-risk STEC (Table 2).
Table 4.
Prevalence of LEE and stx2a in STEC strains from the asymptomatic carriers classified as NS or MS
| LEE | stx2a | No. of strains |
|---|---|---|
| + | + | 4 |
| + | - | 32 |
| - | + | 24 |
| - | - | 104 |
| Total | 164 |
In addition to the 35 high-risk STEC strains, among the 495 STEC isolates from asymptomatic carriers, 147 were LEE-positive and 31 were stx2a-positive. These 178 strains (36.0%) were classified as moderate-risk STEC, while the remaining 282 strains (57.0%) were designated as low-risk STEC (Table 2).
Genetic and phenotypic virulence potential of high-risk STEC strains from asymptomatic carriers
Further evaluation was conducted to assess the virulence potential of STEC isolates belonging to SAPA isolated from healthy carriers. O157:H7 strains has been classified into at least nine clades, among which clades 6 and 8 are significantly associated with HUS, whereas clade 7 is predominantly associated with healthy carriers25,26. Of the ten O157:H7 strains isolated from healthy carriers in this study, six belonged to clade 7 and four to clade 2 (Supplementary Table 1).
Subsequently, we performed long-read sequencing on one isolate each of the serotypes O157:H7, O111:H8, O121:H19, O145:H − , and O26:H11 among the STEC strains isolated from healthy carriers, in order to compare the structures of their Stx phages with those of clinical isolates obtained from publicly available genome data (Supplementary Fig. 1). Except for the Stx1a phage of the O111:H8 strain STEC-AC330, the Stx phages of the healthy carrier-derived strains exhibited no major deletions and showed high structural similarity to those of the clinical isolates. Although the Stx1a phage of the O111:H8 strain STEC-AC330 also displayed high overall sequence similarity to the Stx1a phage of a clinical isolate, it appeared to be a defective phage.
Finally, the stx2 expression levels were compared among high-risk, moderate-risk, and low-risk STEC strains. The results showed that stx2 expression was significantly higher in high-risk STEC than in the other groups (Fig. 3).
Fig. 3. Comparison of stx2 expression levels among risk groups of isolates from asymptomatic carriers.

stx2 expression was measured following induction with mitomycin C in high-risk STEC (strains STEC_AC400, O157:H7, stx1a- and stx2a-positive), moderate-risk STEC (strain STEC_AC089, O105:H7, stx2e- and stx2f-positive) and low-risk STEC (STEC_AC142, O174:H21, stx2c-positive). Expression levels were normalized to the housekeeping gene gapA. Data represent the means of three independent experiments, and error bars indicate standard deviations. Statistically significant differences are indicated by bars, with corresponding P values shown. Source values for Fig. 3 can be found in Supplementary Data 1.
Discussion
In this study, 0.035% of healthy Japanese adults were positive for STEC. Asymptomatic STEC carriers have been reported at a prevalence of 1% among healthy children in France, 0.5% among healthy children in Germany, and 0.08% among healthy adults in Japan23,27,28. Many countries impose strict regulations on asymptomatic STEC carriers, even though their role in outbreaks remains debated19. However, restricting asymptomatic carriers from work can lead to economic hardship, psychosocial stress, and social stigma. In Japan, individuals identified as STEC carriers are legally prohibited from working as food handlers or in social welfare facilities21. Consequently, food handlers and social welfare facility workers who shed STEC for extended periods are often decolonized with antibiotics. However, the use of antibiotics in clinically asymptomatic individuals requires proper justification19. Moreover, there are concerns that certain antibiotics may promote Stx production; thus, antibiotic therapy for STEC infection remains controversial29.
The 495 STEC strains from asymptomatic carriers were classified into more than 100 serotypes, exhibiting extensive phylogenetic and genetic diversity. STEC represents a genetically diverse group that has evolved through horizontal gene transfer of virulence factors such as Stx, resulting in wide variation in virulence potential30–33. A previous study revealed that many STEC isolates from healthy adults belong to O serogroups which are rarely found in STEC isolates from symptomatic patients23. However, only two small-scale genomic studies of STEC from asymptomatic carriers have been conducted to date, comprising 27 and 10 strains, respectively22,24. In these studies, in addition to stx2a, various other stx subtypes, including stx2d and stx2e but not stx2f, were detected in STEC from healthy carriers, with a high prevalence of LEE-negative strains. Our extensive genomic analysis of STEC strains from the healthy carriers revealed 19 serotypes that were significantly associated with asymptomatic carriers, designated SAAC (Table 1). Among these serotypes, the prevalence rates for LEE and stx2a, which are risk factors for severe disease, were 38.3% and 3.7%, respectively, with no strains positive for both factors. Conversely, among strains of the five serotypes identified as being significantly associated with symptomatic patients, designated SAPA, 88.8% were positive for both LEE and stx2a. Among the stx2 subtypes in the SAAC strains, stx2f and stx2e predominated, accounting for 31.2% and 22.4%, respectively (Fig. 1). The pathogenicity of Stx2e to humans remains unclear, and no outbreaks with severe infection caused by Stx2f-producing strains have been reported. Additionally, stx2 expression levels are significantly lower in stx2c-, stx2d-, and stx2e-positive strains than in stx2a-positive strains24. These results indicate that a significant proportion of SAAC strains pose a low risk for the development of severe disease.
In the analysis of LEE-positive strains, the SAAC strains presented a lower prevalence of certain non-LEE effectors and accessory virulence factors than did the SAPA strains (Fig. 2). Specifically, the prevalence of espN, espW, espX, and nleL was markedly lower in the SAAC strains. While the function of EspN remains undefined, EspX and NleL are recognized as ubiquitin ligases, and EspW is involved in cytoskeletal remodelling (Supplementary Table 2). In terms of accessory virulence factors, the frequencies of espP, ihaA, and katP were significantly lower in the SAAC strains than in the SAPA strains. These genes are believed to play a role in colonization and survival within the host (Supplementary Table 2). The limited presence of non-LEE effectors and accessory virulence factors in SAAC strains may diminish their virulence potential.
In this study, among the strains isolated from food handlers and social welfare facility workers, 31 (6.2%) were identified as belonging to the five serotypes categorized as SAPA, including O157:H7, while four strains (0.9%) were not classified as SAPA but harboured both LEE and stx2a (Tables 2 and 3). None of the ten O157:H7 strains isolated from asymptomatic carriers belonged to clades 6 or 8 (Supplementary Table 1), which are known to be strongly associated with severe infections and outbreaks25,26. However, although only a limited number of strains were examined, Stx prophages in these high-risk STEC were mostly intact, and their stx2 expression levels were higher than those of moderate- and low-risk STEC (Fig. 2 and Supplementary Fig. 1). Although the carriers were asymptomatic, these high-risk STEC strains likely retain considerable virulence potential and may act as reservoirs of virulence genes through horizontal transfer, emphasizing the need for rigorous monitoring and control of their carriers. On the other hand, high-risk STEC carriers, and all others, should not be managed by uniform criteria; individualized strategies based on risk stratification for each strain are needed. Future studies should include larger-scale analyses to identify low-risk serotypes typical of asymptomatic carriers and improve risk assessment using STEC detection, serotyping, and stx subtyping. Among the 19 serotypes classified in the SAAC strains in this study, nine were untypable by routine serotyping with antisera. Establishing an antiserum set or PCR system is imperative for the accurate identification of SAAC strains.
A limitation of this study is that we could not assess the pathogenicity of SAAC strains using cell culture or animal models. Although infection models for STEC are not fully established, there are various animal models that mimic parts of the disease process34–36. Furthermore, future research should include not only larger-scale analyses of strains derived from healthy carriers in Japan but also STEC strains from healthy carriers globally.
In conclusion, this study identified serotypes characterized as STEC from healthy carriers. Strains associated with these serotypes are suggested to have a low prevalence of risk factors for severe infection and other virulence factors, indicating a low virulence potential. 7.1%, 36.0% and 57.0% of the 495 STEC strains from asymptomatic carriers were finally classified into high-, moderate-, and low-risk STEC groups. In asymptomatic STEC carriers, an individualized, risk-adapted approach is essential and would necessitate stringent hygiene measures exclusively for high- and moderate-risk strains while limiting unnecessary precautions against low-risk STEC. Return-to-work policies tailored to the virulence of STEC strains may mitigate the personal and socioeconomic burden in cases of asymptomatic prolonged shedding of low-risk STEC.
Methods
Strains used in this study
At the Japan Institute of Microbiology, we have a bank of STEC strains isolated from STEC-positive samples via routine stool tests of food handlers and social welfare facility workers. In 2021, a total of 521 STEC strains were isolated from 1,494,662 stool samples. Stool samples were initially screened by PCR to identify stx-positive specimens. The PCR-positive samples were subsequently plated directly onto selective media including CHROMagar STEC medium (Kanto Chemical Co., Inc., Tokyo, Japan), and colonies obtained were further analysed by PCR to confirm and isolate STEC strains. The samples were collected from healthy individuals without any symptoms of intestinal infection, and multiple isolates obtained from the same individual were excluded from the genomic analysis. Additionally, in Japan, STEC infections are classified as notifiable diseases, requiring comprehensive reporting, including information on serotypes, toxin types, and patient symptoms, which is collected by the National Institute of Infectious Diseases and published annually. From 2011–2021, we gathered data on the serotypes and toxin types of strains from patients exhibiting symptoms such as diarrhoea, abdominal pain, bloody stool, or haemolytic uraemic syndrome (HUS). We obtained genomic data for 250 strains, including 50 strains from public databases, representing the five SAAC serotypes, and incorporated them into the analysis (Supplementary Data 1).
In Japan, routine stool screening for STEC is legally mandated for food handlers and workers in social welfare facilities to prevent secondary transmission. The present study used only bacterial isolates obtained and stored from these routine surveillance tests. No personal identifiers or sensitive data of the individuals were included. Therefore, separate ethical approval was not required.
Genome sequencing
Genomic DNA was extracted from 1 ml of overnight culture of each STEC strain via the DNeasy Blood and Tissue Kit (Qiagen). Genomic DNA libraries were constructed via the xGen DNA Library Prep EZ Kit (Integrated DNA Technologies) in combination with NEBNext Multiplex Oligos for Illumina (96 Unique Dual Index Primer Pairs) (New England BioLabs). Sequencing was performed on the Illumina HiSeq Ⅹ Ten platform, generating paired-end reads of 151 base pairs.
Bioinformatical analysis
Both the raw sequencing data generated in this study and those retrieved from public databases were assembled using Platanus_B v1.3.237. In silico serotyping was conducted using ECTyper or SRST238,39. Subtyping of stx and eae was performed using SRST2 and ABRicate (https://github.com/tseemann/abricate), respectively, with a threshold of ≥99% sequence identity. The presence or absence of non-LEE effectors was assessed by TBLASTN analysis, which uses the amino acid sequence of the effector as the query and the assembled sequence as the database, with criteria of 50% or greater homology to the query sequence and 50% or greater coverage. The presence or absence of other virulence factors was determined using ABRicate under the default settings.
To construct a core gene-based phylogenetic tree, pangenomic analysis was carried out using Roary v3.13.3 with a 90% sequence identity threshold40. SNP sites were extracted from the core gene alignment using SNP-sites v2.5.1 with the -c option41. A maximum likelihood phylogenomic tree was subsequently generated using RAxML-NG ver. 1.0.1 with the following parameters: --all, --bs-trees 100, and --model GTR + G442. The ML tree was visualized and annotated using iTOL v6.643. Clade classification of O157:H7 was performed based on the SNP typing method described previously25,26,44.
Quantitative PCR
Overnight cultures of each STEC strain were diluted in LB medium to an optical density at 600 nm (OD₆₀₀) of 0.1 and incubated at 37 °C with shaking at 200 rpm for 110 min. Mitomycin C was then added to a final concentration of 0.5 µg/mL, and the cultures were further incubated for an additional 3 h under the same conditions. Total RNA was extracted using the RNeasy Protect Bacteria Mini Kit (Qiagen), and complementary DNA (cDNA) was synthesized with the PrimeScript RT Reagent Kit (Takara Bio). qPCR was performed using the THUNDERBIRD SYBR qPCR Mix (TOYOBO) with the synthesized cDNA as a template. The primer sequences used were as follows: stx1a forward 5′-GTGGCATTAATACTGAATTGTCATCA-3′ and reverse 5′-GCGTAATCCCACGGACTCTTC-3′45; stx2a, stx2c, and stx2e forward 5′-TCCATGACAACGGACAGCAG-3′ and reverse 5′-ACGCCAGATATGATGAAACCAG-3′24; stx2f forward 5′-AGAGGAGAGGAAGGGGTAAG-3′ and reverse 5′-TCACGGAACGAACTGAATAAC-3′46; and gapA forward 5′-TATGACTGGTCCGTCTAAAGACAA-3′ and reverse 5′-GGTTTTCTGAGTAGCGGTAGTAGC-3′24. Gene expression levels were normalized to the gapA gene and are presented as relative expression values.
Statistics and reproducibility
To identify serotypes significantly associated with asymptomatic carriers (SAAC) or symptomatic patients (SAPA), the frequencies of each serotype among strains from asymptomatic carriers and patients were compared using Fisher’s exact test (two-sided) in JMP Pro version 17 (SAS Institute, Cary, NC, USA). Serotypes that were significantly more prevalent (p < 0.05) in strains from asymptomatic carriers were classified as SAAC, whereas those significantly more prevalent in strains from patients were classified as SAPA. Serotypes without significant differences were categorized as Not Significant (NS). Serotypes with five or fewer strains in either asymptomatic carriers or patients were not analysed and were classified as minor serotypes (MS).
The significance of differences in the presence or absence of each non-LEE effector and other virulence genes between the SAAC and SAPA groups was also assessed using Fisher’s exact test (two-sided) in JMP Pro version 17. In addition, the total numbers of non-LEE effector genes and other virulence genes per strain were compared between SAAC and SAPA strains using the Wilcoxon rank-sum test in JMP Pro version 17. Differences in stx2 expression among strains were analysed using one-way ANOVA, followed by the Tukey–Kramer HSD test for multiple comparisons in JMP Pro version 18.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Supplementary information
Description of Additional Supplementary Files
Acknowledgements
We thank Ayaka Wakakuwa for providing technical assistance.
Author contributions
Y.I., K.L., T.S., S.I., and Y.O. conceptualized the study. Y.I., M.O., Y.H., T.Y., and Y.O. curated the data. Y.I., Y.H., H.K., A.N., T.Y., H.N.I., and T.K. performed the investigation. T.S. and Y.O. administered the project. Y.I. and Y.O. wrote the original draft. All the authors were responsible for reviewing and editing the manuscript. All the authors had access to the data presented in this study and had final responsibility for the decision to submit for publication.
Peer review
Peer review information
Communications Biology thanks Yanwen Xiong, Ying Hua and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editors: Yu-Wei Wu and Tobias Goris. A peer review file is available.
Data availability
All sequence data generated in this study have been deposited in the NCBI BioProject database under accession number PRJDB18641. The source data underlying all figures and analyses in the manuscript are provided in Supplementary Data 1. Any additional data that support the findings of this study are available from the corresponding author upon reasonable request.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
The online version contains supplementary material available at 10.1038/s42003-025-09369-x.
References
- 1.Terajima, J., Izumiya, H., Hara-Kudo, Y. & Ohnishi, M. Shiga Toxin (Verotoxin)-producing Escherichia coli and Foodborne Disease: A Review. Food Saf (Tokyo)5, 35–53 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Freedman, S. B., van de Kar, N. & Tarr, P. I. Shiga Toxin-Producing Escherichia coli and the Hemolytic-Uremic Syndrome. N Engl J Med389, 1402–1414 (2023). [DOI] [PubMed] [Google Scholar]
- 3.Bai, X., Scheutz, F., Dahlgren, H. M., Hedenstrom, I. & Jernberg, C. Characterization of Clinical Escherichia coli Strains Producing a Novel Shiga Toxin 2 Subtype in Sweden and Denmark. Microorganisms9, 2374 (2021). [DOI] [PMC free article] [PubMed]
- 4.Gill, A. et al. Characterization of Atypical Shiga Toxin Gene Sequences and Description of Stx2j, a New Subtype. J Clin Microbiol60, e0222921 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Hughes, A. C. et al. Structural and Functional Characterization of Stx2k, a New Subtype of Shiga Toxin 2. Microorganisms8, 4 (2019). [DOI] [PMC free article] [PubMed]
- 6.Scheutz, F. et al. Multicenter evaluation of a sequence-based protocol for subtyping Shiga toxins and standardizing Stx nomenclature. J Clin Microbiol50, 2951–2963 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Yang, X. et al. Genomic Characterization of Escherichia coli O8 Strains Producing Shiga Toxin 2l Subtype. Microorganisms10, 1245 (2022). [DOI] [PMC free article] [PubMed]
- 8.Schmidt, H. Shiga-toxin-converting bacteriophages. Res Microbiol152, 687–695 (2001). [DOI] [PubMed] [Google Scholar]
- 9.De Rauw, K., Buyl, R., Jacquinet, S. & Pierard, D. Risk determinants for the development of typical haemolytic uremic syndrome in Belgium and proposition of a new virulence typing algorithm for Shiga toxin-producing Escherichia coli. Epidemiol Infect147, e6 (2018). [DOI] [PubMed] [Google Scholar]
- 10.Tack, D. M. et al. Shiga Toxin-Producing Escherichia coli Outbreaks in the United States, 2010–2017. Microorganisms9, 1529 (2021). [DOI] [PMC free article] [PubMed]
- 11.Buchholz, U. et al. German outbreak of Escherichia coli O104:H4 associated with sprouts. N Engl J Med.365, 1763–1770 (2011). [DOI] [PubMed] [Google Scholar]
- 12.Brooks, J. T. et al. Non-O157 Shiga toxin-producing Escherichia coli infections in the United States, 1983-2002. J Infect Dis.192, 1422–1429 (2005). [DOI] [PubMed] [Google Scholar]
- 13.Schmidt, M. A. LEEways: tales of EPEC, ATEC and EHEC. Cell Microbiol12, 1544–1552 (2010). [DOI] [PubMed] [Google Scholar]
- 14.Tobe, T. et al. An extensive repertoire of type III secretion effectors in Escherichia coli O157 and the role of lambdoid phages in their dissemination. Proc Natl Acad Sci USA103, 14941–14946 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Hayashi, T. et al. Complete genome sequence of enterohemorrhagic Escherichia coli O157:H7 and genomic comparison with a laboratory strain K-12. DNA Res.8, 11–22 (2001). [DOI] [PubMed] [Google Scholar]
- 16.Armstrong, G. L., Hollingsworth, J. & Morris, J. G. Jr Emerging foodborne pathogens: Escherichia coli O157:H7 as a model of entry of a new pathogen into the food supply of the developed world. Epidemiol Rev.18, 29–51 (1996). [DOI] [PubMed] [Google Scholar]
- 17.Karmali, M. A. Infection by verocytotoxin-producing Escherichia coli. Clin Microbiol Rev2, 15–38 (1989). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Kintz, E., Brainard, J., Hooper, L. & Hunter, P. Transmission pathways for sporadic Shiga-toxin producing E. coli infections: A systematic review and meta-analysis. Int J Hyg Environ Health220, 57–67 (2017). [DOI] [PubMed] [Google Scholar]
- 19.Sayk, F., Hauswaldt, S., Knobloch, J. K., Rupp, J. & Nitschke, M. Do asymptomatic STEC-long-term carriers need to be isolated or decolonized? New evidence from a community case study and concepts in favor of an individualized strategy. Front Public Health12, 1364664 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Terajima, J., Iyoda, S., Ohnishi, M. & Watanabe, H. Shiga Toxin (Verotoxin)-Producing Escherichia coli in Japan. Microbiol Spectr2, (2014). [DOI] [PubMed]
- 21.Harada, T. et al. Laboratory investigation of an Escherichia coli O157:H7 strain possessing a vtx2c gene with an IS1203 variant insertion sequence isolated from an asymptomatic food handler in Japan. Diagn Microbiol Infect Dis.77, 176–178 (2013). [DOI] [PubMed] [Google Scholar]
- 22.Baba, H. et al. Genomic analysis of Shiga toxin-producing Escherichia coli from patients and asymptomatic food handlers in Japan. PLoS One14, e0225340 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Morita-Ishihara, T., Iyoda, S., Iguchi, A. & Ohnishi, M. Secondary Shiga Toxin-Producing Escherichia coli Infection, Japan, 2010-2012. Emerg Infect Dis.22, 2181–2184 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Sui, X. et al. Characteristics of Shiga Toxin-Producing Escherichia coli Circulating in Asymptomatic Food Handlers. Toxins (Basel)15, 640 (2023). [DOI] [PMC free article] [PubMed]
- 25.Iyoda, S. et al. Phylogenetic Clades 6 and 8 of Enterohemorrhagic Escherichia coli O157:H7 With Particular stx Subtypes are More Frequently Found in Isolates From Hemolytic Uremic Syndrome Patients Than From Asymptomatic Carriers. Open Forum Infect Dis.1, ofu061 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Manning, S. D. et al. Variation in virulence among clades of Escherichia coli O157:H7 associated with disease outbreaks. Proc Natl Acad Sci USA105, 4868–4873 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Bizot, E. et al. Shiga toxin-producing Escherichia coli carriage in 959 healthy French infants. Arch Dis Child106, 1239–1240 (2021). [DOI] [PubMed] [Google Scholar]
- 28.Harries, M., Dreesman, J., Rettenbacher-Riefler, S. & Mertens, E. Faecal carriage of extended-spectrum beta-lactamase-producing Enterobacteriaceae and Shiga toxin-producing Escherichia coli in asymptomatic nursery children in Lower Saxony (Germany), 2014. Epidemiol Infect144, 3540–3548 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Tarr, P. I. & Freedman, S. B. Why antibiotics should not be used to treat Shiga toxin-producing Escherichia coli infections. Curr Opin Gastroenterol38, 30–38 (2022). [DOI] [PubMed] [Google Scholar]
- 30.Bai, X. et al. Comparative Genomics of Shiga Toxin-Producing Escherichia coli Strains Isolated from Pediatric Patients with and without Hemolytic Uremic Syndrome from 2000 to 2016 in Finland. Microbiol Spectr10, e0066022 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Blankenship, H. M. et al. Population structure and genetic diversity of non-O157 Shiga toxin-producing Escherichia coli (STEC) clinical isolates from Michigan. Sci Rep.11, 4461 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Kalalah, A. A., Koenig, S. S. K., Bono, J. L., Bosilevac, J. M. & Eppinger, M. Pathogenomes and virulence profiles of representative big six non-O157 serogroup Shiga toxin-producing Escherichia coli. Front Microbiol15, 1364026 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Karmali, M. A. et al. Association of genomic O island 122 of Escherichia coli EDL 933 with verocytotoxin-producing Escherichia coli seropathotypes that are linked to epidemic and/or serious disease. J Clin Microbiol41, 4930–4940 (2003). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Melton-Celsa, A. R. & O’Brien, A. D. Animal models for STEC-mediated disease. Methods Mol Med73, 291–305 (2003). [DOI] [PubMed] [Google Scholar]
- 35.Ritchie, J. M. Infant Rabbit Model for Studying Shiga Toxin-Producing Escherichia coli. Methods Mol Biol2291, 365–379 (2021). [DOI] [PubMed] [Google Scholar]
- 36.Thorpe, C. M., Pulsifer, A. R., Osburne, M. S., Vanaja, S. K. & Leong, J. M. Citrobacter rodentium(varphiStx2dact), a murine infection model for enterohemorrhagic Escherichia coli. Curr Opin Microbiol65, 183–190 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Kajitani, R. et al. Platanus_B: an accurate de novo assembler for bacterial genomes using an iterative error-removal process. DNA Res27, dsaa014 (2020). [DOI] [PMC free article] [PubMed]
- 38.Bessonov, K. et al. ECTyper: in silico Escherichia coli serotype and species prediction from raw and assembled whole-genome sequence data. Microb. Genom7, 000728 (2021). [DOI] [PMC free article] [PubMed]
- 39.Inouye, M. et al. SRST2: Rapid genomic surveillance for public health and hospital microbiology labs. Genome Med6, 90 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Page, A. J. et al. Roary: rapid large-scale prokaryote pan genome analysis. Bioinformatics31, 3691–3693 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Page, A. J. et al. SNP-sites: rapid efficient extraction of SNPs from multi-FASTA alignments. Microb Genom2, e000056 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Kozlov, A. M., Darriba, D., Flouri, T., Morel, B. & Stamatakis, A. RAxML-NG: a fast, scalable and user-friendly tool for maximum likelihood phylogenetic inference. Bioinformatics35, 4453–4455 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Letunic, I. & Bork, P. Interactive Tree Of Life (iTOL) v5: an online tool for phylogenetic tree display and annotation. Nucleic Acids Res.49, W293–W296 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Riordan, J. T., Viswanath, S. B., Manning, S. D. & Whittam, T. S. Genetic differentiation of Escherichia coli O157:H7 clades associated with human disease by real-time PCR. J Clin Microbiol46, 2070–2073 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Berger, M., Aijaz, I., Berger, P., Dobrindt, U. & Koudelka, G. Transcriptional and Translational Inhibitors Block SOS Response and Shiga Toxin Expression in Enterohemorrhagic Escherichia coli. Sci Rep.9, 18777 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Yang, X. et al. Characterization of Escherichia coli strains producing Shiga Toxin 2f subtype from domestic Pigeon. Sci Rep.14, 24481 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Description of Additional Supplementary Files
Data Availability Statement
All sequence data generated in this study have been deposited in the NCBI BioProject database under accession number PRJDB18641. The source data underlying all figures and analyses in the manuscript are provided in Supplementary Data 1. Any additional data that support the findings of this study are available from the corresponding author upon reasonable request.


