Skip to main content
Clinical and Translational Medicine logoLink to Clinical and Translational Medicine
. 2024 Mar 11;14(3):e1615. doi: 10.1002/ctm2.1615

One step forward for nanopore protein sequencing

Ziyi Li 1,2, Yakun Yi 1,2, Lei Liu 3, Hai‐Chen Wu 1,2,
PMCID: PMC10928323  PMID: 38468491

1.

With the successful completion of the Human Genome Project, decoding entire human genomes has become a reality. According to the central dogma of molecular biology, genetic information flows from nucleic acid sequences to proteins, allowing for the decoding of the amino acid composition sequence from gene sequences. However, additional information remains concealed in subsequent processes such as post‐translational modifications, protein splicing, and degradation. This hidden information necessitates exploration through high‐throughput and high‐sensitivity proteomics research.

Since the inception of the first protein sequencing method, Edman Degradation, 1 in 1949, researchers have devoted considerable efforts to decode proteins. While mass spectrometry remains the gold standard in protein sequencing, it faces limitations in dynamic range, sequencing length, and accuracy. 2 Recent years have witnessed the emergence of new methods, including fluorosequencing, 3 single‐molecule peptide fingerprinting, 4 tunnelling current 5 and nanopore‐based protein sequencing. 6 , 7 , 8 , 9 , 10 , 11

While the success of nanopore DNA sequencing has driven the exploration of nanopore technology for protein sequencing, the latter presents unique challenges. Protein sequencing introduces additional complexity to nanopore systems when compared to DNA sequencing. This complexity stems not only from the existence of 20 distinctive amino acids, in contrast to the four nucleic acids in DNA, resulting in an exponential growth in data volume. Moreover, peptides in proteins exhibit heterogeneous charges, unlike the uniform negative charges of DNA molecules, thereby presenting challenges for directional translocation in nanopore systems.

Inspired by DNA sequencing, nanopore protein sequencing has evolved, broadly categorized into two distinct methodologies: full‐length chain sequencing and enzyme‐assisted sequencing (Figure 1). The full‐length chain sequencing strategy involves directing the unfolded protein chain through the nanopore. Three groups independently employed a DNA motor to ratchet a DNA‐peptide conjugate through a nanopore, allowing for the reading of the peptide sequence. 6 , 10 , 11 Other attempts include the use of electroosmotic force achieved by adding guanidinium chloride to the buffer solution 8 or by mutating protein pores. 12 On the other hand, enzyme‐assisted sequencing entails enzymatic cleavage of proteins into individual amino acids, subsequently identifying them sequentially. In a recent study, Huang et al. engineered the MspA nanopore with nitrilotriacetic acid, later combining it with Ni2+, which demonstrated the ability to distinguish all 20 proteinogenic amino acids. 9 While these advancements have pushed the boundaries of nanopore protein sequencing, none of them have been able to reveal sequence information. Recently, we proposed an alternative strategy for peptide sequencing based on a combination of enzymatic cleavage and host–guest interaction‐assisted nanopore sensing, aiming to achieve comprehensive peptide sequencing. 7

FIGURE 1.

FIGURE 1

Strategies in nanopore protein sequencing. The protein was first unfolded and purified to obtain individual peptides. Subsequently, these peptides were passed through a nanopore to generate a full‐length read, resulting in a series of current signals that encapsulate sequencing information (left). In another method, proteases are employed to sequentially digest the peptides into amino acids. These individual amino acids then pass through the nanopore, generating characteristic signals (right).

In our previous research, we found that the host‐guest interaction between phenylalanine (F) and cucurbit[7]uril (CB[7]) significantly improves nanopore recognition, resulting in consistent and prolonged current events. 13 Through meticulous experimentation, the FGXD8 model peptide demonstrated optimal performance for discriminating X, where ‘X’ represents any of the 20 proteinogenic amino acids (Figure 2A). The negative‐charged polyaspartic acid chain drove translocation in an electric field, while strong interactions between FG and CB[7] stabilized the peptide⊂CB[7] complex at the wildtype α‐hemolysin (WT αHL) constriction, providing the best resolution. Using this probe with the αHL and a specific mutant (M113F)7 allowed for the discrimination of all 20 proteinogenic amino acids in the FGXD8 sequence (Figure 2B).

FIGURE 2.

FIGURE 2

Peptide sequencing based on a combination of enzymatic cleavage and host–guest interaction‐assisted nanopore sensing. (A) Schematic of the experimental setup and peptide constructs used to discriminate the 20 amino acids. (B) Experimentally determined mean I/I 0 values and their s.d. in an ascending order generated by the translocation of FGXD8⊂CB[7] through wildtype α‐hemolysin (WT αHL) (left). The heat map of mean I/I 0 produced by FGXD8⊂CB[7] through WT and (M113F)7 αHL (right). (C) Schematic illustration of the process for peptide sequencing using the probe peptide FGGCD8 and the mixed enzymes digestion strategy. The model peptide was digested with a mixture of carboxypeptidase A (CPA) and CPB. (D) Stepwise enzymatic digestion strategy. The peptide was digested by either dilute CPA or CPB during each step. The sequencing was completed in six steps, with each step revealing the identities of one or two amino acids.

Motivated by these outcomes, we adapted this methodology for detecting free amino acids. Conjugating the amino group of free amino acids with the sulfhydryl group on FGGCD8 created the FGGC(X)D8 probe, efficiently attaching nearly all proteinogenic amino acids (Figure 2C). With nanopore mutants and altered experimental conditions, we achieved distinct and reliable differentiation of the 20 proteinogenic amino acids.

Identifying free amino acids is a preliminary step for protein sequencing. The challenge lies in obtaining sequential amino acid information. Expecting sequential peptide digestion by protease, we aimed to identify digested amino acids using the FGGC(X)D8 probe. When employing carboxypeptidase A and B (CPA and CPB) to cleave peptides from the C‐terminus, we faced a significant challenge: excessively rapid enzymatic digestion hindered isolating single amino acids. Fortunately, experiments revealed a correlation between amino acid abundance and position within the peptide chain (Figure 2C). Despite rapid cleavage, position‐dependent abundance variability suggested a pathway for deducing peptide sequences based on relative amino acid quantities. The inclusion of D8 in our probe, known for strong negative charges, plays a vital role in averaging charges across the 20 natural amino acids and enhancing sequencing efficiency.

While effective in accurately sequencing short peptides, our strategy faced dephasing beyond 8–10 digested amino acids. To address this, we devised a stepwise enzyme digestion approach (Figure 2D). Exploiting distinct properties of CPA and CPB, we introduced CPA to determine the sequence preceding arginine (R), followed by CPB to specifically cut the R residue. Repeating these cycles significantly extended our capability to accurately sequence much longer peptides.

In conclusion, our innovative peptide sequencing strategies mark a promising frontier in proteomics. Future optimization includes an integrated nanopore chip for high throughput, artificially modified enzymes to control the digestion speed, and improved probe design for reduced sequencing time. Integration of advanced data analysis algorithms and machine learning is crucial for handling the exponential growth in data volume from complex protein sequencing. Beyond basic research, in the clinical realm, nanopore protein sequencing has the potential to revolutionize diagnostics, enabling rapid and precise identification of disease biomarkers for earlier detection and personalized treatment plans, leading to improved patient outcomes. Despite the challenges, nanopore protein sequencing shows a bright future, promising to unlock the full narrative encoded within proteins and providing unprecedented insights into the molecular machinery of life, ushering in a new era of biomedical breakthroughs.

AUTHOR CONTRIBUTIONS

Ziyi Li, Yakun Yi and Hai‐Chen Wu conceived the manuscript and composed the figures. Ziyi Li, Lei Liu and Hai‐Chen Wu wrote the manuscript and approved the final draft.

CONFLICT OF INTEREST STATEMENT

Hai‐Chen Wu, Yakun Yi and Ziyi Li have filed patents describing the strategy for the nanopore‐based peptide sequencing. Lei Liu declares no conflict of interest.

ETHICS STATEMENT

This article does not contain any research involving humans or animals.

ACKNOWLEDGEMENTS

This project was funded by the National Natural Science Foundation of China (no. 22025407) and the Institute of Chemistry, Chinese Academy of Sciences.

Li Z, Yi Y, Liu L, Wu H‐C. One step forward for nanopore protein sequencing. Clin Transl Med. 2024;14:e1615. 10.1002/ctm2.1615

REFERENCES

  • 1. Edman P. A method for the determination of the amino acid sequence in peptides. Arch Biochem. 1949;22:475‐476. [PubMed] [Google Scholar]
  • 2. Alfaro JA, Bohländer P, Dai M, et al. The emerging landscape of single‐molecule protein sequencing technologies. Nat Methods. 2021;18:604‐617. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Reed BD, Meyer MJ, Abramzon V, et al. Real‐time dynamic single‐molecule protein sequencing on an integrated semiconductor device. Science. 2022;378:186‐192. [DOI] [PubMed] [Google Scholar]
  • 4. van Ginkel J, Filius M, Szczepaniak M, Tulinski P, Meyer AS, Joo C. Single‐molecule peptide fingerprinting. Proc Natl Acad Sci USA. 2018;115:3338‐3343. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Ohshiro T, Tsutsui M, Yokota K, Furuhashi M, Taniguchi M, Kawai T. Detection of post‐translational modifications in single peptides using electron tunnelling currents. Nat Nanotechnol. 2014;9:835‐840. [DOI] [PubMed] [Google Scholar]
  • 6. Brinkerhoff H, Kang ASW, Liu JQ, Aksimentiev A, Dekker C. Multiple rereads of single proteins at single‐amino acid resolution using nanopores. Science. 2021;374:1509‐1513. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Zhang Y, Yi Y, Li Z, Zhou K, Liu L, Wu HC. Peptide sequencing based on host‐guest interaction‐assisted nanopore sensing. Nat Methods. 2024;21(1):102‐109. [DOI] [PubMed] [Google Scholar]
  • 8. Yu L, Kang X, Li F, et al. Unidirectional single‐file transport of full‐length proteins through a nanopore. Nat Biotechnol. 2023;41:1130‐1139. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Wang KF, Zhang S, Zhou X, et al. Unambiguous discrimination of all 20 proteinogenic amino acids and their modifications by nanopore. Nat Methods. 2023. [DOI] [PubMed] [Google Scholar]
  • 10. Yan SH, Zhang J, Wang Y, et al. Single molecule ratcheting motion of peptides in a Porin A (MspA) nanopore. Nano Lett. 2021;21:6703‐6710. [DOI] [PubMed] [Google Scholar]
  • 11. Chen ZJ, Wang Z, Xu Y, Zhang X, Tian B, Bai J. Controlled movement of ssDNA conjugated peptide through porin A (MspA) nanopore by a helicase motor for peptide sequencing application. Chem Sci. 2021;12:15750‐15756. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Sauciuc A, Morozzo della Rocca B, Tadema MJ, Chinappi M, Maglia G. Translocation of linearized full‐length proteins through an engineered nanopore under opposing electrophoretic force. Nat Biotechnol. 2023. [DOI] [PubMed] [Google Scholar]
  • 13. Liu L, You Y, Zhou K, et al. A Dual‐Response DNA probe for simultaneously monitoring enzymatic activity and environmental pH using a nanopore. Angew Chem Int Ed. 2019;58:14929‐14934. [DOI] [PubMed] [Google Scholar]

Articles from Clinical and Translational Medicine are provided here courtesy of John Wiley & Sons Australia, Ltd on behalf of Shanghai Institute of Clinical Bioinformatics

RESOURCES