primerJinn – a tool for rationally designing multiplex PCR primer sets and in silico PCR

Jason D Limberis; John Z Metcalfe

doi:10.21203/rs.3.rs-3025970/v1

This is a preprint.

It has not yet been peer reviewed by a journal.

The National Library of Medicine is running a pilot to include preprints that result from research funded by NIH in PMC and PubMed.

[Preprint]. 2023 Jun 26:rs.3.rs-3025970. [Version 1] doi: 10.21203/rs.3.rs-3025970/v1

primerJinn – a tool for rationally designing multiplex PCR primer sets and in silico PCR

Jason D Limberis ¹, John Z Metcalfe ¹

PMCID: PMC10350116 PMID: 37461503

Abstract

Background

Results

We developed primerJinn, a tool that designs a set of multiplex primers and allows for the in silico PCR evaluation of primer sets against numerous input genomes. We used primerJinn to create a multiplex PCR for the sequencing of drug resistance-conferring gene regions from Mycobacterium tuberculosis, which were then successfully sequenced.

Conclusions

primerJinn provides a user-friendly, efficient, and accurate method for designing multiplex PCR primers and performing in silico PCR. It can be used for various applications in molecular biology and bioinformatics research, including the design of assays for amplifying and sequencing drug-resistance-conferring regions in important pathogens.

Introduction

Multiplex PCR amplifies numerous targets in a single tube reaction and is essential in molecular biology and clinical diagnostics. One of its most important applications is in the targeted sequencing of pathogens. By using multiple primer pairs to amplify specific target regions in a single reaction, multiplex PCR allows for the simultaneous detection and identification of multiple pathogens or drug resistance-conferring regions in a single sample, making it a valuable tool for diagnostic and epidemiological studies. Despite this importance, few tools are available for designing multiplex PCR primers,^1–6 some of which are no longer accessible. This is because designing primers for multiple targets simultaneously is challenging, requiring careful consideration of multiple factors such as primer specificity, amplicon length, and primer interactions (dimer formation) under the specific reaction conditions. Yet, these parameters are critical for designing efficient primer sets for use on clinical samples with low amounts of DNA.

We developed primerJinn, a tool that designs a set of multiplex PCR primers and allows for the in silico PCR to evaluate them against numerous input genomes. primerJinn uses primer3⁷ to create primers and a clustering method to select the best primer set based on the amplicon size, melting temperature, and primer interactions. The in silico PCR function uses blast⁸ to identify primer pairs that amplify a DNA sequence of a user-specified maximum, at a given annealing temperature and salt concentration and provides detailed information about the primers and amplicons. Our tool also incorporates approximations for melting temperatures utilizing Q5 Hot Start High-Fidelity Polymerase buffers (NEB, USA), which differ significantly from most other polymerases. primerJinn provides an efficient and accurate method for designing multiplex PCR primers and performing in silico PCR and can be a valuable resource for researchers in the field of targeted sequencing for pathogens.

Implementation

primerJinn uses primer3 to design primers for each specific target range in an input fasta file. By default, these primers amplify a region of 400–800 nucleotides, have an optimal length of 20 (range 10, 40) nucleotides, an optimal Tm of 65°C (range 62°C, 68°C), and are specific to the input template if a mispriming fasta is provided. In addition, the portion of the fasta file provided that is not used to design a particular primer pair is appended to the mispriming library. Since high-fidelity polymerase buffers tend to increase the Tm of primers, we have included an approximation for the highest-fidelity polymerase, Q5 from NEB. We also use the NEB Tm calculator API to output the final Tm for the selected primer set. Following the design of one hundred (default value) primer pairs for each region, a matrix is constructed including Tm, amplicon size, and heterodimer formation probability (based on the Gibbs free energy) and used to generate clusters using a Euclidean metric and Ward linkage criterion. The cluster with the most regions covered is selected, and missing primers are added from the next closest cluster. The output is written to an Excel file.

primerJinn also allows for the in silico PCR evaluation of primers. It takes a reference fasta file and primer sequences as input and returns the binding position (located using blastn-short) and product length of any pair of primers that generate a product at, or below the input Tm (default is 70°C) and the maximum amplicon size (default is 2000 nucleotides). Options include annealing temperature, salt concentration, maximum product length, and whether two or more bases at 5’ end of the primer are required to bind. The output is written to an Excel file.

Results

To evaluate primerJinn we selected eight drug resistance-conferring gene regions from Mycobacterium tuberculosis, the etiological agent of tuberculosis. We passed the 4.4Mb, high GC genome fasta (NC_000962.3) to primerJinn with the regions listed in Table 1. primerJinn output one primer set for each position (Table 2), with the mean primer Tm and amplicon size being 65°C (range 64°C, 67°C) and 665 nucleotides (range 454, 791), respectively. We then used these primers as input for primerJinn in silico PCR function, which appropriately returned only the eight expected amplicons. Finally, we synthesized the 16 primers and performed singleplex and multiplex PCR on M. tuberculosis H37Rv genomic DNA using NEB Q5 HotStart DNA polymerase MasterMix for 35 cycles with denaturation, annealing, and extension for 10s at 98C, 20s at 65C and 30s at 72C. We ran electrophoresis gels and saw the expected bands (Fig. 1A), which showed no mispriming to the human genome (lane 10). We then sequenced the amplicon pool and saw only the expected sequences (Fig. 1B). We also evaluated our Q5 Tm approximation settings against 10000 random DNA sequences from 15 to 25 nucleotides long (1000 in each group). Our Tm approximations were a mean of −0.21°C (SD 0.71°C) below that of the NEB Tm.

Table 1.

Genomic targets for primerJinn multiplex design of drug resistance-conferring regions on M. tuberculosis H37Rv (NC_000962.3).

Gene	Drug	Genomic position start	Genomic position end
gyB	Fluoroquinolones	6578	7250
fgd1	Delaminid	490900	491416
rpoB	Rifampicin	761007	761277
rv0678	Bedaquiline	778990	779488
fbiC	Linezolid	1303831	1303911
atpE	Bedaquiline	1461045	1461291
inhA	Isoniazid	1674182	1674222
katG	Isoniazid	2154831	2154873
pncA	Pyrazinamide	2288681	2289242

Open in a new tab

Table 2. primerJinn output for eight drug resistance conferring gene regions of M. tuberculosis H37Rv (NC_000962.3).

Annealing temperatures (Tms) are in Celsius (Tms have been rounded for illustration purposes), and the product sizes in nucleotides.

Forward Primer	Reverse Primer	Forward Tm	Reverse Tm	Product Size	Name	Forward Primer Tm NEB	Reverse Primer Tm NEB
GAGGAACACCACTAGTACCG	CTCGATGACTTTACGGCCAT	65	64	675	Target 1461045–1461291	64	64
CATGGGATATGGAGCGATCG	GGGGTCGTAGGAGATCTTGA	66	66	791	Target 490900–491416	65	65
CCGGTTGTCCATTCCGTTTA	CTGTACGTATTTGGGTTGCG	64	66	454	Target 1303831–1303911	65	64
GGATGCGAGCTATATCTCCG	AATACGCCGAGAT GTGGAT G	65	65	458	Target 1674182–1674222	65	65
CAACAGTTCATCCCGGTTCG	GACGGATTTGTCGCTCACTA	65	67	759	Target 2288681–2289242	66	64
GCCACCATCGAATATCTGGT	GCTCCAGGAAGGGAATCATC	66	65	778	Target 761007–761277	64	65
GCATACCGAACGTCACAGAT	ACGGTCACCTACAAAAACGG	66	65	665	Target 778990–779488	65	65
GCTCTTAAGGCTGGCAATCT	CGGTCACACTTTCGGTAAGA	65	66	577	Target 2154831–2154873	65	64

Open in a new tab

A) Electrophoresis gel showing expected amplicon sizes and specificity for each individual and pooled primer pairs and B) coverage of the sequencing of the pooled multiplex PCR from lane 8.

Conclusion

Funding

NIAID grant award R01AI153213

Footnotes

Competing interests

All authors declare no competing interests

Declarations

Availability and requirements Project name: primerJinn Project home page: https://github.com/SemiQuant/PrimerJinn

Operating system(s): Platform independentProgramming language: PythonOther requirements: Blast+License: GNU GPL

Any restrictions to use by non-academics: License needed

Ethics approval and consent to participate

Not applicable

Consent for publication

Not applicable

Availability of data and materials

Code and raw data are available at https://github.com/SemiQuant/PrimerJinn.

References

1.Kechin A, Borobova V, Boyarskikh U, Khrapov E, Subbotin S, Filipenko M. NGS-PrimerPlex: High-throughput primer design for multiplex polymerase chain reactions. PLoS Comput Biol. 2020;16(12):e1008468. doi: 10.1371/JOURNAL.PCBI.1008468 [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Yuan J, Yi J, Zhan M, et al. The web-based multiplex PCR primer design software Ultiplex and the associated experimental workflow: up to 100- plex multiplicity. BMC Genomics. 2021;22(1):1–17. doi: 10.1186/S12864-021-08149-1/TABLES/4 [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Xie NG, Wang MX, Song P, et al. Designing highly multiplex PCR primer sets with Simulated Annealing Design using Dimer Likelihood Estimation (SADDLE). Nature Communications 2022 13:1. 2022;13(1):1–10. doi: 10.1038/s41467-022-29500-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Francis F, Dumas MD, Wisser RJ. ThermoAlign: a genome-aware primer design tool for tiled amplicon resequencing. Scientific Reports 2017 7:1. 2017;7(1):1–15. doi: 10.1038/srep44437 [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Brown SS, Chen YW, Wang M, Clipson A, Ochoa E, Du MQ. PrimerPooler: automated primer pooling to prepare library for targeted sequencing. Biol Methods Protoc. 2017;2(1). doi: 10.1093/BIOMETHODS/BPX006 [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Hammet F, Mahmood K, Green TR, et al. Hi-Plex2: A simple and robust approach to targeted sequencing-based genetic screening. Biotechniques. 2019;67(3):118–122. doi: 10.2144/BTN-2019-0026/ASSET/IMAGES/LARGE/FIGURE3.JPEG [DOI] [PubMed] [Google Scholar]
7.Untergasser A, Cutcutache I, Koressaar T, et al. Primer3—new capabilities and interfaces. Nucleic Acids Res. 2012;40(15):e115. doi: 10.1093/NAR/GKS596 [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215(3):403–410. doi: 10.1016/S0022-2836(05)80360-2 [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Code and raw data are available at https://github.com/SemiQuant/PrimerJinn.

[R1] 1.Kechin A, Borobova V, Boyarskikh U, Khrapov E, Subbotin S, Filipenko M. NGS-PrimerPlex: High-throughput primer design for multiplex polymerase chain reactions. PLoS Comput Biol. 2020;16(12):e1008468. doi: 10.1371/JOURNAL.PCBI.1008468 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R2] 2.Yuan J, Yi J, Zhan M, et al. The web-based multiplex PCR primer design software Ultiplex and the associated experimental workflow: up to 100- plex multiplicity. BMC Genomics. 2021;22(1):1–17. doi: 10.1186/S12864-021-08149-1/TABLES/4 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R3] 3.Xie NG, Wang MX, Song P, et al. Designing highly multiplex PCR primer sets with Simulated Annealing Design using Dimer Likelihood Estimation (SADDLE). Nature Communications 2022 13:1. 2022;13(1):1–10. doi: 10.1038/s41467-022-29500-4 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R4] 4.Francis F, Dumas MD, Wisser RJ. ThermoAlign: a genome-aware primer design tool for tiled amplicon resequencing. Scientific Reports 2017 7:1. 2017;7(1):1–15. doi: 10.1038/srep44437 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] 5.Brown SS, Chen YW, Wang M, Clipson A, Ochoa E, Du MQ. PrimerPooler: automated primer pooling to prepare library for targeted sequencing. Biol Methods Protoc. 2017;2(1). doi: 10.1093/BIOMETHODS/BPX006 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R6] 6.Hammet F, Mahmood K, Green TR, et al. Hi-Plex2: A simple and robust approach to targeted sequencing-based genetic screening. Biotechniques. 2019;67(3):118–122. doi: 10.2144/BTN-2019-0026/ASSET/IMAGES/LARGE/FIGURE3.JPEG [DOI] [PubMed] [Google Scholar]

[R7] 7.Untergasser A, Cutcutache I, Koressaar T, et al. Primer3—new capabilities and interfaces. Nucleic Acids Res. 2012;40(15):e115. doi: 10.1093/NAR/GKS596 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R8] 8.Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215(3):403–410. doi: 10.1016/S0022-2836(05)80360-2 [DOI] [PubMed] [Google Scholar]

PERMALINK

This is a preprint.

primerJinn – a tool for rationally designing multiplex PCR primer sets and in silico PCR

Jason D Limberis

John Z Metcalfe

Abstract

Background

Results

Conclusions

Introduction

Implementation

Results

Table 1.

Table 2. primerJinn output for eight drug resistance conferring gene regions of M. tuberculosis H37Rv (NC_000962.3).

Figure 1.

Conclusion

Funding

Footnotes

Availability of data and materials

References

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

This is a preprint.

primerJinn – a tool for rationally designing multiplex PCR primer sets and in silico PCR

Jason D Limberis

John Z Metcalfe

Abstract

Background

Results

Conclusions

Introduction

Implementation

Results

Table 1.

Table 2. primerJinn output for eight drug resistance conferring gene regions of M. tuberculosis H37Rv (NC_000962.3).

Figure 1.

Conclusion

Funding

Footnotes

Availability of data and materials

References

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases