Long-read sequencing on the SMRT platform enables efficient haplotype linkage analysis in preimplantation genetic testing for β-thalassemia

Haitao Wu; Dongjia Chen; Qiang Zhao; Xiaoting Shen; Yongbin Liao; Ping Li; Philip C N Chiu; Canquan Zhou

doi:10.1007/s10815-022-02415-1

. 2022 Feb 9;39(3):739–746. doi: 10.1007/s10815-022-02415-1

Long-read sequencing on the SMRT platform enables efficient haplotype linkage analysis in preimplantation genetic testing for β-thalassemia

Haitao Wu ^1,^#, Dongjia Chen ^2,^#, Qiang Zhao ¹, Xiaoting Shen ², Yongbin Liao ¹, Ping Li ¹, Philip C N Chiu ^3,^4,^✉, Canquan Zhou ^2,^✉

PMCID: PMC8995213 PMID: 35141813

Abstract

Purpose

This study aimed to evaluate the value of long-read sequencing for preimplantation haplotype linkage analysis.

Methods

The genetic material of the three β-thalassemia mutation carrier couples was sequenced using single-molecule real-time sequencing in the 7.7-kb region of the HBB gene and a 7.4-kb region that partially overlapped with it to detect the presence of 17 common HBB gene mutations in the Chinese population and the haplotypes formed by the continuous array of single-nucleotide polymorphisms linked to these mutations. By using the same method to analyze multiple displacement amplification products of embryos from three families and comparing the results with those of the parents, it could be revealed whether the embryos carry disease-causing mutations without the need for a proband.

Results

The HBB gene mutations of the three couples were accurately detected, and the haplotype linked to the pathogenic site was successfully obtained without the need for a proband. A total of 68.75% (22/32) of embryos from the three families successfully underwent haplotype linkage analysis, and the results were consistent with the results of NGS-based mutation site detection.

Conclusion

This study supports long-read sequencing as a potential tool for preimplantation haplotype linkage analysis.

Supplementary Information

The online version contains supplementary material available at 10.1007/s10815-022-02415-1.

Keywords: Long-read sequencing, Preimplantation haplotype linkage analysis, Preimplantation genetic testing for monogenic diseases, Single-nucleotide polymorphisms, Thalassemia

Introduction

β-Thalassemia is a chronic hemolytic disease that is prevalent worldwide. In China, β-thalassemia is more common in southern regions, and its prevalence rate in Guangxi is as high as 4.91% [1]. β-Thalassemia is an autosomal recessive genetic disorder that is mainly caused by a point mutation in the β-globin gene (HBB) on chromosome 11, with a small number of cases caused by deletion of a large fragment of the β-globin gene cluster [2]. Patients with thalassemia minor or intermedia can attain long-term survival if properly treated. However, if not treated with the standard long-term blood transfusion and iron removal treatments, children with thalassemia major will die before 5 years old due to serious complications such as heart failure and liver failure, making this disease a serious economic and spiritual burden for families [3].

To achieve healthy pregnancies, couples with single-gene disease mutations can turn to preimplantation genetic testing for monogenic diseases (PGT-M), which is a less traumatic and more ethical option than prenatal diagnosis. The greatest difficulty of PGT-M technology is that the amount of initial DNA template is extremely limited (usually from 6 to 8 trophectoderm cells). Therefore, allele drop-out (ADO) is prone to occur during the DNA amplification process, which causes some allelic mutations to be undetectable, resulting in failed diagnosis or even misdiagnosis. To improve the accuracy of diagnosis, PGT-M currently generally requires a simultaneous haplotype linkage analysis. The haplotype in the diploid human genome is information on the arrangement of genetic material on either of a pair of chromosomes, which can be used to identify the source of the chromosome where a specific disease-causing gene is located. The transmission of genetic diseases can be effectively prevented by avoiding the transplantation of embryos with the haplotype including the disease-causing allele.

Short tandem repeat (STR)–based haplotyping has obvious drawbacks, such as low detection throughput, a long detection cycle, and the limited number of STRs [4]. Therefore, preimplantation haplotype linkage analysis currently mainly relies on the abundant single-nucleotide polymorphisms (SNPs) in the human genome as genetic markers [4]. The platforms used for SNP detection are next-generation sequencing (NGS) or SNP arrays, which can detect only DNA fragments smaller than 300 bases. However, the average interval between SNP sites in the human genome is approximately 1000 bases [5]. As a result, NGS and SNP arrays can detect no more than a single SNP from each DNA fragment, and it is impossible to obtain direct information on the serially arranged SNPs through these arrays. This leads to two obvious problems. First, the existence of a proband is necessary to determine which haplotype carries the disease-causing gene. However, most couples seeking preimplantation diagnosis have no probands in their families or cannot provide genetic material from the proband for testing. In addition, the amount of information provided by a single SNP is much lower than that provided by a combination of consecutive SNPs (for example, the amount of information contained by a single SNP and 5 consecutive SNPs is 2 and 32, respectively). Therefore, when using NGS and SNP methods, often, a large number of informative SNPs need to be detected to accurately establish haplotypes. However, the HBB gene is located on the short arm of chromosome 11, close to the telomere, which limits the number of nearby SNPs that are detectable. For some families, haplotypes cannot be constructed because there are not enough informative SNPs. Thus, there is an urgent need to explore a more efficient SNP-based haplotyping method to perform PGT for β-thalassemia.

In recent years, long-read sequencing technology, represented by single-molecule real-time (SMRT) sequencing (Pacific Bioscience) and Nanopore (Oxford) sequencing, has become available for DNA sequencing applications. The read length of NGS is less than 300 bp, while the read length of long-read sequencing can reach 10 kb to several megabases [6]. With the help of SMRT, we established a haplotyping method to directly sequence the mutation sites involved in β-thalassemia and multiple consecutive SNP markers nearby. This method not only ensures the accuracy and efficiency of haplotype construction but also avoids the problem of having to obtain probands. This method is expected to help β-thalassemia families in which haplotype construction through NGS is difficult; nevertheless, implement PGT-M and obtain healthy offspring in the future.

Methods

DNA source and ethical approval

This study was conducted for three couples who had undergone routine PGT-M at the First Affiliated Hospital of Sun Yat-Sen University. The female patient and the male patient of family 1 carried β-90 (c.130G>T) and βCD43 (c.-140C>T) mutations of the HBB gene (NM_000518.4). The female patient of family 2 carried a βIVS-II-654 (c.316-197C>T) mutation, and her husband was a carrier of a-gamma-delta-beta0 deletion (βδβ). The couple in family 3 carried the same point mutation, βIVS-II-654 (c.316-197C>T). We performed routine ovulation induction, oocyte retrieval, intracytoplasmic sperm injection (ICSI) fertilization, blastocyst culture, and trophoblast ectoderm biopsy for all three families [7]. The biopsies were amplified with multiple displacement amplification (MDA) (REPLI-g Single Cell Kit, Qiagen) and then analyzed for the β-thalassemia mutation by singleplex fluorescent PCR combined with reverse dot blot hybridization. The detailed process of PGT for β-thalassemia used in our center is described in a previous publication [7]. Genomic DNA was extracted from blood collected in EDTA tubes, and MDA products from embryo biopsy were used for the analysis of this study.

Pathogenic mutation detection and haplotype linkage analysis in the HBB core region

We designed a pair of primers (P1) to cover the whole HBB gene (forward primer: AGGTTTTCCAAAGGGGTTAT; reverse primer: GCATTTATGAGGTCAGCGTAG) (Table 1). The amplicon is approximately 7.4 kb in length and covers 17 common pathogenic mutations of β-thalassemia in the population (green line in Fig. 1). It can also be used to identify all rare missense mutations and indels in this region. We also designed another pair of P2 primers that partially overlapped with the P1 amplicon (forward primer: CTCAACCCTAAGACATAGCCTC; reverse primer: CTAAGCCCAGTCCTTCCAA). The amplicon is approximately 7.7 kb in length and is used for haplotype construction (orange line in Fig. 1). The genomic region covered by the P1 and P2 primers was used as the core region for haplotype construction. This region is approximately 10 kb and contains a total of 26 SNP sites. Both the P1 amplicon and the P2 amplicon form an overlap region of approximately 4000 bp, in which there are 4 SNP sites. Through these 4 SNPs, the haplotypes of the amplicons of P1 and P2 can be linked. The 5′ end of the designed primer carries a barcode (9F × 5R), which is used to identify samples during data analysis.

Table 1.

Primers designed for this study

		Primer sequence 5′–3′	Length of amplicon
P1	Forward:	AGGTTTTCCAAAGGGGTTAT	7.4 kb
P1	Reverse:	GCATTTATGAGGTCAGCGTAG	7.4 kb
P2	Forward:	CTCAACCCTAAGACATAGCCTC	7.7 kb
P2	Reverse:	CTAAGCCCAGTCCTTCCAA	7.7 kb
P3	Forward:	TTCCCCAGACCAAATGGAGC	4978 bp
P3	Reverse:	ATTTCCGTGACTCGCCCTTT	4978 bp
P4	Forward:	TGACTGCCTCATTGTGTGCT	4836 bp
P4	Reverse:	CCGGGAGAGCACTAAGAACC	4836 bp
P5	Forward:	TCTCCCATGACCTACCATATCC	4977 bp
P5	Reverse:	TCACTGAACCTACGCCCCAT	4977 bp
P6	Forward:	GCTTCCCCATCCCTTCTCAC	5066 bp
P6	Reverse:	GGCTTTGGGGAGACTGGTAG	5066 bp

Open in a new tab

Fig. 1. — A map of the primers designed for this study. We designed a pair of primers P1 (green line) to amplify the whole HBB gene. The amplicon is approximately 7.4 kb in length and covers 17 common pathogenic mutations of β-thalassemia in the population. It also spans all rare missense mutations and indels in this region. We also designed another pair of primers P2 (orange line) that partially overlapped with the P1 amplicon. The amplicon is approximately 7.7 kb in length and is used to construct haplotypes. The genomic region covered by the P1 and P2 primers was used as the core region for haplotype construction. This region is approximately 10 kb and contains a total of 26 SNP sites. Both the P1 amplicon and the P2 amplicon form a region of approximately 4000 bp of overlap, in which there are 4 SNP sites. Through these 4 SNPs, the haplotypes of the amplicons of P1 and P2 can be linked. Moreover, we designed four pairs of primers: P3, P4, P5, and P6 (blue lines) in the regions approximately 1 Mb and 1.5 Mb upstream and downstream of the HBB core region. In this way, an additional four amplicons with a length of approximately 6–8 kb were obtained for haplotype construction in the peripheral region. Through analysis of the results of the four regions of the parental and embryo samples, a haplotype linking the four regions was constructed, which could serve to determine the mutation carrier status of the embryos in which all SNPs exhibited ADO during haplotype construction in the HBB core region

We used the gDNA of three couples as templates and used PrimeSTAR® GXL DNA Polymerase (Takara) to perform long-segment PCR in a two-step method. The obtained amplicon strictly followed the SMRTbell Template Prep Kit 1.0 (Pacific Biosciences) to obtain the SMRT bell library. AMPure PB beads (Pacific Biosciences) were used to remove excess unbound polymerase from the library. A Qubit quantifier was used to determine the DNA concentration of the library, and an Agilent Bioanalyzer 2100 kit was used to detect the size of the library fragments. The library was sequenced on the PacBio Sequel System using Sequel Sequencing Kit 3.0 chemistry (Pacific Biosciences).

CCS software version 3.0.0 (https://github.com/pacificbiosciences/unanimity) was used to generate CCS reads. By comparing the sequencing results with the reference sequence (hg38) through pbmm2 version 0.10.0 (https://github.com/PacificBiosciences/pbmm2), the HBB mutation and SNP sites were detected. Through these results, we were able to directly construct haplotypes with 20 consecutively arranged SNPs in the HBB core region for the couples and thus determine in which haplotype the gene mutation site is located. In the same way, we performed pathogenic mutation detection and haplotype construction in the core region for the embryos of the three couples using the embryo MDA product as a template. By comparing the haplotypes of the embryos with those of the parents, it could be revealed whether the embryos carry disease-causing mutations without the need for a proband (Fig. 2).

Fig. 2. — The result of haploid linkage analysis based on the HBB core region for family 1

Haplotype linkage analysis with SNP peripheral to the HBB core region

Considering the possibility of haplotype construction failure due to ADO of all genetic markers in the HBB core region, we designed four pairs of primers: P3, P4, P5, and P6 (blue lines in Fig. 1; Table 1) in the regions approximately 1 Mb and 1.5 Mb upstream and downstream of the HBB core region. In this way, an additional four amplicons with a length of approximately 6–8 kb were obtained for haplotype construction in the peripheral region. We used the primers P3, P4, P5, and P6 to amplify the DNA samples. Primestar GXL DNA polymerase (Takara) was used for long fragment PCR. The amplified products were sequenced by the same method. Due to the large distance between the four amplicons and the HBB core region, we could only obtain these four haplotypes independently, i.e., the haplotypes were not directly linked to the HBB core region. However, through analysis of the results of the four regions of the parental and embryo samples, a haplotype linking the four regions was constructed. The protocol was performed on four samples for each family, that is, the gDNA of each parent and the MDA products from one embryo with successful haplotype construction in the core region and from one embryo that exhibited ADO. By comparing the haplotypes of the parents and embryos with successfully detected mutation sites, the linkage between the peripheral haplotypes and the pathogenic mutations can be determined, thus helping to determine the pathogenic mutation carrier status of the embryos that exhibited ADO during haplotype construction in the core region.

Results

Pathogenic mutation detection and haplotype linkage analysis in the HBB core region

A total of 38 samples from the three families were tested, including 6 parental blood samples and 32 embryonic MDA products (10 from family 1, 8 from family 2, and 14 from family 3). The average sequencing depth of each amplicon was > 1500×, and the sequencing depth of the P1–P2 overlap region exceeded 3000×. The mutation sites in the three couples were accurately detected. The haplotypes of the couples from both family 1 and family 2 were successfully constructed. For family 3, the haplotype of the female patient was successfully constructed, and although the result of the haplotype analysis of the male patient was homozygous wild type, this result was consistent with his genotype (i.e., a carrier of the a-gamma-delta-beta0 large deletion). Overall, 19, 20, and 24 informational SNP loci were detected in the three families, respectively.

The detailed results of pathogenic mutation detection and haplotype linkage analysis in the HBB core region for the three families are displayed in the Figs. 2, 3, and 4, respectively. A total of 72.7% (24/33) of embryos of the three families underwent successful haplotype linkage analysis, and the results were consistent with the results of NGS-based mutation site detection. ADO (i.e., detection of only one haplotype) occurred in 10% (1/10) of family 1 embryos and 25% (2/8) of family 2 embryos. For family 3, 50% (7/14) of embryos appeared to have experienced ADO, among which 6 embryos (see E-4, E-5, E-6, E-10, E-14 in Fig. 4) had only the haplotypes of maternal origin detected; therefore, it was still not possible to determine whether the 6 embryos had inherited the haplotype of the father’s a-gamma-delta-beta0 deletion or if all the SNPs in the paternal haplotype had experienced ADO.

Fig. 3. — The result of haploid linkage analysis based on the HBB core region for family 2

Fig. 4. — The result of haploid linkage analysis based on the HBB core region for family 3. E-4, E-5, E-6, E-10, and E-14 had only the haplotypes of maternal origin detected; therefore, it was still not possible to determine whether the 6 embryos had inherited the haplotype of the father’s a-gamma-delta-beta0 deletion or if all the SNPs in the paternal haplotype had experienced ADO

Haplotype linkage analysis with SNP peripheral to the HBB core region

We successfully performed peripheral haplotype construction for the couples and embryos of family 1 and family 3 (Supplementary Table 1. The linkage between the peripheral haplotypes and the pathogenic mutations was successfully determined by comparing the haplotypes of the parents and embryos with positively detected mutation sites. Thus, haplotype linkage analysis was successfully performed on embryos of family 1 and family 3 with ADO in the core region, and the pathogenic mutation carrier status was accurately inferred. However, peripheral haplotype construction failed in family 2 due to insufficient informative SNPs detected in the four peripheral regions (Supplementary Table 1).

Discussion

Optimizing the PGT-M method for β-thalassemia is of great significance for promoting the genetic health of the population in thalassemia-prone areas in southern China. Singleplex fluorescent PCR combined with reverse dot blot analysis is the typical β-thalassemia PGT-M detection method used in our center, and it can detect up to 17 β-thalassemia mutations that are common in China [7]. We have also combined singleplex fluorescent PCR with PCR-based STR haplotyping to improve the accuracy of β-thalassemia diagnosis [8, 9]. However, due to the limited number of STR sites, PCR-based mutation detection and STR haplotyping methods cannot completely avoid the impacts of PCR-related ADO and recombination between genetic markers on the diagnosis results [4, 10]. High-throughput, high-sensitivity, and high-automation NGS technologies have been considered promising options in PGT in recent years. We combine NGS-based mutation detection with the detection of SNPs closely linked to mutation sites. The efficiency and accuracy of the preimplantation haplotype linkage analysis of β-thalassemia are greatly improved with the increase in the number of genetic markers detected [11, 12]. However, the short read length of NGS allows only the detection of many single SNP sites independently, which not only provides a greatly reduced amount of information but also requires a proband for haplotype linkage analysis, making PGT more difficult for some families.

Long-read sequencing technology has opened a new era of sequencing. SMRT sequencing from Pacific Biosciences is a single-molecule sequencing technology based on light signals that maximizes the activity and continuity of the polymerase, enabling ultralong read lengths (> 10 kb) and high accuracy (> 99%) [13]. SMRT can accurately detect more than 95% of single-nucleotide variants, insertions and deletions < 50 bp, and structural variants in the human genome [14]. In addition, long-read SMRT technology can be used to directly obtain haplotypes containing a continuous arrangement of SNPs. Moreover, since the mutation site can be detected together with the closely linked SNP sites up- and downstream of the mutation, SMRT can also be used to analyze the linkage between the pathogenic site and the nearby SNPs without the need for material from additional family members. In this study, the genetic material of the three β-thalassemia mutation carrier couples was sequenced using SMRT to detect the presence of 17 common HBB gene mutations in the Chinese population and the haplotypes formed by the continuous array of SNPs linked to these mutations. The results showed that the HBB gene mutations of the three couples were accurately detected, and the haplotype linked to the pathogenic site was successfully obtained without the need for a proband. A total of 68.75% (22/32) of embryos from the three families successfully underwent haplotype linkage analysis, and the results were consistent with the results of NGS-based mutation site detection. Our research supports the value and feasibility of the application of long-read sequencing in the PGT for β-thalassemia, especially for the families who cannot achieve haplotype analysis through NGS-based PGT.

Since thalassemia is the most common monogenic disease in southern China (where our reproductive center is located), and β-thalassemia is more dominated by point mutations than α-thalassemia, validation of the feasibility of our method in β-thalassemia carriers was our first choice. Although our method overcomes issues for PGT for β-thalassemia, it still lacks in price and has high error rates. This work would be more significant if it has advantages for PGT of other monogenic diseases as well. Theoretically, the method in this study can be applied to linkage analysis of other monogenic diseases if there are enough informative SNPs upstream and downstream of the target mutations. The study by Wenger et al. showed that SMRT can be used to perform haplotype linkage analysis for 99.64% of human genome mutations, suggesting that SMRT may help to further improve the accuracy of mutation detection for a variety of diseases [14]. In addition, this method was also used in the study of Wilbe et al. for haplotype construction in two families with gonadal mosaicism. In one family, haplotype construction was successfully carried out, while the other family failed to construct haplotype due to no information SNP was detected [15]. This failure may have been due to the short amplicon fragments sequenced in that study (3.8 kb and 3.9 kb, respectively). Here, we not only amplified a 7.4-kb region containing the HBB gene to ensure the detection of the mutation site but also amplified a 7.7-kb region that partially overlapped with the gene, extending the detection region to 10 kb. The SNP sites detected in the overlapping 4-kb region can be used to link the haplotypes detected in the two regions, making it possible to obtain more consecutively arranged SNP sites and increase the success rate of haplotype construction. More research is needed to verify the feasibility of long-read sequencing in PGT for other monogenic diseases, but our study undoubtedly provides an important reference for future explorations.

Although the probability of ADO occurring at all SNP sites in an 11-kb region is very low, we found that it still occurred in some embryos, suggesting that we need to continue to optimize our method, such as by changing the primers or polymerase used, to rule out problems during the long fragment PCR process. If there is no problem with the long fragment PCR process, the sample size needs to be further expanded, since only three pedigrees have been tested to date. In addition, to avoid failures in the haplotype construction of the HBB core region due to problems such as ADO, we also amplified the 4 peripheral regions approximately 1 Mb and 1.5 Mb upstream and downstream of the core region for haplotype construction. The constructed haplotype can be used for haplotype linkage analysis through pedigree analysis between parents and embryos with successful mutation detection and can help to infer the pathogenic mutation carrier status of embryos that experienced ADO of all the genetic markers in the HBB core region. Using this method, we successfully predicted the mutation carrier status of two embryos for which we could not construct haplotypes in the core region, and the results were consistent with those of traditional NGS-based mutation detection, indicating that this method greatly improved the detection ability of our protocol.

Overall, although long-read sequencing is not a new technique, this study is the first to show that it can perform PGT-M of β-thalassemia without the need for a proband, which breaks through a major limitation of NGS-based PGT-M, making our preliminary results meaningful. Our study not only provides a reference for the application of long-read sequencing in PGT of other monogenic diseases but also provides a solution for some situations in which haplotype construction cannot be successfully performed by long-read sequencing, as in the study of Wilbe et al. [15]

The biggest limitation of this study is the extremely limited sample size, which makes our study difficult to draw strong conclusions. When proposing a new protocol for diagnosis, multiple samples must be screened. Due to financial constraints and the difficulty of finding couples willing to participate, this study could only be conducted in three couples. However, the preliminary results of the three couples have shown that the technique has great potential, and it is worthy of further study to verify the value of long-read sequencing in the application of PGT for β-thalassemia and other monogenic diseases.

Moreover, we found that SMRT was stable and reliable in terms of both sequencing and typing results when gDNA is used as the template, but it is less effective in embryonic MDA products. Sequencing data of embryonic MDA products showed that the heterozygous ratio is prone to imbalance. For example, the ratio of a certain allele/haplotype is 7–25%, while the ratio of another is as high as 90–100%. This may be caused by uneven amplification of MDA. Whole-genome amplification (WGA) is a necessary step for high-throughput analysis of DNA samples derived from a small number of embryonic cells. However, the amplification bias generated by WGA will inevitably cause a certain distortion of the original DNA template, resulting in detection errors, although such errors can be partially compensated by the use of a sufficient number of genetic markers [16]. This suggests that we will need to be cautious when applying our PGT protocol. However, the significance of our long-read sequencing-based approach for increasing the application value of PGT is undeniable. At present, it is necessary to further improve the protocol, such as by optimizing or changing the amplification method used for embryonic DNA samples, increasing the number of SNP sites available for analysis, developing more accurate data analysis and correction methods, and verifying the feasibility of this protocol in a larger sample size.

In addition, there are still some important factors restricting the wide application of long-read sequencing, such as a high base error rate, high cost, and insufficient bioinformatics software [17]. Although the base error rate of long-read sequencing is much higher than that of first-generation and second-generation sequencing, the error rate of each nucleotide sequenced can be effectively reduced from 20% to less than 1% at sufficient depth (30× or higher) [18]. However, the need for higher depth also makes long-read sequencing much more expensive than second-generation sequencing. Another important research cost is data calculation, and processing long-read sequencing data requires a large amount of data storage and calculation costs [17]. Therefore, more algorithmic and systematic studies are needed to improve the speed and accuracy of analysis while reducing its costs. With the further development and optimization of third-generation sequencing technology, the detection costs will be further reduced so that its advantages in sequencing will be more obvious and it will be increasingly used by researchers and clinicians.

Conclusions

This study developed a method for the preimplantation diagnosis of β-thalassemia embryos based on long-read sequencing, and the results showed that the method could achieve SNP-based preimplantation haplotype linkage analysis of embryos from β-thalassemia families without relying on sequences from additional family members. In addition, we constructed haplotypes around the HBB core region to achieve haplotype linkage analysis for embryos that exhibit ADO of all genetic markers of the core region, which greatly increased the stability of our protocol. This study supports long-read sequencing as a powerful and effective tool for preimplantation haplotype linkage analysis. There is no doubt that long-read sequencing will realize the dream of PGT-M for more families who cannot receive conventional NGS-SNP and help them obtain healthy offspring.

Supplementary Information

ESM 1^{(115.4KB, pdf)}

(PDF 115 kb)

Author contribution

H.W., X.S., and C.Z. designed and performed the experiments, collected and analyzed data, and wrote the manuscript. D.C., H.W., Y.L., P.L., and Q.Z. conducted the experiment. D.C. contributed to the interpretation of the results. P.C.C. and C.Z. supervised the experiments, and revised the manuscript. All authors have read and approved the final manuscript.

Funding

This study received funding from sources as follows: the Natural Science Foundation of Guangdong Province, China (2018A030310050); National Key R&D Program of China (2016YFC1000205); Guangdong Provincial Key Laboratory of Reproductive Medicine (2012A061400003).

Declarations

Ethics approval

This study was approved by the ethics committee of the Affiliated Jiangmen Hospital of Sun Yat-Sen University. We obtained informed consent from all the couples before this study.

Conflict of interest

The authors declare no competing interests.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Haitao Wu and Dongjia Chen contributed equally to this work.

Contributor Information

Philip C. N. Chiu, Email: pchiucn@hku.hk

Canquan Zhou, Email: zhoucanquan@mail.sysu.edu.cn.

References

1.Lai K, Huang G, Su L, He Y. The prevalence of thalassemia in mainland China: evidence from epidemiological surveys. Sci Rep. 2017;7:920. doi: 10.1038/s41598-017-00967-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Mettananda S, Higgs DR. Molecular basis and genetic modifiers of thalassemia. Hematol Oncol Clin North Am. 2018;32:177–191. doi: 10.1016/j.hoc.2017.11.003. [DOI] [PubMed] [Google Scholar]
3.Origa R. β-Thalassemia. Genet Med. 2017;19:609–619. doi: 10.1038/gim.2016.173. [DOI] [PubMed] [Google Scholar]
4.Natesan SA, Bladon AJ, Coskun S, Qubbaj W, Prates R, Munne S, et al. Genome-wide karyomapping accurately identifies the inheritance of single-gene defects in human preimplantation embryos in vitro. Genet Med. 2014;16:838–845. doi: 10.1038/gim.2014.45. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Sachidanandam R, Weissman D, Schmidt SC, Kakol JM, Stein LD, Marth G, et al. A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms. Nature. 2001;409:928–933. doi: 10.1038/35057149. [DOI] [PubMed] [Google Scholar]
6.Logsdon GA, Vollger MR, Eichler EE. Long-read human genome sequencing and its applications. Nat Rev Genet. 2020;21:597–614. doi: 10.1038/s41576-020-0236-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Fu Y, Shen X, Chen D, Wang Z, Zhou C. Multiple displacement amplification as the first step can increase the diagnostic efficiency of preimplantation genetic testing for monogenic disease for beta-thalassemia. J Obstet Gynaecol Res. 2019;45:1515–1521. doi: 10.1111/jog.14003. [DOI] [PubMed] [Google Scholar]
8.Shen X, Xu Y, Zhong Y, Zhou C, Zeng Y, Zhuang G, et al. Preimplantation genetic diagnosis for α-and β-double thalassemia. J Assist Reprod Genet. 2011;28:957–964. doi: 10.1007/s10815-011-9598-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Shen XT, Xu YW, Zhong YP, Zeng YH, Wang J, Ding CH, et al. Combination of multiple displacement amplification with short tandem repeat polymorphismin preimplantation genetic diagnosis. Beijing Da Xue Xue Bao Yi Xue Ban. 2013;45:852–858. [PubMed] [Google Scholar]
10.Gueye NA, Jalas C, Tao X, Taylor D, Scott RT, Jr, Treff NR. Improved sensitivity to detect recombination using qPCR for Dyskeratosis Congenita PGD. J Assist Reprod Genet. 2014;31:1227–1230. doi: 10.1007/s10815-014-0298-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Chen D, Shen X, Wu C, Xu Y, Ding C, Zhang G, et al. Eleven healthy live births: a result of simultaneous preimplantation genetic testing of alpha- and beta-double thalassemia and aneuploidy screening. J Assist Reprod Genet. 2020;37:549–557. doi: 10.1007/s10815-020-01732-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Chen D, Shen X, Xu Y, Ding C, Ye Q, Zhong Y, et al. Successful four-factor preimplantation genetic testing: alpha- and beta-thalassemia, human leukocyte antigen typing, and aneuploidy screening. Syst Biol Reprod Med. 2021:1–9. [DOI] [PubMed]
13.Ardui S, Ameur A, Vermeesch JR, Hestand MS. Single molecule real-time (SMRT) sequencing comes of age: applications and utilities for medical diagnostics. Nucleic Acids Res. 2018;46:2159–2168. doi: 10.1093/nar/gky066. [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Wenger AM, Peluso P, Rowell WJ, Chang PC, Hall RJ, Concepcion GT, et al. Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome. Nat Biotechnol. 2019;37:1155–1162. doi: 10.1038/s41587-019-0217-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Wilbe M, Gudmundsson S, Johansson J, Ameur A, Stattin EL, Anneren G, et al. A novel approach using long-read sequencing and ddPCR to investigate gonadal mosaicism and estimate recurrence risk in two families with developmental disorders. Prenat Diagn. 2017;37:1146–1154. doi: 10.1002/pd.5156. [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Pinard R, de Winter A, Sarkis GJ, Gerstein MB, Tartaro KR, Plant RN, et al. Assessment of whole genome amplification-induced bias through high-throughput, massively parallel whole genome sequencing. BMC Genomics. 2006;7:216. doi: 10.1186/1471-2164-7-216. [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Amarasinghe SL, Su S, Dong X, Zappia L, Ritchie ME, Gouil Q. Opportunities and challenges in long-read sequencing data analysis. Genome Biol. 2020;21:30. doi: 10.1186/s13059-020-1935-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Sedlazeck FJ, Lee H, Darby CA, Schatz MC. Piercing the dark matter: bioinformatics of long-range sequencing and mapping. Nat Rev Genet. 2018;19:329–346. doi: 10.1038/s41576-018-0003-4. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

ESM 1^{(115.4KB, pdf)}

(PDF 115 kb)

[CR1] 1.Lai K, Huang G, Su L, He Y. The prevalence of thalassemia in mainland China: evidence from epidemiological surveys. Sci Rep. 2017;7:920. doi: 10.1038/s41598-017-00967-2. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR2] 2.Mettananda S, Higgs DR. Molecular basis and genetic modifiers of thalassemia. Hematol Oncol Clin North Am. 2018;32:177–191. doi: 10.1016/j.hoc.2017.11.003. [DOI] [PubMed] [Google Scholar]

[CR3] 3.Origa R. β-Thalassemia. Genet Med. 2017;19:609–619. doi: 10.1038/gim.2016.173. [DOI] [PubMed] [Google Scholar]

[CR4] 4.Natesan SA, Bladon AJ, Coskun S, Qubbaj W, Prates R, Munne S, et al. Genome-wide karyomapping accurately identifies the inheritance of single-gene defects in human preimplantation embryos in vitro. Genet Med. 2014;16:838–845. doi: 10.1038/gim.2014.45. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR5] 5.Sachidanandam R, Weissman D, Schmidt SC, Kakol JM, Stein LD, Marth G, et al. A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms. Nature. 2001;409:928–933. doi: 10.1038/35057149. [DOI] [PubMed] [Google Scholar]

[CR6] 6.Logsdon GA, Vollger MR, Eichler EE. Long-read human genome sequencing and its applications. Nat Rev Genet. 2020;21:597–614. doi: 10.1038/s41576-020-0236-x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR7] 7.Fu Y, Shen X, Chen D, Wang Z, Zhou C. Multiple displacement amplification as the first step can increase the diagnostic efficiency of preimplantation genetic testing for monogenic disease for beta-thalassemia. J Obstet Gynaecol Res. 2019;45:1515–1521. doi: 10.1111/jog.14003. [DOI] [PubMed] [Google Scholar]

[CR8] 8.Shen X, Xu Y, Zhong Y, Zhou C, Zeng Y, Zhuang G, et al. Preimplantation genetic diagnosis for α-and β-double thalassemia. J Assist Reprod Genet. 2011;28:957–964. doi: 10.1007/s10815-011-9598-5. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR9] 9.Shen XT, Xu YW, Zhong YP, Zeng YH, Wang J, Ding CH, et al. Combination of multiple displacement amplification with short tandem repeat polymorphismin preimplantation genetic diagnosis. Beijing Da Xue Xue Bao Yi Xue Ban. 2013;45:852–858. [PubMed] [Google Scholar]

[CR10] 10.Gueye NA, Jalas C, Tao X, Taylor D, Scott RT, Jr, Treff NR. Improved sensitivity to detect recombination using qPCR for Dyskeratosis Congenita PGD. J Assist Reprod Genet. 2014;31:1227–1230. doi: 10.1007/s10815-014-0298-9. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR11] 11.Chen D, Shen X, Wu C, Xu Y, Ding C, Zhang G, et al. Eleven healthy live births: a result of simultaneous preimplantation genetic testing of alpha- and beta-double thalassemia and aneuploidy screening. J Assist Reprod Genet. 2020;37:549–557. doi: 10.1007/s10815-020-01732-7. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR12] 12.Chen D, Shen X, Xu Y, Ding C, Ye Q, Zhong Y, et al. Successful four-factor preimplantation genetic testing: alpha- and beta-thalassemia, human leukocyte antigen typing, and aneuploidy screening. Syst Biol Reprod Med. 2021:1–9. [DOI] [PubMed]

[CR13] 13.Ardui S, Ameur A, Vermeesch JR, Hestand MS. Single molecule real-time (SMRT) sequencing comes of age: applications and utilities for medical diagnostics. Nucleic Acids Res. 2018;46:2159–2168. doi: 10.1093/nar/gky066. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR14] 14.Wenger AM, Peluso P, Rowell WJ, Chang PC, Hall RJ, Concepcion GT, et al. Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome. Nat Biotechnol. 2019;37:1155–1162. doi: 10.1038/s41587-019-0217-9. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR15] 15.Wilbe M, Gudmundsson S, Johansson J, Ameur A, Stattin EL, Anneren G, et al. A novel approach using long-read sequencing and ddPCR to investigate gonadal mosaicism and estimate recurrence risk in two families with developmental disorders. Prenat Diagn. 2017;37:1146–1154. doi: 10.1002/pd.5156. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR16] 16.Pinard R, de Winter A, Sarkis GJ, Gerstein MB, Tartaro KR, Plant RN, et al. Assessment of whole genome amplification-induced bias through high-throughput, massively parallel whole genome sequencing. BMC Genomics. 2006;7:216. doi: 10.1186/1471-2164-7-216. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR17] 17.Amarasinghe SL, Su S, Dong X, Zappia L, Ritchie ME, Gouil Q. Opportunities and challenges in long-read sequencing data analysis. Genome Biol. 2020;21:30. doi: 10.1186/s13059-020-1935-5. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR18] 18.Sedlazeck FJ, Lee H, Darby CA, Schatz MC. Piercing the dark matter: bioinformatics of long-range sequencing and mapping. Nat Rev Genet. 2018;19:329–346. doi: 10.1038/s41576-018-0003-4. [DOI] [PubMed] [Google Scholar]

PERMALINK

Long-read sequencing on the SMRT platform enables efficient haplotype linkage analysis in preimplantation genetic testing for β-thalassemia

Haitao Wu

Dongjia Chen

Qiang Zhao

Xiaoting Shen

Yongbin Liao

Ping Li

Philip C N Chiu

Canquan Zhou

Abstract

Purpose

Methods

Results

Conclusion

Supplementary Information

Introduction

Methods

DNA source and ethical approval

Pathogenic mutation detection and haplotype linkage analysis in the HBB core region

Table 1.

Fig. 1.

Fig. 2.

Haplotype linkage analysis with SNP peripheral to the HBB core region

Results

Pathogenic mutation detection and haplotype linkage analysis in the HBB core region

Fig. 3.

Fig. 4.

Haplotype linkage analysis with SNP peripheral to the HBB core region

Discussion

Conclusions

Supplementary Information

Author contribution

Funding

Declarations

Ethics approval

Conflict of interest

Footnotes

Contributor Information

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases