Skip to main content
PLOS ONE logoLink to PLOS ONE
. 2021 Apr 15;16(4):e0249695. doi: 10.1371/journal.pone.0249695

Use of amplicon-based sequencing for testing fetal identity and monogenic traits with Single Circulating Trophoblast (SCT) as one form of cell-based NIPT

Xinming Zhuo 1,*, Qun Wang 1, Liesbeth Vossaert 1, Roseen Salman 1, Adriel Kim 2, Ignatia Van den Veyver 1,3, Amy Breman 4, Arthur Beaudet 1,*
Editor: Osman El-Maarri5
PMCID: PMC8049273  PMID: 33857205

Abstract

A major challenge for cell-based non-invasive prenatal testing (NIPT) is to distinguish individual presumptive fetal cells from maternal cells in female pregnancies. We have sought a rapid, robust, versatile, and low-cost next-generation sequencing method to facilitate this process. Toward this goal, single isolated cells underwent whole genome amplification prior to genotyping. Multiple highly polymorphic genomic regions (including HLA-A and HLA-B) with 10–20 very informative single nucleotide polymorphisms (SNPs) within a 200 bp interval were amplified with a modified method based on other publications. To enhance the power of cell identification, approximately 40 Human Identification SNP (Applied Biosystems) test amplicons were also utilized. Using SNP results to compare to sex chromosome data from NGS as a reliable standard, the true positive rate for genotyping was 83.4%, true negative 6.6%, false positive 3.3%, and false negative 6.6%. These results would not be sufficient for clinical diagnosis, but they demonstrate the general validity of the approach and suggest that deeper genotyping of single cells could be completely reliable. A paternal DNA sample is not required using this method. The assay also successfully detected pathogenic variants causing Tay Sachs disease, cystic fibrosis, and hemoglobinopathies in single lymphoblastoid cells, and disease-causing variants in three cell-based NIPT cases. This method could be applicable for any monogenic diagnosis.

Introduction

Since 2015, several influential professional societies, including the International Society for Prenatal Diagnosis (ISPD) and American College of Obstetricians and Gynecologists (ACOG), have stated that noninvasive prenatal testing (NIPT) is an available screening option for all pregnant women. Current NIPT is based on analysis of cell-free fetal (cff) DNA, and it has become widely available since its introduction to clinical practice in 2011. In contrast, cell-based NIPT, which relies on the isolation of circulating fetal cells in maternal blood, has been a long-sought alternative to cell-free NIPT and is now approaching commercialization. Currently, the cell-free NIPT approach has the advantage of a faster turnaround time and lower cost. However, the accuracy of cell-free NIPT is impacted by the large amount of maternal DNA in plasma (more than 80% of all circulating DNA) and the highly fragmented nature of this genetic material. Thus, it is only recommended for detection of the common fetal aneuploidies by many professional societies [1, 2]. These drawbacks can be addressed by cell-based NIPT, but it is not yet available as a clinical test. Cell-free NIPT, trophoblast-based NIPT and CVS all can detect placental mosaicism, while amniocentesis, fetal nucleated red blood cell (fnRBC)-based NIPT, fetal blood sampling, and amniocentesis can help to clarify whether mosaicism involves the fetus or is confined to the placenta. Cell-based NIPT would potentially have a higher positive predictive value compared to cell-free NIPT, since the DNA source is purely fetal or placental in origin without any maternal contamination [35]. Limitations of cost and throughput would need to be overcome for cell-free NIPT to be a routine alternative. Recently, multiple groups have reported successful cases of cell-based NIPT via capturing trophoblast cells [3, 6, 7].

The critical step for cell-based NIPT is the recovery of rare fetal cells, such as trophoblasts. As described previously [3, 4], 30–40 mL of blood is collected at 10–16 weeks’ gestation, followed by density fractionation or magnetic activated cell sorting (MACS) with anti-trophoblast antibodies to enrich the nucleated cells. Then, the nucleated cells are immunostained to identify trophoblasts that are cytokeratin positive and leukocyte common antigen (CD45) negative. The stained cells are picked individually under fluorescence microscopy with an automatic instrument described previously [3, 4] and subjected to whole genome amplification (WGA), which allows downstream genotyping, and copy number analysis using array Comparative Genomic Hybridization (CGH) or next generation sequencing (NGS).

Genotyping is an essential step after isolating the putative fetal cells. Typically, a successful cell-based NIPT would isolate 5–10 cells per 30 mL maternal blood sample. Since the nucleated cell recovery is a complicated multiple-step procedure, and several antibodies are used, there is a chance of picking a maternal cell (~10% in our experience). Whole genome shotgun (WGS) sequencing at low coverage (5–10 million reads per cell) provides good copy number data, but it does not readily distinguish fetal and maternal cells if the fetus is female. Previously, we used short tandem repeat analysis, SNP arrays, or Y-chromosome targeted qPCR to confirm the fetal origin of single cells. However, there are various disadvantages to these approaches, such as inefficiency, ambiguity, high cost, or limited application.

In this work, we developed a fast, low-cost, and reliable genotyping assay with amplicon sequencing. We sequenced approximately 90 highly polymorphic SNPs within about 40 amplicons. Among these amplicons, four contain multiple common SNPs (see Materials and Methods), which allow for haplotyping of the WGA DNA product. Together, it allows for the effective differentiation of cells containing the fetal genome from cells of maternal origin in most cases. This genotyping uses a small aliquot of the WGA product and does not interfere with downstream analysis. This method could also be easily expanded for the detection of additional disease-associated variants, which would have clinical utility for pregnancies with increased risk for monogenic disorders.

Materials and methods

Sample collection and preparation

Blood samples were collected from pregnant women from multiple centers under a protocol approved by the Baylor College of Medicine or Columbia University Medical Center Institutional Review Boards utilizing written informed consent. Approximately 30 mL of blood was collected into anticoagulant EDTA Vacutainer tubes (BD). Fetal cells were enriched with methods described in Breman et al., 2016. Both cytokeratin (CK)-positive putative fetal cells, and CK-negative maternal white blood cells were picked from maternal blood using the CytePicker® equipment (Rarecyte). There were 154 usable blood samples from 152 pregnancies; two women had two samples collected during one pregnancy.

Overall strategy

The typical cell-based NIPT workflow yields 3–10 singlet or doublet cells per patient, individually captured in a PCR tube for downstream WGA. The WGA DNA products of those cells must be checked for quality and confirmed as nonmaternal cells before finalizing the interpretation. Thus, a fast, low-cost, and high-throughput genotyping assay is necessary. To meet this need, we designed a single-cell genotyping assay using a modified amplicon sequencing approach for genotyping (Fig 1). The first step is conventional PCR of a pool of amplicons with bridging adaptors, which contains a partial sequence of Illumina i5 and i7 adaptors. The second step is adding a dual index with a sequencing adaptor to previous PCR products. This concludes the library construction for the Illumina machine. DNA samples with different indexes were balanced and pooled for sequencing with Illumina Miseq. The sequencing result was demultiplexed with Illumina BaseSpace. The demultiplexed reads were mapped with conventional BWA-MEM. The mapped reads were used for SNP typing and amplicon haplotyping, the details of which are discussed below.

Fig 1. The workflow for amplicon-based genotyping.

Fig 1

Amplicon design

Three groups of amplicons were used to carry out amplicon-seq. The first group consisted of amplicons with multiple common SNPs (>5% prevalence) as suggested in Debeljak et al. [8], including regions in HLA-A, HLA-B, chromosome 7q11 and chromosome 11q22, which have good sequencing coverage in single-cell WGA; all of these amplicons have at least eight common SNPs. The information from these common SNPs can be used for effective haplotyping and identifying the origin of isolated cells. The second group, which consisted of 37 amplicons, contained common SNPs selected from the Human Identification panel (ABI), which covers most chromosomes, including the Y chromosome. They were selected according to sequencing coverage in the single-cell WGA product, which has a high tendency for dropout. The third group consisted of amplicons designed to detect single nucleotide variants associated with certain inherited disease genes of interest, including hemoglobin subunit beta (HBB), hexosaminidase A (HEXA), and cystic fibrosis transmembrane conductance regulator (CFTR) (Table 1). For studies of trophoblasts from specific at-risk cases, we also prepared amplicons for DHCR7 and RASPN (Table 1). The primers of these amplicons were prepared with adaptors compatible with Illumina True-seq HT i5 and i7 adaptors.

Table 1. List of amplicons.

Amplicon_ID Chr Amplicon_Start Amplicon_Stop Size Annotation
rs1490413 chr1 4367241 4367415 175
rs4847034 chr1 105717572 105717689 118
rs3780962 chr10 17193291 17193386 96
rs964681 chr10 132698373 132698467 95
rs1498553 chr11 5708942 5709103 162
rs901398 chr11 11096160 11096278 119
rs2269355 chr12 6945833 6946005 173
rs4530059 chr14 104769098 104769197 100
rs2016276 chr15 24571747 24571845 99
rs2342747 chr16 5868655 5868769 115
rs2292972 chr17 80765753 80765827 75
rs938283 chr17 77468418 77468592 175
rs9905977 chr17 2919336 2919452 117
rs1024116 chr18 75432299 75432467 169
rs576261 chr19 39559774 39559848 75
rs12997453 chr2 182413125 182413299 175
rs1005533 chr20 39487029 39487201 173
rs445251 chr20 15124851 15125009 159
rs221956 chr21 43606946 43607048 103
rs2830795 chr21 28608067 28608212 146
rs733164 chr22 27816739 27816833 95
rs1355366 chr3 190806053 190806219 167
rs4364205 chr3 32417580 32417720 141
rs6444724 chr3 193207331 193207425 95
rs1979255 chr4 190318032 190318131 100
rs159606 chr5 17374818 17374977 160
rs338882 chr5 178690682 178690774 93
rs7704770 chr5 159487871 159488033 163
rs13218440 chr6 12059906 12060004 99
rs6955448 chr7 4310289 4310423 135
rs1360288 chr9 128967996 128968115 120
rs1463729 chr9 126881396 126881493 98
rs7041158 chr9 27985851 27986020 170
P256 chrY 8685171 8685289 119
rs17250845 chrY 8418867 8418960 94
rs35284970 chrY 2734829 2734921 93
rs3911 chrY 21733328 21733502 175
HLA-A chr6 2911156 2911140 245 Haplotype
HLA-B chr6 31319491 31319646 156 Haplotype
Chr7q11 chr7 64895160 64895374 215 Haplotype
Chr11q22 chr11 99491336 99491527 192 Haplotype
rs334&rs33930165 chr11 5226980 5227053 74 HBB
rs11393960 chr7 117559481 117559645 165 CFTR
rs75527207&rs74597325 chr7 117587771 117587870 100 CFTR
rs121907954 chr15 72350490 72350617 128 HEXA
rs387906309 chr15 72346528 72346691 164 HEXA
rs147324677 chr15 72346182 72346293 112 HEXA
rs138659167 Chr11 71146795 71146921 127 DHCR7
rs104894299&rs761584017 Chr11 47469333 47469720 388 RASPN

The coordinates in this table include the first and last base of the amplicon so that subtraction of one number from the other gives a number one less than the product size.

Library construction for Illumina

WGA was performed using the PicoPLEX kits (Rubicon/Takara) or, in some cases, the Ampli1 WGA kit (Silicon Biosystems), according to the manufacturer’s protocol. Briefly, 100ng of WGA DNA product was used for Amplicon-seq library construction, with two-step PCR. The first step included 20–25 cycles of PCR with NEB Q5HS to amplify amplicons of interest and add designated adaptors. The second step used a previous adaptor sequence as primer binding sites to add Illumina i5 and i7 dual-index adaptors with 10–15 cycles of PCR. The 2-step PCR products were purified with standard AMPure protocol (Beckman) for the 100–300 bp product. PCR products were visualized by gel electrophoresis followed by the Bioanalyzer (Agilent) to check the quality and then quantified by using a KAPA Library Quantification Kit (Kapa Biosystems).

Sequencing with Miseq

The barcoded DNA library was diluted to 2 nM and pooled together for denaturing with the Illumina protocol. An 8–10 pM diluted denatured library was mixed with 5–10% PhiX control. The mixed library was loaded onto the Illumina Miseq with 150-cycle v3 kits (Illumina) and sequenced with 2x76 reads and dual index.

Demultiplexing, alignment and variant calling

Sequencing results were demultiplexed by Illumina BaseSpace. The reads were mapped and processed with a shell script (S3 File). In summary, the fastq.gz raw files were aligned with customized reference files (S1 File) by BWA-MEM [9]. Samtools [10, 11] and Bam-readcount followed by a customized R script (S4 File) were used for calling variants within selected intervals (S2 File). The cutoff for calling a variant is at least ten reads of the less frequent allele and 5% of all reads.

Variants from each sample were summarized and compared with their paired control with an R script (S6 File). Typically, a cell with at least 2 SNPs and at least 6% of comparable SNPs different from its maternal gDNA control is scored as a fetal cell. Otherwise, it will be classified as an uninformative cell. Throughout this manuscript, a cell or a SNP is referred to as informative if the putative fetal cell has an allele not carried by the mother (e.g., mother is AA and the putative fetal cell is AB or _B). An uninformative cell does not have alleles not carried by the mother and may be a maternal cell or a fetal cell with inadequate genotyping data. This may represent an underestimate of fetal cells, especially when one but not two SNPS support fetal origin.

Haplotype calling

Using the Illumina 2x76 read length necessitated performing stitch overlap for read 1 and read 2 with PEAR (Paired-end read merger) [12] for amplicons shorter than 150 bp. For amplicons larger than 150 bp, we stitched non-overlapping reads with a 15-N padding sequence. Those stitched reads were mapped with BWA-MEM with modified parameters that allow a bigger unmatched gap inside a mapped read. The mapped results were processed with an R script (S5 File) used in the main shell script to extract SNPs and reconstitute a new sequence with CIGAR information. The new concise sequences were tabulated and grouped with sequence similarity according to their Levenshtein distance. We assigned a haplotype to each major group of a concise sequence. Pair-wise haplotype comparison between the maternal gDNA sample and the putative fetal cells were performed with another R script (S7 File). All scripts are hosted and maintained on https://github.com/xmzhuo/NIPT_genotyping.

Results

Target region coverage

We compared the coverage of our amplicons for SNPs with gDNA and NIPT cell WGA products (Fig 2). For gDNA, most of the samples have very high coverage, which reflects the distribution of samples with many scorable SNPs. The WGA product of maternal white blood cells (WBCs) shows less scorable SNPs than gDNA, which would be the result of starting with a single diploid genome target in combination with fixation, staining, and amplification during WGA.

Fig 2. Comparison of the coverage of several genomic DNAs and NIPT single cells.

Fig 2

The count of samples with various scorable SNPs was normalized to the total number of samples of each group (gDNA n = 20, WGA n = 35). The scorable SNP cutoff was at least 5% minor allele frequency (MAF) and ten reads. Data include NIPT case numbers 946, 977, 982, 983, 984, 988, 989, 990, 991, 992, 993, 996, and 998 (clinical data in S8 File).

SNP typing of NIPT WGA products

A typical process of SNP typing after BWA-MEM aligner mapping includes variant calling with Samtools and retrieving the read depth with bam-readcount (Fig 3). This step will produce a table containing variant call information including allele fraction and read depth of all SNPs of interest. Information on indels was masked to avoid confusion in later steps. Then, we performed the pair-wise comparison of WGA products with the maternal gDNA. The script will calculate how many SNPs are different between two DNA samples. In an example case with two cells (Fig 3), three SNPs show a difference between one of the cells (middle panel) and maternal gDNA, which suggests that it is likely to be a fetal cell. The other cell shows identical calls with maternal gDNA, which suggests that it either is a maternal cell or a noninformative fetal cell.

Fig 3. The workflow and an example of SNP typing.

Fig 3

Reads for the variant allele are colored green while reads for the normal allele are colored grey and summed in red. The mother is homozygous for all three SNPs, while cell A is heterozygous for all three SNPs in each case having an allele that the mother does not have. Cell B is homozygous for all SNPs indistinguishable from the mother and is interpreted as maternal or noninformative (fetal with allele drop out for all three SNPs). Data from NIPT case number 1000 (clinical data in S8 File).

The results for 152 cases can be divided into three groups as shown in Table 2. First, there were 7/152 (4.6%) cases where none of the cells passing WGA could be proven to be fetal; Second, there were 42/152 (27.6%) cases with only one cell scored as fetal; and third, there were 103/152 (67.8%) with two or more cells scored as fetal as defined in methods. At the single cell level for cells that pass WGA, cells can be scored in only two ways: 1) uninformative meaning no or inadequate evidence of fetal status or 2) adequate evidence for fetal status. Some of the “uninformative” cells were certainly maternal, and we cannot distinguish an uninformative fetal cell from a maternal cell. Deeper genotyping would make healthy uninformative fetal cells informative, but some fetal cells are known to be apoptotic with degraded DNA and may or may not pass WGA. False positives and false negatives at a single cell level are discussed below. This method is a work in progress, and we consider one fetal cell as partial success and two or more fetal cells as success from a clinical perspective. Improved cell recovery and improved genotyping would be needed to achieve an optimal test. The relative roles of failure to isolate and amplify cells, uninformative genotyping, and suboptimal numbers of fetal cells can be calculated from Table 2.

Table 2. Tabulation of cases according to outcome of genotyping.

Condition 2 SNP + 10% 2 SNP + 8% 2 SNP + 6%
None of the available cells with adequate WGA were proven fetal. 13 (8.6) 8 (5.3%) 7 (4.6%)
Only one fetal cell identified. 46 (30.3%) 44 (28.9%) 42 (27.6)
Two or more fetal cells identified. 93 (61.2%) 100 (65.8%) 103 (67.8%)
Total cases/pregnancies 152

In Table 2, we examined what percent of cases had one or more or two or more cells scored as fetal. Individual cells were scored as fetal if two or more SNPs had at least 10% reads for an allele that was not present in the mother. Cases were then subdivided into those where the informative SNPs indicating fetal status were at least 10%, 8%, or 6%, of the scorable SNPs (There were no additional cases where the informative SNPs were less than 6% of the scorable SNPs). There were 60 cells with one informative SNP suggesting that requiring two SNPs may undercount fetal cells. It is important to distinguish the percent of cases (82.4%) that had two or more fetal cells (103/152) from the percent of cells (78.7%) that had two or more fetal SNPs indicating fetal status (408/518).

If one assumes that the NGS data are 100% reliable for sex based on X and Y data, which we believe is the case, the sensitivity and specificity of the genotyping can be estimated from normal male singleton pregnancies as shown in Table 3. Definition of other contingencies are included in the same table. Identifying a fetal cell as fetal by genotyping and confirming that it is male based on NGS is a true positive. The results vary depending on the cutoff for scoring a SNP allele as present. If we accept two different 2 SNP + 6% cutoff, based on normal male singleton pregnancies the true positive rate was 82.1%, true negative 6.6%, false positive 3.3%, and false negative 7.9%. This is equal to a sensitivity [True Positive Rate = TP/(TP+FN) of 91.2% (124/124+12)] and a precision [Positive Predictive Value = TP/(TP+FP) of 96.1% (124/124+5)]. These results would not be sufficient for clinical diagnosis, but they demonstrate the general validity of the approach and suggest that deeper genotyping of single cell could be completely reliable.

Table 3. Sensitivity and specificity for genotyping.

Normal male singleton
True positive True negative False positive False negative
Cells scored as fetal by genotype and male by NGS Cells scored as not fetal by genotype and female by NGS Cells scored as fetal by genotype but female (maternal) by NGS Cells scored as not fetal by genotype but male by NGS
2 SNP 126 10 5 10
2 SNP + 6% 124 10 5 12
2 SNP + 8% 117 10 5 19
2 SNP + 10% 107 10 5 29

In NIPT #1000 (Fig 4), we isolated eight putative fetal cells from the mother’s blood for WGA. Cells G78, G212, G 227, and G232 all have at least five SNPs which differed from the maternal gDNA; thus, they were confirmed to be fetal in origin. The remaining three cells (G79, G113 and G320) had 0–1 SNPs which differed from the maternal gDNA, and these were considered uninformative and possibly white blood cells accidently isolated from the mother’s blood. Cell G106 has only 2 SNP difference in less than 20 informative SNPs, which was considered as a low-quality sample and uninformative.

Fig 4. Example of SNP typing result of one NIPT case.

Fig 4

The X-axis indicates the number of variant sites passing coverage cutoff. The Y-axis is the number of informative SNPs in a cell. Data from NIPT case number 1000.

Haplotyping of NIPT WGA products

The haplotyping of NIPT WGA products potentially has higher power than SNP typing for identifying fetal cells. Since the WGA product typically went through the 14–16 cycles of amplification from trace amounts of input DNA, there is a small chance of introducing new mutations, which would affect the precision of SNP typing at low read-depth. For example, some of the cells from case #1000 (Fig 4) had one low-depth SNP difference from the maternal gDNA. To address this issue, we developed a haplotyping approach for multiple highly polymorphic regions, which contain multiple very common SNPs within the 200 bp amplicon. Thus, we can decrease the impact of random mutations introduced during the WGA process on the final interpretation (i.e., one nucleotide change is less likely to change the classification of a major haplotype group, which is comparable to HLA typing approach). Mutations introducing a random change are not rare, but mutations switching from one allele at a SNP to the other polymorphic allele are much rarer. In addition, the haplotypes of an amplicon can be treated as a permutation of a given number of SNPs, which theoretically generates much more haplotypes than SNP types and has higher power at differentiating two cells. Third, we can estimate the point at which a sequence artifact arose based on the fraction of each minor haplotype group in the total reads for a given amplicon. For example, a high fraction indicates a variant preexisted in the cell, a medium fraction indicates the variant arose during the WGA step, while a low fraction is consistent with an artifact from the final step of amplicon-seq.

We performed haplotyping with the following steps (Fig 5). After regular alignment with BWA-MEM with the default setting, we joined the Read 1 and Read 2 with PEAR [12]. The overlapping Read 1 and Read 2 were merged. If the amplicon is longer than two reads joined together, we merged the two reads and padded the gap with a tandem repeat of N. The merged reads were remapped with BWA-MEM again with a lenient setting to tolerate a larger gap. The remapped reads were processed with an R script to extract the selected SNPs, and each read was reconstituted with the concise sequence while preserving the read ID. The concise reads were then tallied and ranked according to frequency (typically, only the top 10 were kept, which usually consist of more than 99.99% of all types of reads). The Levenshtein distances were calculated for these reads, which typically ended up with only one or two major groups to represent the haplotype of this amplicon. To compare the haplotypes of more than two samples, all the top reads of each sample were pooled together, and their distances were calculated, which will determine if these samples share the same read group (haplotype). For example, the maternal gDNA carried a mocked haplotype 1 (TA) and haplotype 2 (GC). A positive fetal cell should be identified to carry at least one new haplotype 3 (GA), which would be TA/GA or GA/GA (result from dropout of haplotype 1 or 2) (Fig 5).

Fig 5. The workflow for haplotyping.

Fig 5

The paired aligned reads were jointed with PEAR and padded with BBMap short read aligner when a gap exists. The new joint reads were remapped with BWA-MEM with a modified configuration. The highly polymorphic sites were extracted for constructing concise haplotypes. To demonstrate the workflow, we present example short reads with TA and GC haplotypes.

The haplotyping approach can effectively differentiate a candidate cell from maternal gDNA. In the case shown in Fig 6, we have the gDNA from both parents. As described previously, we extracted all 28 SNP sites in the HLA-A amplicon and reconstructed a concise 28 nt sequence for each read. In this case, the top four most frequent read types of potential fetal cells can be grouped into two major groups, with a Levenshtein distance of more than 2 (S1 Fig). The intra-group difference has a distance of less than 0.5, which suggests a difference of only one nucleotide. The difference likely results from artifacts introduced during extensive amplification (WGA then PCR). The same condition was observed in maternal gDNA and paternal gDNA as well. We observed the inheritance pattern of haplotypes when all read types from maternal, paternal, and fetal DNA were plotted together. One fetal haplotype matched with the mother and the other matched with the father. From these haplotype groupings, we concluded that this is a true fetal cell.

Fig 6. Haplotype analysis of a fetal cell compared to both parents.

Fig 6

Each vertical column represents a particular SNP in the haplotype. The left-most four SNPs are informative as the fetal cell has an allele that the mother does not have and indicate that the putative fetal cell is indeed fetal. The 24 SNPs to the right are not informative for the fetal cell as they do not have an allele that the mother does not have. Data from NIPT case number 1000.

We tested the performance of haplotyping in four amplicons with matched gDNA, WBC, and fetal cells (S2 Fig). For gDNA, all four amplicons performed nicely to distinguish one from the other. For WBC and fetal cells, the performance was not as good, largely owing to the dropout events. However, when four amplicons were combined, we can still distinguish about 50% of all the cells (from selected cases with both fetal cells and WBCs) (S3A Fig). With combined power of both SNP typing and haplotyping, we can increase the solving rate of differentiating a WBC from around 60% to more than 70% (from cases independent of fetal cell existence) (S3B Fig).

Genotyping for monogenic disease-causing variants

We also wished to use this method to genotype for monogenic disease-causing variants. We first evaluated the ability to detect disease variants in single cultured lymphoblasts of known genotypes. Our cell based NIPT provides pure fetal DNA, which allows us to look at monogenic disease-causing variants. Here, we developed amplicons that contain HEXA c.805C>T, HBB c.19T>A, HBB c.20C>T, and CFTR c.1521_1523delTCT. Corresponding cells carrying certain variants were obtained from the Coriell Institute and single lymphoblasts were picked from tissue culture and processed for WGA. The cells were isolated and genotyped by methods described herein. We successfully detected the known variants in the WGA products of cells carrying these changes (Fig 7). The allele dropout rate was 15% and 8% for unfixed and fixed lymphoblasts, respectively. The higher rate of allele drop out in fetal cells is presumably caused by some combination of DNA degradation caused by time in the maternal circulation including apoptosis and by cell isolation including fixation and permeabilization steps.

Fig 7. Testing for sequence variants in single lymphoblasts.

Fig 7

From left to right, Tay-Sachs, HEXA c.805C>T het; HBB c.19T>A and c.20C>T compound het; CFTR c.1521_1523delTCT het. The reads are shown in the IGV browser.

In all of our cell-based NIPT samples studies, there were three families with known pathogenic variants. One was a family where the mother was affected with sickle cell anemia, and the fetus was expected to be heterozygous based on parental information. No amniocentesis or CVS was performed. In a second family, both parents were carriers with the same known pathogenic variant in DHCR7. By amniocentesis, the fetus was heterozygous for the variant carried by both parents. A third family had a previous affected child with congenital myasthenic syndrome type 11 caused by biallelic, compound heterozygous pathogenic variants in RAPSN. By CVS, the fetus carried the paternal but not the maternal pathogenic variant. For the sickle cell family, nine fetal cells were recovered and four were genotyped. Two cells, G54 (Fig 8) and G532 (Alternative Allele Frequency (AAF) 17%, data not shown), were heterozygous for the pathogenic variant (Fig 8), while one cell (G474) had dropout for the normal allele and one cell (G309) did not pass coverage cutoff (data not shown). The cell-based NIPT data scored the fetus as heterozygous. For the DHCR7 family, seven fetal cells were recovered and four were genotyped. One cell, (G1286), was heterozygous for the pathogenic variant (AAF 80%) (Fig 8), while two cells (G123 and G4584) had dropout for the normal allele and one cell (G1074) had dropout for the variant allele. Again, the cell-based NIPT data scored the fetus as heterozygous. For the RAPSN family, four fetal cells were recovered and four were genotyped. In Fig 8C, the pathogenic variant in each parent is shown. The G540 cell in Fig 8C, shows absence of the maternal pathogenic variant but presence of the paternal pathogenic variant. All four cells showed absence of the maternal pathogenic variant. For the paternal pathogenic variant, two cells (G540 and G2847 (AAF 87%)) were heterozygous while one cell (G1517) had dropout for the normal allele and one cell (G360) had dropout for the paternal pathogenic variant. There is a small probability that the fetus carries the maternal pathogenic variant, but there was dropout in all four cells; more likely the fetus does not carry the maternal pathogenic variant in agreement with the CVS data.

Fig 8. Genotyping for pathogenic variants in trophoblasts from three cases.

Fig 8

In panel A, the mother is affected and homozygous for the sickle cell anemia variant. Fetal trophoblast G54 is heterozygous for the variant (ClinGen Accession: CA125138). Reads for the mutant allele are colored green while reads for the normal allele are colored grey and summed in red. In panel B, the mother is heterozygous for a DHCR7 pathogenic variant that is also present in the father (CA090917). Fetal trophoblast G1286 is also heterozygous for the pathogenic variant, although there is biased over-representation of the variant allele. Reads for the mutant allele are colored blue while reads for the normal allele are colored grey and summed in brown. In panel C, the mother is heterozygous for the pathogenic N88K variant (CA199511) in the RAPSN gene, and the father is heterozygous for the V165M pathogenic variant (CA5976731). Fetal trophoblast G540 is heterozygous for the paternal V165K pathogenic variant but not for the maternal N88K pathogenic variant. Allele drop out for the N88K variant cannot be ruled out, and multiple cells must be tested to gain statistical evidence that the fetus has not inherited the N88K variant. All results agreed with data from amniocentesis or CVS. Data from NIPT case numbers 1180, 1492, and 1607.

Discussion

This single cell genotyping assay can provide an essential step for confirming the fetal origin of cells obtained from cell-based NIPT workflows. Through genotyping, we can reject cells that are of indeterminate or maternal origin and provide metrics for WGA DNA quality using multiple amplicons. Thus, researchers can focus on smaller numbers of cells, which can then be used for a more expensive downstream test or analysis, such as the low-depth NGS and microarray for CNV analysis [3, 4, 6]. We can add more amplicons to cover variants of interest for detecting recessive or dominant inherited diseases. The data demonstrate the principle that detailed genotyping of individual cells can distinguish fetal from maternal cells, but the data are limited, and new methods to more extensively compare the genotype of individual cells to the genotype of the mother are needed.

Although there is evidence that some cells from previous pregnancies can persist for decades, especially CD34+ cells [13], there is no evidence that trophoblasts can persist from previous pregnancies, and we expect based on the biology of these cells that they are unlikely to persist. A future method that provides deep genotyping of individual cells could distinguish same sex nonidentical twins, but the method described here would not be sufficient to accomplish such distinction reliably.

Although this assay is promising, there are still some limitations. First, the dropout rate for individual amplicons is significant, which hinders the power of cell identification and affects the detection of disease-causing variants. Second, although the cost is relatively low and the running time is short (<24 h), it still takes extra effort to complete and could increase the turnaround time of NIPT. Third, the multiple steps of PCR after WGA are prone to introduce errors that may cause ambiguity at SNP typing, although errors switching from one SNP allele to another or from a mutant to wild-type genotype or vice versa are very rare.

There are multiple options for reducing the dropout rate for the improvement of this assay. First, we can increase the size of the panel, which allows more amplicons to compensate for the dropout. Second, we can modify the amplicons to reduce the size of the amplicons or otherwise improve the amplification. Third, improved versions of WGA are being developed which can reduce allele dropout.

Use of this method for genotyping fetuses at risk for specific pathogenic variants is potentially feasible. Although cell-free NIPT is relatively straight forward for genotyping paternal pathogenic variant, it is more complex for determining the maternal contribution to the genotype, although this can be accomplished with more complex analysis [1416]. Cell-free NIPT has been used to screen for de novo variants in a panel of genes [17]. Confirmation for de novo variants can be performed by specific reanalysis of maternal plasma if the mother is not mosaic for the variant. Single cell analysis of circulating trophoblasts can be used to determine the genotype of fetuses at risk for monogenic disorders, although allele dropout must be ruled out by analysis of multiple cells or it can be addressed using karyomapping as has been used in single cell preimplantation testing [18]. Hopefully improved methods for recovering fetal cells from mother’s blood will reduce any failure due to lack of cells, but use of haplotyping with cell-free NIPT in combination with haplotype analysis or digital PCR, amniocentesis, and CVS would be three alternative strategies if cell-based NIPT fails.

Our genotyping assay has the potential to be used in many clinical research applications. As described previously, it can be used to identify a fetal cell in cell-based NIPT and screen for known disease-causing variants. It also has the potential to distinguish between same sex dizygotic twins. Furthermore, this technique would also be adapted for other single cell applications, such as circulating tumor cell analysis. A future improved version may hopefully reduce the dropout rate in amplicons and increase the coverage at regions of interest.

Supporting information

S1 Fig. Grouping multiple amplicon haplotypes for a family trio.

We use HLA-A amplicon haplotypes from samples present in Fig 6 to demonstrate how to identify the fetal cell. Haplotype groups of the mother (Red), fetal cell (Green), and father (Black). The Y-axis of bar graphs indicates the factions of total reads in each DNA types in different read groups. The tree cluster suggests the distance between read-groups according to Levenshtein distance calculation.

(TIF)

S2 Fig. Using ROC-AUC to estimate the performance of haplotyping with various distance setting.

The Y-axis is the ROC-AUC distance to diagonal line from 0 to 1. The X-axis is the distance used for haplotype grouping. Three types of DNA were used for evaluation, Fetal (blue), WBC (Orange), and gDNA (grey).

(TIF)

S3 Fig. Evaluation of the performance of haplotyping at identifying a DNA with a non-maternal origin.

A. Performance of different haplotyping amplicons at detecting a non-maternal DNA. Y-axis is the detection rate, which estimates the fraction of the sample can be differentiated with a particular amplicon. The x-axis indicates which amplicon was tested. Fetal cells, WBC cells, and gDNA were tested accordingly from selected cases with both fetal cells and WBCs present. B. Improving detection rate by combining SNP typing and Haplotyping. WBCs from different cases (with or without fetal cells) were analyzed with individual and combined approaches.

(TIF)

S1 File. Reference sequence fasta file.

The fasta file used for alignment.

(FA)

S2 File. SNVs reference BED.

The bed demonstrate position of SNVs.

(BED)

S3 File. Linux shell script and parameters for alignment and analysis.

(SH)

S4 File. R script for genotype calling.

(R)

S5 File. R script for haplotype calling.

(R)

S6 File. R script to compare genotype of NIPT cells with parents.

(R)

S7 File. R script to compare haplotype of NIPT cells with parents.

(R)

S8 File. Table of selected clinical sample information.

(XLSX)

Acknowledgments

We thank the participating patients, study coordinators, and genetic counselors at Baylor College of Medicine, Texas Children’s Hospital, and Columbia University Medical Center for recruitment of samples.

Data Availability

All relevant data are within the paper and its Supporting Information files.

Funding Statement

This work was supported by internal institutional funds at Baylor College of Medicine awarded to AB and National Institutes of Health in the form of a grant awarded to IVdV (HD055651-12S1) at Baylor College of Medicine.

References

  • 1.Committee Opinion No. 640: Cell-Free DNA Screening For Fetal Aneuploidy. Obstet Gynecol. 2015;126(3):e31–7. 10.1097/AOG.0000000000001051 . [DOI] [PubMed] [Google Scholar]
  • 2.Benn P, Borrell A, Chiu RW, Cuckle H, Dugoff L, Faas B, et al. Position statement from the Chromosome Abnormality Screening Committee on behalf of the Board of the International Society for Prenatal Diagnosis. Prenat Diagn. 2015;35(8):725–34. 10.1002/pd.4608 . [DOI] [PubMed] [Google Scholar]
  • 3.Breman AM, Chow JC, U’Ren L, Normand EA, Qdaisat S, Zhao L, et al. Evidence for feasibility of fetal trophoblastic cell-based noninvasive prenatal testing. Prenat Diagn. 2016;36(11):1009–19. 10.1002/pd.4924 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Vossaert L, Wang Q, Salman R, Zhuo X, Qu C, Henke D, et al. Reliable detection of subchromosomal deletions and duplications using cell-based noninvasive prenatal testing. Prenat Diagn. 2018;38(13):1069–78. Epub 2018/11/19. 10.1002/pd.5377 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Vossaert L, Wang Q, Salman R, McCombs AK, Patel V, Qu C, et al. Validation Studies for Single Circulating Trophoblast Genetic Testing as a Form of Noninvasive Prenatal Diagnosis. Am J Hum Genet. 2019;105(6):1262–73. Epub 2019/11/27. 10.1016/j.ajhg.2019.11.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Kolvraa S, Singh R, Normand EA, Qdaisat S, van den Veyver IB, Jackson L, et al. Genome-wide copy number analysis on DNA from fetal cells isolated from the blood of pregnant women. Prenat Diagn. 2016;36(12):1127–34. 10.1002/pd.4948 . [DOI] [PubMed] [Google Scholar]
  • 7.Vestergaard EM, Singh R, Schelde P, Hatt L, Ravn K, Christensen R, et al. On the road to replacing invasive testing with cell-based NIPT: Five clinical cases with aneuploidies, microduplication, unbalanced structural rearrangement, or mosaicism. Prenat Diagn. 2017;37(11):1120–4. 10.1002/pd.5150 . [DOI] [PubMed] [Google Scholar]
  • 8.Debeljak M, Freed DN, Welch JA, Haley L, Beierl K, Iglehart BS, et al. Haplotype counting by next-generation sequencing for ultrasensitive human DNA detection. J Mol Diagn. 2014;16(5):495–503. 10.1016/j.jmoldx.2014.04.003 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25(14):1754–60. 10.1093/bioinformatics/btp324 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Li H. A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics. 2011;27(21):2987–93. 10.1093/bioinformatics/btr509 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25(16):2078–9. 10.1093/bioinformatics/btp352 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Zhang J, Kobert K, Flouri T, Stamatakis A. PEAR: a fast and accurate Illumina Paired-End reAd mergeR. Bioinformatics. 2014;30(5):614–20. 10.1093/bioinformatics/btt593 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Bianchi DW, Zickwolf GK, Weil GJ, Sylvester S, DeMaria MA. Male fetal progenitor cells persist in maternal blood for as long as 27 years postpartum. Proc Natl Acad Sci U S A. 1996;93(2):705–8. 10.1073/pnas.93.2.705 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Chang MY, Ahn S, Kim MY, Han JH, Park HR, Seo HK, et al. One-step noninvasive prenatal testing (NIPT) for autosomal recessive homozygous point mutations using digital PCR. Sci Rep. 2018;8(1):2877. Epub 2018/02/13. 10.1038/s41598-018-21236-w [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Hudecova I, Chiu RW. Non-invasive prenatal diagnosis of thalassemias using maternal plasma cell free DNA. Best Pract Res Clin Obstet Gynaecol. 2017;39:63–73. Epub 2016/10/26. 10.1016/j.bpobgyn.2016.10.016 . [DOI] [PubMed] [Google Scholar]
  • 16.Vermeulen C, Geeven G, de Wit E, Verstegen MJAM, Jansen RPM, van Kranenburg M, et al. Sensitive Monogenic Noninvasive Prenatal Diagnosis by Targeted Haplotyping. Am J Hum Genet. 2017;101(3):326–39. Epub 2017/08/24. 10.1016/j.ajhg.2017.07.012 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Zhang J, Li J, Saucier JB, Feng Y, Jiang Y, Sinson J, et al. Non-invasive prenatal sequencing for multiple Mendelian monogenic disorders using circulating cell-free fetal DNA. Nat Med. 2019;25(3):439–47. 10.1038/s41591-018-0334-x . [DOI] [PubMed] [Google Scholar]
  • 18.Handyside AH, Harton GL, Mariani B, Thornhill AR, Affara N, Shaw MA, et al. Karyomapping: a universal method for genome wide analysis of genetic disease based on mapping crossovers between parental haplotypes. J Med Genet. 2010;47(10):651–8. Epub 2009/10/25. 10.1136/jmg.2009.069971 . [DOI] [PubMed] [Google Scholar]

Decision Letter 0

Osman El-Maarri

6 Aug 2020

PONE-D-20-14518

Use of amplicon-based sequencing for testing fetal identity and monogenic traits with single circulating trophoblast (SCT) prenatal diagnosis

PLOS ONE

Dear Dr. Zhuo,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

All reviewers expressed major concerns regarding the manuscript. It is required that you address all these concerns so that we may further consider this manuscript.

Please submit your revised manuscript by Sep 20 2020 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

We look forward to receiving your revised manuscript.

Kind regards,

Osman El-Maarri, Ph.D

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Please provide additional details regarding participant consent. In the ethics statement in the Methods and online submission information, please ensure that you have specified (1) whether consent was informed and (2) what type you obtained (for instance, written or verbal, and if verbal, how it was documented and witnessed). If the need for consent was waived by the ethics committee, please include this information.

3. We note that Figure(s) [1] in your submission contain copyrighted images. All PLOS content is published under the Creative Commons Attribution License (CC BY 4.0), which means that the manuscript, images, and Supporting Information files will be freely available online, and any third party is permitted to access, download, copy, distribute, and use these materials in any way, even commercially, with proper attribution. For more information, see our copyright guidelines: http://journals.plos.org/plosone/s/licenses-and-copyright.

We require you to either (1) present written permission from the copyright holder to publish these figures specifically under the CC BY 4.0 license, or (2) remove the figures from your submission:

a.    You may seek permission from the original copyright holder of Figure(s) [1] to publish the content specifically under the CC BY 4.0 license.

We recommend that you contact the original copyright holder with the Content Permission Form (http://journals.plos.org/plosone/s/file?id=7c09/content-permission-form.pdf) and the following text:

“I request permission for the open-access journal PLOS ONE to publish XXX under the Creative Commons Attribution License (CCAL) CC BY 4.0 (http://creativecommons.org/licenses/by/4.0/). Please be aware that this license allows unrestricted use and distribution, even commercially, by third parties. Please reply and provide explicit written permission to publish XXX under a CC BY license and complete the attached form.”

Please upload the completed Content Permission Form or other proof of granted permissions as an "Other" file with your submission.

In the figure caption of the copyrighted figure, please include the following text: “Reprinted from [ref] under a CC BY license, with permission from [name of publisher], original copyright [original copyright year].”

b.    If you are unable to obtain permission from the original copyright holder to publish these figures under the CC BY 4.0 license or if the copyright holder’s requirements are incompatible with the CC BY 4.0 license, please either i) remove the figure or ii) supply a replacement figure that complies with the CC BY 4.0 license. Please check copyright information on all replacement figures and update the figure caption with source information. If applicable, please specify in the figure caption text when a figure is similar but not identical to the original image and is therefore for illustrative purposes only.

4. Thank you for including the following competing interests statement; "The Houston authors are faculty and staff at Baylor College of Medicine (BCM), which is a partial owner of a for profit diagnostic company, Baylor Genetics (BG); Houston authors also are employee of or have advisory or lab director roles at BG. ALB is founder and CEO of Luna Genetics, Inc."

Please confirm that this does not alter your adherence to all PLOS ONE policies on sharing data and materials, by including the following statement: "This does not alter our adherence to  PLOS ONE policies on sharing data and materials.” (as detailed online in our guide for authors http://journals.plos.org/plosone/s/competing-interests).  If there are restrictions on sharing of data and/or materials, please state these. Please note that we cannot proceed with consideration of your article until this information has been declared.

Please include your updated Competing Interests statement in your cover letter; we will change the online submission form on your behalf.

Please know it is PLOS ONE policy for corresponding authors to declare, on behalf of all authors, all potential competing interests for the purposes of transparency. PLOS defines a competing interest as anything that interferes with, or could reasonably be perceived as interfering with, the full and objective presentation, peer review, editorial decision-making, or publication of research or non-research articles submitted to one of the journals. Competing interests can be financial or non-financial, professional, or personal. Competing interests can arise in relationship to an organization or another person. Please follow this link to our website for more details on competing interests: http://journals.plos.org/plosone/s/competing-interests

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Partly

Reviewer #2: Partly

Reviewer #3: No

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: I Don't Know

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: No

Reviewer #3: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: Dear editor,

The paper “Use of amplicon-based sequencing for testing fetal identity and monogenic traits with

single circulating trophoblast (SCT) prenatal diagnosis” by Zhuo et al. is well written and clearly describes a promising procedure to distinguish fetal from maternal cells. However, I have a few remarks that require clarification.

Major point

- There is no discussion on the presence of fetal cells still present from previous pregnancies. Fetal cells have been detected decades after a pregnancy. Can the method described also distinguish if a fetal cell has a different haplotype than other fetal cells? If so, this is a strength of the methods described. If not, this would be a limitation.

- In addition, what would be the effect of twin pregnancies?

- Extra discussion is needed on the effects of the analysis on downstream analyses. Does the procedure cause lower quality results in downstream analysis of selected cells, since extra WGA amplification is needed to perform this test

Minor points,

- Line 999: Settings used for BWA-MEM. Does ‘conventional BWA-MEM’ mean all default settings?

- Table 1 is rather a list of amplicons containing SNP regions than a list of SNPs.

- In table 1: the start or stop of the HLA-A amplicon seems to be incorrects (Start>stop and not matching amplicon size).

- Lines 321 and 322, perhaps add the Protein accession numbers to the variants

- One method to infer haplotypes in cfDNA was not yet mentioned. https://pubmed.ncbi.nlm.nih.gov/28844486/

- Spelling mistake in fig 4: mother � mother and (informatic � informative?)

- The github code would benefit from some polishing, adding comments and removing commented out codes.

Reviewer #2: In this paper, the authors study the application of WGA followed by SNP and / or haplotype analysis in multiple amplicons to determine the fetal or maternal origin of presumably fetal cells, retrieved from the maternal blood. They do so by using different amplicons, i.e. amplicons from regions such as the HLA regions, with multiple SNPs per amplicon for haplotyping, and Human Identification SNP test amplicons, with one SNP per amplicon. They also studied the possibility of using these methods for genotyping for monogenic traits. Below my comments and questions:

1. My main comment is that, even though the authors do mention the drawbacks of their method, the paper is quite optimistic. The theory behind the methods they describe is solid, but the actual data are still quite preliminary. In their abstract the authors say that the method allowed reliable differentiation between fetal and maternal cells, in the introduction in line 75 they mention “in most cases”, but this was only in cases with sufficient information, which is certainly not 100%. Not all data is presented in a clear way, such as numbers of cells that are informative, but in how many cases? See my comments below. The methods that are described certainly have potential for this application, but need to be improved and the authors do acknowledge this in the discussion. In the discussion, the authors do mention multiple options for improvements, but how feasible are these? If feasible, why didn’t the authors already implement these, or at least show some data of improved versions, as proof-of-principle. The authors show data on fetal genotyping for monogenic traits, but in the discussion they mention themselves that a low failure rate is to be expected due to failure to isolate fetal cells and/or to ADO.

2. Throughout the paper, the authors use the term cell-based NIPT, whereas in the title they use the term single circulating trophoblast prenatal diagnosis. I would suggest to use the same terminology throughout the paper, and would suggest to also use the term cell-based NIPT in the title.

3. Line 40: analysis cell free = analysis of cell-free

4. In the introduction, the authors mention some drawbacks of cf-NIPT. One major drawback, of course, is the fact that cf-NIPT studies cd-DNA from the placenta and not from the fetus itself. This major drawback is not solved by the cell-based technique the authors use, as trophoblast cells are used for this method. The authors should comment on this. Furthermore, even though it is outside the scope of this paper, in the introduction the authors should also (briefly) comment on drawbacks of the cell-based NIPT, such as costs as compared to cf-NIPT and high-throughput possibilities.

5. With SNP typing, DNA from 156 blood samples were compared, with data from 518 cells: 357 cells were informative. Does being informative mean that the cells were fetal, or were there also maternal cells identified? If all fetal, were these cells originally also from 68.9% of the 156 samples, or were cells from some samples overrepresented and others underrepresented? In other words, could informative (fetal?) cells be retrieved from 68.9% of the 156 samples? (This refers to the comment that not all data support the conclusions, questions 1 and 3 of the review).

6. For haplotyping, how many cells from how many samples were tested? Again 156 samples? Also here, were the informative cells distributed evenly over the samples. In how many samples could fetal cells be identified based on haplotyping only? Is 50% of all cells the same as 50% of the cases? (Again, see questions 1 and 3 of the review)

7. The authors state they tested the haplotyping method with matched gDNA, WBC and fetal cells. In Line 278, they only mention WBC, not the fetal cells. What is the differentiating rate for the fetal cells? Please mention this in the text, not only in a supplementary figure, as this is the main subject of this paper. Moreover, according to Figure S3A, this is about 50% for both WBC and fetal cells, and this percentage is determined by one amplicon (HLA-B).

8. Figure S3A shows fetal cells with a detection rate of about 50%. In Figure S3B for HAP this seems to be more than 60%. What is the difference?

9. Lines 302-304: the authors state that 5 cells were genotyped: two were heterozygous and two had ADO. What happened to the fifth one?

10. In line 288, the authors state that in the cultured lymphoblasts, the ADO rate was 15% and 8%. When using true fetal cells from the three families, the ADO is very much higher, about 50%, as in each family some cells show ADO. How do the authors explain this difference?

11. From two of the three families more cells were retrieved than were used for genotyping. Why did the authors not genotype all available cells? Could not all cells be identified as true fetal? If so, that the identification of true fetal cells is lower than 60-70%. How were these cells identified as true fetal? If the other cells were maternal in origin, the chance of picking up a maternal cell is much higher than 10%, as stated in the introduction in line 65.

Reviewer #3: This paper describes the use of SNPs genotyping by sequencing in order to identifiy circulating cells of fetal origin in the context of NIPT applications.

Although the subject matter and the approach are interesting, there are significant issues with the way some of the results are presented and especially with the conclusions reached by the authors.

Because of the major limitations of the proposed methods, highlighted by several results (e.g. the combined performance of SNPs genotyping and haplotyping that gives only a 70% rate of success for identifying fetal cells from WBC (line 278, fig S3 B), a high rate of allele dropout and amplification artefacts contributing to false-negatives and false positives (line 335-341)), a clear and detailed presentation of false positive/negative rates should be provided.

Unfortunately, these rates are not comprehensively and clearly presented in the paper. Table 2 is not clearly explained. For example, I'm puzzled by the apparently negative correlation between true positives (TPV) and % of SNP differences (diff SNP%) in the "Fetal cell" column, shouldn't it be the opposite ? More info about the table headers, interpretation, etc. should be given. One of the main focus of the paper should be to present, analyse and discuss these rates in details.

This relatively poor performance of the genotyping approach to confirm fetal origin seems to be caused at least in part by the low amount/quality of starting DNA material since the detection rate for genomic DNA is much higher (Fig S3 A). This impact of starting material is also evident from the much lower number of scorable SNPs from WGA products as compared to genomic DNA shown on fig. 2.

These results thus point to serious technical challenges associated with WGA from single cells. Moreover, as the author acknowledge at line 335-341, both a high rate of allele dropout and amplification artefacts contribute to false-negatives and false positives.

The allele dropout problem mentionned above was also seriously affecting the genotyping for monogenic diseases, with rates reaching 8-15% (line 288). Such rate is clearly incompatible with any diagnostic application.

From these results, the realistic take home message is clearly one of skepticism towards the feasibility of using this kind of approach clinically in the short to mid term. However, although the authors acknowledge some of these limitations in the discussion, they nevertheless make statements that I consider misleading with respect to what the data show. For instance, "This single cell genotyping assay can provide an essential step for confirming the fetal origin of cells obtained from cell-based NIPT workflows. Through genotyping, we can reject cells that are of indeterminate or maternal origin and provide metrics for WGA DNA quality using multiple amplicons." (line 328-331). "...this assay is rapid and reliable..." (line 335). "Use of this method for genotyping fetuses at risk for specific monogenic mutations is feasible." (line 352-353).

More details should also be provided about the exact number of amplicons used rather than vague statements such as: “We sequenced approximately 90 highly polymorphic SNPs within about 40 amplicons.” (line 72). Likewise, in the methods section, three groups of amplicons are mentioned but only for the second group, a number is provided (37), while a reference is offered for the first group (Debeljak et al.), no number is given concerning the third group. Table 1 listing the amplicon used contains 49 entries.

Overall, the way the methods and results are presented lacks clarity and details that complicates the evaluation of the technique.

I thus consider that in its current form, this paper suffers from 1) a lack of clarity and a lack of emphasis on false positive/negative rates and other performance statistics, and 2) needs a more sober and realistic interpretation of the results, less focused on optimistic hopes of better future performances and more concerned about current challenges.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Lennart F. Johansson

Reviewer #2: No

Reviewer #3: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2021 Apr 15;16(4):e0249695. doi: 10.1371/journal.pone.0249695.r002

Author response to Decision Letter 0


2 Nov 2020

To Editor’s comments:

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming.

We changed the format of manuscript and naming of files to meet PLOS One Style.

2. Please provide additional details regarding participant consent. In the ethics statement in the Methods and online submission information, please ensure that you have specified (1) whether consent was informed and (2) what type you obtained (for instance, written or verbal, and if verbal, how it was documented and witnessed). If the need for consent was waived by the ethics committee, please include this information.

We provide additional details about consent in materials and methods. “Blood samples were collected from pregnant women from multiple centers under a protocol approved by the Baylor College of Medicine Institutional Review Boards utilizing written informed consent.” It is now indicated that this was written informed consent.

3. We note that Figure(s) [1] in your submission contain copyrighted images. All PLOS content is published under the Creative Commons Attribution License (CC BY 4.0), which means that the manuscript, images, and Supporting Information files will be freely available online, and any third party is permitted to access, download, copy, distribute, and use these materials in any way, even commercially, with proper attribution.

We redraw Figure 1 to address the conflict.

4. Thank you for including the following competing interests statement; "The Houston authors are faculty and staff at Baylor College of Medicine (BCM), which is a partial owner of a for profit diagnostic company, Baylor Genetics (BG); Houston authors also are employee of or have advisory or lab director roles at BG. ALB is founder and CEO of Luna Genetics, Inc."

Please confirm that this does not alter your adherence to all PLOS ONE policies on sharing data and materials, by including the following statement: "This does not alter our adherence to PLOS ONE policies on sharing data and materials.” (as detailed online in our guide for authors http://journals.plos.org/plosone/s/competing-interests). If there are restrictions on sharing of data and/or materials, please state these. Please note that we cannot proceed with consideration of your article until this information has been declared.

" The Houston authors are or were faculty and staff at Baylor College of Medicine(BCM), which is a partial owner of a for profit diagnostic company, Baylor Genetics (BG); Houston authors are also employees of or have advisory or lab director roles at BG. ALB is founder and CEO of Luna Genetics, Inc. This does not alter our adherence to PLOS ONE policies on sharing data and materials.”

To Reviewer #1’s comments:

The paper “Use of amplicon-based sequencing for testing fetal identity and monogenic traits with single circulating trophoblast (SCT) prenatal diagnosis” by Zhuo et al. is well written and clearly describes a promising procedure to distinguish fetal from maternal cells. However, I have a few remarks that require clarification.

Major point

- There is no discussion on the presence of fetal cells still present from previous pregnancies. Fetal cells have been detected decades after a pregnancy. Can the method described also distinguish if a fetal cell has a different haplotype than other fetal cells? If so, this is a strength of the methods described. If not, this would be a limitation.

- In addition, what would be the effect of twin pregnancies?

- Extra discussion is needed on the effects of the analysis on downstream analyses. Does the procedure cause lower quality results in downstream analysis of selected cells, since extra WGA amplification is needed to perform this test.

The following is added to the Introduction. This genotyping uses a small aliquot of the WGA product and does not interfere with downstream analysis.

To the discussion. Although there is evidence that some cells from previous pregnancies can persist for decades, especialy CD34+ cells,{PMID: 8570620} there is no evidence that trophoblasts can persist from previous pregnancies, and we expect based on the biology of these cells including apoptosis, that they are unlikely to persist. A future method that provides deeper genotyping of individual cells could distinguish cells from previous pregnancies and cells from same sex nonidentical twins, but the method described here would not be sufficient to accomplish such distinction reliably.

Minor points,

- Line 999: Settings used for BWA-MEM. Does ‘conventional BWA-MEM’ mean all default settings?

Since it is amplicon seq and the data size relatively small, we can do the mapping on a regular laptop rather than a server, so we adjusted the BWA-MEM setting for small memory and minimum seed length (see the supplement script S3 for exact setting).

- Table 1 is rather a list of amplicons containing SNP regions than a list of SNPs.

The title of the table has been changed to List of Amplicons

- In table 1: the start or stop of the HLA-A amplicon seems to be incorrects (Start>stop and not matching amplicon size)

Here we used the 1-base format rather than 0-base (1-base format include the first and last base of the amplicon so that subtraction of one number from the other gives a number one less than the product size), so user can be easier to access the location in UCSC genome browser (1-base also). I add a footnote to the table to clarify the format.

- Lines 321 and 322, perhaps add the Protein accession numbers to the variants

We added ClinGen accession number to the variant in figure legend.

- One method to infer haplotypes in cfDNA was not yet mentioned. https://pubmed.ncbi.nlm.nih.gov/28844486/

We add this article to the reference:16. Vermeulen C, Geeven G, de Wit E, Verstegen MJAM, Jansen RPM, van Kranenburg M, et al. Sensitive Monogenic Noninvasive Prenatal Diagnosis by Targeted Haplotyping. Am J Hum Genet. 2017;101(3):326-39. Epub 2017/08/24. doi: 10.1016/j.ajhg.2017.07.012.

- Spelling mistake in fig 4: mother � mother and (informatic � informative?)

We fixed this typo.

- The github code would benefit from some polishing, adding comments and removing commented out codes.

We have made revisions to address this.

To Reviewer #2’s comments:

In this paper, the authors study the application of WGA followed by SNP and / or haplotype analysis in multiple amplicons to determine the fetal or maternal origin of presumably fetal cells, retrieved from the maternal blood. They do so by using different amplicons, i.e. amplicons from regions such as the HLA regions, with multiple SNPs per amplicon for haplotyping, and Human Identification SNP test amplicons, with one SNP per amplicon. They also studied the possibility of using these methods for genotyping for monogenic traits. Below my comments and questions:

1. My main comment is that, even though the authors do mention the drawbacks of their method, the paper is quite optimistic. The theory behind the methods they describe is solid, but the actual data are still quite preliminary. In their abstract the authors say that the method allowed reliable differentiation between fetal and maternal cells, in the introduction in line 75 they mention “in most cases”, but this was only in cases with sufficient information, which is certainly not 100%. Not all data is presented in a clear way, such as numbers of cells that are informative, but in how many cases? See my comments below. The methods that are described certainly have potential for this application, but need to be improved and the authors do acknowledge this in the discussion. In the discussion, the authors do mention multiple options for improvements, but how feasible are these? If feasible, why didn’t the authors already implement these, or at least show some data of improved versions, as proof-of-principle. The authors show data on fetal genotyping for monogenic traits, but in the discussion, they mention themselves that a low failure rate is to be expected due to failure to isolate fetal cells and/or to ADO.

The following sentences were added to the discussion, acknowledging the points made by the reviewer.

“The data demonstrate the principle that detailed genotyping of individual cells can distinguish fetal from maternal cells, but the data are somewhat limited and preliminary, and new methods to more extensively compare the genotype of individual cells to the genotype of the mother are needed.”

The text has been modified as follows to address allele drop out and failure rate.

“Single cell analysis of circulating trophoblasts can be used to determine the genotype of fetuses at risk for monogenic disorders, although allele dropout must be ruled out by analysis of multiple cells or it can be addressed using karyomapping as has been used in single cell preimplantation testing.{PMID: 19858130} Hopefully improved methods for recovering fetal cells from mother’s blood will reduce any failure due to lack of cells, but use of cell-free NIPT in combination with haplotype analysis or digital PCR, amniocentesis, and CVS would be three alternative strategies for difficult cases. “

2. Throughout the paper, the authors use the term cell-based NIPT, whereas in the title they use the term single circulating trophoblast prenatal diagnosis. I would suggest to use the same terminology throughout the paper, and would suggest to also use the term cell-based NIPT in the title.

Single circulating trophoblast testing is one form of cell based NIPT. They do not mean the same thing. We have modified the title to clarify this. Fetal nucleated RBC-based NIPT would be another form of cell based NIPT.

3. Line 40: analysis cell free = analysis of cell-free

Fixed

4. In the introduction, the authors mention some drawbacks of cf-NIPT. One major drawback, of course, is the fact that cf-NIPT studies cd-DNA from the placenta and not from the fetus itself. This major drawback is not solved by the cell-based technique the authors use, as trophoblast cells are used for this method. The authors should comment on this. Furthermore, even though it is outside the scope of this paper, in the introduction the authors should also (briefly) comment on drawbacks of the cell-based NIPT, such as costs as compared to cf-NIPT and high-throughput possibilities.

There are many complexities around many different forms of testing and mosaicism. The following sentence has been added to the introduction. “Cell-free NIPT, trophoblast-based NIPT, and CVS all can detect placental mosaicism, while amniocentesis, fetal nucleated red blood cell (fnRBC)-based NIPT when feasible, and fetal blood sampling can help to clarify whether mosaicism involves the fetus or is confined to the placenta.”

The following was also added. “Limitations of cost and throughput would need to be overcome for cell-based NIPT to be a routine alternative. At present to our knowledge, no laboratory offers cell-based NIPT as a clinical test.”

5. With SNP typing, DNA from 156 blood samples were compared, with data from 518 cells: 357 cells were informative. Does being informative mean that the cells were fetal, or were there also maternal cells identified? If all fetal, were these cells originally also from 68.9% of the 156 samples, or were cells from some samples overrepresented and others underrepresented? In other words, could informative (fetal?) cells be retrieved from 68.9% of the 156 samples? (This refers to the comment that not all data support the conclusions, questions 1 and 3 of the review).

The following sentence has been added as noted above in response to similar comments earlier. “The data demonstrate the principle that detailed genotyping of individual cells can distinguish fetal from maternal cells, but the data are somewhat limited, and new methods to more extensively compare the genotype of individual cells to the genotype of the mother are needed.”

The following has been added. “The results for cases can be divided into four groups as shown in Table 2. First, there were 7/154 (4.5%) cases where none the cells passing WGA could be proven to be fetal; Second, there were 42/154 (27.3%) cases with only one cell scored as fetal; and third, there were 103/154 (66.9%) with two or more cells scored as fetal as defined in methods. At the single cell level for cells that pass WGA, cells can be scored in only two ways: 1) uninformative meaning that no or inadequate evidence of fetal status or 2) adequate evidence for fetal status. Some of the “uninformative” cells were certainly maternal, and we cannot distinguish an uninformative fetal cell from a maternal cell. Deeper genotyping would make healthy uninformative fetal cells informative, but some fetal cells are known to be apoptotic with degraded DNA and may or may not pass WGA.” Case number reduce from original 156 to 154, due to 2 cases have poor quality maternal gDNA, which hinder the analysis.

6. For haplotyping, how many cells from how many samples were tested? Again 156 samples? Also here, were the informative cells distributed evenly over the samples. In how many samples could fetal cells be identified based on haplotyping only? Is 50% of all cells the same as 50% of the cases? (Again, see questions 1 and 3 of the review)

Insertion of new Table 2 addresses this.

7. The authors state they tested the haplotyping method with matched gDNA, WBC and fetal cells. In Line 278, they only mention WBC, not the fetal cells. What is the differentiating rate for the fetal cells? Please mention this in the text, not only in a supplementary figure, as this is the main subject of this paper. Moreover, according to Figure S3A, this is about 50% for both WBC and fetal cells, and this percentage is determined by one amplicon (HLA-B).

Both the text and figure S2 and S3A mentioned fetal cells with WBC (Maternal cell). Since the Haplotyping only offer relative smaller contribution to the differentiation than SNP typing, we keep them in the supplement.

The Figure S3B were used to estimate the power of a different genetic origin cell, WBC(single Maternal cell), from unmatched gDNA. Because the WBCs is picked intentionally and went through all the processing steps as potential fetal cells, which offer two benefits as control. First, their origin is very certain maternal origin. Second, they usually preserve a better shape, in contrast fetal cells occasionally were in the process of apoptosis or aggregated together. Comparing WBC with unmatched gDNA allow us to compare and estimate the limit of our method when encounter an intact cell.

8. Figure S3A shows fetal cells with a detection rate of about 50%. In Figure S3B for HAP this seems to be more than 60%. What is the difference?

First, S3B for HAP is comparing the detection rate of WBC (~62%); while in S3A, WBCs combined is about 55%, they are about 7% difference. Second, S3A use only case with both Fetal cell and WBCs. S3B include cases with or without Fetal cells. Because they use different groups of cases for calculation, they cannot be compared directly.

We edit the text as following:” when four amplicons were combined, we can still distinguish about 50% of all the cells (from selected cases with both fetal cells and WBCs) (S3 Fig A). With combined power of both SNP typing and haplotyping, we can increase the solving rate of differentiating a WBC from around 60% to more than 70% (from cases independent of fetal cell existence) (S3 Fig B).”

9. Lines 302-304: the authors state that 5 cells were genotyped: two were heterozygous and two had ADO. What happened to the fifth one?

The fifth cell does not carry mutation on that position. It is a possibility that the mutant allele drop out in this cell.

10. In line 288, the authors state that in the cultured lymphoblasts, the ADO rate was 15% and 8%. When using true fetal cells from the three families, the ADO is very much higher, about 50%, as in each family some cells show ADO. How do the authors explain this difference?

Sentence added. “The higher rate of allele drop out in fetal cells is presumably caused by some combination of DNA degradation caused by time in the maternal circulation including apoptosis and by cell isolation including fixation and permeabilization steps.”

11. From two of the three families more cells were retrieved than were used for genotyping. Why did the authors not genotype all available cells? Could not all cells be identified as true fetal? If so, that the identification of true fetal cells is lower than 60-70%. How were these cells identified as true fetal? If the other cells were maternal in origin, the chance of picking up a maternal cell is much higher than 10%, as stated in the introduction in line 65.

We do a quality check after WGA and determine if the cell should move on the next step. For cells pass the QC, we do the genotyping for most of them. Some cells either suffer bad quality (very low read number in sequencing) or have dropout reads in the position (such as cell dicussed in question #9).

To Reviewer #3’s comments:

This paper describes the use of SNPs genotyping by sequencing in order to identifiy circulating cells of fetal origin in the context of NIPT applications.

Although the subject matter and the approach are interesting, there are significant issues with the way some of the results are presented and especially with the conclusions reached by the authors.

Because of the major limitations of the proposed methods, highlighted by several results (e.g. the combined performance of SNPs genotyping and haplotyping that gives only a 70% rate of success for identifying fetal cells from WBC (line 278, fig S3 B), a high rate of allele dropout and amplification artefacts contributing to false-negatives and false positives (line 335-341)), a clear and detailed presentation of false positive/negative rates should be provided.

Based on this comment and similar comments above, we made a new table2. Significant modifications have been introduced into the Results and Discussion as below.

“The results for cases can be divided into four groups as shown in Table 2. First, there were 7/154 (4.5%) cases where none the cells passing WGA could be proven to be fetal; Second, there were 42/154 (27.3%) cases with only one cell scored as fetal; and third, there were 103/154 (66.9%) with two or more cells scored as fetal as defined in methods. At the single cell level for cells that pass WGA, cells can be scored in only two ways: 1) uninformative meaning that no or inadequate evidence of fetal status or 2) adequate evidence for fetal status. Some of the “uninformative” cells were certainly maternal, and we cannot distinguish an uninformative fetal cell from a maternal cell. Deeper genotyping would make healthy uninformative fetal cells informative, but some fetal cells are known to be apoptotic with degraded DNA and may or may not pass WGA. False positives and false negatives at a single cell level are discussed below. This method is a work in progress, and we consider one fetal cell as partial success and two or more fetal cells as success from a clinical perspective. Improved cell recovery and improved genotyping would be needed to achieve an optimal test. The relative roles of failure to isolate and amplify cells, uninformative genotyping, and suboptimal numbers of fetal cells can be calculated from Table 2.”

Unfortunately, these rates are not comprehensively and clearly presented in the paper. Table 2 is not clearly explained. For example, I'm puzzled by the apparently negative correlation between true positives (TPV) and % of SNP differences (diff SNP%) in the "Fetal cell" column, shouldn't it be the opposite ? More info about the table headers, interpretation, etc. should be given. One of the main focus of the paper should be to present, analyse and discuss these rates in details.

We agree that the rates were no presented clearly, and we have constructed a different presentation with a new table 3, which we believe is much improved.

The following is inserted in the results.

“If one assumes that the NGS data are 100% reliable for sex based on X and Y data, which we believe is the case, the sensitivity and specificity of the genotyping can be estimated from normal male singleton pregnancies as shown in Table 3. Definition of other contingencies are included in the same table. Identifying a fetal cell as fetal by genotyping and confirming that it is male based on NGS is a true positive. The results vary depending on the cutoff for scoring a SNP allele as present. If we accept two different 2 SNP + 6% cutoff, based on normal male singleton pregnancies the true positive rate was 82.1%, true negative 6.6%, false positive 3.3%, and false negative 7.9%. This is equal to a sensitivity of 91.2% and a precision of 96.1%. These results would not be sufficient for clinical diagnosis, but they demonstrate the general validity of the approach and suggest that deeper genotyping of single cell could be completely reliable.”

This relatively poor performance of the genotyping approach to confirm fetal origin seems to be caused at least in part by the low amount/quality of starting DNA material since the detection rate for genomic DNA is much higher (Fig S3 A). This impact of starting material is also evident from the much lower number of scorable SNPs from WGA products as compared to genomic DNA shown on fig. 2.

The reviewer is correct. The poor performance on genotyping is caused in part by low amount/quality of DNA.

The following sentence has been added “The higher rate of allele drop out in fetal cells is presumably caused by some combination of DNA degradation caused by time in the maternal circulation including apoptosis and by cell isolation including fixation and permeabilization steps.”

These results thus point to serious technical challenges associated with WGA from single cells. Moreover, as the author acknowledge at line 335-341, both a high rate of allele dropout and amplification artefacts contribute to false-negatives and false positives.

One can address false positive and false negatives theoretically in genotyping a single cell. False positives where a sequencing error introduces the alternative of two alleles at a position would be very rare where alternate allele appears by chance. Single base sequencing errors are moderately common but do not create a false positive. False negative could occur due to allele drop out which is higher in cells recovered from mother’s blood than for lymphoblasts. Clearly ADO is a very common occurrence.

Limitations are failure to isolate fetal cells, failure of WGA (e.g., due to apoptosis), allele drop out, and insufficient depth of genotyping. We hypothesize that all of these can potentially be overcome by recovering more cells and by deeper genotyping of each cell (i.e. genotyping thousands of SNPs).

The allele dropout problem mentionned above was also seriously affecting the genotyping for monogenic diseases, with rates reaching 8-15% (line 288). Such rate is clearly incompatible with any diagnostic application.

ADO is a problem for diagnosing monogenic disorders using cell-based NIPT. If a nucleotide position is heterozygous in even one cell, but preferably in multiple cells, the genotype at that position can be scored with confidence. If the genotype shows only the normal (N) allele at a position, interpretation can conclude that the genotype is NN or NM but not MM (affected). If the genotype shows only the mutant (M) at a position, interpretation can conclude that the genotype is MM of NM but not NN. This circumstance occurs for common variants causing phenotypes as for sickle cell anemia or cystic fibrosis. If one allele is found in some cells and the opposite allele is found in other cells in a singleton pregnancy, we would argue that finding two cells with only the N allele and two cells with only the M allele would allow the conclusion that the genotype is heterozygous at that position. For compound heterozygous genotypes, the circumstances are somewhat different. Observing both alleles in multiple cells is straight forward for interpretation that the fetus carries the mutant allele at this position. Observing only the M allele would also allow the conclusion that the fetus carries the mutant allele at this position Observing only the N allele in even many cells is challenging, since the fetus could carry the mutation with allele dropout in all five cells. This could occur especially if a SNP impairs the function of a primer. This concern can be reduced by demonstrating that the primers used amplify all parental alleles. The uncertainty of a genotype position could be addressed by cell-free NIPT for a paternal mutation, but a method such as haplotyping, karyomapping, digital PCR, or amniocentesis or CVS could be used for maternal mutations.

From these results, the realistic take home message is clearly one of skepticism towards the feasibility of using this kind of approach clinically in the short to mid term. However, although the authors acknowledge some of these limitations in the discussion, they nevertheless make statements that I consider misleading with respect to what the data show. For instance, "This single cell genotyping assay can provide an essential step for confirming the fetal origin of cells obtained from cell-based NIPT workflows. Through genotyping, we can reject cells that are of indeterminate or maternal origin and provide metrics for WGA DNA quality using multiple amplicons." (line 328-331). "...this assay is rapid and reliable..." (line 335). "Use of this method for genotyping fetuses at risk for specific monogenic mutations is feasible." (line 352-353).

We agree with skepticism about using the current method clinically, but it does demonstrate that a deeper genotyping method that detects thousands of SNPs in single cells could be used clinically.

More details should also be provided about the exact number of amplicons used rather than vague statements such as: “We sequenced approximately 90 highly polymorphic SNPs within about 40 amplicons.” (line 72). Likewise, in the methods section, three groups of amplicons are mentioned but only for the second group, a number is provided (37), while a reference is offered for the first group (Debeljak et al.), no number is given concerning the third group. Table 1 listing the amplicon used contains 49 entries.

The total number of amplicons are case specific. First 41 amplicons are regular for identifying the origin of cell. Then additional amplicons are added if the parents have monogenic disorder, so the total number is case by case.

Overall, the way the methods and results are presented lacks clarity and details that complicates the evaluation of the technique.

I thus consider that in its current form, this paper suffers from 1) a lack of clarity and a lack of emphasis on false positive/negative rates and other performance statistics, and 2) needs a more sober and realistic interpretation of the results, less focused on optimistic hopes of better future performances and more concerned about current challenges.

We think that the revisions above address these weaknesses.

Attachment

Submitted filename: rebuttal letter oct31.docx

Decision Letter 1

Osman El-Maarri

4 Jan 2021

PONE-D-20-14518R1

Use of amplicon-based sequencing for testing fetal identity and monogenic traits with single circulating trophoblast (SCT) as one form of cell-based NIPT

PLOS ONE

Dear Dr. Zhuo,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Feb 18 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

We look forward to receiving your revised manuscript.

Kind regards,

Osman El-Maarri, Ph.D

Academic Editor

PLOS ONE

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #2: (No Response)

Reviewer #3: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #2: Partly

Reviewer #3: Partly

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #2: Yes

Reviewer #3: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #2: Yes

Reviewer #3: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #2: Yes

Reviewer #3: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #2: The authors have done a significant amount of work from which the paper certainly benefits.

Previously, I commented on the difference between cells and cases. Even though the data in the paper are presented in a much better way now, I still have some questions:

1. In the rebuttal the authors mention the number of cases is now 154 instead of 156. In the text, they use the number of 154, while in table 2 the numbers do not add up to 154 but to 152. Please explain or correct.

2. In Table 2, numbers of cases are presented, which in my opinion is indeed the most informative way of (at least) presenting the data. In lines 213-218 the authors again switch to information on number of cells instead of cases. This apparently is not the same as cases, as when calculating % from the data in Table 2, different % are obtained than mentioned in the text. Please use the same way of presenting data in both text and Table.

3. In Table 2 the last column seems to be the same as the “2 SNP + 6%” column as far as data is concerned. What does this last column show? According to Table 3 it should be “2 SNP + 0%”, but what does that mean? The numbers in the column in Table 2 are the same as in the “2 SNP + 6%” column. Is this correct?

4. Table 2 would benefit from mentioning also the percentages, for better comparison with the text.

5. Lines 226-230 and Table 3: this is again based on cells. Please also provide information of cases.

6. Lines 337-342: As mentioned in my previous comment, the data n the numbers of cells do not add up to the numbers of cells that are genotyped. Please add information on all the cells, for both the DHCR7 and the sickle cell family.

7. Line 337: “…but the data are somewhat limited…”. Please remove the word “somewhat”, as this single word raises a lot of questions and removing the word does not change the message of the sentence.

8. Lines 403-405: the authors suggest as alternatives to overcome the problem with their method to use their NIPT in combination with amniocenteses and CVS. This seems a strange alternative, as cell-based NIPT is being validated and optimized as replacement for invasive methods. What would be the benefit of cell-based NIPT if it should be used in combination with an invasive procedure anyway?

Minor comments:

9. Abstract line 26: “This method allowed reliable differentiation of fetal and maternal cells.” This is still an optimistic, and possibily misleading, sentence. The sentence can be removed without any damage to the abstract.

10. Throughout the paper, the term “mutation” is used. There is an international agreement to use the word “variant” or “pathogenic variant” instead (see Richards et al. 2015). Please use this throughout the paper.

11. Line 199: “…where none the cells…” = “…where none of the cells…”

12. Line 202: “…meaning that no…” = “…meaning no….”

13. Line 214: “…used a more conservative….” = “…used a more conservative approach of…”

14. Line 226: “…two different 2 SNP…” = “two different SNPs…”

15. Line 241: “…per sample…” = “…per cell…”

16. Line 351: “…the sickle mutation…” = “…the sickle cell anemia variant…”

17. Line 377: spelling error in pregnancies and piology (=etiology??)

Reviewer #3: Although the authors have attempted to address all the comments raised in the previous review, a few issues persist.

Notably, in my opinion, the discussion still contains overly optimistic statements that do not adequately portrait what the current data suggest. For instance, at line 381, I would replace: "...this assay is rapid and reliable…" by "...this assay is promising…".

at line 393, replace: "...monogenic mutations is feasible." by "...monogenic mutations is potentially feasible.". At line 410, replace: "An improved version of this assay may significantly reduce…" by "A future improved version may hopefully reduce…".

The paper would also greatly benefit from clarifying the terminology used when referring to "informative" cells and SNPs throughout. It seems like the authors refer to SNPs with more than one allele present in the NGS data from a sample as an informative SNP, whether or not there's a difference between the mother and the fetus at this SNP position (e.g. when both mother and fetus are heterozygotes for the same alleles), while an informative cell refers to a cell with feto-maternal SNP differences. These terms should be formally defined early on to avoid confusion. However, at line 299, the authors refer to informative SNPs as those where there's a difference between the genotype of the mother and of the putative fetal cell. The term scorable SNP is also used, which here seems to simply refer to a genotyped SNP position that passed quality control.

Additional minor suggestions and comments

line 98: "The first step is conventional PCR with a pool of amplicons with bridging adaptors…" should be : "The first step is conventional PCR of a pool of amplicons with bridging adaptors…"

line 140: "An 8-10 pM diluted denatured…"

line 188: "...three SNPs show a difference between one of the cell…"

line 198: I believe it should be three groups and not four: "The results for cases can be divided into three groups…"

line 214: "...we initially used a more conservative criterion of at least…"

line 215: "We found that 68.9 % of genotyped putative fetal cells…"

line 228: what does precision of 96.1 % refers to here ? To the authors mean global diagnostic accuracy ? Please mathematically define the precision calculation used.

line 235: "The remaining three cells…"

line 348: The last sentence ("The fetus does carry the paternal mutation.") seems odd and should be removed or modified.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #2: No

Reviewer #3: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2021 Apr 15;16(4):e0249695. doi: 10.1371/journal.pone.0249695.r004

Author response to Decision Letter 1


26 Jan 2021

Reviewer #2: The authors have done a significant amount of work from which the paper certainly benefits.

Previously, I commented on the difference between cells and cases. Even though the data in the paper are presented in a much better way now, I still have some questions:

1. In the rebuttal the authors mention the number of cases is now 154 instead of 156. In the text, they use the number of 154, while in table 2 the numbers do not add up to 154 but to 152. Please explain or correct.

The following sentence is added to the first paragraph of methods describing the samples.

“There were 154 usable blood samples from 152 pregnancies/cases; two women had two samples collected during one pregnancy.”

To avoid confusion, we also revised the number accordingly in the text.

2. In Table 2, numbers of cases are presented, which in my opinion is indeed the most informative way of (at least) presenting the data. In lines 213-218 the authors again switch to information on number of cells instead of cases. This apparently is not the same as cases, as when calculating % from the data in Table 2, different % are obtained than mentioned in the text. Please use the same way of presenting data in both text and Table.

The paragraph starting at line 213 is substantially rewritten as follows:

In Table 2, we examined what percent of cases had one or more or two of more cells scored as fetal. Individual cells were scored as fetal if two or more SNPs had at least 10% reads for an allele that was not present in the mother. Cases were then subdivided into those where the informative SNPs indicating fetal status were at least 10%, 8%, or 6%, of the scorable SNPs. There were no cases where the informative SNPs were less than 6% of the scorable SNPs. There were 60 cells with one informative SNP suggesting that requiring two SNPs may undercount fetal cells. It is important to distinguish the percent of cases (82.4%) that had two or more fetal cells (103/152) from the percent of cells (78.7%) that had two or more fetal SNPs indicating fetal status (408/518).

3. In Table 2 the last column seems to be the same as the “2 SNP + 6%” column as far as data is concerned. What does this last column show? According to Table 3 it should be “2 SNP + 0%”, but what does that mean? The numbers in the column in Table 2 are the same as in the “2 SNP + 6%” column. Is this correct?

The last column of Table 2 has been deleted as not contributing useful information.

4. Table 2 would benefit from mentioning also the percentages, for better comparison with the text.

We edited the table as suggested.

5. Lines 226-230 and Table 3: this is again based on cells. Please also provide information of cases.

This Table presents the reliability of scoring a cell as fetal, and it is essential to present the data as cells. There are multiple cells per case, and it is not the case that is a false positive or false negative. No change is made to the text.

6. Lines 337-342: As mentioned in my previous comment, the data n the numbers of cells do not add up to the numbers of cells that are genotyped. Please add information on all the cells, for both the DHCR7 and the sickle cell family.

For the sickle cell case, we made an error. Nine fetal cells were recovered and four were genotyped. The fifth cell was a maternal WBC intentionally picked as a control. The text is corrected. Information for all of the cells genotyped is now presented for the DHCR7 and sickle cell families.

7. Line 337: “…but the data are somewhat limited…”. Please remove the word “somewhat”, as this single word raises a lot of questions and removing the word does not change the message of the sentence.

The word was removed.

8. Lines 403-405: the authors suggest as alternatives to overcome the problem with their method to use their NIPT in combination with amniocenteses and CVS. This seems a strange alternative, as cell-based NIPT is being validated and optimized as replacement for invasive methods. What would be the benefit of cell-based NIPT if it should be used in combination with an invasive procedure anyway?

The intent was to mention amniocentesis and CVS as alternatives if cell-based NIPT fails. The text is changed to indicate this.

Minor comments:

9. Abstract line 26: “This method allowed reliable differentiation of fetal and maternal cells.” This is still an optimistic, and possibily misleading, sentence. The sentence can be removed without any damage to the abstract.

Removed.

10. Throughout the paper, the term “mutation” is used. There is an international agreement to use the word “variant” or “pathogenic variant” instead (see Richards et al. 2015). Please use this throughout the paper.

There were 43 occurrences of the word mutation and 8 occurrences of the word mutant. Dozens of changes were made where appropriate.

11. Line 199: “…where none the cells…” = “…where none of the cells…”

Changed

12. Line 202: “…meaning that no…” = “…meaning no….”

changed

13. Line 214: “…used a more conservative….” = “…used a more conservative approach of…”

The entire paragraph was rewritten.

14. Line 226: “…two different 2 SNP…” = “two different SNPs…”

Changed

15. Line 241: “…per sample…” = “…per cell…”

changed

16. Line 351: “…the sickle mutation…” = “…the sickle cell anemia variant…”

Changed

17. Line 377: spelling error in pregnancies and piology (=etiology??)

Changed to biology

Reviewer #3: Although the authors have attempted to address all the comments raised in the previous review, a few issues persist.

Notably, in my opinion, the discussion still contains overly optimistic statements that do not adequately portrait what the current data suggest. For instance, at line 381, I would replace: "...this assay is rapid and reliable…" by "...this assay is promising…".

at line 393, replace: "...monogenic mutations is feasible." by "...monogenic mutations is potentially feasible.". At line 410, replace: "An improved version of this assay may significantly reduce…" by "A future improved version may hopefully reduce…".

Changed as suggested.

The paper would also greatly benefit from clarifying the terminology used when referring to "informative" cells and SNPs throughout. It seems like the authors refer to SNPs with more than one allele present in the NGS data from a sample as an informative SNP, whether or not there's a difference between the mother and the fetus at this SNP position (e.g. when both mother and fetus are heterozygotes for the same alleles), while an informative cell refers to a cell with feto-maternal SNP differences. These terms should be formally defined early on to avoid confusion. However, at line 299, the authors refer to informative SNPs as those where there's a difference between the genotype of the mother and of the putative fetal cell. The term scorable SNP is also used, which here seems to simply refer to a genotyped SNP position that passed quality control.

The following sentence is added in the methods section under variant calling.

Throughout this manuscript, a cell or a SNP is referred to as informative if the putative fetal cell has an allele not carried by the mother (e.g., mother is AA and the putative fetal cell is AB or _B). An uninformative cell does not have alleles not carried by the mother and may be a maternal cell of a fetal cell with inadequate genotyping data.

The legend for Fig. 6 is very consistent with this explanation.

Additional minor suggestions and comments

line 98: "The first step is conventional PCR with a pool of amplicons with bridging adaptors…" should be : "The first step is conventional PCR of a pool of amplicons with bridging adaptors…"

Changed

line 140: "An 8-10 pM diluted denatured…"

changed

line 188: "...three SNPs show a difference between one of the cell…"

changed

line 198: I believe it should be three groups and not four: "The results for cases can be divided into three groups…"

changed

line 214: "...we initially used a more conservative criterion of at least…"

This paragraph is rewritten

line 215: "We found that 68.9 % of genotyped putative fetal cells…"

This paragraph is rewritten

line 228: what does precision of 96.1 % refers to here ? To the authors mean global diagnostic accuracy ? Please mathematically define the precision calculation used.

Changed the text:

“This is equal to a sensitivity [True Positive Rate= TP/(TP+FN) of 91.2% (124/124+12)] and a precision [Positive Predictive Value = TP/(TP+FP) of 96.1% (124/124+5)].”

line 235: "The remaining three cells…"

Changed

line 348: The last sentence ("The fetus does carry the paternal mutation.") seems odd and should be removed or modified.

Deleted sentence.

Attachment

Submitted filename: reply_to_comment.22Jan2021.docx

Decision Letter 2

Osman El-Maarri

9 Mar 2021

PONE-D-20-14518R2

Use of amplicon-based sequencing for testing fetal identity and monogenic traits with single circulating trophoblast (SCT) as one form of cell-based NIPT

PLOS ONE

Dear Dr. Zhuo,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Apr 23 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

We look forward to receiving your revised manuscript.

Kind regards,

Osman El-Maarri, Ph.D

Academic Editor

PLOS ONE

Journal Requirements:

Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #2: All comments have been addressed

Reviewer #3: (No Response)

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #2: Yes

Reviewer #3: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #2: Yes

Reviewer #3: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #2: Yes

Reviewer #3: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #2: Yes

Reviewer #3: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #2: (No Response)

Reviewer #3: The authors have addressed most of the issues previously identified, but there remains a few details that need to be clarified.

As requested, the authors have added a sentence defining and clarifying the "informative/uninformative" terminology in the methods section, where this definition is given: "Throughout this manuscript, a cell or a SNP is referred to as informative if the putative fetal cell has an allele not carried by the mother...".

However, at line 249, while refering to Fig. 4, they state that "Cell G106 has only 2 SNP difference in less than 20 informative SNPs...". The axis on Fig. 4 are indeed labelled "number of SNPs not present in mother" and "number of informative SNPs", for the y and x axis respectively.

Therefore, given the definition of informative SNP provided by the authors, I don't understand what can be the difference here between "number of SNPs not present in mother" and "number of informative SNPs". Aren't SNPs not present in mother suppose to be alleles not present in mother ? Please clarify.

Regarding Fig. 4, there are other inconsistencies between the text and the figure. At line 245, it is said that cells G78, G212, G227 and G232 all have at least four SNPs wich are different from maternal gDNA while on the graph, it seems to be at least five SNPs which are different (of these four cells, G78 and G212 have the lowest number of differences and it seems to be five differences when judging from the axis and the points on the graph). Then it is said that the remaining three cells had 0-2 SNPs which differed from maternal gDNA, while it seems to be rather 0-1 judging from the figure ?

Moreover, the caption for Fig. 4 says: "As expected, the maternal gDNA samples gave no alleles not present in the mother...". This sentence seems odd given that thoughout the paper, we get the impression that maternal gDNA (from WBC) is used as the reference for the mother's genome. How could there be an allele not present in the mother from the data used to establish the mother's genome ? That would be circular ? Was there another reference used ?

Finally, at line 100, the requested change of "...conventional PCR with a pool..." for "...conventional PCR of a pool..." has not been made, even if the authors claim to have made the change in their reply.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #2: No

Reviewer #3: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2021 Apr 15;16(4):e0249695. doi: 10.1371/journal.pone.0249695.r006

Author response to Decision Letter 2


20 Mar 2021

Dear Editor,

We would like to thank the reviewers for their specific and helpful comments. The manuscript has been improved according to the suggestions from the reviewers.

For specific comments, please see our response below:

Reviewer #3: The authors have addressed most of the issues previously identified, but there remains a few details that need to be clarified.

As requested, the authors have added a sentence defining and clarifying the "informative/uninformative" terminology in the methods section, where this definition is given: "Throughout this manuscript, a cell or a SNP is referred to as informative if the putative fetal cell has an allele not carried by the mother...".

However, at line 249, while refering to Fig. 4, they state that "Cell G106 has only 2 SNP difference in less than 20 informative SNPs...". The axis on Fig. 4 are indeed labelled "number of SNPs not present in mother" and "number of informative SNPs", for the y and x axis respectively.

Therefore, given the definition of informative SNP provided by the authors, I don't understand what can be the difference here between "number of SNPs not present in mother" and "number of informative SNPs". Aren't SNPs not present in mother suppose to be alleles not present in mother ? Please clarify.

We changed the labels in figure 4 and the text accordingly.

“The X-axis indicates the number of variant sites passing coverage cutoff. The Y-axis is the number of informative SNPs in a cell.”

Regarding Fig. 4, there are other inconsistencies between the text and the figure. At line 245, it is said that cells G78, G212, G227 and G232 all have at least four SNPs wich are different from maternal gDNA while on the graph, it seems to be at least five SNPs which are different (of these four cells, G78 and G212 have the lowest number of differences and it seems to be five differences when judging from the axis and the points on the graph). Then it is said that the remaining three cells had 0-2 SNPs which differed from maternal gDNA, while it seems to be rather 0-1 judging from the figure ?

Moreover, the caption for Fig. 4 says: "As expected, the maternal gDNA samples gave no alleles not present in the mother...". This sentence seems odd given that thoughout the paper, we get the impression that maternal gDNA (from WBC) is used as the reference for the mother's genome. How could there be an allele not present in the mother from the data used to establish the mother's genome ? That would be circular ? Was there another reference used ?

We changed typo in the text to “at least five” and “0-1” according to suggest. We also removed the sentence "As expected, the maternal gDNA samples gave no alleles not present in the mother..." to avoid confusion.

Finally, at line 100, the requested change of "...conventional PCR with a pool..." for "...conventional PCR of a pool..." has not been made, even if the authors claim to have made the change in their reply.

We fixed this typo.

We think that the revisions above address these weaknesses.

We hope that our revision in its current form is suitable for publication in PLOS ONE.

Sincerely yours,

On behalf of the authors

Xinming Zhuo, Ph.D. and Arthur Beaudet, M.D.

Attachment

Submitted filename: reply_to_comment.12Mar2021.docx

Decision Letter 3

Osman El-Maarri

24 Mar 2021

Use of amplicon-based sequencing for testing fetal identity and monogenic traits with single circulating trophoblast (SCT) as one form of cell-based NIPT

PONE-D-20-14518R3

Dear Dr. Zhuo,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Osman El-Maarri, Ph.D

Academic Editor

PLOS ONE

Acceptance letter

Osman El-Maarri

6 Apr 2021

PONE-D-20-14518R3

Use of amplicon-based sequencing for testing fetal identity and monogenic traits with single circulating trophoblast (SCT) as one form of cell-based NIPT

Dear Dr. Zhuo:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Priv.-Doz. Dr. Osman El-Maarri

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Fig. Grouping multiple amplicon haplotypes for a family trio.

    We use HLA-A amplicon haplotypes from samples present in Fig 6 to demonstrate how to identify the fetal cell. Haplotype groups of the mother (Red), fetal cell (Green), and father (Black). The Y-axis of bar graphs indicates the factions of total reads in each DNA types in different read groups. The tree cluster suggests the distance between read-groups according to Levenshtein distance calculation.

    (TIF)

    S2 Fig. Using ROC-AUC to estimate the performance of haplotyping with various distance setting.

    The Y-axis is the ROC-AUC distance to diagonal line from 0 to 1. The X-axis is the distance used for haplotype grouping. Three types of DNA were used for evaluation, Fetal (blue), WBC (Orange), and gDNA (grey).

    (TIF)

    S3 Fig. Evaluation of the performance of haplotyping at identifying a DNA with a non-maternal origin.

    A. Performance of different haplotyping amplicons at detecting a non-maternal DNA. Y-axis is the detection rate, which estimates the fraction of the sample can be differentiated with a particular amplicon. The x-axis indicates which amplicon was tested. Fetal cells, WBC cells, and gDNA were tested accordingly from selected cases with both fetal cells and WBCs present. B. Improving detection rate by combining SNP typing and Haplotyping. WBCs from different cases (with or without fetal cells) were analyzed with individual and combined approaches.

    (TIF)

    S1 File. Reference sequence fasta file.

    The fasta file used for alignment.

    (FA)

    S2 File. SNVs reference BED.

    The bed demonstrate position of SNVs.

    (BED)

    S3 File. Linux shell script and parameters for alignment and analysis.

    (SH)

    S4 File. R script for genotype calling.

    (R)

    S5 File. R script for haplotype calling.

    (R)

    S6 File. R script to compare genotype of NIPT cells with parents.

    (R)

    S7 File. R script to compare haplotype of NIPT cells with parents.

    (R)

    S8 File. Table of selected clinical sample information.

    (XLSX)

    Attachment

    Submitted filename: rebuttal letter oct31.docx

    Attachment

    Submitted filename: reply_to_comment.22Jan2021.docx

    Attachment

    Submitted filename: reply_to_comment.12Mar2021.docx

    Data Availability Statement

    All relevant data are within the paper and its Supporting Information files.


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES