Skip to main content
PLOS One logoLink to PLOS One
. 2021 Apr 15;16(4):e0249930. doi: 10.1371/journal.pone.0249930

Deep sequencing of DNA from urine of kidney allograft recipients to estimate donor/recipient-specific DNA fractions

Aziz Belkadi 1, Gaurav Thareja 1, Darshana Dadhania 2,3, John R Lee 2,3, Thangamani Muthukumar 2,3, Catherine Snopkowski 3, Carol Li 3, Anna Halama 1, Sara Abdelkader 1, Silvana Abdulla 1, Yasmin Mahmoud 1, Joel Malek 1, Manikkam Suthanthiran 2,3, Karsten Suhre 1,*
Editor: Stanislaw Stepkowski4
PMCID: PMC8049329  PMID: 33857204

Abstract

Kidney transplantation is the treatment of choice for patients with end-stage kidney failure, but transplanted allograft could be affected by viral and bacterial infections and by immune rejection. The standard test for the diagnosis of acute pathologies in kidney transplants is kidney biopsy. However, noninvasive tests would be desirable. Various methods using different techniques have been developed by the transplantation community. But these methods require improvements. We present here a cost-effective method for kidney rejection diagnosis that estimates donor/recipient-specific DNA fraction in recipient urine by sequencing urinary cell DNA. We hypothesized that in the no-pathology stage, the largest tissue types present in recipient urine are donor kidney cells, and in case of rejection, a larger number of recipient immune cells would be observed. Extensive in-silico simulation was used to tune the sequencing parameters: number of variants and depth of coverage. Sequencing of DNA mixture from 2 healthy individuals showed the method is highly predictive (maximum error < 0.04). We then demonstrated the insignificant impact of familial relationship and ethnicity using an in-house and public database. Lastly, we performed deep DNA sequencing of urinary cell pellets from 32 biopsy-matched samples representing two pathology groups: acute rejection (AR, 11 samples) and acute tubular injury (ATI, 12 samples) and 9 samples with no pathology. We found a significant association between the donor/recipient-specific DNA fraction in the two pathology groups compared to no pathology (P = 0.0064 for AR and P = 0.026 for ATI). We conclude that deep DNA sequencing of urinary cells from kidney allograft recipients offers a noninvasive means of diagnosing acute pathologies in the human kidney allograft.

Introduction

In 1933 Ukrainian surgeon Yurii Voronoy achieved the first human kidney transplantation [1]. Kidney transplantation is the preferred treatment option for patients with end-stage renal failure compared to dialysis. Today, renal transplantation plays an important role in clinical medicine and has become a relatively safe intervention. However, various pathologies can still affect the transplanted organ, including infections, disease recurrence and immune rejections The rejections can be related to a range of donor- and recipient-specific factors [2, 3]. Acute renal rejection affects 10 to 20% of transplants within three months after transplantation and chronic rejection is an important cause of graft failure [35].

To diagnose rejection, kidney allograft biopsies are considered the gold-standard for detecting both acute and chronic immune rejection, as well as other associated pathologies that may eventually lead to allograft loss. However, biopsies are invasive, costly, and in rare cases can lead to organ loss, while the readout can potentially be erroneous if a non-affected part of the kidney is sampled by chance. Therefore, invasive biopsies in patients with low immunological risk could be criticized [6]. There is hence a strong need for noninvasive assays to detect injury in transplanted kidneys. Several studies to develop suitable biomarkers for allograft rejection have been conducted. These studies include the quantification of specific messenger RNAs (mRNA) in urine [7], large-scale transcriptomics analyses of peripheral blood [8], proteomics analyses of biopsies [9] and urine [10, 11], and metabolomics [12] and RNA sequencing [13] of urine pellet or supernatant. Nevertheless, non-invasive methods developed to date still have important caveats and require further improvement.

The presence of donor-specific DNA in the blood was first reported in women who had a kidney or a liver transplant [14]. The measurement of cell-free donor-specific DNA in blood for a differential diagnosis of kidney injury has been suggested recently [1517]. These studies focused on females who received a kidney from male donors by identifying the presence of DNA coding for the testis specific protein Y- linked 1 (TSPY1) or the sex-determining region of the Y chromosome using quantitative polymerase chain reaction. With the improvement of next generation sequencing technologies, whole genome sequencing (WGS) [18, 19] and targeted sequencing [20] were used for measuring donor-specific DNA for solid organ transplant rejection. However, these studies focused on heart transplants and measured cell-free donor-specific DNA in blood plasma. More importantly, these methods require the sequencing of both donor and receptor DNA which is more costly.

An algorithm for measuring donor-specific DNA in plasma of organ transplants without requiring donor or recipient genotyping was implemented by Gordon et al [21]. But this algorithm made the assumption that donor fraction is < 14%. More recently, Grskovic et al. used sequencing of 266 single nucleotide variants (SNVs) that discriminate best between two unrelated individuals to count reference and alternative allele frequency for estimating the donor-derived cell-free DNA fraction [22]. This method showed a high correlation between cell-free donor specific DNA levels in recipient blood and active rejection of the kidney allografts [23]. However, this method does not account for potential sequencing errors and requires a priori knowledge of the familial relationship between donor and recipient. Finally, a statistical method combining SNV array genotyping of donor and recipient before transplantation with recipient DNA sequencing was used to estimate recipient-derived DNA fraction in heart and lung transplants [24]. Nonetheless, this method requires SNV genotyping of donor and recipient DNA before transplantation. Most importantly, all of the previous studies focused on DNA extracts from blood.

The presence of donor-specific DNA in urine of kidney allograft recipients has been reported [25]. We recently conducted a study based on RNA sequencing of kidney allograft biopsies and found a correlation between the ratio of heterozygous to homozygous SNVs with the rejection phenotype [26]. Moreover, we have shown that DNA methylation could be used to accurately estimate the tissue type composition in recipient urine samples. We found that the largest tissue types present in recipient urine were kidney cells and neutrophils and that donor-specific DNA fraction correlates with the kidney derived cell fraction [27]. However, we restricted the analysis to kidney recipients with urinary tract infection and BK-virus nephropathy only. Most recently, we have identified different gene signatures and pathways associated with two different types of kidney rejection using RNA-seq of urinary cell pellets: acute T cell–mediated rejection and antibody-mediated rejection [13]. Deconvolution analysis showed a higher enrichment of immune cells in urine matched to rejection biopsies compared to urine matched to no-rejection biopsies.

Given that the fraction of donor-specific DNA can be determined using DNA sequencing, we here hypothesize that the recipient-specific DNA fraction in urine correlates with the level of active rejection in the kidney allograft, assuming that recipient-specific DNA originates mostly from allograft-invading immune cells while donor-specific DNA comes from the allograft [28]. Inspired by methods to estimate DNA contamination in sequencing projects [29, 30], we present a cost-effective method to determine the fraction of donor/recipient-specific DNA (denoted α hereafter) in urine by sequencing targeted regions. We demonstrate that the precision of this measure depends on sequencing depth and length of the targeted region. Most importantly, no prior knowledge of donor and recipient relationship is required. To the best of our knowledge, this is the first method for estimating donor/recipient-specific DNA fraction in DNA mixture extracted from kidney graft recipient’s urine. Our method provides an easy way to determine the donor/recipient-specific DNA fraction regardless of donor and recipient gender. We evaluate the applicability of our approach for the detection of kidney transplant rejection. Future applications could be routine tests of urine samples as a reference to adjust and personalize the dosage of immune suppressive drugs in kidney transplant patients.

Materials and methods

Algorithm

The proposed algorithm is inspired by the contamination estimation assessment in DNA sequencing methods [29, 30]. We hypothesize that recipient urine contains a mixture of recipient and donor DNA. Let N be the number of bi-allelic SNVs sequenced from recipient urine DNA and each SNV i is covered by Mi reads. Let giR and giD be the genotype of recipient and donor at the SNV i, respectively. Both giR and giD are unknown. Limiting the analysis on bi-allelic SNVs only leads to three possible genotypes for recipient and donor at each SNV i: giR (giD) = {0, 1, 2} where 0 = homozygous wild type, 1 = heterozygous and 2 = homozygous for the alternative allele. The likelihood of the donor-specific DNA fraction (α) will be:

Lα=i=1NgiRgiDj=1Mieij1-αPbij|giR,eij+αPbij|giD,eijP(eij)PgiRPgiD (1)

Where bij represents the read j covering the SNV i. eij represents the sequencing error of SNV i at the read j: P(eij = 1) = 10-Qij/10 and P(eij = 0) = 1- P(eij = 1). Qij represents the minimum between the base quality of the read j at the position of the variant i and the mapping quality of the read j. The probability of bij conditioned to the recipient (donor) genotype giR (giD) and the sequencing error eij is described in Table 1. Finally, we used the simulated annealing approach together with a grid search to find α that maximizes the likelihood function [31]. The method was implemented in a Python script.

Table 1. The probability of read bij carrying the reference (a), alternative (A) or a different allele (e) conditioned to the recipient (donor) genotype giR (giD) and the sequencing error eij.

giR (giD) = 0 giR (giD) = 1 giR (giD) = 2
eij = 0 eij = 1 eij = 0 eij = 1 eij = 0 eij = 1
P(bij = a) 1 0 1/2 1/6 0 1/3
P(bij = A) 0 1/3 1/2 1/6 1 0
P(bij = e) 0 2/3 0 2/3 0 2/3

As the likelihood function for estimating α requires a balance in alternative/reference allele distribution in heterozygous calls for both recipient giR and donor giD genotypes, very deep recipient urine DNA sequencing will provide this allele balance (Table 2).

Table 2. The probability and the 99% interval confidence of a perfect allele balance in a heterozygous call as a function of depth of coverage M.

A total of 10,000 simulations were performed for each proposed M.

M P(alt/(ref+alt) = 0.5) 99% confidence interval
10 0.24 [0.10, 0.90]
50 0.11 [0.34, 0.66]
100 0.08 [0.39, 0.62]
500 0.18 [0.45, 0.55]
1,000 0.27 [0.46, 0.54]
5,000 0.53 [0.48, 0.52]
10,000 0.69 [0.49, 0.51]

Simulated SNVs based on general population structure

To assess the effect of the number of SNVs (N) and mean depth of coverage (M) on the allele balance and thus the prediction accuracy of the likelihood function, we simulated 2 independent SNV-sets each set containing N common SNVs (minor allele frequency ≥ 5%) and covered on average by M reads. We varied N and M in 35 scenarios where N = {10, 50, 100, 500 and 1,000} and M = {10, 50, 100, 500, 1,000, 5,000 and 10,000}. We merged α reads from set1 and 1-α from set2 randomly generating a combined SNV-set and applied the likelihood function on the combined SNV-set to estimate the observed α. Because L(α) = L(1-α), we restricted 0 ≤ α ≤ 0.5 in steps of 0.01 generating 51 scenarios. A thousand replicates for each scenario and for each α were performed to obtain an empirical distribution.

Sequencing of urinary cell DNA from a pair of healthy individuals

We extracted DNA from whole urinary cell pellet of two healthy individuals; S1, a 30-year-old European woman and S2 a 30-year-old Arab woman using the Qiagen® Allprep Mini Kit (S2 Table in S1 File) extraction kit. We omitted urinary cell-free DNA as it has been shown that urinary fragmented DNA needs suitable preservatives to avoid DNA degradation [32]. The DNA concentration was similar for the two individuals: 35ng/μl. We mixed DNA from S1 and S2 to achieve 5 scenarios: i) 100% from S1; ii) 100% from S2; iii) 90% from S1; and 10% from S2; iv) 70% from S1 and 30% from S2; v) 50% from S1 and 50% from S2, and each scenario was replicated three times. We performed deep targeted DNA sequencing on each replicate. GeneReadDNA Seq Targeted Panels V2; Human Breast Cancer Panel (Qiagen, USA) was used to perform target enrichment by multiplex PCR. The breast cancer panel consists of four primer pools yielding 2,915 amplicons. Briefly, 40ng of each gDNAs was amplified using PCR reagents with 4 primer pool mixes following the manufacturer’s protocol. After the completion of the 4 PCR reactions, the 4 products were combined, and the enriched DNA was purified using Agencourt AMPure XP beads (Beckman Coulter, USA). The concentration and the size of the purified amplicons were determined using Qubit 2.0 Fluerometer dsDNA BR assay kit (LifeTechnologies, USA) and Agilent BioAnalyzer 2100 High-Sensitivity DNA kit (Agilent Technologies, USA). A total amount of 80–160 ng of purified enriched DNA was used as template to generate NGS libraries. The NGS libraries were prepared using NEBNEXT Ultra II DNA Library Prep Kit (New England Biolabs, USA) and NEXTflex DNA Barcodes (Bio Scientific, USA). All library preparation steeps were performed according to the manufacturer’s protocol. The size and quality of the final libraries were analyzed using Agilent BioAnalyzer 2100 with 1000 DNA kit (Agilent Technologies, USA). The quantified libraries were then normalized, pooled, and spiked with 5% PhiX control library (Illumina, USA). Finally, the pooled libraries were sequenced on a single lane of Illumina Hiseq 4000 (Illumina, USA) paired-end 150 bp run.

Obtained reads were aligned to the human genome reference hg19 using bwa [33]. A total of 51,893 bi-allelic SNVs from the Exac project are included in the targeted genomic regions. The method works only on SNVs with different genotypes between donor and recipient. Under Hardy Weinberg assumption, we assessed the probability of having a different genotype for each SNV i as:

PGiDGiR=GD=02GR=02FGiD=GD*FGiR=GR,GDGR (2)
FGiDR=0=piDR2 (3)
FGiDR=1=2*piDR*qiDR (4)
FGiDR=2=qiDR2 (5)

Where GiD and GiR represent the donor and recipient genotype, respectively. piD and piR represent the donor and recipient reference allele frequency, respectively. qiD and qiR represent the donor and recipient alternative allele frequency, respectively.

To avoid allele dropout due to primer annealing region, we filtered out 24,237 SNVs falling at the primer sequencing regions and SNVs carried by reads targeted by primers containing SNVs [34]. From the 27,656 remaining SNVs, we selected the 1,000 most common and applied the likelihood function after filtering out the reads carrying the variant at the last 20 base pairs [35].

Simulated SNVs in pairs of individuals from the same and different ethnicities

We used individuals from the 1,000 Genomes Project phase 3 representing five major populations: AFR, AMR, EAS, EUR and SAS [36]. We randomly selected two individuals and aimed to cover all possible situations: five cases where individual1 and individual 2 belong to the same population and ten cases where individual1 and individual 2 belong to different populations. We extracted the 1,000 SNVs described previously from individual1 and Individual 2 and then merged α reads from individual1 and 1-α from individual 2 generating a combined SNV-set. We varied α from 0 to 0.5 in steps of 0.01 and fixed the mean depth of coverage at M = 5,000. We applied the likelihood function to assess α for each combined SNV-set generated. We repeated the individual selecting process 100 times to obtain an empirical distribution for each situation.

Simulated SNVs in pairs of biological siblings

We performed WGS on Illumina HiSeq 2500 sequencer of 91 Qatari siblings from 27 nuclear families containing at least 2 siblings and up to 7 siblings [37]. Reads were aligned to the hg19 reference genome using bwa [33]. Sequence alignment files were filtered and genotypes were called using the Genome Analysis Tool Kit best practices pipeline; variants were called using HaplotypeCaller [38, 39]. We used Plink identity by state to confirm the familial relationship [40] (S1 Fig in S1 File). We extracted the 1,000 SNVs described previously from each sibling and merged each pair generating 100 combined SNV-sets. We applied the likelihood function to assess α for each combined SNV-set generated by varying α from 0 to 0.5 with a step of 0.01 while the mean depth of coverage was set at M = 5,000.

Donor/recipient-specific DNA fraction in real kidney recipient urine DNA

We studied 32 biopsy-matched urine specimens collected from 26 kidney allograft recipients who were enrolled. The study was approved by the Weill Cornell Medicine Institutional Review Board protocols 1207012730). All patients provided written informed consent. Kidney allograft biopsies were classified as acute rejection (n = 11), acute tubular necrosis (n = 12) and normal histology (n = 9) using the Banff 2017 schema [41] (Table 3 and S2 Table in S1 File). DNA was extracted exclusively from urinary cell pellets and deep targeted DNA sequencing was performed on all samples. Briefly, 50cc of fresh urine was centrifuged at 2,000g for 30 minutes at room temperature and the urinary cell pellet was harvested after removing the supernatant. After washing the urine cell pellet with 1ml PBS, the cells were lysed using 350ul of Buffer RLT from Qiagen® and DNA was isolated from the cell pellet using Allprep DNA/RNA/Protein Mini Kit from Qiagen®. Total DNA was quantified using the NanoDrop Spectrophotometer. DNA sequencing was performed as previously described for the pair of healthy individuals. Obtained reads were aligned to the human genome reference hg19 using bwa [33]. We filtered out low quality reads using an in-house Python script. We applied the likelihood function on the 1,000 SNV-set to estimate the recipient-specific DNA fraction. The nonparametric Kruskal-Wallis test was applied to assess the correlation between observed α and all the diagnosis phenotypes. Dunn’s function was applied to test the pairwise association. R software was used for statistical tests and generating graphs [42].

Table 3. Characteristics of kidney transplant recipients.

Recipient Characteristics Patient Total (N = 26) Patients with AR (N = 8) Patients with ATI (N = 10) Patients with No Pathology (N = 8)
Number of Biopsy Associated Urine Specimens 32 12 11 9
Age, years
 Mean (SD) 45.5 (14) 46.1 (19) 43.8 (14) 47.1 (8)
 Median 43 43 40 49
 Min, Max 25, 80 25, 80 29, 74 34, 59
Gender, N (%)
 Male 40 (100%) 40 (100%) 40 (100%) 40 (100%)
Race, N (%)
 White 8 (31%) 3 (38%) 3 (30%) 2 (25%)
 Black 11 (42%) 4 (50%) 3 (30%) 4 (50%)
 Hispanic 3 (11%) 1 (12%) 0 (0%) 2 (25%)
 Asian 2 (8%) 0 (0%) 2 (20%) 0(0%)
 Mixed 2 (8%) 0 (0%) 2 (20%) 0 (0%)
Cause of ESRD, N (%)
 Diabetes 4 (15%) 1 (13%) 1 (10%) 2 (25%)
 Hypertension 11 (42%) 3 (37%) 5 (50%) 3 (50%)
 Glomerulonephritis 6 (23%) 3 (37%) 1 (10%) 2 (25%)
 Polycystic Kidney Disease 3 (12%) 0 (0%) 2 (20%) 1 (11%)
 Other 2 (8%) 1 (13%) 1 (10%) 0 (0%)
Prior Transplant History, N (%) 2 (8%) 1 (13%) 1 (10%) 0 (0%)
Donor Source, N (%)
 Living 17 (65%) 4 (50%) 7 (70%) 6 (75%)
 Deceased 9 (35%) 4 (50%) 3 (30%) 2 (25%)
Induction Therapy, N (%)
 Antithymocyte globulin 23 (88%) 5 (62%) 10 (100%) 8 (100%)
 IL-2 Receptor Antibody 3 (12%) 3 (38%) 0 (0%) 0 (0%)
Steroid Maintenance Therapy, N (%) 10 (38%) 4 (50%) 2 (20%) 4 (50%)
Time since Transplant to Biopsy, Month, mean (SD) 8.26 (12.32) 9.79 (11.05) 1.56 (1.64) 15.10 (17.20)
Biopsy Creatinine 3.1 (2.4) 2.9 (1.9) 4.4 (2.9) 1.6 (0.25)
One Year Post Biopsy Creatinine 2.2 (1.7) 2.9 (2.9) 2.0 (0.6) 1.8 (0.7)

Ethnicity estimation for donor and recipient

We combined the observed α in the kidney transplant patients with a cost function to predict the genotype of both recipient (giR) and donor (giD) at each SNV i. First, we computed the expected fraction of the alternative to the total allele (reference + alternative) expi for all 9 possible combinations of giR and giD (Table 4). The observed fraction of the alternative to total alleles (obsi) at the SNV i is defined by:

obsi=NumberofreadscarryingthealternativealleleTotalnumberofreadscoveringtheSNVi (6)

Table 4. Expected fraction of the alternative to total alleles (reference + alternative) as a function of observed α.

giR giD expi
0 0 0
0 1 α/2
0 2 α
1 0 1-α2
1 1 1/2
1 2 1+α2
2 0 1—α
2 1 2-α2
2 2 1

Then, we used the cost function to determine giR and giD that minimizes the difference between the 9 expected (expi) and the observed (obsi) fraction of the alternative to total alleles:

Lossα,giR,giD=Mino=1o=9expio-obsi2 (7)

Once giR and giD were estimated, we performed a partial least square analysis (PLS) using 3 subpopulations from the 1,000 Genomes Project African, East Asian and European populations using the mixOmics R package [43] and then predicted the ethnicity of donor and recipient in the real kidney transplant samples. The leave-two-out 1,000-fold cross-validation showed the highest prediction accuracy (81.6%) when using the Yoruba in Nigeria, the Southern Han Chinese and the Toscani in Italy amongst the African, East Asian and European subpopulations (S2 Fig in S1 File). We excluded the American and the South Asian populations because using just 1,000 SNVs is not sufficient to perform a reliable PLS on 5 populations, where the highest cross-validation prediction accuracy was too low at only 54.8% (S2 Fig in S1 File). More relevant, none of the donors or recipients involved in the study belonged to the South Asian or the American populations.

Results

In silico simulation of donor-recipient DNA mixtures

To determine the optimal sequencing parameters, we use numerical simulations. The simulation process is based on generating two different SNV-sets, then merging the two sets with a predefined proportion of each set; α from set 1 and (1-α) from set 2, and then applying a likelihood function (Methods) to estimate this proportion (observed α). Two major parameters affect the estimation of the observed α: the number of sequenced SNVs (N) and the depth of sequencing coverage (M). For a range of parameters N = {10, 50, 100, 500, 1,000} and M = {10, 50, 100, 500, 1,000, 5,000, 10,000} and varying α from 0 to 0.5 in steps of 0.01, we repeated the simulation process for each N x M x α combination 1,000 times to obtain an empirical distribution of observed α (S3 Fig in S1 File).

We computed the maximum error (ε) for each combination N x M over all tested α. ε ranges between 0 (best case where observed α = tested α) and 0.5 (worst case where tested α = 0.5 and observed α = 0 or tested α = 0 and observed α = 0.5) (Fig 1). As expected, our simulations show that increasing both N and M improves the observed α estimation accuracy. Moreover, the estimation of the observed α is unstable when using a small number of SNVs (N < 100) or low coverage (M < 500). The prediction accuracy stabilizes above N > 500 and M >1,000.

Fig 1. Maximum error for detecting the DNA fraction α in a simulated DNA sequencing experiment by varying sequencing depth and number of SNVs.

Fig 1

Maximum absolute (A) and maximum relative (B) errors are represented. A total of 35 scenarios combining five different numbers of SNVs N = {10, 50, 100, 500, 1,000} and seven depth of coverage M = {10, 50, 100, 500, 1,000, 5,000, 10,000} were simulated (S3 Fig in S1 File). Represented here are maximum error observed in 1,000 simulations for every tested α ranging from 0 to 0.5 in steps of 0.01.

Experimental estimation of α using a controlled mixture of urine from two individuals

To assess the accuracy of detecting observed α in a mixture of two real human urine samples, we performed a targeted sequencing of urine DNA from two healthy individuals originating from different populations: S1 a healthy 35-year-old European woman and S2 a healthy 34-year-old Arab woman (S2 Table in S1 File). A total of 1,850 exonic regions from a panel targeting 93 genes known to be associated with risk of breast cancer were sequenced. These sequenced genomic regions cover 370,942 base pairs across 22 chromosomes (S1 Table in S1 File). We chose to use this genomic panel due to its costs and the high number of SNVs present in the targeted genomic regions. Indeed, a total of 51,893 bi-allelic SNVs falling in these genomic regions were present in the Exome Aggregation Consortium (ExAC) [44]. As the method works on bi-allelic SNV with different genotypes between donor and recipient, we computed for each SNV the probability of having different genotypes for two individuals (S1 Table in S1 File). Only 437 SNVs have a probability of having different genotypes for two individuals higher than 10%.

As a measure of quality control, we first checked the balance of reference and alternative alleles in heterozygous calls. The alternative allele frequency is expected to be around 50% in heterozygous genotypes. However, we observed the presence of SNVs with skewed alternative allele frequencies (S4 Fig in S1 File). We noticed the recurrence of such unbalance in every replicate of both samples (S5 Fig in S1 File for examples). We investigated whether the amplification-based strategies for DNA target enrichment affect the allele dropout, causing the skewed alternative allele distribution. We found that the SNVs with a skewed distribution all fall into the primer sequence regions or carried by reads targeted by primers containing SNVs. We therefore filtered out SNVs falling into these regions and kept the 1,000 most common SNVs in the general population. These 1,000 SNVs will be used as a SNV panel for detecting DNA fraction in a combination of two DNA sources (observed α) in the rest of the study. The alternative allele frequency was balanced in these 1,000 SNVs (S6 Fig in S1 File). Moreover, the maximum error of estimating the observed α based on these 1,000 SNVs in all replicates was < 0.0034 (mean error = 0.0028 ± 0.00037). We then mixed 90% DNA from S1 and 10% DNA from S2 in three replicates. For each replicate, targeted DNA sequencing was performed and the observed α was estimated. The preparation of the mixture was based on total DNA content in the samples. However, the presence of bacterial DNA in urine samples can strongly skew the estimation of human DNA concentration measurement [45]. We assessed the actual DNA concentration of S1 and S2 in urine by considering the mean observed α over the three replicates to 0.053. This indicates that S1 DNA concentration is ~19 times lower than S2 DNA concentration. Considering the estimated S1 and S2 DNA concentration, the maximum error of the observed α was < 3.5% in the three replicates (Fig 2).

Fig 2. Estimation of DNA fraction (Alpha) in a combination of two healthy DNA sources.

Fig 2

Five scenarios of DNA mixtures and three replicates for each scenario were performed. From left to right: 100% from individual S1 and 0% from individual S2; 0% from individual S1 and 100% from individual 2; 50% from individual 1 and 50% from individual 2; 70% from individual S1 and 30% from individual S2; 90% from individual S1 and 10% from individual S2. The estimated fractions (estimated α) are represented by black dots. The expected fractions when DNA concentration in individual S1 was 19 times lower than DNA concentration in individual S2 are represented by red dots. The expected fractions before correction for DNA concentration are represented by blue dots.

We extended the analysis to two levels of DNA mixture scenarios: (i) 70% DNA from S1 and 30% DNA from S2, (ii) 50% DNA from S1 and 50% DNA from S2. Each scenario was replicated three times and targeted DNA sequencing was performed for each replicate. The observed α was similar in the three replicates of all three scenarios (scenario i: mean observed α = 0.11 ± 0.036, scenario ii: mean observed α = 0.032 ± 0.00048). Considering the estimated S1 and S2 DNA concentration, the maximum error of the observed α was < 3.8% in all replicates of both scenarios (0.037 in scenario (i) and 0.018 in scenario (ii)) (Fig 2).

Simulation of the effect of family relationship and ethnicity on the estimation of α

The most challenging scenario is that of one sibling donating a kidney to another, as they share 50% of their genome. The extreme case of mono-zygotic twins, where both genomes are identical, can of course not be addressed with our method. To numerically explore this "worst case" scenario, we used whole genome sequencing data from 91 siblings [37] and then generated 100 combinations of every sibling pair. Self-reported relationship was confirmed using the identity by state (S1 Fig in S1 File). For each pair of siblings, we simulated donor and recipient DNA sequences by varying α from 0 to 0.5 in steps of 0.01 and resampling the mean coverage at 5,000 reads. The maximum absolute error was observed when the expected α = 0.07: observed α = 0.034 (Fig 3).

Fig 3. Effect of family relationship and ethnicity on detecting DNA fraction in a combination of two DNA sources.

Fig 3

Each dot represents in A) the maximum absolute error and in B) the maximum relative error for each expected (α) from 0 to 0.5 in steps of 0.01 over 100 pairs of siblings (red), 100 pairs of individuals belonging to the same population (green) and 100 pairs of individuals belonging to different populations (blue). Afr = Africans. Amr = Americans. Eas = East Asians. Eur = Europeans. Sas = South Asians.

Simulation of the effect of population origin on the estimation of α

Using the same methods as when comparing siblings, we then assessed the effect of donor and recipient ethnicity on our method. We applied our method to simulated pairs of individuals belonging to the same and to different populations of the 1,000 genomes project [36]: Africans, Americans, East Asians, Europeans and South Asians. The absolute error was < 0.04 in all scenarios (Fig 3). As expected, the absolute error was lower when the two DNA sources belonged to different populations (mean maximum absolute error = 0.018 ± 0.005) than when they belonged to the same population (mean maximum absolute error = 0.022 ± 0.007). Additionally, the maximum relative error was comparable in all scenarios (mean maximum relative error = 0.836 ± 0.137) whether the two DNA sources belonged to the same (mean maximum relative error = 0.764 ± 0.207) or different populations (mean maximum relative error = 0.856 ± 0.077). These results confirm the power of the method for detecting the DNA fraction in a combination of two DNA sources independent of familial relationship or ethnicity.

Application to urine samples from clinical kidney allograft patients

To test our method in a real-case scenario, we used DNA extracted from 32 urine samples matched to 32 biopsies from 26 kidney allograft recipients and classified the urine samples into three groups based on their Banff classification of kidney alllograft biopsy: “Acute Tubular Injury” (ATI, N = 12), “Acute Rejection” (AR, N = 11) and “No Observed Pathology” (N = 9) (Table 3 and S2 Table in S1 File). DNA was extracted from the urinary cells and deep targeted sequencing was performed for the 32 samples. Reflecting the effect of depth of coverage on the accuracy of detecting observed α in simulated data, we set the mean depth of coverage to ~ 14,000 reads. After read alignment, we applied our method to estimate the donor/recipient to total DNA fraction (Fig 4).

Fig 4. Donor/recipient to total DNA fraction in urine from 32 real kidney allograft recipients.

Fig 4

Box plots and individual data points of the estimated fraction (observed α) are estimated from deep DNA targeted sequencing of urinary cells. AR: Acute Rejection. ATI: Acute Tubular Injury. A statistically significant difference was observed between all the diagnostic categories (P = 0.035, Kruskal-Wallis test). By Dunn’s test, difference in observed α between the two pathologies and no pathology group was statistically significant: ATI vs no-pathology: P = 0.0064 and AR vs no-pathology: P = 0.026. The pairwise comparison of AR and ATI pathologies was not statistically significant (P > 0.05).

The difference of observed α between the diagnosis phenotypes was statistically significant (P = 0.035, Kruskal-Wallis test). We observed a significant difference when comparing the two transplant kidney pathologies ATI and AR to the No Pathology group (P = 0.0064 and P = 0.026, Dunn’s test for ATI vs no pathology and AR vs no pathology, respectively). However, no significant difference was observed in observed α when comparing the two pathologies ATI to AR (P = 0.31, Dunn’s test).

Inference of donor and recipient ethnic origin

In the absence of donor and recipient genomes, it is impossible to determine whether the observed α represents the donor or the recipient fraction of the total DNA. However, in cases in which recipient and donor gender or ethnicity differ, this issue can be addressed. The urinary cell DNA sequencing we performed here did not target genomic regions of the Y chromosome. Thus, detecting recipient and donor gender cannot be carried out using the actual data, but could be accomplished/carried out easily in future sequencing panels.

To predict donor and recipient ethnicity, an estimation of both recipient and donor genotypes is needed. For each of 1,000 SNVs, we computed the fraction of the alternative to total alleles. We then used the observed α to compute the nine expected fractions of the alternative allele (Table 4). We then used a cost function to estimate donor and recipient genotypes that minimizes the difference between the nine expected fractions and the observed fraction of the alternative to total alleles. Based on these estimated genotypes, we applied a supervised classification method to predict the recipient and the donor ethnicity as following: as donor-specific DNA fraction has been shown to be higher in the no-pathology group [28], we supposed the observed α to represent the donor-to-total DNA fraction and computed the probability of donor and recipient belonging to one of the three populations: African, East Asian and European (see Methods). Both donor and recipient are assigned to the population showing the highest probability and then compared to the self-reported ethnicity. Seven recipients and eight donors were excluded from the prediction because they belong to a mixed self-reported population or the observed α was ~ 0 so the prediction of donor genotypes was impossible. The prediction was inconclusive (Probability of prediction < 70%) for 5 recipients and 8 donors. In 16 of 20 recipients (80%) and 15 of 16 donors (94%), the probability of prediction was higher than 70%. However, only one AR sample and one ATI sample, for which the prediction was conclusive, had donor and recipient ethnicity mismatch. In these two samples (European donor and African recipient for both samples), the prediction was in agreement with the self-reported ethnicity. Hence, due to the small number of self-reported ethnicity mismatches, it is impossible to confirm whether the observed α represents the donor or the recipient DNA fraction (as observed α < 0.5 by definition).

Discussion

Different omics technologies, including mRNA measurement by PCR [7], metabolomics [12] and RNA-sequencing [13] have been applied by our group and others to identify non-invasive biomarkers for kidney allograft rejection. Here, we present a new approach based on targeted deep- sequencing of DNA obtained from urine samples from kidney allograft recipients. We extended methods originally used for the assessment of DNA contamination to estimate the fraction of recipient DNA in a two-source mixed DNA sample [29, 30]. We used in silico simulations to obtain a suitable parameter range for the method to be sufficiently accurate in estimating the fraction of a two-source DNA mixture. We then experimentally evaluated the accuracy of the estimation method using controlled mixtures of two DNA sources. Allele drop-out occurs in amplification-based target enrichment when a variant is located in a primer region and prevents primer hybridization, leading to failed amplification and allele bias [34]. Our method overcomes these unexpected artefacts due to DNA sequencing. Other algorithms for estimating the donor-specific DNA fraction require the donor and recipient relationship information [22]. Here, we found that ethnicity and familial relationship between donor and recipient appear to have a lower impact as compared to previously presented methods.

We tested our method on clinical samples from patients with and without kidney allograft rejection events. We compared the α value obtained from urine DNA sequencing reads of kidney allograft recipients with kidney injury associated with AR and ATI. The alpha value was significantly different in patients with AR and ATI compared to those without kidney allograft pathology. The calculation of alpha is based on the assumption that the DNA isolated from the urine is derived from the transplanted kidney and that both the recipient and the donor DNA are present: recipient DNA from the infiltrating immune cells and donor DNA from the kidney parenchymal cells. Indeed, by counting Y chromosome-derived cell free DNA, we have recently shown that in kidney recipients with donor-recipient gender mismatch the donor-specific DNA fraction was lower in recipients with UTI compared to those with no UTI, and higher in recipients with BKVN compared to those with no BKVN [28]. Thus, our approach might be considered as a potential new diagnostic signature measured in urine specimens.

We were not able to ascertain whether the DNA in recipient urine is derived mostly from the donor or the recipient. Studies have shown that both AR and ATI are associated with allograft damage indicating that there will be some donor DNA in the urine. But AR is also associated with recipient immune cell infiltration while ATI is usually not [46]. Tissue injury from ATI however could be associated inflammation. The fraction of recipient to donor cells in the urine should be higher for AR compared to ATI and the fraction of donor cell to recipient cells in the urine should be higher for ATI compared to AR. Thus, AR patients should have a fraction of donor to recipient DNA of much lower than 0.5 and ATI patients should have a fraction of donor to recipient DNA of much greater than 0.5. To address this, a future complementary analysis on a bigger sample having donor and recipient ethnicity and/or gender mismatches will be worthy of investigation.

Supporting information

S1 File

(ZIP)

Data Availability

Data cannot be shared publicly because the Weill Cornell Medicine ethics committee has imposed restrictions on sharing a de-identified data set. Data are available from the Weill Cornell Medicine Institutional Data Access / Ethics Committee (contact via irb@med.cornell.edu) for researchers who meet the criteria for access to confidential data. The data underlying the results presented in the study are available from irb@med.cornell.edu.

Funding Statement

This work was supported by the Biomedical Research Program at Weill Cornell Medicine–Qatar, a program funded by the Qatar Foundation (https://www.qf.org.qa) [to AB, GT, AH, Sa A, Si A, Y M, J M and K S], the National Priority Research program of the Qatar National research fund (https://www.qnrf.org/en-us/) [grant # NPRP12S-0227-190173 to AB and MS], the National Institute of Health (https://www.nih.gov) [grants # K08DK087824 and R37AI051652 to MS], The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Hume DM, Merrill JP, Miller BF, Thorn GW. Experiences with renal homotransplantation in the human: report of nine cases. J Clin Invest. 1955;34: 327–382. 10.1172/JCI103085 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Tsai M-K, Wu F-LL, Lai I-R, Lee C-Y, Hu R-H, Lee P-H. Decreased acute rejection and improved renal allograft survival using sirolimus and low-dose calcineurin inhibitors without induction therapy. Int J Artif Organs. 2009;32: 371–380. 10.1177/039139880903200608 [DOI] [PubMed] [Google Scholar]
  • 3.Nankivell BJ, Kuypers DRJ. Diagnosis and prevention of chronic kidney allograft loss. Lancet Lond Engl. 2011;378: 1428–1437. 10.1016/S0140-6736(11)60699-5 [DOI] [PubMed] [Google Scholar]
  • 4.Jalalzadeh M, Mousavinasab N, Peyrovi S, Ghadiani MH. The impact of acute rejection in kidney transplantation on long-term allograft and patient outcome. Nephro-Urol Mon. 2015;7: e24439. 10.5812/numonthly.24439 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.McDonald S, Russ G, Campbell S, Chadban S. Kidney transplant rejection in Australia and New Zealand: relationships between rejection and graft outcome. Am J Transplant Off J Am Soc Transplant Am Soc Transpl Surg. 2007;7: 1201–1208. 10.1111/j.1600-6143.2007.01759.x [DOI] [PubMed] [Google Scholar]
  • 6.Rush D. Can protocol biopsy better inform our choices in renal transplantation? Transplant Proc. 2009;41: S6–8. 10.1016/j.transproceed.2009.06.092 [DOI] [PubMed] [Google Scholar]
  • 7.Suthanthiran M, Schwartz JE, Ding R, Abecassis M, Dadhania D, Samstein B, et al. Urinary-cell mRNA profile and acute cellular rejection in kidney allografts. N Engl J Med. 2013;369: 20–31. 10.1056/NEJMoa1215555 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Li L, Khatri P, Sigdel TK, Tran T, Ying L, Vitalone MJ, et al. A peripheral blood diagnostic test for acute rejection in renal transplantation. Am J Transplant Off J Am Soc Transplant Am Soc Transpl Surg. 2012;12: 2710–2718. 10.1111/j.1600-6143.2012.04253.x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Nakorchevsky A, Hewel JA, Kurian SM, Mondala TS, Campbell D, Head SR, et al. Molecular mechanisms of chronic kidney transplant rejection via large-scale proteogenomic analysis of tissue biopsies. J Am Soc Nephrol JASN. 2010;21: 362–373. 10.1681/ASN.2009060628 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Sigdel TK, Kaushal A, Gritsenko M, Norbeck AD, Qian W-J, Xiao W, et al. Shotgun proteomics identifies proteins specific for acute renal transplant rejection. Proteomics Clin Appl. 2010;4: 32–47. 10.1002/prca.200900124 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Ling XB, Sigdel TK, Lau K, Ying L, Lau I, Schilling J, et al. Integrative urinary peptidomics in renal transplantation identifies biomarkers for acute rejection. J Am Soc Nephrol JASN. 2010;21: 646–653. 10.1681/ASN.2009080876 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Suhre K, Schwartz JE, Sharma VK, Chen Q, Lee JR, Muthukumar T, et al. Urine Metabolite Profiles Predictive of Human Kidney Allograft Status. J Am Soc Nephrol JASN. 2016;27: 626–636. 10.1681/ASN.2015010107 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Verma A, Muthukumar T, Yang H, Lubetzky M, Cassidy MF, Lee JR, et al. Urinary cell transcriptomics and acute rejection in human kidney allografts. JCI Insight. 2020;5. 10.1172/jci.insight.131552 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Lo YM, Tein MS, Pang CC, Yeung CK, Tong KL, Hjelm NM. Presence of donor-specific DNA in plasma of kidney and liver-transplant recipients. Lancet Lond Engl. 1998;351: 1329–1330. 10.1016/s0140-6736(05)79055-3 [DOI] [PubMed] [Google Scholar]
  • 15.García Moreira V, Prieto García B, Baltar Martín JM, Ortega Suárez F, Alvarez FV. Cell-free DNA as a noninvasive acute rejection marker in renal transplantation. Clin Chem. 2009;55: 1958–1966. 10.1373/clinchem.2009.129072 [DOI] [PubMed] [Google Scholar]
  • 16.Sigdel TK, Vitalone MJ, Tran TQ, Dai H, Hsieh S-C, Salvatierra O, et al. A rapid noninvasive assay for the detection of renal transplant injury. Transplantation. 2013;96: 97–101. 10.1097/TP.0b013e318295ee5a [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Macher HC, Suárez-Artacho G, Guerrero JM, Gómez-Bravo MA, Álvarez-Gómez S, Bernal-Bellido C, et al. Monitoring of transplanted liver health by quantification of organ-specific genomic marker in circulating DNA from receptor. PloS One. 2014;9: e113987. 10.1371/journal.pone.0113987 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Snyder TM, Khush KK, Valantine HA, Quake SR. Universal noninvasive detection of solid organ transplant rejection. Proc Natl Acad Sci U S A. 2011;108: 6229–6234. 10.1073/pnas.1013924108 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.De Vlaminck I, Valantine HA, Snyder TM, Strehl C, Cohen G, Luikart H, et al. Circulating cell-free DNA enables noninvasive diagnosis of heart transplant rejection. Sci Transl Med. 2014;6: 241ra77. 10.1126/scitranslmed.3007803 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Hidestrand M, Tomita-Mitchell A, Hidestrand PM, Oliphant A, Goetsch M, Stamm K, et al. Highly sensitive noninvasive cardiac transplant rejection monitoring using targeted quantification of donor-specific cell-free deoxyribonucleic acid. J Am Coll Cardiol. 2014;63: 1224–1226. 10.1016/j.jacc.2013.09.029 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Gordon PMK, Khan A, Sajid U, Chang N, Suresh V, Dimnik L, et al. An Algorithm Measuring Donor Cell-Free DNA in Plasma of Cellular and Solid Organ Transplant Recipients That Does Not Require Donor or Recipient Genotyping. Front Cardiovasc Med. 2016;3: 33. 10.3389/fcvm.2016.00033 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Grskovic M, Hiller DJ, Eubank LA, Sninsky JJ, Christopherson C, Collins JP, et al. Validation of a Clinical-Grade Assay to Measure Donor-Derived Cell-Free DNA in Solid Organ Transplant Recipients. J Mol Diagn JMD. 2016;18: 890–902. 10.1016/j.jmoldx.2016.07.003 [DOI] [PubMed] [Google Scholar]
  • 23.Bloom RD, Bromberg JS, Poggio ED, Bunnapradist S, Langone AJ, Sood P, et al. Cell-Free DNA and Active Rejection in Kidney Allografts. J Am Soc Nephrol JASN. 2017. 10.1681/ASN.2016091034 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Sharon E, Shi H, Kharbanda S, Koh W, Martin LR, Khush KK, et al. Quantification of transplant-derived circulating cell-free DNA in absence of a donor genotype. PLoS Comput Biol. 2017;13: e1005629. 10.1371/journal.pcbi.1005629 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Zhong XY, Hahn D, Troeger C, Klemm A, Stein G, Thomson P, et al. Cell-free DNA in urine: a marker for kidney graft rejection, but not for prenatal diagnosis? Ann N Y Acad Sci. 2001;945: 250–257. [PubMed] [Google Scholar]
  • 26.Thareja G, Yang H, Hayat S, Mueller FB, Lee JR, Lubetzky M, et al. Single nucleotide variant counts computed from RNA sequencing and cellular traffic into human kidney allografts. Am J Transplant Off J Am Soc Transplant Am Soc Transpl Surg. 2018;18: 2429–2442. 10.1111/ajt.14870 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Cheng AP, Burnham P, Lee JR, Cheng MP, Suthanthiran M, Dadhania D, et al. A cell-free DNA metagenomic sequencing assay that integrates the host injury response to infection. Proc Natl Acad Sci U S A. 2019;116: 18738–18744. 10.1073/pnas.1906320116 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Burnham P, Dadhania D, Heyang M, Chen F, Westblade LF, Suthanthiran M, et al. Urinary cell-free DNA is a versatile analyte for monitoring infections of the urinary tract. Nat Commun. 2018;9: 2412. 10.1038/s41467-018-04745-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Jun G, Flickinger M, Hetrick KN, Romm JM, Doheny KF, Abecasis GR, et al. Detecting and estimating contamination of human DNA samples in sequencing and array-based genotype data. Am J Hum Genet. 2012;91: 839–848. 10.1016/j.ajhg.2012.09.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Flickinger M, Jun G, Abecasis GR, Boehnke M, Kang HM. Correcting for Sample Contamination in Genotype Calling of DNA Sequence Data. Am J Hum Genet. 2015;97: 284–290. 10.1016/j.ajhg.2015.07.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Kirkpatrick S, Gelatt CD, Vecchi MP. Optimization by simulated annealing. Science. 1983;220: 671–680. 10.1126/science.220.4598.671 [DOI] [PubMed] [Google Scholar]
  • 32.Augustus E, Van Casteren K, Sorber L, van Dam P, Roeyen G, Peeters M, et al. The art of obtaining a high yield of cell-free DNA from urine. PloS One. 2020;15: e0231058. 10.1371/journal.pone.0231058 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Li H, Durbin R. Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinforma Oxf Engl. 2010;26: 589–595. 10.1093/bioinformatics/btp698 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Gray PN, Dunlop CLM, Elliott AM. Not All Next Generation Sequencing Diagnostics are Created Equal: Understanding the Nuances of Solid Tumor Assay Design for Somatic Mutation Detection. Cancers. 2015;7: 1313–1332. 10.3390/cancers7030837 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Kwon S, Lee B, Yoon S. CASPER: context-aware scheme for paired-end reads from high-throughput amplicon sequencing. BMC Bioinformatics. 2014;15 Suppl 9: S10. 10.1186/1471-2105-15-S9-S10 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.1000 Genomes Project Consortium, Auton A, Brooks LD, Durbin RM, Garrison EP, Kang HM, et al. A global reference for human genetic variation. Nature. 2015;526: 68–74. 10.1038/nature15393 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Kumar P, Al-Shafai M, Al Muftah WA, Chalhoub N, Elsaid MF, Aleem AA, et al. Evaluation of SNP calling using single and multiple-sample calling algorithms by validation against array base genotyping and Mendelian inheritance. BMC Res Notes. 2014;7: 747. 10.1186/1756-0500-7-747 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20: 1297–1303. 10.1101/gr.107524.110 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Belkadi A, Bolze A, Itan Y, Cobat A, Vincent QB, Antipenko A, et al. Whole-genome sequencing is more powerful than whole-exome sequencing for detecting exome variants. Proc Natl Acad Sci U S A. 2015;112: 5473–5478. 10.1073/pnas.1418631112 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007;81: 559–575. 10.1086/519795 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Roufosse C, Simmonds N, Clahsen-van Groningen M, Haas M, Henriksen KJ, Horsfield C, et al. A 2018 Reference Guide to the Banff Classification of Renal Allograft Pathology. Transplantation. 2018;102: 1795–1814. 10.1097/TP.0000000000002366 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.R: a language and environment for statistical computing. https://www.gbif.org/tool/81287/r-a-language-and-environment-for-statistical-computing
  • 43.Rohart F, Gautier B, Singh A, Lê Cao K-A. mixOmics: An R package for ‘omics feature selection and multiple data integration. PLoS Comput Biol. 2017;13: e1005752. 10.1371/journal.pcbi.1005752 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Fu W, O’Connor TD, Jun G, Kang HM, Abecasis G, Leal SM, et al. Analysis of 6,515 exomes reveals the recent origin of most human protein-coding variants. Nature. 2013;493: 216–220. 10.1038/nature11690 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Li X, Wu Y, Zhang L, Cao Y, Li Y, Li J, et al. Comparison of three common DNA concentration measurement methods. Anal Biochem. 2014;451: 18–24. 10.1016/j.ab.2014.01.016 [DOI] [PubMed] [Google Scholar]
  • 46.Olsen S, Burdick JF, Keown PA, Wallace AC, Racusen LC, Solez K. Primary acute renal failure (“acute tubular necrosis”) in the transplanted kidney: morphology and pathogenesis. Medicine (Baltimore). 1989;68: 173–187. 10.1097/00005792-198905000-00005 [DOI] [PubMed] [Google Scholar]

Decision Letter 0

Stanislaw Stepkowski

Transfer Alert

This paper was transferred from another journal. As a result, its full editorial history (including decision letters, peer reviews and author responses) may not be present.

11 Feb 2021

PONE-D-20-40848

Deep sequencing of DNA from urine of kidney allograft recipients to estimate the donor-specific DNA fraction

PLOS ONE

Dear Dr. Belkadi,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

==============================

he authors need to address all reviewer's comments:

Reviewer # 1:

The authors have a creative concept that the degree of kidney allograft rejection can be estimated by deep sequencing DNA found in recipient urine.  An analysis of the SNV frequencies is then proposed as a means of predicting the relative contributions of donor versus recipient DNA; the hypothesis is that, with increasing recipient DNA contribution, one can infer an increased proportion of responding immune cells and therefore a higher likelihood of allograft rejection.

The concept seems promising, but I fear that the authors gloss over some significant caveats to this method. The first is the assumption that recipient-specific DNA originates mostly from tissue-invading immune cells; as the authors note, they were able to previously perform such an analysis on kidney recipients with urinary tract infections and showed lower proportions of donor DNA when individuals were diagnosed with UTI (compared to individuals without). This does seem to indicate that recipient-specific DNA can come from recipient immune cells responding to a urinary tract infection. In a clinical setting, do you anticipate being able to tell the difference between a low donor-specific fraction (as a result of low rejection) and a low donor-specific fraction (as a result of competition by infiltrating immune cells due to infection)?  This would be particularly relevant for subclinical UTIs.

Secondly, I would caution the authors against stating that they can determine the donor-specific DNA fraction, given that this method is unable to determine which DNA specimen is from which individual. I agree with their interpretation that the fraction of recipient to donor cells SHOULD be higher for AR than for ATI (and for no-pathology specimens), however, as they state, that future complementary study would be necessary.

Specific questions about the procedure:

1)     Could the authors address why they chose to use the Breast cancer risk panel rather than other available genomic sequencing panels?

2)     Does this method use cell free DNA, or does it include a cell lysis step? Are the authors concerned about the relative proportions of donor/recipient DNA in the cell free vs. cellular fractions?

3)     Regarding DNA target enrichment due to skewed amplification, I am curious about the SNVs that are NOT in the primer regions. I presume that, if one of the amplification primers fails to bind effectively due to a SNV, it affects the results not just of that SNV but of all SNVs within that amplicon? I understand and wholeheartedly agree with the authors’ decision to exclude SNVs falling within primer sequence regions; I wonder if the quantification of other SNVs is negatively affected by these binding issues as well.

I appreciate the commitment to patient privacy, and do not expect the authors to share specific point mutations for their study subjects.  However, the manuscript would benefit from the inclusion of some additional data, for example, concentrations of extracted DNA both before and after library preparation, specific numbers for the donor-specific fraction of each tested patient, etc.

Overall, the authors should be commended on a well-written paper that clearly expresses their concepts and logical process. Some sentences are awkwardly written and could benefit from a second look, but it doesn’t reach the level of incomprehension

==============================

Please submit your revised manuscript by Mar 22 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

We look forward to receiving your revised manuscript.

Kind regards,

Stanislaw Stepkowski

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1) Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2) Your ethics statement should only appear in the Methods section of your manuscript. If your ethics statement is written in any section besides the Methods, please move it to the Methods section and delete it from any other section. Please ensure that your ethics statement is included in your manuscript, as the ethics statement entered into the online submission form will not be published alongside your manuscript.

3) In line with PLOS' guidelines on detailed reporting (https://journals.plos.org/plosone/s/criteria-for-publication#loc-3), please ensure you have provided sufficient detail on participant recruitment.

4) Please improve statistical reporting and report exact p-values for all values greater than or equal to 0.001. P-values less than 0.001 should be expressed as p < 0.001. Our statistical reporting guidelines are available at https://journals.plos.org/plosone/s/submission-guidelines#loc-statistical-reporting.

5)  We note that the grant information you provided in the ‘Funding Information’ and ‘Financial Disclosure’ sections do not match.

When you resubmit, please ensure that you provide the correct grant numbers for the awards you received for your study in the ‘Funding Information’ section.

6) We note that you have indicated that data from this study are available upon request. PLOS only allows data to be available upon request if there are legal or ethical restrictions on sharing data publicly. For information on unacceptable data access restrictions, please see http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions.

In your revised cover letter, please address the following prompts:

a) If there are ethical or legal restrictions on sharing a de-identified data set, please explain them in detail (e.g., data contain potentially identifying or sensitive patient information) and who has imposed them (e.g., an ethics committee). Please also provide contact information for a data access committee, ethics committee, or other institutional body to which data requests may be sent.

b) If there are no restrictions, please upload the minimal anonymized data set necessary to replicate your study findings as either Supporting Information files or to a stable, public repository and provide us with the relevant URLs, DOIs, or accession numbers. Please see http://www.bmj.com/content/340/bmj.c181.long for guidelines on how to de-identify and prepare clinical data for publication. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories.

We will update your Data Availability statement on your behalf to reflect the information you provide.

7) Please include captions for your Supporting Information files at the end of your manuscript, and update any in-text citations to match accordingly. Please see our Supporting Information guidelines for more information: http://journals.plos.org/plosone/s/supporting-information.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Partly

Reviewer #2: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: I Don't Know

Reviewer #2: I Don't Know

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: No

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The authors have a creative concept that the degree of kidney allograft rejection can be estimated by deep sequencing DNA found in recipient urine. An analysis of the SNV frequencies is then proposed as a means of predicting the relative contributions of donor versus recipient DNA; the hypothesis is that, with increasing recipient DNA contribution, one can infer an increased proportion of responding immune cells and therefore a higher likelihood of allograft rejection.

The concept seems promising, but I fear that the authors gloss over some significant caveats to this method. The first is the assumption that recipient-specific DNA originates mostly from tissue-invading immune cells; as the authors note, they were able to previously perform such an analysis on kidney recipients with urinary tract infections and showed lower proportions of donor DNA when individuals were diagnosed with UTI (compared to individuals without). This does seem to indicate that recipient-specific DNA can come from recipient immune cells responding to a urinary tract infection. In a clinical setting, do you anticipate being able to tell the difference between a low donor-specific fraction (as a result of low rejection) and a low donor-specific fraction (as a result of competition by infiltrating immune cells due to infection)? This would be particularly relevant for subclinical UTIs.

Secondly, I would caution the authors against stating that they can determine the donor-specific DNA fraction, given that this method is unable to determine which DNA specimen is from which individual. I agree with their interpretation that the fraction of recipient to donor cells SHOULD be higher for AR than for ATI (and for no-pathology specimens), however, as they state, that future complementary study would be necessary.

Specific questions about the procedure:

1) Could the authors address why they chose to use the Breast cancer risk panel rather than other available genomic sequencing panels?

2) Does this method use cell free DNA, or does it include a cell lysis step? Are the authors concerned about the relative proportions of donor/recipient DNA in the cell free vs. cellular fractions?

3) Regarding DNA target enrichment due to skewed amplification, I am curious about the SNVs that are NOT in the primer regions. I presume that, if one of the amplification primers fails to bind effectively due to a SNV, it affects the results not just of that SNV but of all SNVs within that amplicon? I understand and wholeheartedly agree with the authors’ decision to exclude SNVs falling within primer sequence regions; I wonder if the quantification of other SNVs is negatively affected by these binding issues as well.

I appreciate the commitment to patient privacy, and do not expect the authors to share specific point mutations for their study subjects. However, the manuscript would benefit from the inclusion of some additional data, for example, concentrations of extracted DNA both before and after library preparation, specific numbers for the donor-specific fraction of each tested patient, etc.

Overall, the authors should be commended on a well-written paper that clearly expresses their concepts and logical process. Some sentences are awkwardly written and could benefit from a second look, but it doesn’t reach the level of incomprehension.

Reviewer #2: I'm really sorry,but I must say that I'm not competent to review this paper despite the the fact that researchfield is my field (kidney) and the paper seems very interesting (that's why I accepted to be reviewer), but when I read the methods and results parts, there is too much informatic and mathematic for me.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2021 Apr 15;16(4):e0249930. doi: 10.1371/journal.pone.0249930.r002

Author response to Decision Letter 0


23 Mar 2021

Comments from Reviewer #1

Comment 0-1: The first is the assumption that recipient-specific DNA originates mostly from tissue-invading immune cells; as the authors note, they were able to previously perform such an analysis on kidney recipients with urinary tract infections and showed lower proportions of donor DNA when individuals were diagnosed with UTI (compared to individuals without). This does seem to indicate that recipient-specific DNA can come from recipient immune cells responding to a urinary tract infection. In a clinical setting, do you anticipate being able to tell the difference between a low donor-specific fraction (as a result of low rejection) and a low donor-specific fraction (as a result of competition by infiltrating immune cells due to infection)? This would be particularly relevant for subclinical UTIs.

Response: Agree. Thank you for this interesting comment. In a clinical situation like UTI, immune cells detected in urine could indeed be originated from recipient’s reaction to the infection. Based on the present observations only, we cannot state the origin of a low donor-specific DNA fraction, especially in UTI patients. Because we did not recruit patients with UTI in this study, and deconvolution methods -like our previous paper mentioned by the reviewer- to estimate cell-type composition work on RNA-seq or methylation but not on DNA-seq data. Further investigations on more phenotypes including UTI and using state-of-the art single cell RNA seq will help to address the reviewer’s insightful comment.

Comment 0-2: Secondly, I would caution the authors against stating that they can determine the donor-specific DNA fraction, given that this method is unable to determine which DNA specimen is from which individual. I agree with their interpretation that the fraction of recipient to donor cells SHOULD be higher for AR than for ATI (and for no-pathology specimens), however, as they state, that future complementary study would be necessary.

Response: We thank the reviewer for the important point. `we have updated the manuscript by changing “donor-specific” to donor/recipient specific”

Comment 0-3: I appreciate the commitment to patient privacy, and do not expect the authors to share specific point mutations for their study subjects. However, the manuscript would benefit from the inclusion of some additional data, for example, concentrations of extracted DNA both before and after library preparation, specific numbers for the donor-specific fraction of each tested patient, etc.

Response: Thank you for pointing this out. We added a Supplementary Table 2 showing the DNA concentration and the observed alpha for each sample included in the analysis.

Comment 0-4: Overall, the authors should be commended on a well-written paper that clearly expresses their concepts and logical process. Some sentences are awkwardly written and could benefit from a second look, but it doesn’t reach the level of Incomprehension

Response: Thank you for your comment. The manuscript has been edited by an expert scientific writer.

Comment 1: Could the authors address why they chose to use the Breast cancer risk panel rather than other available genomic sequencing panels?

Response: Thank you for pointing this out. Our idea was to estimate donor (recipient) – specific DNA fraction in recipient urine. Our hypothesis was that any large targeted genomic regions will present different alleles between donor and recipients (except for monozygotic twins) that could be used to compute the likelihood function. The breast cancer risk panel covered a large genomic region that display this characteristic. Additionally, the costs of this panel corroborated our choice. Furthermore, we have extensive experience working on the breast cancer risk panel in our lab. Nonetheless, we agree with the reviewer that any other DNA-sequencing panel targeting sufficient large and variable (rich in single nucleotide variants) genomic regions will probably give similar results. We have updated the manuscript according to the reviewer’s comment.

Comment 2: Does this method use cell free DNA, or does it include a cell lysis step? Are the authors concerned about the relative proportions of donor/recipient DNA in the cell free vs. cellular fractions?

Response: We thank the reviewer for this comment. We included in our analysis only DNA coming from whole urinary pallets. The pellet DNA has a higher quality as compared to fragmented DNA. For example, the urinary cell-free DNA has a shorter half-life than blood cell-free DNA [27317895]. Additionally, the loss of urinary cell-free DNA often occurs [32325682]. Despite the various commercial stabilization and preservation solutions that were developed and used for urinary cell-free DNA applications [33207777], studies evaluating the efficacy of these commercial products are largely missing. Finally, it has been shown that urinary single-cell DNA needs quasi-immediate sample storage conditions and suitable preservative to avoid cell lysis and DNA degradation [32251424]. For all these reasons, we decided to run urinary pellet DNA sequencing. We agree with the reviewer that it would be interesting to estimate donor (recipient) -specific DNA fraction in urine cell-free DNA and compare it to urinary pellet DNA. But this will need further investigations. We have updated the manuscript according to the reviewer’s comment.

Comment 3: Regarding DNA target enrichment due to skewed amplification, I am curious about the SNVs that are NOT in the primer regions. I presume that, if one of the amplification primers fails to bind effectively due to a SNV, it affects the results not just of that SNV but of all SNVs within that amplicon? I understand and wholeheartedly agree with the authors’ decision to exclude SNVs falling within primer sequence regions; I wonder if the quantification of other SNVs is negatively affected by these binding issues as well.

Response: Thank you for highlighting this point. In the final list of 1,000 variants used in our analysis, we filtered out the variants falling in the primer sequences +/- the read length (150 bp). We are sorry for the misunderstanding. As requested by the reviewer, we checked the alternative allele frequency for single nucleotide variants falling in primer sequences and the reads targeted by a primer containing a variant. We identified 6,979 SNVs falling in primer sequences and 17,258 SNVs carried by reads targeted by primers containing SNVs. We filtered out uncovered SNVs in each replicate. For both variants falling in primer sequences (Fig A) and variants carried by reads targeted by a primer containing a variant (Fig B), the alternative allele distribution was skewed. This observation supports the reviewer’s hypothesis. We have updated the manuscript according to the reviewer’s comment.

Decision Letter 1

Stanislaw Stepkowski

29 Mar 2021

Deep sequencing of DNA from urine of kidney allograft recipients to estimate  donor/recipient-specific DNA fractions

PONE-D-20-40848R1

Dear Dr. Belkadi,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Stanislaw Stepkowski

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Acceptance letter

Stanislaw Stepkowski

6 Apr 2021

PONE-D-20-40848R1

Deep sequencing of DNA from urine of kidney allograft recipients to estimate  donor/recipient-specific DNA fractions

Dear Dr. Suhre:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Stanislaw Stepkowski

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 File

    (ZIP)

    Data Availability Statement

    Data cannot be shared publicly because the Weill Cornell Medicine ethics committee has imposed restrictions on sharing a de-identified data set. Data are available from the Weill Cornell Medicine Institutional Data Access / Ethics Committee (contact via irb@med.cornell.edu) for researchers who meet the criteria for access to confidential data. The data underlying the results presented in the study are available from irb@med.cornell.edu.


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES