Skip to main content
PLOS ONE logoLink to PLOS ONE
. 2021 Jul 12;16(7):e0254407. doi: 10.1371/journal.pone.0254407

Whole genome sequencing identifies novel structural variant in a large Indian family affected with X-linked agammaglobulinemia

Abhinav Jain 1,2,#, Geeta Madathil Govindaraj 3,4,*,#, Athulya Edavazhippurath 3,5, Nabeel Faisal 3, Rahul C Bhoyar 1, Vishu Gupta 1,2, Ramya Uppuluri 6, Shiny Padinjare Manakkad 3, Atul Kashyap 1, Anoop Kumar 1, Mohit Kumar Divakar 1,2, Mohamed Imran 1,2, Sneha Sawant 7, Aparna Dalvi 7, Krishnan Chakyar 3, Manisha Madkaikar 7, Revathi Raj 6, Sridhar Sivasubbu 1,2,*, Vinod Scaria 1,2,*
Editor: Obul Reddy Bandapalli8
PMCID: PMC8274882  PMID: 34252140

Abstract

X—linked agammaglobulinemia (XLA, OMIM #300755) is a primary immunodeficiency disorder caused by pathogenic variations in the BTK gene, characterized by failure of development and maturation of B lymphocytes. The estimated prevalence worldwide is 1 in 190,000 male births. Recently, genome sequencing has been widely used in difficult to diagnose and familial cases. We report a large Indian family suffering from XLA with five affected individuals. We performed complete blood count, immunoglobulin assay, and lymphocyte subset analysis for all patients and analyzed Btk expression for one patient and his mother. Whole exome sequencing (WES) for four patients, and whole genome sequencing (WGS) for two patients have been performed. Carrier screening was done for 17 family members using Multiplex Ligation-dependent Probe Amplification (MLPA) and haplotype ancestry mapping using fineSTRUCTURE was performed. All patients had hypogammaglobulinemia and low CD19+ B cells. One patient who underwent Btk estimation had low expression and his mother showed a mosaic pattern. We could not identify any single nucleotide variants or small insertion/ deletions from the WES dataset that correlates with the clinical feature of the patient. Structural variant analysis through WGS data identifies a novel large deletion of 5,296 bp at loci chrX:100,624,323–100,629,619 encompassing exons 3–5 of the BTK gene. Family screening revealed seven carriers for the deletion. Two patients had a successful HSCT. Haplotype mapping revealed a South Asian ancestry. WGS led to identification of the accurate genetic mutation which could help in early diagnosis leading to improved outcomes, prevention of permanent organ damage and improved quality of life, as well as enabling genetic counselling and prenatal diagnosis in the family.

Introduction

X—linked agammaglobulinemia (XLA) is a monogenic primary immunodeficiency disorder caused by pathogenic mutations in the BTK (Bruton’s tyrosine kinase) gene [1] with X-linked recessive inheritance. The BTK gene is involved in the development, maturation, and signaling of B cells [2]. The absence of plasma B cells leads to markedly reduced levels of all classes of immunoglobulins. Patients with pathogenic variants in the BTK gene typically manifest with recurrent infections between 3 and 18 months of age. The commonest infections are those of the respiratory tract caused by encapsulated bacteria. [3, 4]. The prevalence of XLA has been estimated to be 1 in 190,000 male births or 1 in 379,000 total live births, with 40% of cases having a positive family history [5]. As per the diagnostic criteria, the patient needs to be a male, who has hypogammaglobulinemia or agammaglobulinemia, <2 percent CD19+ B cells, and either a male family member of maternal lineage who is documented to have agammaglobulinemia and <2 percent CD19+ B cells or a confirmed (by DNA, messenger ribonucleic acid (mRNA), or protein analysis) defect in the BTK gene or Btk expression [6].

The BTK gene loci are mapped to Xq21.3-Xq22 spanning 36.7 kb on the X chromosome and consists of 19 exons and 5 functional domains. A large number of genetic variants spanning this locus have been mapped to the BTK loci and systematically collected and deposited in the BTKbase database http://structure.bmc.lu.se/idbase/BTKbase/ [7]. A large number of mutations reported are single nucleotide variations (73%) that include missense, nonsense, and splice site variations. Small insertions and deletions (18%) and structural variations (9.5%) account for the remaining mutations. Only 3.5% of the gross deletions reported disrupting the functionality of the BTK gene [8]. A large number of patients with XLA have been reported with either a deletion in only the BTK gene or gross deletions encompassing BTK along with a few neighboring genes like TIMM8A, TAF7L, Artemis, IGHM, and DRP2 [9, 10]. The contiguous deletion in the BTK and TIMM8A gene that causes immunodeficiency with dystonia, optic neuronopathy, and sensorineural deafness is known as XLA-Mohr-Tranebjærg syndrome (XLA-MTS) [11]

We report five patients belonging to a large Indian family suffering from XLA. Whole genome sequencing identified a novel large deletion of 5,296 bp at loci chrX:100,624,323–100,629,619 spanning BTK exons 3–5. Screening by Multiplex Ligation-dependent Probe Amplification (MLPA) in the extended family identified seven carriers for the deletion. We surmise that the approach from next-generation sequencing to a low-cost screening method is replicable and could help to reduce the disease burden in the community.

Materials and methods

Patient and clinical workup

Five male children belonging to an Indian family were evaluated as a part of a programme on primary immune deficiency disorders at the Government Medical College, Kozhikode, Kerala, and the Institute of Genomics and Integrative Biology, Delhi between 2015 and 2019. The study was approved by the Institutional Ethics Committee, Government Medical College, Kozhikode and CSIR-IGIB (Ethics No. GMCKKD/RP2017/IEC/147). Written informed consent was obtained from the parents of all children who participated in the study. The index cases were P2 and P3. After elaborate pedigree analysis, clinical characteristics were recorded in a semi-structured proforma and included age at diagnosis, number of hospitalizations and PICU admissions, type of infections, complications, outcome, etc. Clinical investigations performed were complete blood count, lymphocyte subset analysis, and immunoglobulin assay.

Btk protein expression

The cytoplasmic staining procedure was used to analyze the Btk protein expression by flow cytometry. We took 50 μL of whole blood from the patient (P2), mother, and a healthy unrelated control which was surface stained using CD3-FITC, CD14-PE for 30 min at 37°C. An unstained tube was maintained for all three samples. At the end of incubation, these were fixed with formaldehyde for 10 minutes followed by permeabilization with Triton-X for 30 minutes at 37°C. Subsequently, these were washed, stained with an anti-Btk monoclonal antibody for 45 minutes, washed, and analyzed by flow cytometry.

DNA isolation and whole exome sequencing

Approximately 5 ml of blood from patients (P1, P2, P4, and P5) and their family members were drawn in an acid citrate dextrose (ACD) tube (Becton Dickinson, NJ, USA). Due to the unexpected death of P3, we could not collect his sample. Genomic DNA was isolated using the salting-out method [12] and 100 ng was used as a template to perform WES on the patient samples using the Truseq Exome library prep kit following the standard procedure as per the manufacturer’s protocol (Cat no.: 20000408, Illumina Inc., SA USA). The prepared library was sequenced on the HiSeq2500 for three patients (P1, P2, and P4) and the NovaSeq6000 platform for P5 patients had paired-end read with a read length of 150 bp.

Exome data analysis

The raw reads underwent quality control at Phred score Q30 using Trimmomatic-0.38 [13]. The trimmed reads were mapped on to human reference genome GRCh37 using Burrows-Wheeler Aligner (BWA) version 0.7.17 [14]. The mapped reads were further sorted and duplicate reads were removed using SAMtools [15] and Picard respectively https://broadinstitute.github.io/picard/. The variant calling was done using the HaplotypeCaller of Genome Analysis ToolKit (GATK version 3.8.0) best practices [16]. The variants were further systematically annotated using ANNOVAR (2018-04-16 00:47:49) [17] that comprises of multiple datasets i.e. refGene, dbsnp (avsnp150), dbnsfp35a, and clinvar 20190305. It also annotates allele frequency from the global population datasets i.e. 1000 Genome project (1000g2015_all), Genome Aggregation Database (gnomAD V2.1.1), and Esp6500. The protein altered variants (missense, stop gain, stop loss, frameshift, non-frameshift, splicing, small insertion, and small deletion) were prioritized. Further, variants whose minor allele frequency was greater than 5% in the global population were filtered out. A gene filter was applied which comprised 454 genes known to be implicated in PID and were recently catalogued by the International Union of Immunological Societies (IUIS) expert committee [18]. Finally, based on the phenotype, we correlated the variant’s clinical significance.

Whole genome sequencing and analysis

The whole genome sequencing was performed using 100 ng of genomic DNA from two patients (P1 and P4). WGS was performed with 150 bp paired-end reads that were generated using Truseq PCR free library kit as per manufacturer’s instructions (Cat no.: FC-121-9006DOC Illumina Inc. SA USA) on Illumina NovaSeq 6000 platform (San Diego, CA, USA) using sequencing by synthesis chemistry. The raw reads to variant annotations were performed similarly to the whole exome analysis. We merged both the individuals’ variants using the GATK (version 3.8.0) tool called CombineVariants. We filtered out the variants with MAF>0.05 in the global population. Further, we adopted an overlap based strategy [19] where we prioritized the common homozygous variants between P1 and P4. Further, variants were filtered based on the in-silico tool CADD (Combined Annotation-Dependent Depletion) score > 15 [20]. For the remaining variants, we manually correlated phenotypic characteristics of patients with filtered variants.

We have also performed structural variant (SV) analysis on the whole genome sequenced aligned reads of both the patients (P1 and P4). We used LUMPY (version 0.2.13) for SV calling [21] that were further genotyped using SVtyper [22]. The SVs were further prioritized using an overlap-based strategy where we prioritized a common homozygous variant in the 454 PID genes [19]. To validate the SV result, we adopted the manual coverage-based analysis on WGS paired-end reads that were aligned on the reference genome hg19/GRCh37 using Integrated Genome Viewer (IGV) [23].

Multiplex-ligation dependent probe amplification (MLPA) assay

To identify and validate the gross deletion in extended family members, the MLPA based approach was adopted. We have performed the test on 17 members of the family with their consent and institutional ethical approval (Ethics No. GMCKKD/RP2017/IEC/147). Genomic DNA was isolated using a standard salting-out method [12] and 100 ng of the genomic DNA was used. MLPA was performed as per manufacturer’s instructions (MRC-Holland, Amsterdam, The Netherlands) using SALSA MLPA EK1 reagent kit-FAM (EK1-FAM, MRC Holland, Netherlands) along with the SALSA MLPA Probemix P210 BTK (P210-050R, MRC Holland, Netherlands). Capillary electrophoresis of the amplicons was performed on ABI 3130 genetic analyser (Applied Biosystems™, California, USA). MLPA data analysis was performed using Coffalyser.net software (MRC Holland, Netherlands).

Ancestry haplotype mapping using chromosomal painting

Since it is a novel deletion and number of primary immunodeficient variants have been previously shown to have founder effects and specific ancestries [24, 25], we explored the haplotype similarity pattern of our patients with that of the global population. We used the haplotype prediction tool fineSTRUCTURE [26] (version 2.1.3) on P1 and P4 whole genome sequenced variants, whose chrX variants have been extracted and merged it with chrX variants of 1000 Genome Project as a reference and 44 whole genome data of the Qatar population. The 1000 Genome project consists of 2504 individuals from five major populations African (AFR), American (AMR), East Asian (EAS), European (EUR), and South Asian (SOU) [27]. We pruned the merged VCF by applying a variant filter of allele frequency greater than 1% and allele number filter of 1,000 to get the maximum genotype rate using a bespoke bash script. We phased the merged VCF using the SHAPEIT v2.r900 tool [28]. We ran fineSTRUCTURE pipeline which involves four steps that include painting of chromosome depending on the population haplotype with the individual i.e. chromopainter then combining all painted data and assigning a population to a block which involves chromocombine and fineSTRUCTURE respectively, and finally tree building based on maximum posterior population inference. The region 50KB, 500KB, 5MB, and 50MB upstream and downstream on both sides of the chromosomal locus chrX:100624323–100629619 has been plotted (hg19/GRCh37) using R scripts provided by fineSTRUCTURE.

Results

Clinical details

The five male children presented to the Government Medical College, Kozhikode, Kerala with recurrent respiratory infections, diarrhea, pyoderma, and pyogenic meningitis with onset between 5 months and 12 months of age. They belonged to a large extended family as shown in Fig 1. Clinical features have been tabulated in Table 1 and detailed in the S1 File.

Fig 1. Extended family pedigree of the patients affected with X-linked agammaglobulinemia.

Fig 1

Arrow represents two index cases, *—Individuals who underwent whole exome sequencing, $- Individuals who underwent whole genome sequencing, #- Individuals who underwent MLPA.

Table 1. Clinical and Immunological features of patients with XLA in the present cohort.

Clinical and immunological characteristics P1 P2 P3 P4 P5
Age at onset 1 year 5 months 7 months 5 months 6 months
Age at diagnosis 7 years 1 year 7 months 1 year 6 months 6 months 8 years
Type of infections Pneumonia, Pyoderma, Pyogenic meningitis Pneumonia, Acute suppurative otitis media, Diarrhea, Oral thrush Pneumonia
Diarrhea
Pyoderma
Diarrhea Pneumonia, Pyogenic meningitis Diarrhea, pyoderma
Absent / atrophic tonsils yes yes yes yes yes
Sibling death yes no yes yes no
No. of hospitalizations before diagnosis 2 >15 1 2 >10
No. of PICU admissions none 2 1 none 1
Immunoglobulin assay IgG low, IgA low, IgM low IgG low, IgA low, IgM normal IgG low, IgA normal, IgM normal IgG low, Ig A low, IgM normal IgG low, IgA low, IgM normal
CD 19 count at diagnosis (N) In cells/mm3 0 (91–610) 0 (430–3300) 8 (430–3300) 10 (430–3300) 10 (91–610)
CD 3 count at diagnosis (N) in cells/mm3 6458 (570–2400) 5475 (1460–5440) 6400 (1460–5440) 4169 (1900–5900) 9553 (570–2400)
CD 56 count at diagnosis (N) in cells/mm3 355 (78–470) 7.0175 (80–340) 160 (80–340) 661 (160–950) 528 (78–470)
Onset to IVIG start Not on regular IVIG 1 year, 2 months 11 months 1 month Not on regular IVIG
Complications Bronchiectasis None None None Bronchiectasis, Arthritis
Outcome Stunted growth, Stunted growth, Cured–HSCT from matched sibling donor (twin) at 6.5 years Stunted growth Died Cured-HSCT from matched sibling donor at 2 years Stunted growth and delayed puberty

All the children had absent or atrophic tonsils, hypogammaglobulinemia, severely reduced or absent CD19 counts, and were started on intravenous immunoglobulin (IVIG) infusions. Two children who presented late, and were not on regular prophylaxis with IVIG developed bronchiectasis. Other complications included stunted growth, delayed puberty, and arthritis. One child died soon after diagnosis due to meningitis at 20 months of age. Analysis of the pedigree chart revealed that 22 male children had died in the early years of life as shown in Fig 1.

Btk expression estimation using flow cytometry

By using this flow cytometric approach, patient P2 has low Btk expression observed on monocytes (37%) compared to 87% in the control, the normal range being 90 +/- 5%. His mother showed a mosaic pattern suggestive of being a carrier for XLA.

Whole exome sequencing analysis

We performed whole exome sequencing for four patients (P1, P2, P4, and P5) with 99.5% alignment on the human reference genome for each patient and average coverage of 90.8X. A total of 467,638 variants were called for each patient. Variants were annotated using a tool ANNOVAR [17] on average, 13,074 protein-altering variants consisting of splicing, exonic splicing, and exonic variants except synonymous SNVs were prioritized. These variants were further prioritized for minor allele frequency (MAF) < 5% in the global population which reduces the average number of variants to 2,383. On average 12 genetic variants were mapped to 454 primary immunodeficiency genes, for further downstream analysis. We could not find any variant which could correlate with the clinical features. The whole exome sequencing data with variant filtering has been tabulated in S1 Table.

Whole genome sequencing analysis

Since we could not find any causal variant using whole exome sequencing, we performed whole genome sequencing of P1 and P4 with a mapping percentage of 99.58% and 99.49% and coverage of 21X and 86X respectively. Variant calling was done by GATK (version 3.8.0) HaplotypeCaller and resulted in 4,905,687 and 4,904,616 variants for P1 and P4 respectively. The variant files of both the patients P1 and P4 were merged and annotated using a tool ANNOVAR [17] which led to a total of 6.5 million variants. These variants were prioritized for minor allele frequency (MAF) < 5% in the global population dataset, that reduces the variant number to 890,387. Further, we adopted an overlap-based strategy [19] and prioritized 27,236 variants. On applying in-silico computational tool i.e. CADD score > 15, we prioritized 4 variants. On manual correlation with the filtered variants, we could not correlate any of the prioritized variants with the clinical characteristics of the patients. The whole genome sequencing data with variant filtering has been tabulated in S2 Table.

We have also performed structural variant analysis using LUMPY for calling SV. We got a total of 8,925 and 29,638 SVs in patient P1 and P4 respectively. Prioritizing common and homozygous SVs in both the patients, led to a total of 768 SVs. Further we applied the PID gene filter that drops the number to 8 SVs. Finally, on clinical correlation with SVs, we prioritized a novel large deletion of 5,296 bp on chrX at loci ranging from 100,624,323 to 100,629,618. The variant filtering at each step has been tabulated in Table 2. Further visualizing on the human reference genome hg19/GRCh37 using IGV, we found that there were no reads present on chrX at loci ranging from 100,624,323 to 100,629,618 i.e. 5296 bases (~5 Kb). This led to the identification of a large deletion spanning exon 3 to exon 5 of the BTK gene for both P1 and P4 as shown in Fig 2. However, the region flanking the deleted loci showed adequate mapped reads. The deleted region in the patient has also been properly covered in the control sample. This prompted us to visualize the whole exome data for P2 and P5 on IGV and as a result, we found the same BTK gene deletion encompassing exon 3–5 as shown in S1 Fig. On intersecting patients deleted loci chrX:100624323–100629619 with the 1000 Genome project SVs, gnomAD SVs, and IndiGen SVs (in-house) database, we could not find any structural variant that falls in the exon 3–5 of the BTK gene in any of the databases.

Table 2. Whole genome sequencing structural variant analysis for P1 and P4.

Data P1 P4
Total Structural Variants 8,925 29,638
Common homozygous Variants 768
PID genetic variants (454 gene) 8
Phenotype associated variant 1
Structural Variant chrX:100,624,323–100,629,619 (5296 bp deletion)

Fig 2. Identification of putative large deletion in whole genome data in P1 and P4.

Fig 2

The figure represents the visualization of sequencing reads aligned on the coding region of the BTK gene.

Hematopoietic stem cell transplantation (HSCT)

After the family was counseled regarding the feasibility of HSCT as a curative option, two patients P2 and P4 underwent HSCT from a matched sibling donor. They required myeloablative conditioning to prevent graft rejection, and a treosulfan-based reduced toxicity protocol to ensure adequate myeloablation. Graft versus host disease prophylaxis consisted of short-course methotrexate and tacrolimus. Both of them are now disease-free.

Family screening using the multiplex ligation-dependent probe amplification (MLPA) assay

We performed whole genome sequencing on two samples P1 and P4 and found a large hemizygous deletion encompassing 3–5 exons of the BTK gene. To confirm whether the mutation is de novo or inherited, MLPA assay-based approach was adopted. The assay was first validated using the P1 and P4 samples as positive controls and two reference samples from healthy individuals. The positive control samples P1 and P4 were found to have the probe ratio of zero for the exons 3–5 of BTK gene i.e. there is the absence of any copy for this region as shown in Fig 3 and therefore, the deletion was detected, which corroborated with the whole genome sequencing results shown in Fig 2.

Fig 3. Multiplex ligation dependent amplification (MLPA) assay for detection of deletion encompassing 3–5 exons of the BTK gene: Representation of ratio charts generated using the Coffalyser.net software.

Fig 3

(A) Reference sample depicts the normal copies of the BTK gene and other samples (B) Sample P1 (C) Sample P2 (D) Sample P4 (E) Sample P5 are all males and depict probe ratio of zero for 3–5 exons of the BTK gene which lies on X-chromosome, i.e. they have hemizygous deletion. For all the ratio charts, the longitudinal axis (X-axis) represents the final ratios after inter and intra normalisations of the probe ratios. Horizontal axis (Y-axis) represents the probe names alongwith the length in the order (Both axis title have been manually enlarged for clarity). The blue and red horizontal lines depict the arbitrary borders of ratio 1.3 and 0.7 respectively. The black and red dots denote the final ratio obtained for each of the probes and the vertical bars represent the 95% confidence range for each probe. The Roman and numeric numbers on top of each ratio chart represent the individual marked as per the pedigree in Fig 1.

The standardized and validated MLPA assay was then used to screen additional family members of P1 and P4 for the deletion in the BTK gene. The probe ratios for the mothers of both P1 and P4 (IV-32 and IV-40) corresponded to ~0.5 as shown in S2 Fig, therefore, they were heterozygous for the same deletion. This shows that the deletion has been inherited by both P1 and P4 from their mothers. The brother (V-21) and father (IV-41) of P4 have also been tested and were found to have a normal copy of the gene. The other two patients, P2 and P5 were tested using the MLPA based assay and were found to harbor the same deletion in hemizygous form as shown in Fig 3. Following this, the families of P2 and P5 were tested. The maternal grandmother (III-9), mother (IV-36), and the two sisters (V-16 and V-17) of P2 were found to be heterozygous for the deletion, and one brother (V-19) was found to be carrying a normal copy of the gene S2 Fig. The mother (III-47) of P5 was found to be heterozygous for the deletion and her two sisters (IV-102 and IV-103) were found to have inherited both the normal copies of the gene. Additionally, two other members (V-34 and V-38), one male and one female, of the extended family were tested for the deletion and were found to have normal copies of the gene. Using MLPA-based assay, it has been found that the large deletion is inherited in multiple members of the family. Out of 17 members tested, we found four members of the family were hemizygous and seven were carriers for the deletion. Upon analysis of the pedigree in Fig 1, we can predict more than half a dozen of the family members could be carriers of the deletion. The MLPA probe ratio for each individual at 19 BTK exons test probes as well as for reference probes has been tabulated in S3 Table.

Chromosomal painting analysis

Since this variation is absent in all the control populations, it excites us to know the haplotype ancestry around the deleted region. In order to predict haplotype ancestry of the region flanking the deletion on chromosome X, we merged chrX variants of whole genome sequenced P1 and P4 individually with 2,504 individuals of five major populations from 1000 Genome Project and 44 whole genome sequenced Qatari individuals which comprise 3,821,263 variants. We pruned the variants whose allele frequency is less than 0.01 and allele number less than 1,000 that filtered the variant number to 185,258, and 186,417 for P1 and P4 respectively. After phasing and chromosomal painting using fineSTRUCTURE, we found that both the samples had South Asian ancestry. The painted chromosomal region of both the patients P1 and P4, 5MB upstream and downstream of the deleted locus has been well represented in Fig 4. For more fine visualization we have painted the chromosomal region at 50KB, 500KB, and 50MB upstream and downstream to the deleted locus in S3 Fig.

Fig 4. Haplotype ancestry prediction with the global population using fineSTRUCTURE.

Fig 4

South Asian ancestry predicted the haplotype for locus 5MB upstream and 5MB downstream to loci chrX:100,624,323–100,629,619 (hg19/GRCh37) of two affected patients (V14 and V23) with 2504 individuals of five major populations (AFR-African, AMR- American, EAS- East Asian, EUR-European, and SAS-South Asian) of 1000 Genome Project and 44 individuals of Qatari ancestry.

Discussion

XLA was the first primary immunodeficiency disorder discovered by a pediatrician Ogden Carr Bruton in 1952 [29]. The onset of symptoms is usually before one year of age. Infections do not occur in early infancy due to the protective effect of maternally derived IgG [5]. Apart from an increased frequency and severity of respiratory and gastrointestinal infections, there are other rare manifestations described including arthritis, which was a feature in P5, and significantly impinges on the quality of life [30]. Autoimmunity and autoinflammation can occur in XLA, and arthritis could be either due to a dysfunctional immune system or due to infection with mycoplasma, which was not the case in our patient P5 [30]. The challenge lies in identifying the index case [31]. Although panhypogammaglobulinemia is the rule, there are reports of normal immunoglobulin levels or selective immunoglobulin deficiency [32, 33]. Antibody responses to isohemagglutinins and vaccine antigens were not performed since B cell numbers were negligible, and it was possible to look for a BTK variant.

Identifying the variant is critical for carrier detection and genetic counseling, without which the disease frequency in the community cannot be reduced. We have performed WES in all the four patients but unfortunately were unable to find the causal variant. This could be due to the limitation of WES in providing adequate coverage for all the genes [34] inability in calling structural variants [35], missing non-coding variants [36] as well as high-quality coding SNVs, [37]. The WGS has paved the way for the diagnosis of such difficult-to-diagnose disorders. In the present study, we identified a novel large deletion of 5,296 bp encompassing exons 3–5 of the BTK gene in four patients. A very recent analysis was done by Thaventhiran and group, where they performed WGS analysis on 1,318 PID patients and identified eight structural variants which could have been missed by WES [38]. WGS also inferred denovo 3Mb deletion breakpoints in two probands affected by DiGeorge syndrome [39].

XLA is caused by pathogenic variants in the BTK gene, which is a cytosolic tyrosine kinase protein composed of five domains i.e. pleckstrin homology (PH), Tec homology (TH), Src homology domain (SH2), SH3, and kinase (Catalytic) domains [40]. In our patients, the 5,296 bp deletion spans exons 3–5 of BTK gene. Exon 3 is an integral part of the PH domain, which regulates the binding of molecules like inositol 1,3,4,5-tetrakisphosphate (IP4), inositol 1,3,4,5,6-pentakisphosphate (IP5), and inositol 1,2,3,4,5,6-hexakisphosphate (IP6) which in turn activates BTK gene function [41]. There are multiple mutations reported in exon 3 which affect the functionality of the PH domain [42] and cause XLA. Mutations in exon 4 which encode for the C-terminal immunoglobulin domain [43] and mutations in exon 5 also have been reported in a patient with decrease in Btk expression and CD19+ B cell number [44].

Early diagnosis is imperative to avoid morbidity, mortality and vaccination with OPV, since these children are at risk for paralytic poliomyelitis. [45]. In the family studied, 22 male children had died due to infections as shown in Fig 1. These individuals might have been harbouring the same hemizygous deletion and early diagnosis could have been life-saving. Using a low-cost MLPA-based assay for screening additional family members, we have found seven female carriers for the BTK exon 3–5 deletion. Unfortunately, we were able to test only 17 out of 159 individuals. But by analyzing the pedigree, we can predict that more than half a dozen family members could be carriers. Carrier detection in families affected by Mendelian disorders will enable genetic counseling and antenatal diagnosis, ultimately resulting in a reduced disease burden. [46, 47]. MLPA has been used to screen for multiple diseases as well as prenatal screening, due to its swift, highly sensitive and cost‐effective approach [48, 49]. Both children in whom the diagnosis was delayed and who were not on regular IVIG prophylaxis (P1 and P5) developed permanent lung damage.

There are reports of successful HSCT for XLA, but it has not been used extensively since immunoglobulin replacement is widely available. However, when the cost and availability of lifelong prophylaxis is a limiting factor, HSCT has been chosen by parents instead of the option of lifelong immunoglobulin replacement [50, 51]. This was the case in P2 and P4 patients.

While the mean age at onset was 7 months, the mean age at diagnosis was 44.6 months, a delay of 37.6 months, resulting in multiple hospital admissions for treatment of infections. Clinician and patient/parent education would help to ensure early diagnosis, enhance compliance with treatment and prevent poor outcomes as occurred in P1 and P5. Transitioning to adult services is also a challenge [52]. Accurate genetic workup and counseling of the extended family will reduce the burden of care. [53].

The WGS and the low-cost MLPA assay are replicable in the society for reducing the disease burden of affected families and the community at large. The effectiveness of this approach hinges on the availability and accessibility of a system for genetic counseling that the community would accept.

Supporting information

S1 Fig. Visualization of putative large deletion encompassing BTK gene exon 3–5 of whole exome data of patients P1, P2, P4, P5, and control sample.

(TIF)

S2 Fig. MLPA for detection of deletion encompassing 3–5 exons of the BTK gene in the additional family members of the proband.

Representation of ratio charts generated using the Coffalyser.net software. Longitudinal axis represents the final ratios after inter and intra normalisations of the probe ratios. Horizontal axis represents the probe names along with the length (The axis titles have been manually enlarged for clarity). The blue and red horizontal lines depict the arbitrary borders of ratio 1.3 and 0.7 respectively. The black and red dots denote the final ratio obtained for each of the probes and the vertical bars represent the 95% confidence range for each probe. The test probes are of BTK gene and the rest are the reference probes. The Roman and numeric numbers on top of each ratio chart represent the individual marked as per the pedigree in Fig 1.

(TIF)

S3 Fig. South Asian ancestry predicted for locus.

A) 50MB upstream and 5KB downstream, B) 500KB upstream and 500KB downstream, and C) 50KB upstream and 50KB downstream to loci chrX:100,624,323–100,629,619 (hg19/GRCh37) of two affected first cousins (V14 and V23) with 2504 individuals of five major populations (AFR-African, AMR- American, EAS- East Asian, EUR-European, and SAS-South Asian) of 1000 Genome Project and 44 individuals of Qatar ancestry.

(TIF)

S1 Table. Whole exome sequencing data summary for each patient.

(PDF)

S2 Table. Whole genome sequencing data summary for P1 and P4.

(PDF)

S3 Table. MLPA copy number ratios for each of the reference and test probes of the proband and all the extended family members tested.

The roman and numeric numbers on top of each ratio chart represent the individual marked as per the pedigree in Fig 1.

(PDF)

S1 File. Detailed clinical feature of XLA large deletion family.

(DOCX)

Acknowledgments

We thank Disha Sharma, Bani Jolly, Mukta Poojary, and Afra Shamnath, for suggestions which enriched the manuscript.

Data Availability

The data involved in the study will be provided upon request, as it could be potentially identifiable and contains patient sensitive information. Access to the data could be requested by mailing to Dr. Jyoti Yadav (j.yadav@igib.res.in) who is the convener of the Institutional Human Ethics Committee of CSIR-IGIB.

Funding Statement

The work was supported by Council of Scientific and Industrial Research (CSIR) India through grant number MLP1801 (RareGen-CSIR India), Science & Engineering Research Board (SERB) through grant number EMR/2016/006828/HS (SERB, DST) and Foundation for Primary Immunodeficiency Diseases (FPID, USA).

References

  • 1.Tsukada S, Saffran DC, Rawlings DJ, Parolini O, Allen RC, Klisak I, et al. Deficient expression of a B cell cytoplasmic tyrosine kinase in human X-linked agammaglobulinemia. Cell. 1993. Jan 29;72(2):279–90. doi: 10.1016/0092-8674(93)90667-f [DOI] [PubMed] [Google Scholar]
  • 2.Maas A, Hendriks RW. Role of Bruton’s tyrosine kinase in B cell development. Dev Immunol. 2001;8(3–4):171–81. doi: 10.1155/2001/28962 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Moin M, Aghamohammadi A, Farhoudi A, Pourpak Z, Rezaei N, Movahedi M, et al. X-linked agammaglobulinemia: a survey of 33 Iranian patients. Immunol Invest. 2004. Feb;33(1):81–93. doi: 10.1081/imm-120027687 [DOI] [PubMed] [Google Scholar]
  • 4.Tsukada S, Saffran DC, Rawlings DJ, Parolini O, Allen RC, Klisak I, et al. Deficient expression of a B cell cytoplasmic tyrosine kinase in human X-linked agammaglobulinemia. Cell. 1993. Jan 29;72(2):279–90. doi: 10.1016/0092-8674(93)90667-f [DOI] [PubMed] [Google Scholar]
  • 5.Winkelstein JA, Marino MC, Lederman HM, Jones SM, Sullivan K, Burks AW, et al. X-linked agammaglobulinemia: report on a United States registry of 201 patients. Medicine. 2006. Jul;85(4):193–202. doi: 10.1097/01.md.0000229482.27398.ad [DOI] [PubMed] [Google Scholar]
  • 6.Zwebb, www.zwebb.com. ESID—European Society for Immunodeficiencies [Internet]. [cited 2020 Sep 12]. Available from: https://esid.org/Working-Parties/Clinical-Working-Party/Resources/Diagnostic-criteria-for-PID2#Q15
  • 7.Väliaho J, Smith CIE, Vihinen M. BTKbase: the mutation database for X-linked agammaglobulinemia. Hum Mutat. 2006. Dec;27(12):1209–17. doi: 10.1002/humu.20410 [DOI] [PubMed] [Google Scholar]
  • 8.Conley ME, Broides A, Hernandez-Trujillo V, Howard V, Kanegane H, Miyawaki T, et al. Genetic analysis of patients with defects in early B-cell development. Immunol Rev. 2005. Feb; 203:216–34. doi: 10.1111/j.0105-2896.2005.00233.x [DOI] [PubMed] [Google Scholar]
  • 9.van Zelm MC, Geertsema C, Nieuwenhuis N, de Ridder D, Conley ME, Schiff C, et al. Gross deletions involving IGHM, BTK, or Artemis: a model for genomic lesions mediated by transposable elements. Am J Hum Genet. 2008. Feb;82(2):320–32. doi: 10.1016/j.ajhg.2007.10.011 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Jo E-K, Wang Y, Kanegane H, Futatani T, Song C-H, Park J-K, et al. Identification of mutations in the Bruton’s tyrosine kinase gene, including a novel genomic rearrangements resulting in large deletion, in Korean X-linked agammaglobulinemia patients. J Hum Genet. 2003. May 24;48(6):322–6. doi: 10.1007/s10038-003-0032-4 [DOI] [PubMed] [Google Scholar]
  • 11.Richter D, Conley ME, Rohrer J, Myers LA, Zahradka K, Kelecić J, et al. A contiguous deletion syndrome of X-linked agammaglobulinemia and sensorineural deafness. Pediatr Allergy Immunol. 2001. Apr;12(2):107–11. doi: 10.1034/j.1399-3038.2001.0129999107.x [DOI] [PubMed] [Google Scholar]
  • 12.Miller SA, Dykes DD, Polesky HF. A simple salting out procedure for extracting DNA from human nucleated cells. Nucleic Acids Res. 1988. Feb 11;16(3):1215. doi: 10.1093/nar/16.3.1215 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014. Aug 1;30(15):2114–20. doi: 10.1093/bioinformatics/btu170 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009. Jul 15;25(14):1754–60. doi: 10.1093/bioinformatics/btp324 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009. Aug 15;25(16):2078–9. doi: 10.1093/bioinformatics/btp352 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet. 2011. May;43(5):491–8. doi: 10.1038/ng.806 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Wang K, Li M, Hakonarson H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 2010. Sep;38(16): e164. doi: 10.1093/nar/gkq603 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Bousfiha A, Jeddane L, Picard C, Al-Herz W, Ailal F, Chatila T, et al. Human Inborn Errors of Immunity: 2019 Update of the IUIS Phenotypical Classification. J Clin Immunol. 2020. Jan;40(1):66–81. doi: 10.1007/s10875-020-00758-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Gilissen C, Hoischen A, Brunner HG, Veltman JA. Disease gene identification strategies for exome sequencing. Eur J Hum Genet. 2012. May;20(5):490–7. doi: 10.1038/ejhg.2011.258 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Rentzsch P, Witten D, Cooper GM, Shendure J, Kircher M. CADD: predicting the deleteriousness of variants throughout the human genome. Nucleic Acids Res. 2019. Jan 8;47(D1): D886–94. doi: 10.1093/nar/gky1016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Layer RM, Chiang C, Quinlan AR, Hall IM. LUMPY: a probabilistic framework for structural variant discovery. Genome Biol. 2014. Jun 26;15(6): R84. doi: 10.1186/gb-2014-15-6-r84 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Chiang C, Layer RM, Faust GG, Lindberg MR, Rose DB, Garrison EP, et al. SpeedSeq: ultra-fast personal genome analysis and interpretation. Nat Methods. 2015. Oct;12(10):966–8. doi: 10.1038/nmeth.3505 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Robinson JT, Thorvaldsdóttir H, Winckler W, Guttman M, Lander ES, Getz G, et al. Integrative genomics viewer. Nat Biotechnol. 2011. Jan;29(1):24–6. doi: 10.1038/nbt.1754 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Naamane H, El Maataoui O, Ailal F, Barakat A, Bennani S, Najib J, et al. The 752delG26 mutation in the RFXANK gene associated with major histocompatibility complex class II deficiency: evidence for a founder effect in the Moroccan population. Eur J Pediatr. 2010. Sep;169(9):1069–74. doi: 10.1007/s00431-010-1179-6 [DOI] [PubMed] [Google Scholar]
  • 25.Sorte HS, Osnes LT, Fevang B, Aukrust P, Erichsen HC, Backe PH, et al. A potential founder variant in in three Norwegian families with warts, molluscum contagiosum, and T-cell dysfunction. Mol Genet Genomic Med. 2016. Nov;4(6):604–16. doi: 10.1002/mgg3.237 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Lawson DJ, Hellenthal G, Myers S, Falush D. Inference of population structure using dense haplotype data. PLoS Genet. 2012. Jan;8(1): e1002453. doi: 10.1371/journal.pgen.1002453 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.1000 Genomes Project Consortium, Auton A, Brooks LD, Durbin RM, Garrison EP, Kang HM, et al. A global reference for human genetic variation. Nature. 2015. Oct 1;526(7571):68–74. doi: 10.1038/nature15393 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Delaneau O, Coulonges C, Zagury J-F. Shape-IT: new rapid and accurate algorithm for haplotype inference. BMC Bioinformatics. 2008. Dec 16; 9:540. doi: 10.1186/1471-2105-9-540 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Bruton OC. Agammaglobulinemia. Pediatrics. 1952. Jun;9(6):722–8. [PubMed] [Google Scholar]
  • 30.Soresina A, Nacinovich R, Bomba M, Cassani M, Molinaro A, Sciotto A, et al. The quality of life of children and adolescents with X-linked agammaglobulinemia. J Clin Immunol. 2009. Jul;29(4):501–7. doi: 10.1007/s10875-008-9270-8 [DOI] [PubMed] [Google Scholar]
  • 31.Zhang Y-N, Gao Y-Y, Yang S-D, Cao B-B, Zheng K-L, Wei P, et al. Delayed diagnosis of X-linked agammaglobulinaemia in a boy with recurrent meningitis. BMC Neurol. 2019. Dec 12;19(1):320. doi: 10.1186/s12883-019-1536-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Preece K, Lear G. X-linked Agammaglobulinemia With Normal Immunoglobulin and Near-Normal Vaccine Seroconversion. Pediatrics. 2015. Dec;136(6): e1621–4. doi: 10.1542/peds.2014-3907 [DOI] [PubMed] [Google Scholar]
  • 33.Lim L-M, Chang J-M, Wang I-F, Chang W-C, Hwang D-Y, Chen H-C. Atypical X-linked agammaglobulinaemia caused by a novel BTK mutation in a selective immunoglobulin M deficiency patient. BMC Pediatr. 2013. Sep 27; 13:150. doi: 10.1186/1471-2431-13-150 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Sheppard S, Biswas S, Li MH, Jayaraman V, Slack I, Romasko EJ, et al. Utility and limitations of exome sequencing as a genetic diagnostic tool for children with hearing loss. Genet Med. 2018. Dec;20(12):1663–76. doi: 10.1038/s41436-018-0004-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Biesecker LG, Shianna KV, Mullikin JC. Exome sequencing: the expert view. Genome Biol. 2011. Sep 14;12(9):128. doi: 10.1186/gb-2011-12-9-128 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Chou J, Ohsumi TK, Geha RS. Use of whole exome and genome sequencing in the identification of genetic causes of primary immunodeficiencies. Curr Opin Allergy Clin Immunol. 2012. Dec;12(6):623–8. doi: 10.1097/ACI.0b013e3283588ca6 [DOI] [PubMed] [Google Scholar]
  • 37.Belkadi A, Bolze A, Itan Y, Cobat A, Vincent QB, Antipenko A, et al. Whole-genome sequencing is more powerful than whole-exome sequencing for detecting exome variants. Proc Natl Acad Sci U S A. 2015. Apr 28;112(17):5473–8. doi: 10.1073/pnas.1418631112 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Thaventhiran JED, Allen HL, Burren OS, Rae W, Greene D, Staples E, et al. Whole-genome sequencing of a sporadic primary immunodeficiency cohort. Nature. 2020. May 6;583(7814):90–5. doi: 10.1038/s41586-020-2265-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Guo X, Delio M, Haque N, Castellanos R, Hestand MS, Vermeesch JR, et al. Variant discovery and breakpoint region prediction for studying the human 22q11.2 deletion using BAC clone and whole genome sequencing analysis. Hum Mol Genet. 2016. Sep 1;25(17):3754–67. doi: 10.1093/hmg/ddw221 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Mohamed AJ, Yu L, Bäckesjö C-M, Vargas L, Faryal R, Aints A, et al. Bruton’s tyrosine kinase (Btk): function, regulation, and transformation with special emphasis on the PH domain. Immunol Rev. 2009. Mar;228(1):58–73. doi: 10.1111/j.1600-065X.2008.00741.x [DOI] [PubMed] [Google Scholar]
  • 41.Fukuda M, Kojima T, Kabayama H, Mikoshiba K. Mutation of the pleckstrin homology domain of Bruton’s tyrosine kinase in immunodeficiency impaired inositol 1,3,4,5-tetrakisphosphate binding capacity. J Biol Chem. 1996. Nov 29;271(48):30303–6. doi: 10.1074/jbc.271.48.30303 [DOI] [PubMed] [Google Scholar]
  • 42.Yip KL, Chan SY, Ip WK, Lau YL. Bruton’s tyrosine kinase mutations in 8 Chinese families with X-linked agammaglobulinemia. Hum Mutat. 2000. Apr;15(4):385. doi: [DOI] [PubMed] [Google Scholar]
  • 43.Ochs MD Dr.med HD, Smith PhD CIE, Puck MD JM, editors. Primary Immunodeficiency Diseases. Oxford University Press; 2013.
  • 44.Teocchi MA, Domingues Ramalho V, Abramczuk BM, D’Souza-Li L, Santos Vilela MM. BTK mutations selectively regulate BTK expression and upregulate monocyte XBP1 mRNA in XLA patients. Immun Inflamm Dis. 2015. Sep;3(3):171–81. doi: 10.1002/iid3.57 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Wright PF, Hatch MH, Kasselberg AG, Lowry SP, Wadlington WB, Karzon DT. Vaccine-associated poliomyelitis in a child with sex-linked agammaglobulinemia. J Pediatr. 1977. Sep;91(3):408–12. doi: 10.1016/s0022-3476(77)81309-7 [DOI] [PubMed] [Google Scholar]
  • 46.van Maarle MC, Stouthard MEA, Bonsel GJ. Quality of life in a family based genetic cascade screening programme for familial hypercholesterolaemia: a longitudinal study among participants. J Med Genet. 2003. Jan;40(1): e3. doi: 10.1136/jmg.40.1.e3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Ahmed S, Saleem M, Modell B, Petrou M. Screening extended families for genetic hemoglobin disorders in Pakistan. N Engl J Med. 2002. Oct 10;347(15):1162–8. doi: 10.1056/NEJMsa013234 [DOI] [PubMed] [Google Scholar]
  • 48.Madrigal I, Rodríguez-Revenga L, Badenas C, Sánchez A, Martinez F, Fernandez I, et al. MLPA as first screening method for the detection of microduplications and microdeletions in patients with X-linked mental retardation. Genet Med. 2007. Feb;9(2):117–22. doi: 10.1097/gim.0b013e318031206e [DOI] [PubMed] [Google Scholar]
  • 49.Lalic T, Vossen RHAM, Coffa J, Schouten JP, Guc-Scekic M, Radivojevic D, et al. Deletion and duplication screening in the DMD gene using MLPA. Eur J Hum Genet. 2005. Nov;13(11):1231–4. doi: 10.1038/sj.ejhg.5201465 [DOI] [PubMed] [Google Scholar]
  • 50.Vellaichamy Swaminathan V, Uppuluri R, Patel S, Melarcode Ramanan K, Ravichandran N, Jayakumar I, et al. Treosulfan-based reduced toxicity hematopoietic stem cell transplantation in X-linked agammaglobulinemia: A cost-effective alternative to long-term immunoglobulin replacement in developing countries. Pediatr Transplant. 2020. Feb;24(1): e13625. doi: 10.1111/petr.13625 [DOI] [PubMed] [Google Scholar]
  • 51.Ikegame K, Imai K, Yamashita M, Hoshino A, Kanegane H, Morio T, et al. Allogeneic stem cell transplantation for X-linked agammaglobulinemia using reduced intensity conditioning as a model of the reconstitution of humoral immunity. J Hematol Oncol. 2016. Feb 13; 9:9. doi: 10.1186/s13045-016-0240-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.El-Sayed ZA, Abramova I, Aldave JC, Al-Herz W, Bezrodnik L, Boukari R, et al. X-linked agammaglobulinemia (XLA): Phenotype, diagnosis, and therapeutic challenges around the world. World Allergy Organ J. 2019. Mar 22;12(3):100018. doi: 10.1016/j.waojou.2019.100018 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Rudilla F, Franco-Jarava C, Martínez-Gallo M, Garcia-Prat M, Martín-Nalda A, Rivière J, et al. Expanding the Clinical and Genetic Spectra of Primary Immunodeficiency-Related Disorders With Clinical Exome Sequencing: Expected and Unexpected Findings. Front Immunol. 2019. Oct 1; 10:2325. doi: 10.3389/fimmu.2019.02325 [DOI] [PMC free article] [PubMed] [Google Scholar]

Decision Letter 0

Obul Reddy Bandapalli

16 Mar 2021

PONE-D-20-38741

Whole genome sequencing identifies novel structural variant in a large Indian family affected with X-linked agammaglobulinemia

PLOS ONE

Dear Dr. Scaria,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Based on internal evaluation and external review this manuscript was found to be interesting. However the reviewers raised points that needs to be addressed carefully, especially the claim that WES did not give any casual variants was not taken well by one of the reviewers and suggested to look for CNVs before concluding. Please try to address all those points by performing the analysis or write rebuttal putting forward your arguments.

Please submit your revised manuscript by Apr 23 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

We look forward to receiving your revised manuscript.

Kind regards,

Obul Reddy Bandapalli, MSc, PhD

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Thank you for including your ethics statement:  "The study was approved by the Institutional Ethics Committee (Ethics No. GMCKKD/RP2017/IEC/147).".   

a. Please amend your current ethics statement to include the full name of the ethics committee/institutional review board(s) that approved your specific study.

b. Once you have amended this/these statement(s) in the Methods section of the manuscript, please add the same text to the “Ethics Statement” field of the submission form (via “Edit Submission”).

For additional information about PLOS ONE ethical requirements for human subjects research, please refer to http://journals.plos.org/plosone/s/submission-guidelines#loc-human-subjects-research

3. Please provide additional details regarding participant consent.

In the ethics statement in the Methods and online submission information, please ensure that you have specified what type you obtained (for instance, written or verbal, and if verbal, how it was documented and witnessed).

If your study included minors, state whether you obtained consent from parents or guardians.

If the need for consent was waived by the ethics committee, please include this information.

4. We note that you have indicated that data from this study are available upon request. PLOS only allows data to be available upon request if there are legal or ethical restrictions on sharing data publicly. For more information on unacceptable data access restrictions, please see http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions.

In your revised cover letter, please address the following prompts:

a) If there are ethical or legal restrictions on sharing a de-identified data set, please explain them in detail (e.g., data contain potentially sensitive information, data are owned by a third-party organization, etc.) and who has imposed them (e.g., an ethics committee). Please also provide contact information for a data access committee, ethics committee, or other institutional body to which data requests may be sent.

b) If there are no restrictions, please upload the minimal anonymized data set necessary to replicate your study findings as either Supporting Information files or to a stable, public repository and provide us with the relevant URLs, DOIs, or accession numbers. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories.

We will update your Data Availability statement on your behalf to reflect the information you provide.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Partly

Reviewer #2: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: N/A

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The authors report the detection of a large deletion in the BTK gene in a large family with X-linked agammaglobulinemia. More than 500 mutations in this gene have previously been identified to cause the disease, so the paper adds one more mutation to the list. The paper is clearly written and the data analysis is well described. The authors have attempted to show that WGS is needed for identifying the pathogenic variant, however, the data presented does not support this claim (see below).

My main criticism is that the authors performed whole-exome sequencing on four affected individuals but say that did not find "any causal variant using whole-exome sequencing". However, this is not well justified since the authors only looked at small variants identified from exome data. Given the strong association of mutations in BTK with the phenotype, the authors could easily have looked for copy number variants specifically in the BTK gene (the deletion would easily be seen as zero read depth on the X chromosome in males). There are many tools for detection of CNVs from exome data that could also have been used. Simply visualizing read depth using the IGV viewer can reveal deletions that overlap coding regions. Therefore, the claim that WGS was needed to detect the deletion in the BTK gene is not supported. WGS is definitely beneficial for identifying the precise breakpoints but exome sequencing is still 3-4 times cheaper than WGS. I would recommend that the authors report the analysis of read depth from exome data.

The rationale for performing chromosome painting analysis of the region around the deletion is not clear to me. The large deletion identified in this family cannot be expected to be seen in controls since it would cause a severe disease in males. Therefore, it cannot be expected to have a specific population origin. The individuals studied in this paper are from India and it is not surprising that they have South Asian ancestry. This should be true for their entire genome, rather than just the region around the deletion.

Reviewer #2: A. Is the manuscript technically sound, and the data support the conclusions?

The manuscript by Jain et al. describes about the identification of large deletion (5,296 bp) encompassing exons 3-5 of the BTK gene of the patients with X - linked agammaglobulinemia (XLA) using the whole-genome sequencing (WGS). The authors have performed whole-exome sequencing (WES) and WGS of the four and two patients with XLA, respectively. Also, the authors have performed several assays including multiplex-ligation dependent probe amplification (MLPA) assay. The manuscript can be accepted after the following minor concerns have been addressed.

In abstract section, the authors must rephrase the following sentence as they concluded (in the results section) that they could not find any variant which could correlate with the clinical features: “Whole genome sequencing led to identification of the accurate genetic mutation which could help in early diagnosis leading to improved outcomes, prevention of permanent organ damage and improved quality of life, as well as enabling prenatal diagnosis”.

In the methodology section the authors can use multiple algorithms for the identification of alterations including copy number alterations using WES and WGS data of the patients with XLA.

B. Has the statistical analysis been performed appropriately and rigorously?

The statistical analysis has not been performed by the authors.

C. Have the authors made all data underlying the findings in their manuscript fully available?

The authors have provided analyzed data in the figures/tables and supplementary information in the manuscript wherever necessary. The raw data can be accessed through the convener of the Institutional Human Ethics Committee of the institute.

D. Is the manuscript presented in an intelligible fashion and written in standard English?

Overall manuscript is okay written, however the authors can rephrase the sentences in methodology section for example, “small indels, small insertion, and small deletion”.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2021 Jul 12;16(7):e0254407. doi: 10.1371/journal.pone.0254407.r002

Author response to Decision Letter 0


13 May 2021

Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The authors report the detection of a large deletion in the BTK gene in a large family with X-linked agammaglobulinemia. More than 500 mutations in this gene have previously been identified to cause the disease, so the paper adds one more mutation to the list. The paper is clearly written and the data analysis is well described. The authors have attempted to show that WGS is needed for identifying the pathogenic variant, however, the data presented does not support this claim (see below).

My main criticism is that the authors performed whole-exome sequencing on four affected individuals but say that they did not find "any causal variant using whole-exome sequencing". However, this is not well justified since the authors only looked at small variants identified from exome data. Given the strong association of mutations in BTK with the phenotype, the authors could easily have looked for copy number variants specifically in the BTK gene (the deletion would easily be seen as zero read depth on the X chromosome in males). There are many tools for detection of CNVs from exome data that could also have been used. Simply visualizing read depth using the IGV viewer can reveal deletions that overlap coding regions. Therefore, the claim that WGS was needed to detect the deletion in the BTK gene is not supported. WGS is definitely beneficial for identifying the precise breakpoints but exome sequencing is still 3-4 times cheaper than WGS. I would recommend that the authors report the analysis of read depth from exome data.

Response: We agree with the reviewer the point regarding visualizing the whole exome sequencing (WES) reads on the IGV could potentially identify the deletion. However in the case of WES, one cannot be sure whether such a loss of coverage could be due to chromosomal deletion or the efficiency of probes. Additionally the tools used for calling CNV using WES dataset are not accurate and robust and lead to false positive calls even in normal cases. In this context, WGS is beneficial to identify the precise breakpoint in the deletion as well as have sensitive and specific tools for structural variant calling - given it is independent of the probes, and additionally the paired-end reads can accurately identify not just the deletion but also the breakpoints. In addition, WGS provides us with an appropriate basis to perform further analysis like haplotypes and ancestry assessments which are not possible using WES dataset.

The rationale for performing chromosome painting analysis of the region around the deletion is not clear to me. The large deletion identified in this family cannot be expected to be seen in controls since it would cause a severe disease in males. Therefore, it cannot be expected to have a specific population origin. The individuals studied in this paper are from India and it is not surprising that they have South Asian ancestry. This should be true for their entire genome, rather than just the region around the deletion.

Reviewer: We apologize to the reviewer for not making it clear in the manuscript. The purpose of performing the haplotype ancestry for the patients in our study affected with PID. It is previously described in literature that a number of variants in PID have a founder effect and could be mapped to specific ancestry (PMID: 20414676, 27896283, 32822427) . The basic purpose of performing the haplotype ancestry is to identify whether the variant in the South Indian family might have a specific ancestry and also verify whether this could be associated with a founder effect in the Indian population. We have included the statement in the revised manuscript to make it clear for the reader also.

Reviewer #2: A. Is the manuscript technically sound, and the data support the conclusions?

The manuscript by Jain et al. describes about the identification of large deletion (5,296 bp) encompassing exons 3-5 of the BTK gene of the patients with X - linked agammaglobulinemia (XLA) using the whole-genome sequencing (WGS). The authors have performed whole-exome sequencing (WES) and WGS of the four and two patients with XLA, respectively. Also, the authors have performed several assays including multiplex-ligation dependent probe amplification (MLPA) assay. The manuscript can be accepted after the following minor concerns have been addressed.

In abstract section, the authors must rephrase the following sentence as they concluded (in the results section) that they could not find any variant which could correlate with the clinical features: “Whole genome sequencing led to identification of the accurate genetic mutation which could help in early diagnosis leading to improved outcomes, prevention of permanent organ damage and improved quality of life, as well as enabling prenatal diagnosis”.

Response: We thank the reviewer for the comment, we have included the statement in the Abstract section that “We could not identify any variant from the WES dataset that correlates with the clinical feature of the patient”. Also we have rephrased the concluding statement of the Abstract section.

In the methodology section the authors can use multiple algorithms for the identification of alterations including copy number alterations using WES and WGS data of the patients with XLA.

Response: We completely agree with the reviewer on using multiple tools for identification of CNV using WES and WGS. However the tools for CNV calling in the WES dataset are not robust and results in the false negative CNV calling. While using the WGS dataset, there are multiple tools with high sensitivity and specificity for CNV calling. In our study, we have used a tool LUMPY (version 0.2.13) for structural variant calling that identify the large deletion in our patient

B. Has the statistical analysis been performed appropriately and rigorously?

The statistical analysis has not been performed by the authors.

Response: Since the study involved individual cases , no statistical analysis was performed

C. Have the authors made all data underlying the findings in their manuscript fully available?

The authors have provided analyzed data in the figures/tables and supplementary information in the manuscript wherever necessary. The raw data can be accessed through the convener of the Institutional Human Ethics Committee of the institute.

Response: Since the WGS/WES constitute data that can be potentially identifiable and contains patient sensitive information, the data is available only on request. Access to the data could be requested by mailing to Dr. Jyoti Yadav (ni.ser.bigi@vaday.j) who is the convener of the Institutional Human Ethics Committee of CSIR-IGIB.

D. Is the manuscript presented in an intelligible fashion and written in standard English?

Overall manuscript is okay written, however the authors can rephrase the sentences in the methodology section for example, “small indels, small insertion, and small deletion”.

Response: We thank the reviewer we have rephrased by removing the small indels in the statement.

Attachment

Submitted filename: Response to reviewers.docx

Decision Letter 1

Obul Reddy Bandapalli

28 Jun 2021

Whole genome sequencing identifies novel structural variant in a large Indian family affected with X-linked agammaglobulinemia.

PONE-D-20-38741R1

Dear Dr. Scaria,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Obul Reddy Bandapalli, MSc, PhD

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: (No Response)

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: (No Response)

Reviewer #2: N/A

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: (No Response)

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: (No Response)

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: (No Response)

Reviewer #2: (No Response)

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Acceptance letter

Obul Reddy Bandapalli

1 Jul 2021

PONE-D-20-38741R1

Whole genome sequencing identifies novel structural variant in a large Indian family affected with X-linked agammaglobulinemia

Dear Dr. Scaria:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Obul Reddy Bandapalli

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Fig. Visualization of putative large deletion encompassing BTK gene exon 3–5 of whole exome data of patients P1, P2, P4, P5, and control sample.

    (TIF)

    S2 Fig. MLPA for detection of deletion encompassing 3–5 exons of the BTK gene in the additional family members of the proband.

    Representation of ratio charts generated using the Coffalyser.net software. Longitudinal axis represents the final ratios after inter and intra normalisations of the probe ratios. Horizontal axis represents the probe names along with the length (The axis titles have been manually enlarged for clarity). The blue and red horizontal lines depict the arbitrary borders of ratio 1.3 and 0.7 respectively. The black and red dots denote the final ratio obtained for each of the probes and the vertical bars represent the 95% confidence range for each probe. The test probes are of BTK gene and the rest are the reference probes. The Roman and numeric numbers on top of each ratio chart represent the individual marked as per the pedigree in Fig 1.

    (TIF)

    S3 Fig. South Asian ancestry predicted for locus.

    A) 50MB upstream and 5KB downstream, B) 500KB upstream and 500KB downstream, and C) 50KB upstream and 50KB downstream to loci chrX:100,624,323–100,629,619 (hg19/GRCh37) of two affected first cousins (V14 and V23) with 2504 individuals of five major populations (AFR-African, AMR- American, EAS- East Asian, EUR-European, and SAS-South Asian) of 1000 Genome Project and 44 individuals of Qatar ancestry.

    (TIF)

    S1 Table. Whole exome sequencing data summary for each patient.

    (PDF)

    S2 Table. Whole genome sequencing data summary for P1 and P4.

    (PDF)

    S3 Table. MLPA copy number ratios for each of the reference and test probes of the proband and all the extended family members tested.

    The roman and numeric numbers on top of each ratio chart represent the individual marked as per the pedigree in Fig 1.

    (PDF)

    S1 File. Detailed clinical feature of XLA large deletion family.

    (DOCX)

    Attachment

    Submitted filename: Response to reviewers.docx

    Data Availability Statement

    The data involved in the study will be provided upon request, as it could be potentially identifiable and contains patient sensitive information. Access to the data could be requested by mailing to Dr. Jyoti Yadav (j.yadav@igib.res.in) who is the convener of the Institutional Human Ethics Committee of CSIR-IGIB.


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES