Skip to main content
PLOS One logoLink to PLOS One
. 2020 Apr 15;15(4):e0231651. doi: 10.1371/journal.pone.0231651

Identification of NUDT15 gene variants in Amazonian Amerindians and admixed individuals from northern Brazil

Juliana Carla Gomes Rodrigues 1, Tatiane Piedade de Souza 1, Lucas Favacho Pastana 1, André Maurício Ribeiro dos Santos 2, Marianne Rodrigues Fernandes 1, Pablo Pinto 1,2, Alayde Vieira Wanderley 3, Sandro José de Souza 4, José Eduardo Kroll 4, Adenilson Leão Pereira 2, Leandro Magalhães 2, Laís Reis das Mercês 2, Amanda Ferreira Vidal 2, Tatiana Vinasco-Sandoval 2, Giovanna Chaves Cavalcante 2, João Farias Guerreiro 2, Paulo Pimentel de Assumpção 1, Ândrea Ribeiro-dos-Santos 1,2, Sidney Santos 1,2, Ney Pereira Carneiro dos Santos 1,2,*
Editor: Francesc Calafell5
PMCID: PMC7159207  PMID: 32294118

Abstract

Introduction

The nudix hydrolase 15 (NUDT15) gene acts in the metabolism of thiopurine, by catabolizing its active metabolite thioguanosine triphosphate into its inactivated form, thioguanosine monophosphate. The frequency of alternative NUDT15 alleles, in particular those that cause a drastic loss of gene function, varies widely among geographically distinct populations. In the general population of northern Brazilian, high toxicity rates (65%) have been recorded in patients treated with the standard protocol for acute lymphoblastic leukemia, which involves thiopurine-based drugs. The present study characterized the molecular profile of the coding region of the NUDT15 gene in two groups, non-admixed Amerindians and admixed individuals from the Amazon region of northern Brazil.

Methods

The entire NUDT15 gene was sequenced in 64 Amerindians from 12 Amazonian groups and 82 admixed individuals from northern Brazil. The DNA was extracted using phenol-chloroform. The exome libraries were prepared using the Nextera Rapid Capture Exome (Illumina) and SureSelect Human All Exon V6 (Agilent) kits. The allelic variants were annotated in the ViVa® (Viewer of Variants) software.

Results

Four NUDT15 variants were identified: rs374594155, rs1272632214, rs147390019, andrs116855232. The variants rs1272632214 and rs116855232 were in complete linkage disequilibrium, and were assigned to the NUDT15*2 genotype. These variants had high frequencies in both our study populations in comparison with other populations catalogued in the 1000 Genomes database. We also identified the NUDT15*4 haplotype in our study populations, at frequencies similar to those reported in other populations from around the world.

Conclusion

Our findings indicate that Amerindian and admixed populations from northern Brazil have high frequencies of the NUDT15 haplotypes that alter the metabolism profile of thiopurines.

Introduction

The analysis of the polymorphisms of the Drug Absorption, Distribution, Metabolism, and Excretion (ADME) genes has been commonly employed in personalized medicine as a diagnostic tool for the selection of the most appropriate drugs and/or dosage for the treatment of a range of diseases. A number of recent studies [1, 2, 3] have identified varying frequencies in many genetic polymorphisms in genetically distinct populations, which may modulate the prognosis of patients and the therapeutic efficacy of treatments.

One ADME-related gene that represents a major breakthrough in the field of pharmacogenetics is the nudix hydrolase 15 (NUDT15) gene. This gene acts in the metabolism of thiopurine, catabolizing its active metabolite thioguanosine triphosphate (TGTP) into its inactivated form, thioguanosine monophosphate (TGMP) [4]. Studies have shown that genetic variants responsible for the loss of NUDT15 function lead to an increased incorporation of thioguanine nucleotides into the DNA strand, which would result in an exacerbation of the associated cytotoxic effects, culminating in the appearance of hematological toxicities, such as severe myelosuppression [5, 6].

Thiopurine-based drugs, such as 6-mercaptopurine (6-MP) and azathioprine (AZA), are widely used to treat Acute Lymphoblastic Leukemia (ALL) and Inflammatory Bowel Disease (IBD), respectively. However, the efficacy or tolerance of these drugs may vary considerably according to the level of activity of the NUDT15 enzyme in the patient [7, 8]. The American Food and Drug Administration (FDA) recently recommended specific dose adjustments of both 6-MP and AZA according to the NUDT15 genotype profile of the patient [9, 10].

The NUDT15 gene show six main haplotypes (*1-*6), composed by the four variants combinations (p.Val18_Val19insGlyVal, p.Val18Ile, p.Arg139Cys, p.Arg139His), as described by Moriyama and colleagues [11]. Also, NUDT15 still has assigned eight actionable dyplotypes, which are sorted by enzymatic activity level: normal (*1/*1), intermediate (*1/*2,*1/*3,*1/*4,*1/*5) and low (*3/*5,*2/*3,*3/*3) [11, 12].

Data available from international databases (the 1000 Genomes Project, the Exome Aggregation Consortium [ExAC], and the Genome Aggregation Database [gnomAD]) show that the frequencies of alternative NUDT15 alleles, especially those that cause a drastic loss of gene function, vary greatly between geographically distinct populations. A C>T transition in the rs116855232 (p.Arg139Cys) variant is assigned to two haplotypes (NUDT15*2 and NUDT15*3), which present an extreme loss of gene function, associated with the presence of the T allele of this polymorphism. The T allele of rs116855232 is found more commonly in East Asian [13], Latin [14] and Native American [15] populations, but is rare in populations from Europe, Africa, and South Asia.

The NUDT15*9 haplotype is derived from an in-frame deletion in the sequence responsible for the catalytic activity of NUDT15 (rs746071566), and is associated with extreme loss of enzyme function. This deletion has been found exclusively in patients of European or African descent [16].

An in-frame insertion in this same region (rs1272632214; p.Gly17_Val18dup) is assigned to two haplotypes (NUDT15*2 and NUDT15*6), which are both related to a drastic loss of gene function. This insertion has been identified in populations from Guatemala, Singapore, and Japan, and enzymatic assays have shown that this insertion is responsible for a reduction in the catalytic activity of the NUDT15 enzyme [11].

In Brazil, thiopurine-based therapy is part of the standard treatment protocol for ALL. In a recent study developed by our research group on a northern Brazilian population, severe toxicities (grades 3 and 4) were found in 65% of the patients treated with the standard therapeutic protocol for ALL [7]. This toxicity rate is much higher than that observed in other populations (worldwide average = 26%) treated using the same protocol [7, 17].

Previous data demonstrated that Brazilian Native American populations have a high frequency of a polymorphism (rs116855232) that drastically reduces NUDT15 activity, and that these ethnic group contribute 30%, on average, of the genetic makeup of the admixed populations of the Brazilian Amazon region [15, 18, 19]. Thereafter, a possible cause of the severe toxic effects observed in northern Brazil during the standard thiopurine-based treatment protocol is that individuals of this region have higher frequencies of NUDT15 deleterious alleles, resulting from the genomic contribution inherited of Amerindian groups. Our objective is to evaluate the frequency of deleterious alleles of NUDT15 gene in Amazonian Amerindians and admixed populations of northern Brazil.

Therefore, we employed next-generation sequencing (NGs) data obtained by our research group to define the molecular profile of the NUDT15 gene in two samples, one containing 64 individuals from 11 different indigenous groups of the Brazilian Amazon region and the other, 82 individuals representing the admixed population from the same region.

Methods

Study population

The study population is composed of 64 Amerindians and 82 admixed individuals from the Amazon region of northern Brazil. The Amerindians represent 12 different Amazonian ethnic groups that were grouped together as the Native American (NAM) group. Details such as name, location and number of individuals in each ethnic group are described in S1 Table. The 82 admixed individuals live in Belém, city located in northern Brazil, and, due to its colonization process, are characterized of, mainly, three ancestral genetic components: European, Native American and African. These admixed individuals were enrolled in a broader project to investigate germline mutations in patients with gastric cancer, and are referred to in the present study as the Brazilian Admixed Population (BAP). The present study was approved by the National Committee for Ethics in Research (CONEP) and the Research Ethics Committee of the UFPA Tropical Medicine Center, under CAAE number 20654313.6.0000.5172. All participants signed a free-informed consent as well as the tribe leaders and when necessary a translator explained the project and the importance. The recruitment period for participants was from September 2017 to December 2018.

We compared our results with those of populations from other continents obtained from the phase 3 release of the 1000 Genomes database (available at http://www.1000genomes.org). These population include those of African (AFR), Latin (AMR), East Asian (EAS), European (EUR), and South Asian (SAS) descent. We also compared our findings with genomic data from two databases of genomic variants analyzed in Brazilian populations, specifically individuals from southeast Brazil: the Online Archive of Brazilian Mutations (ABraOM) (available at http://abraom.ib.usp.br) and the Brazilian Initiative on Precision Medicine (BIPMed) (available at https://bipmed.org/).

Extraction of the DNA and preparation of the exome library

The DNA was extracted using the phenol-chloroform method described by Sambrook et al. [20]. The genetic material was quantified using a Nanodrop-8000 spectrophotometer (Thermo Fisher Scientific Inc., Wilmington, DE, USA) and the integrity of the DNA was assessed by electrophoresis in 2% agarose gel.

The libraries were prepared using the Nextera Rapid Capture Exome (Illumina) and SureSelect Human All Exon V6 (Agilent) kits following the manufacturer’s recommendations. The sequencing reactions were run in the NextSeq 500® platform (Illumina®, US) using the NextSeq 500 High-output v2 300 cycle kit (Illumina®).

Bioinformatic analysis

The quality of the FASTQ reads was analyzed (FastQC v.0.11- http://www.bioinformatics.babraham.ac.uk/projects/fastqc/), and the samples were filtered to eliminate low-quality readings (fastx_tools v.0.13—http://hannonlab.cshl.edu/fastx_toolkit/). The sequences were mapped and aligned with the reference genome (GRCh38) using the BWA v.0.7 tool (http://bio-bwa.sourceforge.net/). Following this alignment with the reference genome, the file was indexed and sorted (SAMtools v.1.2—http://sourceforge.net/projects/samtools/). Subsequently, the alignment was processed for duplicate PCR removal (Picard Tools v.1.129—http://broadinstitute.github.io/picard/), mapping quality recalibration, and local realignment (GATK v.3.2—https://www.broadinstitute.org/gatk/). The results were processed to determine the variants from the reference genome (GATK v.3.2). The analysis of the variant annotations was run in the ViVa® (Viewer of Variants) software, developed by the Federal University of Rio Grande do Norte (UFRN) bioinformatics team. The databases and their versions used for variant annotations are: SnpEff v.4.3.T, Ensembl Variant Effect Predictor (Ensembl release 99) and ClinVar (v.2018-10). For in silico prediction of pathogenicity, it was used: SIFT (v.6.2.1), PolyPhen-2 (v.2.2), LRT (November, 2009), Mutation Assessor(v.3.0), Mutation Taster (v. 2.0), FATHMM(v.2.3), PROVEAN(v.1.1.3), MetaSVM(v1.0), M-CAP(v1.4) e FATHMM-MKL (http://fathmm.biocompute.org.uk/about.html).

Statistical analyses

The allele frequencies of the populations were obtained directly by gene counting, and compared with the other study populations (AFR, EUR, AMR, EAS, SAS, ABraOM and BIPMed). The difference in frequencies between the populations were analyzed by Fisher’s exact test, results were considered significant when p-value ≤ 0.05. The inter-population variability of the polymorphisms was assessed using Wright’s fixation index (FST). We performed a linkage disequilibrium estimation using Haploview v. 4.2. The analyses were run in Arlequin v.3.5 [21] and in RStudio v.3.5.1.

Results

We identified four NUDT15 variants, two INDELs (Insertion/Deletion) and two SNVs (Single Nucleotide Variant) in the 146 individuals analyzed, with a medium coverage of 77X. Table 1 provides details of the characteristics of these variants, including their chromosomal location and genic region, the reference number, ClinVar significance, the in silico prediction of functional impact, and pathogenicity.

Table 1. Description of the NUDT15 variants found in the 146 individuals sampled in the present study.

SNP or INDEL position Reference SNP Id Location Impact predicted by SNPeff Change in AA ClinVar significance In silico prediction of pathogenicitya
13:48037782 rs1272632214 Exon 1 Moderate p.Gly17_Val18dup - -
13:48045720 rs147390019 Exon 3 Moderate p.Arg139His Drug response to thiopurines Yes (LRT/Mutation Taster)
13:48045719 rs116855232 Exon 3 Moderate p.Arg139Cys Drug response to thiopurines Yes (LRT)
13:48037953 rs374594155 Intron 1 Modified - - -

aThe other in silico analysis tools used did not show pathogenic impact.

The rs374594155 variant corresponds to a deletion of 25 nucleotide pairs located in intron 1 of the NUDT15 gene. No published information is available on the possible phenotypic or functional effects of this mutation.

Two variants (rs147390019 and rs116855232) provoke changes in the amino acid sequence. Both have clinical significance for the response of the organism to thiopurine-based drugs, as indicated by the ClinVar. In silico predictions of pathogenicity have also been reported. The rs1272632214 variant is derived from the insertion of six nucleotide pairs into exon 1, which results in the in-frame insertion of two amino acids (Glycine and Valine) into the catalytic activity region of the enzyme.

The Hardy-Weinberg Equilibrium (HWE) was calculated and p-values of less than 0.05 were considered significant. All the polymorphisms reported were in HWE, except for rs374594155 (p-value < 0.001).

The allele frequencies recorded for the four variants in the two study populations are relatively high, in general, in comparison with the five populations obtained from the 1000 Genome database and the samples of Brazilian populations from the ABraOM and BIPMed databases, as shown in Table 2.

Table 2. Comparison of the allele frequencies recorded for the NUDT15 variants in the two populations analyzed in the present study (NAM and BAP) with those of five continental populations (AFR, AMR, EAS, EUR and SAS) described in the 1000 genomes database.

Reference SNP Id Minor Allele Frequency
NAM BAP AFR AMR EAS EUR SAS ABraOM BPIMed
rs1272632214 0.0938 0.0680 0 0.050 0.060 0.010 0 0.001 0.005
rs147390019 0.0160 0.0067 0 0.007 0.001 0 0 0.004 -
rs116855232 0.0938 0.0680 0.001 0.045 0.095 0.002 0.070 0.012 -
rs374594155a 0.0550 0.1300 0.087 0.120 0.080 0.008 0.062 0.060 -

aIntron variant

The in-frame insertion of six nucleotide pairs (rs1272632214) has an allele frequency of 9.4% in the Amerindian population analyzed here, and a frequency of 6.8% in the admixed population. These frequencies are slightly higher than those recorded in the AMR (5%) and EAS (6%) populations, and much higher than those observed in the EUR (1%), ABraOM (0,1%) and BIPMed (0,5%) populations. The in-frame mutation is absent from the AFR and SAS populations.

The C>T transition that defines the rs116855232 was identified only in the individuals that presented the above-mentioned in-frame polymorphism, both in the Amerindian population (9.4%) and in the Amazonian admixed population (6.8%). In both populations, the rs1272632214 and rs116855232 are in complete linkage disequilibrium (D = 1 and R2 = 1). This variant (T allele) is more frequent in Asian (EAS = 9.5% and SAS = 7%) and Latin (AMR = 4.5%) populations, but is very rare in Africans (AFR = 0.1%), Europeans (EUR = 0.2%) and in southeast Brazilian (ABraOM = 1,2% and absent in BIPMed).

The G>A transversion that defines the rs147390019 was identified in two Amerindians (frequency of 1.6%) and one individual from the admixed population (0.6%). This variant is considered rare (frequency less than 1%) in all other populations from around the world.

The Table 3 shows the pairwise comparisons of frequencies found between the Amerindians and the admixed Brazilian population of this study, the five populations of the 1000 genomes project and the sample of Brazilians from the ABraOM database.

Table 3. Pairwise comparison of allelic frequencies in Amerindians and admixed population of Brazil with each one of the five continental populations from 1000 genomes and the southeast population of the ABraOM databases.

Pairwise Comparison (P-Valuea) Reference SNP Id
rs1272632214 rs147390019 rs116855232 rs374594155b
NAM vs. BAP 1 1 1 1
NAM vs. AFR - - 1.358e-05 1
NAM vs. AMR 0.5708 1 0.7583 1
NAM vs. EAS 1 0.6385 1 1
NAM vs. EUR 0.0018 - 5.858e-05 0.0439
NAM vs. SAS - - 1 1
NAM vs. ABraOM 1.409e-05 0.7251 2.207e-03 1
BAP vs. AFR - - 6.831e-05 1
BAP vs. AMR 1 1 1 1
BAP vs. EAS 1 1 1 0.9482
BAP vs. EUR 0.0082 - 2.807e-04 1.3203e-06
BAP vs. SAS - - 1 0.2347
BAP vs. ABraOM 7.496e-05 0.8936 0.0143 0.1267

aCalculated by Fisher’s Exact Test, p-values adjusted by Bonferroni’s method.

bIntronic variant.

The Brazilian population from BIPMed database was not included since only one allele of one variant was observed.

The results demonstrated that the rs116855232 variant has the most discrepant frequencies between the Amerindians and admixed Brazilian populations in comparison with the others; each showed significant differences with Africans, Europeans and Brazilians from ABraOM. Indeed, when we analyze the pairwise differences by the prism of parental population, the European is the most distinctive regarding both Amerindians and admixed Brazilian population from Amazon (Table 3).

The Table 4 reports the inter-populational variability of the three exonic variants found in the Amazonian Amerindian and the admixed Brazilian population, which was assessed by the Wright’s fixation index (FST). The most significant differences were reported between the NAM and AFR groups (0.47507), followed by the NAM and EUR groups (0. 39153) and, the NAM e ABraOM populations (0. 29119).

Table 4. Pairwise FST among Amerindians, admixed Brazilian population, the five continental populations from 1000 genomes database and the Brazilian population from ABraOM.

NAM AFR AMR EAS EUR SAS ABRAOM BAP
NAM 0.00000
AFR 0.47507 0.00000
AMR 0.05655 0.04984 0.00000
EAS 0.04913 0.08537 0.01038 0.00000
EUR 0.39153 -0.00034 0.03826 0.07116 0.00000
SAS 0.12447 0.07298 0.01209 0.01250 0.05958 0.00000
ABRAOM 0.29119 0.00655 0.02847 0.06181 0.00409 0.04099 0.00000
BAP 0.03128 0.10400 -0.00323 0.01100 0.06932 0.01380 0.03603 0.00000

*The individuals from BIPMed database were not included in the test since only one allele of one variant was identified in this database.

Discussion

The present study is the first to investigate the complete sequence of the NUDT15 gene in Amazonian Amerindians and a highly admixed population with a major Amerindian component, groups that are under-represented, in general, in pharmacogenetic studies.

We described four variants of the NUDT15 gene, one of which (rs374594155) is intronic, while the other three (rs147390019, rs1272632214, and rs116855232) are all exonic. These variants all have considerable allelic frequencies and well-established clinical impacts in terms of the tolerance of thiopurine-based treatment protocols and the development of severe toxicity. Two of the variants (rs1272632214, and rs116855232) are in complete linkage disequilibrium.

Based on the Pharmacogene Variation Consortium (PharmVar) classification, only two variant haplotypes were observed in our samples [21]. One (NUDT15*4) is represented by the isolated variant rs147390019. The other (NUDT15*2) is defined by the combined presence of rs1272632214, and rs116855232.

More than 20 allelic variants described in the NUDT15 gene are capable of reducing or even deactivating the enzymatic activity of the protein and provoking cytotoxic effects [22, 23, 24]. However, most of these variants are rare and unique to a specific population or group.

The international population databases show that the frequency of the mutations that reduce the functional activity of the enzyme, such as those described by the PharmVar Consortium, exceeds 1% only in Asians, Latin American and Amerindian populations [14, 15, 25]. The most frequent mutations of these groups include the rs116855232 variant, either in isolation or combined with rs1272632214.

The rs147390019 and rs1272632214 polymorphisms were first associated with hematopoietic toxicity in a cohort of children diagnosed with ALL from Japan, Guatemala, and Singapore [11]. In this study, the functional nucleotide phosphatase activity of the NUDT15 enzyme was also described. The evidence indicates that the presence of rs147390019 and rs1272632214 may result in the loss of 75% and 85% in vitro enzyme function, respectively. The study [11] also confirmed the absence of the enzyme catabolizing function when the rs1272632214 and rs116855232 variants were combined. Overall, then, the three variants identified here modify substantially the enzymatic function of NUDT15 and can have a significant impact on the response of the carriers of this genotype to thiopurine-based treatments.

Our analysis showed that 22% of the Amerindians investigated and 15% of the admixed individuals have haplotypes with known clinical impacts. The NUDT15*2 haplotype is more frequent in both populations (NAM = 9.4%, BAP = 6.8%) than NUDT15*4 (NAM = 1.6%, BAP = 0.7%).

There is one specific investigation that evaluated the presence of rs116855232 of the NUDT15 gene in Amerindian populations from southern and midwestern Brazil [15]. In this study, the mean frequency of rs116855232 was 25%, ranging from 5% to 32%, generally higher than the frequency recorded in the present study (9.4%). The differences observed among different Amerindian ethnicities may be due to the stochastic effects of genetic drift, which are common in Brazilian Amerindian groups [26, 27, 28]. In fact, we observed that the frequencies of variants in each of the Amerindians tribes, in which the number of individuals is equal or above five, have great variability (S2 Table). For example, the NUDT15*2 has a frequency of 3% in the Asurini de Trocará group whereas in the Zo’é population is 30%; thus, this reinforce the existence of allelic fluctuations in different Amerindians tribes, wich are arising from evolutionary processes, particularly genetic drift.

In the case of the admixed population, we compared our data with those available in two public genomic databases from the state of São Paulo, in southeastern Brazil: BIPMed, which comprises 106 individuals; and ABraOM, comprising 609 individuals. In the BIPMed database, it was reported 11 polymorphisms in the NUDT15 gene, however most of them had a frequency lower than 1%. In regard with the variants reported in this study, only one was also described in the database, the rs1272632214, with a very low frequency (only one out of 212 alleles) (http://bipmed.iqm.unicamp.br).

In the ABraOM database, it was possible to identify eight exonic mutations with possible phenotypic effects, of which five were identified in only one (rs746071566, rs149436418, rs115012598, and rs200980982) or two (rs139551410) individuals of the 609 evaluated (http://abraom.ib.usp.br/). The other three mutations were the same as those described in our study, i.e., rs116855232, with a frequency of 1.2%, rs1272632214 (0,1%), and rs147390019 (0.4%) [29].

In summary, the maximum proportion of individuals with deleterious alleles in the population from southeastern Brazil is much lower than the one we recorded in the admixed population from northern Brazil. One possible reason for this discrepancy is the greater contribution of Amerindian ancestry to the northern (Amazonian) population in comparison with that from southeastern Brazil, since the first has an average of 30% and the second has a medium of 8.8% of Amerindian genetic background [17, 18, 30].

The Amerindian ancestry has also been linked to higher frequencies of NUDT15 deleterious alleles, specifically the rs1272632214 and rs116855232 polymorphisms, in other populations, such as the subpopulations from the Latin population of 1000 genomes database (S2 Table). The Mexicans (MXL), Colombians (CML), Puerto Ricans (PUR), and Peruvians (PEL) and have around of 25%, 25%, 13% [31], and 77% [32] of Amerindian genetic background; and the rs1272632214 and rs116855232 frequencies observed in these populations are, respectively, 2.1% and 2.1% for CML, 3.1% and 4.7% for MXL, 10.6% and 11.8% for PEL and, lastly, 0.5% and 0.5% for PUR, which can be inferred that the populations with higher Amerindian genomic ancestry have higher frequencies of the deleterious alleles, i.e Peruvians, followed by Mexicans.

Conclusions

Our results demonstrate that Amerindian and admixed populations from northern Brazil have a high frequency of two NUDT15 haplotypes (*2 and *4) that alter significantly the thiopurine metabolization profile. Our findings indicate that these deleterious mutations could partly account for the high rates of toxicity found in ALL patients form northern Brazil. It will nevertheless be essential to evaluate the practical effects of these haplotypes in case-control studies with patients undergoing 6-mercaptopurine therapy, in order to establish their potential association with the poor clinical outcomes.

Supporting information

S1 Table. Name, location and number of individuals in each population group studied.

(PDF)

S2 Table. Allelic frequency independently for each of the Amerindian populations (N ≥ 5), of the admixed Brazilian population, of the subpopulations of the American group from the 1000 genomes, and of the southeast Brazilian samples.

CML = Colombians from Medellin, Colombia; MXL: Mexican Ancestry from Los Angeles USA; PEL = Peruvians from Lima, Peru; PUR = Puerto Ricans from Puerto Rico.

(PDF)

S1 Dataset

(PDF)

Data Availability

All relevant data are within the paper and its Supporting Information files.

Funding Statement

We acknowledge funding from Conselho Nacional de Desenvolvimento Científico e Tecnológico- CNPq (http://www.cnpq.br) through the call Chamada Universal- MCTI / CNPq No. 14/2013, contemplated researcher Paulo Pimentel de Assumpção; Universidade Federal do Pará- UFPA (https://portal.ufpa.br); Pró-Reitoria de Pesquisa e Pós-Graduação da UFPA- PROPESP (http://www.propesp.ufpa.br); Coordenação de Aperfeiçoamento de Pessoal de Nível Superior- CAPES(https://www.capes.gov.br); Fundação Amazônia de Amparo a Estudos e Pesquisas-FAPESPA (http://www.fapespa.pa.gov.br). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Li J, Lou H, Yang X, Lu D, Li S, Jin L, et al. Genetic architectures of ADME genes in five Eurasian admixed populations and implications for drug safety and efficacy. J Med Genet. 2014; 51(9):614–22. 10.1136/jmedgenet-2014-102530 [DOI] [PubMed] [Google Scholar]
  • 2.Mukerjee G, Huston A, Kabakchiev B, Piquette-Miller M, van Schaik R, Dorfman R. User considerations in assessing pharmacogenomic tests and their clinical support tools. NPJ Genom. Med. 2018; 11:3–26. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Rodrigues JCG, Fernandes MR, Guerreiro JF, da Silva ALDC, Ribeiro-Dos-Santos Â, Santos S, et al. Polymorphisms of ADME-related genes and their implications for drug safety and efficacy in Amazonian Amerindians. Sci Rep. 2019; 9(1):7201–7209. 10.1038/s41598-019-43610-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Valerie NC, Hagenkort A, Page BD, Masuyer G, Rehling D, Carter M, et al. NUDT15 hydrolyzes 6-thio-deoxyGTP to mediate the anticancer efficacy of 6-thioguanine. Cancer Res. 2016; 76(18):5501–11. 10.1158/0008-5472.CAN-16-0584 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Nishii R, Moriyama T, Janke LJ, Yang W, Suiter CC, Lin TN, et al. Preclinical evaluation of NUDT15-guided thiopurine therapy and its effects on toxicity and antileukemic efficacy. Blood. 2018; 131(22):2466–2474. 10.1182/blood-2017-11-815506 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Cheng J, Zhang H, Ma HZ, Li J. Homozygous mutation in NUDT15 in childhood acute lymphoblastic leukemia with increased susceptibility to mercaptopurine toxicity: A case report. Exp Ther Med. 2019; 17(5):4285–4288. 10.3892/etm.2019.7434 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Carvalho DC, Wanderley AV, Santos AMR, Fernandes MR, Castro ANCL, Leitão LPC, et al. Pharmacogenomics and variations in the risk of toxicity during the consolidation/maintenance phases of the treatment of pediatric B-cell leukemia patients from an admixed population in the Brazilian Amazon. Leuk Res. 2018; 74:10–13. 10.1016/j.leukres.2018.09.003 [DOI] [PubMed] [Google Scholar]
  • 8.Kakuta Y, Kinouchi Y, Shimosegawa T. Pharmacogenetics of thiopurines for inflammatory bowel disease in East Asia: prospects for clinical application of NUDT15 genotyping. J Gastroenterol. 2018; 53(2):172–180. 10.1007/s00535-017-1416-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.PURIXAN® (mercaptopurine) [package insert on the internet]. United Kingdom: Nova Laboratories Ltd.; https://www.accessdata.fda.gov/drugsatfda_docs/label/2018/205919s003lbl.pdf (2018).
  • 10.IMURAN ® (azathioprine) [package insert on the internet]. Sebela Pharmaceuticals, Inc. https://www.accessdata.fda.gov/drugsatfda_docs/label/2018/016324s039lbl.pdf (2018).
  • 11.Moriyama T, Nishii R, Perez-Andreu V, Yang W, Klussmann FA, Zhao X, et al. NUDT15 polymorphisms alter thiopurine metabolism and hematopoietic toxicity. Nat Genet. 2016; 48(4):367–73. 10.1038/ng.3508 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Kakuta Y, Izumiyama Y, Okamoto D, Nakano T, Ichikawa R, Naito T, et al. High-resolution melt analysis enables simple genotyping of complicated polymorphisms of codon 18 rendering the NUDT15 diplotype. J Gastroenterol. 2020; 55(1): 67–77. 10.1007/s00535-019-01638-x [DOI] [PubMed] [Google Scholar]
  • 13.Yang JJ, Landier W, Yang W, Liu C, Hageman L, Cheng C, et al. Inherited NUDT15 variant is a genetic determinant of mercaptopurine intolerance in children with acute lymphoblastic leukemia. J Clin Oncol. 2015; 33(11):1235–42. 10.1200/JCO.2014.59.4671 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Relling MV, Schwab M, Whirl-Carrillo M, Suarez-Kurtz G, Pui CH, Stein CM, et al. Clinical Pharmacogenetics Implementation Consortium guideline for thiopurine dosing based on TPMT and NUDT15 genotypes: 2018 Update. Clin Pharmacol Ther. 2019; 105(5):1095–1105. 10.1002/cpt.1304 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Suarez-Kurtz G, Brisson GD, Hutz MH, Petzl-Erler ML, Salzano FM. NUDT15 Polymorphism in Native American Populations of Brazil. Clin Pharmacol Ther. 2019; 105(6):1321–1322. 10.1002/cpt.1379 [DOI] [PubMed] [Google Scholar]
  • 16.Moriyama T, Yang YL, Nishii R, Ariffin H, Liu C, Lin TN, et al. Novel variants in NUDT15 and thiopurine intolerance in children with acute lymphoblastic leukemia from diverse ancestry. Blood. 2017; 130(10):1209–1212. 10.1182/blood-2017-05-782383 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Stary J, Zimmermann M, Campbell M, Castillo L, Dibar E, Donska S, et al. Intensive chemotherapy for childhood acute lymphoblastic leukemia: results of the randomized intercontinental trial ALL IC-BFM 2002. J Clin Oncol. 2014; 32(3):174–84. 10.1200/JCO.2013.48.6522 [DOI] [PubMed] [Google Scholar]
  • 18.Santos NP, Ribeiro-Rodrigues EM, Ribeiro-Dos-Santos AK, Pereira R, Gusmão L, Amorim A, et al. Assessing individual interethnic admixture and population substructure using a 48-insertion-deletion (INSEL) ancestry-informative marker (AIM) panel. Hum Mutat. 2010;31(2):184–90. 10.1002/humu.21159 [DOI] [PubMed] [Google Scholar]
  • 19.Ramos BR, D’Elia MP, Amador, Santos NP, Santos SE, da Cruz Castelli E, et al. Neither self-reported ethnicity nor declared family origin are reliable indicators of genomic ancestry. Genetica. 2016; 144(3):259–65. 10.1007/s10709-016-9894-1 [DOI] [PubMed] [Google Scholar]
  • 20.Sambrook J, Fritsch EF, Maniatis T. Molecular cloning: a laboratory manual. Cold spring harbor laboratory press, 1989. [Google Scholar]
  • 21.Excoffier L. & Lischer H. E. L. Arlequin suite ver 3.5: A new series of programs to perform population genetics analyses under Linux and Windows. Mol Ecol Resour. 2010; 10(3):564–567. 10.1111/j.1755-0998.2010.02847.x [DOI] [PubMed] [Google Scholar]
  • 22.Sutiman N, Chen S, Ling KL, Chuah SW, Leong WF, Nadiger V, et al. Predictive role of NUDT15 variants on thiopurine-induced myelotoxicity in Asian inflammatory bowel disease patients. Pharmacogenomics. 2018; 19(1):31–43. 10.2217/pgs-2017-0147 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Chang JY, Park SJ, Jung ES, Jung SA, Moon CM, Chun J, et al. Genotype-based treatment with thiopurine reduces incidence of myelosuppression in patients with inflammatory bowel diseases. Clin Gastroenterol Hepatol. 2019; pii:S1542-3565(19)30909-7. 10.1016/j.cgh.2019.08.034 [DOI] [PubMed] [Google Scholar]
  • 24.Yang JJ, Whirl-Carrillo M, Scott SA, Turner AJ, Schwab M, Tanaka Y, et al. Pharmacogene Variation Consortium Gene Introduction: NUDT15. Clin Pharmacol Ther. 2019; 105(5):1091–1094. 10.1002/cpt.1268 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Zhu Y, Yin D, Su Y, Xia X, Moriyama T, Nishii R, et al. Combination of common and novel rare NUDT15 variants improves predictive sensitivity of thiopurine-induced leukopenia in children with acute lymphoblastic leukemia. Haematologica. 2018; 103(7):e293–e295. 10.3324/haematol.2018.187658 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Callegari-Jacques SM, Tarazona-Santos EM, Gilman RH, Herrera P, Cabrera L, Santos SE, et al. Autosome STRs in native South America-Testing models of association with geography and language. Am J Phys Anthropol. 2011; 145(3):371–81. 10.1002/ajpa.21505 [DOI] [PubMed] [Google Scholar]
  • 27.Callegari-Jacques SM, Salzano FM. Genetic variation within two linguistic Amerindian groups: relationship to geography and population size. Am J Phys Anthropol. 1989; 79(3):313–20. 10.1002/ajpa.1330790307 [DOI] [PubMed] [Google Scholar]
  • 28.Amorim CE, Wang S, Marrero AR, Salzano FM, Ruiz-Linares A, Bortolini MC. X-chromosomal genetic diversity and linkage disequilibrium patterns in Amerindians and non-Amerindian populations. Am J Hum Biol. 2011; 23(3):299–304. 10.1002/ajhb.21110 [DOI] [PubMed] [Google Scholar]
  • 29.Naslavsky MS, Yamamoto GL, Almeida TF, Ezquina SAM, Sunaga DY, Pho N, et al. Exomic variants of an elderly cohort of Brazilians in the ABraOM database. Hum Mutat. 2017; 38(7):751–763. 10.1002/humu.23220 [DOI] [PubMed] [Google Scholar]
  • 30.Secolin R, Mas-Sandoval A, Arauna LR, Torres FR, de Araujo TK, Santos ML, et al. Distribution of local ancestry and evidence of adaptation in admixed populations. Sci Rep. 2019; 9(1):13900 10.1038/s41598-019-50362-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Gravel S, Zakharia F, Moreno-Estrada A, Byrnes JK, Muzzio M, Rodriguez-Flores JL, et al. Reconstructing Native American migrations from whole-genome and whole-exome data. PLoS Genet. 2013; 9(12):e1004023 10.1371/journal.pgen.1004023 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Martin AR, Gignoux CR, Walters RK, Wojcik GL, Neale BM, Gravel S, et al. Human demographic history impacts genetic risk prediction across diverse populations. Am J Hum Genet. 2017; 100(4):635–649. 10.1016/j.ajhg.2017.03.004 [DOI] [PMC free article] [PubMed] [Google Scholar]

Decision Letter 0

Francesc Calafell

12 Dec 2019

PONE-D-19-30849

Whole exome sequencing of Amerindians and admixed individuals from northern Brazil: identification of NUDT15 gene variants

PLOS ONE

Dear Fernandes,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Both reviewers have serious objections to what appears to be the main message of the manuscript, namely that NUDT15 gene variants could explain the higher rate of toxicity of thiopurine treatments in NE Brazil, and that those variants are linked to Ameridian ancestry. I agree that you should either redimension the goals and conclusions of your manuscript, or provide the necessary evidence (which implies new analyses) to back up this statement.

We would appreciate receiving your revised manuscript by Jan 26 2020 11:59PM. When you are ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter.

To enhance the reproducibility of your results, we recommend that if applicable you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). This letter should be uploaded as separate file and labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. This file should be uploaded as separate file and labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. This file should be uploaded as separate file and labeled 'Manuscript'.

Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

We look forward to receiving your revised manuscript.

Kind regards,

Francesc Calafell

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at http://www.journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and http://www.journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Please provide additional details regarding participant consent. In the ethics statement in the Methods and online submission information, please ensure that you have specified whether consent was written or verbal/oral. If consent was verbal/oral, please specify: 1) whether the ethics committee approved the verbal/oral consent procedure, 2) why written consent could not be obtained, and 3) how verbal/oral consent was recorded. If your study included minors, please state whether you obtained consent from parents or guardians in these cases. If the need for consent was waived by the ethics committee, please include this information.

3. Please include in your Methods section the date ranges over which you recruited participants to this study.

4. To comply with PLOS ONE submission guidelines please deposit your sequencing data in a publicly available repository (you can find a list of repositories in the link here : https://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories)

5. We note that you have indicated that data from this study are available upon request. PLOS only allows data to be available upon request if there are legal or ethical restrictions on sharing data publicly. For more information on unacceptable data access restrictions, please see http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions.

In your revised cover letter, please address the following prompts:

a) If there are ethical or legal restrictions on sharing a de-identified data set, please explain them in detail (e.g., data contain potentially sensitive information, data are owned by a third-party organization, etc.) and who has imposed them (e.g., an ethics committee). Please also provide contact information for a data access committee, ethics committee, or other institutional body to which data requests may be sent.

b) If there are no restrictions, please upload the minimal anonymized data set necessary to replicate your study findings as either Supporting Information files or to a stable, public repository and provide us with the relevant URLs, DOIs, or accession numbers. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories.

We will update your Data Availability statement on your behalf to reflect the information you provide.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: No

Reviewer #2: Partly

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: No

Reviewer #2: No

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: No

Reviewer #2: No

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: No

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: This is an interesting paper, which describes differences in frequencies of variants in NUDT15 gene among Amazonian Amerindians and admixed Brazilian samples, compared with 1000 Genome database. However, the main point of the paper, which is to relate NUDT15 variants with high frequency of toxic effects due to thiopurine treatment in northern Brazilian patients, were not achieve, since the authors only reports minimum allele frequencies, and did not perform any genetic association or similar analysis to compare patients with thiopurine toxic effects and controls, or with patients without thiopurine toxic effects. Below I describe additional Major and minor corrections:

Page 1, line 4: I would change the short title, since it to something like "Investigating NUDT15 gene variants in Amerindians and population from northern Brazil", since we cannot perform a "Whole-Exome" in one gene;

Page 2, line 37: The authors should change this phrase, since only the coding region of NUDT15 gene was sequenced;

Page 4, line 78: The authors should include a brief explanation about NUDT15 diplotypes;

Page 5, lines 98-101: The hypothesis seems confusing to me. Studies have already been shown that NUDT15 genotypes or haplotypes are related to toxic effects from thiopurine-based treatments. However, here the hypothesis is exactly to see whether the link between NUDT15 genotypes and thiopurine toxic effects exists. I think what that the authors mean is whether the frequency of NUDT15 haplotypes is different in this specific population, which would explain the high frequency of severe toxicities. In addition, the authors cited "alternative NUDT15 genotypes", and what are these alternative genotypes? Therefore, I suggest that the authors rewrite this part, in order to better explain the hypothesis;

Page 5, lines 119-121: The authors should briefly explain what ancestry populations this admixed sample contains (Europeans, sub-Saharan Africans and Native Americans? or other additional ancestral one?). It could be achieved by global ancestry estimations (such as ADMIXTURE algorithm), in order to strength the ancestry population information;

Page 6, line 137: In Bioinformatic analysis section, the authors should provide the database version of all database variant annotation used (ClinVar, LRT, Mutation Taster, and others), regardless the variant annotation software used to achieved the database information (Annovar, VEP, VarStation, ViVa), since pathogenicity and clinical information could change depending on the database version used.

Page 7, lines 152-153. The authors should report at least whether these variants were well covered by the exome. It could be included in one column in Table 1;

Table 1: Are the authors analyze other pathogenicity prediction algorithms rather than LRT and Mutation Taster? For example, SIFT, PolyPhen, CADD. If yes, they should be included in Table 1. If not, the authors should analyze the pathogenicity by these other algorithms and included the results in Table 1;

Page 8, lines 167-169: The difference in frequencies among the populations is very interesting; in this case, the authors should include a statistical test in order to guarantee that these differences are not by chance. If the frequency differences are not by chance, it also would be interesting to answer the question “Why are these frequencies are different in this specific population”. It could be due to some evolutionary processes, such as natural selection, genetic drift, or some problems in the data (too much related individuals, high inbreeding). I think the authors should explore these issues to better explain these high frequency (if it is not by chance);

Page 9, line 179: The authors cited linkage disequilibrium estimation, but they did not describe how linkage disequilibrium was estimated in the methods. In addition, important variant informatios is lacking, for example, Hardy-Weinberg disequilibrium reports.

Page 11, lines 223-226: This phrase does not make sense, because the authors cited that no previous information about NUDT15 variation were report, and the next phrase cites one paper which one variant was investigated in Amerindians (ref 13);

Page 11, lines 230-231: The authors also should investigate the NUDT15 variation in the database provided by the Brazilian Initiative on Precision Medicine (BIPMed) (https://bipmed.org/brave/index.html);

Page 11, lines 239-240: The authors should clearly reference this information, since this information could be based on one paper only and could not reflect the southeastern Brazil samples;

Page 12, lines 247-250: The hypothesis rise by the authors is that alternative NUDT15 genotypes could be involved in toxic effects observed in northern Brazilian patients exposed a thiopurine-based treatment. However, the authors did not perform any experiment to clearly link the presence of these NUDT15 haplotypes and levels of thiopurine: they just report the minor allele frequencies in the two groups (which seems that they were not exposed to thiopurine treatment). Therefore, these conclusions did not answer the objective of the paper, and I suggest that the authors design some experiment to answer this hypothesis, or change the hypothesis and the objectives of the paper.

The authors should correct grammatical and typographical errors throughout the manuscript.

Reviewer #2: The authors address an interesting topic, that concerns the population based metabolism of thiopurine, which has important implications for the treatment of several diseases as Acute Lymphoblastic Leukemia or Inflammatory Bowel Disease. Admixed populations from northern Brazil have revealed much higher levels of toxicity than other world wide populations, according to the references cited by the authors.

The authors aim to identify the variants of the NUDT15 gene and assess the frequencies in two populations, a group of different samples belonging to several Native American populations from Northern Brazil and an admixed population from the same area. Moreover, the authors claim correlation of the Native American ancestry proportion in the admixed populations with the variants related to the high toxicity levels after the administration of theopurine based drugs.

I acknowledge the significance of the study. However, I want to highlight few points that the authors should address:

First, the title of the manuscript is confusing. The title claims Whole exome sequencing of Amerindians and admixed individuals from northern Brazil. However, the study only analyses the NUDT15 gene. This is could be misleading. The authors should clarify if they have generated whole exome data. They also should explain why they only analyze this gene and not the rest of the exome. The title should not only be coherent with the text, but both should be consistent with the data provided.

Second. As the authors discuss, it is relevant to know if the pathogenic variants of the admixed populations are related to the Native American ancestry. However, there is a lack of evidence to support this hypothesis and the authors should perform the additional analyses needed for a solid discussion. Ideally, the authors should use the whole exome data to accurately assess how pathogenic variants are correlated with Native American genetic global and local ancestry.

Third, in the interest of personalized medicine, it would be interesting to see if some Native American populations have higher frequencies of the pathogenic alleles than others. Similarly, the frequencies in the American admixed populations can be correlated or not to the average proportions of the Native American ancestry, depending if the Native American genetic sources were different and had different frequencies of the pathogenic alleles.

Therefore, assuming the authors only have access to the data regarding the NUDT15 gene, the authors should address the following analyses.

�An extra table for the allele frequencies independently for each of the 11 Native American populations when the population size is five or higher. This table should also analyze independently the four American admixed populations of the 1000 genomes and include the southeastern Brazilian admixed population. The authors should use this information to confirm and discuss more accurately what they say in line 227.

�A table reporting the FST between populations regarding all variants of the region analyzed.

�If the authors do not have access to the whole exome data and can not compute the average genomic proportions of ancestral components (Native American, European and sub-Saharan African) for the American admixed populations, they should at least report the values found in the literature for these populations (the amazonian Brazilian admixed population, the southeastern Brazilian admixed population and the four American admixed populations) and see if the frequency of the studied variants is correlated with the Native American ancestry as the authors claim in line 241.

Authors must also make minor but important changes:

Introduction

�Line 58: change ethnically by genetically

Methods

�Line 114: add detailed information of the location of sampling of the admixed and Native American populations

�Detail which regions are analyzed and in which chromosome they are found.

Results

�Sort tables 1 and 2 by position

�Paragraph starting in line 167 should be supported by a statistical test, like the Chi-squared test.

�Correct the sentence of line 206 where it is said that Native American populations are descendants of the analyzed East Asian populations. Native American might be more closely related to East Asian populations than to other populations, but this does not mean they are their descendants.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files to be viewed.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2020 Apr 15;15(4):e0231651. doi: 10.1371/journal.pone.0231651.r002

Author response to Decision Letter 0


24 Jan 2020

Reviewer #1: This is an interesting paper, which describes differences in frequencies of variants in NUDT15 gene among Amazonian Amerindians and admixed Brazilian samples, compared with 1000 Genome database. However, the main point of the paper, which is to relate NUDT15 variants with high frequency of toxic effects due to thiopurine treatment in northern Brazilian patients, were not achieve, since the authors only reports minimum allele frequencies, and did not perform any genetic association or similar analysis to compare patients with thiopurine toxic effects and controls, or with patients without thiopurine toxic effects. Below I describe additional Major and minor corrections:

1) Page 1, line 4: I would change the short title, since it to something like "Investigating NUDT15 gene variants in Amerindians and population from northern Brazil", since we cannot perform a "Whole-Exome" in one gene;

Response: We have changed the short title in accordance with this suggestion.

2) Page 2, line 37: The authors should change this phrase, since only the coding region of NUDT15 gene was sequenced;

Response: We have changed the phrase to highlight that we did only the coding region of NUDT15 gene.

3) Page 4, line 78: The authors should include a brief explanation about NUDT15 diplotypes;

Response: The text has been modified accordingly.

4) Page 5, lines 98-101: The hypothesis seems confusing to me. Studies have already been shown that NUDT15 genotypes or haplotypes are related to toxic effects from thiopurine-based treatments. However, here the hypothesis is exactly to see whether the link between NUDT15 genotypes and thiopurine toxic effects exists. I think what that the authors mean is whether the frequency of NUDT15 haplotypes is different in this specific population, which would explain the high frequency of severe toxicities. In addition, the authors cited "alternative NUDT15 genotypes", and what are these alternative genotypes? Therefore, I suggest that the authors rewrite this part, in order to better explain the hypothesis;

Response: We agreed that the manuscript's hypothesis was not very clear and therefore caused confusion about what we actually intended to accomplish. The objective of our paper was to describe the frequency of deleterious alleles of NUDT15 gene in a parental group forming of the actual population from northern Brazil (Native Americans) and in admixed northern population itself. We rewrote this part of the article and hope that the hypothesis has been cleared up.

5) Page 5, lines 119-121: The authors should briefly explain what ancestry populations this admixed sample contains (Europeans, sub-Saharan Africans and Native Americans? or other additional ancestral one?). It could be achieved by global ancestry estimations (such as ADMIXTURE algorithm), in order to strength the ancestry population information;

Response: The main ancestries populations of the brazilian north population is the European, African and Native Americans. This data is well-stablished about the Brazilian population since it is a consequence of our colonization process. Also, this information has been confirmed by genetic analysis, using a dataset of 61 Ancestry Informative Markers.

Details about the methodology can be achieved in “Santos NP et al. Assessing individual interethnic admixture and population substructure using a 48-insertion-deletion (INSEL) ancestry-informative marker (AIM) panel. Hum Mutat. 2010;31(2):184-90.” and “Ramos et al. Neither self-reported ethnicity nor declared family origin are reliable indicators of genomic ancestry. Genetica. 2016; 144(3):259-65.”)

6) Page 6, line 137: In Bioinformatic analysis section, the authors should provide the database version of all database variant annotation used (ClinVar, LRT, Mutation Taster, and others), regardless the variant annotation software used to achieved the database information (Annovar, VEP, VarStation, ViVa), since pathogenicity and clinical information could change depending on the database version used.

Response: The manuscript has been modified accordingly.

7) Page 7, lines 152-153. The authors should report at least whether these variants were well covered by the exome. It could be included in one column in Table 1;

Response: We have included the information about the exome coverage in the first paragraph of the Results section.

8) Table 1: Are the authors analyze other pathogenicity prediction algorithms rather than LRT and Mutation Taster? For example, SIFT, PolyPhen, CADD. If yes, they should be included in Table 1. If not, the authors should analyze the pathogenicity by these other algorithms and included the results in Table 1;

Response: All the pathogenicity prediction algorithms used are described in the subsection “Bioinformatic analysis” of the Methods section. We have not included them in the Table 1 because the only algorithms that showed pathogenicity predictions is the LRT and Mutation Taster.

9) Page 8, lines 167-169: The difference in frequencies among the populations is very interesting; in this case, the authors should include a statistical test in order to guarantee that these differences are not by chance. If the frequency differences are not by chance, it also would be interesting to answer the question “Why are these frequencies are different in this specific population”. It could be due to some evolutionary processes, such as natural selection, genetic drift, or some problems in the data (too much related individuals, high inbreeding). I think the authors should explore these issues to better explain these high frequency (if it is not by chance);

Response: As the other reviewer also suggested, we performed a Fisher’s exact test to statistically evaluate the differences in populations frequencies observed. Also, we have explained in the Discussion section that we believe that these differences in frequencies may be due to genetic drift, a commonly genetic event observed in Amerindian populations.

10) Page 9, line 179: The authors cited linkage disequilibrium estimation, but they did not describe how linkage disequilibrium was estimated in the methods. In addition, important variant information is lacking, for example, Hardy-Weinberg disequilibrium reports.

Response: We added a new subsection called “Statistical analysis” in the Methods section to better explain the tests and sofwtwares we have used in our data analyzes. The linkage disequilibrium was estimated using the Haploview software. Additionally, the Hardy-Weinberg disequilibrium reports has been described in the 198-200 lines.

11) Page 11, lines 223-226: This phrase does not make sense, because the authors cited that no previous information about NUDT15 variation were report, and the next phrase cites one paper which one variant was investigated in Amerindians (ref 13);

Response: The text has been rewritten.

12) Page 11, lines 230-231: The authors also should investigate the NUDT15 variation in the database provided by the Brazilian Initiative on Precision Medicine (BIPMed) (https://bipmed.org/brave/index.html);

Response: We agreed with the suggestion and data from the BIPMED database was included in our analyzes.

13) Page 11, lines 239-240: The authors should clearly reference this information, since this information could be based on one paper only and could not reflect the southeastern Brazil samples;

Response: We have supported this information using additional data about the subpopulation groups of the Latin population of the 1000 genomes database, i.e. Colombians, Mexican descendants, Peruvians and Puerto Ricans. The paragraph comprises the 308-317 lines.

14) Page 12, lines 247-250: The hypothesis rise by the authors is that alternative NUDT15 genotypes could be involved in toxic effects observed in northern Brazilian patients exposed a thiopurine-based treatment. However, the authors did not perform any experiment to clearly link the presence of these NUDT15 haplotypes and levels of thiopurine: they just report the minor allele frequencies in the two groups (which seems that they were not exposed to thiopurine treatment). Therefore, these conclusions did not answer the objective of the paper, and I suggest that the authors design some experiment to answer this hypothesis, or change the hypothesis and the objectives of the paper.

Response: We clarified our hypothesis and hope that it now agrees with our results and conclusions;

15) The authors should correct grammatical and typographical errors throughout the manuscript.

Response: We sent the manuscript to be revised for a native English speaker professional translator, in order to correct typos and language errors across the text (a declaration has been provided and it is attached at the end of this document).

REVIEWER #2: The authors address an interesting topic, that concerns the population based metabolism of thiopurine, which has important implications for the treatment of several diseases as Acute Lymphoblastic Leukemia or Inflammatory Bowel Disease. Admixed populations from northern Brazil have revealed much higher levels of toxicity than other world wide populations, according to the references cited by the authors.

The authors aim to identify the variants of the NUDT15 gene and assess the frequencies in two populations, a group of different samples belonging to several Native American populations from Northern Brazil and an admixed population from the same area. Moreover, the authors claim correlation of the Native American ancestry proportion in the admixed populations with the variants related to the high toxicity levels after the administration of theopurine based drugs.

I acknowledge the significance of the study. However, I want to highlight few points that the authors should address:

• First, the title of the manuscript is confusing. The title claims Whole exome sequencing of Amerindians and admixed individuals from northern Brazil. However, the study only analyses the NUDT15 gene. This is could be misleading. The authors should clarify if they have generated whole exome data. They also should explain why they only analyze this gene and not the rest of the exome. The title should not only be coherent with the text, but both should be consistent with the data provided.

Commentary: We have changed the title of the manuscript. Also, we have analyzed only the NUDT15 gene because we have a particular clinical condition in our region regarding the toxicities arising from thioupurine-treatments, that are dependent of the NUDT15 enzyme activity. So we performed a molecular epidemiology study in our population regarding the frequencies of deleterious alleles of this gene.

Second. As the authors discuss, it is relevant to know if the pathogenic variants of the admixed populations are related to the Native American ancestry. However, there is a lack of evidence to support this hypothesis and the authors should perform the additional analyses needed for a solid discussion. Ideally, the authors should use the whole exome data to accurately assess how pathogenic variants are correlated with Native American genetic global and local ancestry.

Third, in the interest of personalized medicine, it would be interesting to see if some Native American populations have higher frequencies of the pathogenic alleles than others. Similarly, the frequencies in the American admixed populations can be correlated or not to the average proportions of the Native American ancestry, depending if the Native American genetic sources were different and had different frequencies of the pathogenic alleles.

Therefore, assuming the authors only have access to the data regarding the NUDT15 gene, the authors should address the following analyses.

1) An extra table for the allele frequencies independently for each of the 11 Native American populations when the population size is five or higher. This table should also analyze independently the four American admixed populations of the 1000 genomes and include the southeastern Brazilian admixed population. The authors should use this information to confirm and discuss more accurately what they say in line 227.

Response: We accepted this suggestion. The supplementary table 2 contains data requested. The discussion section has been modified accordingly the results achieved by the analyzes performed (lines 281-288).

2) A table reporting the FST between populations regarding all variants of the region analyzed.

Response: We have accepted the suggestion and created the table 4, placed in the Results section.

3) If the authors do not have access to the whole exome data and can not compute the average genomic proportions of ancestral components (Native American, European and sub-Saharan African) for the American admixed populations, they should at least report the values found in the literature for these populations (the amazonian Brazilian admixed population, the southeastern Brazilian admixed population and the four American admixed populations) and see if the frequency of the studied variants is correlated with the Native American ancestry as the authors claim in line 241.

Response: We have added the information requested. (lines 305-317).

• Authors must also make minor but important changes:

Introduction

1) Line 58: change ethnically by genetically

Response: The text has been modified accordingly.

Methods

2) Line 114: add detailed information of the location of sampling of the admixed and Native American populations

Response: We have added the information requested (Supplementary table 1).

3) Detail which regions are analyzed and in which chromosome they are found.

Results

Response: The regions are the chromosome location are detailed in table 1.

4) Sort tables 1 and 2 by position

Response: We have changed the tables in accordance with this suggestion.

5) Paragraph starting in line 167 should be supported by a statistical test, like the Chi-squared test.

Response: We performed a Fisher’s exact test to support the differences in frequencies observed.

6) Correct the sentence of line 206 where it is said that Native American populations are descendants of the analyzed East Asian populations. Native American might be more closely related to East Asian populations than to other populations, but this does not mean they are their descendants.

Response: We have corrected the sentence in accordance with this suggestion.

Attachment

Submitted filename: Response to Reviewers.docx

Decision Letter 1

Francesc Calafell

20 Feb 2020

PONE-D-19-30849R1

Identification of NUDT15 gene variants in Amazonian Amerindians and admixed individuals from northern Brazil

PLOS ONE

Dear Fernandes,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

We would appreciate receiving your revised manuscript by Apr 05 2020 11:59PM. When you are ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter.

To enhance the reproducibility of your results, we recommend that if applicable you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). This letter should be uploaded as separate file and labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. This file should be uploaded as separate file and labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. This file should be uploaded as separate file and labeled 'Manuscript'.

Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

We look forward to receiving your revised manuscript.

Kind regards,

Francesc Calafell

Academic Editor

PLOS ONE

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #2: (No Response)

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #2: Partly

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #2: No

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #2: The authors have addressed all the concerns I had pointed in the previous submission. However, there are few issues I think that can still be improved.

Line 36. Non-admixed should be written instead of pure

Line 123. genetic components should replace ethnic matrices

Line 177. The R-packages used should be listed and cited. I.e. the R packages used to compute the FST. If no package was used for the FST computation and a self-made script was used, the FST formula should be written and a citation to the reference article added.

Line 214 The p-values regarding the BAP population compared to the others should also be listed at Table 3. Moreover, the authors should apply a multi-test p-value correction. The p.adjust() function in the R package stats is one way to do it (although of course not the only one).

Line 234. FST values should not be described as percentages.

Line 237 BAP population must be included at table 4.

Line 261 Latin Americans with low percentages of Native American ancestry are not genetically similar to East Asians. In my opinion, this sentence is still not accurate. The authors can try a reformulation of the sentence to highlight that East Asians have the highest allele frequencies a part of Native Americans or directly replace exceeds 1% only in Asians and in populations genetically similar to them, that is, 262 the Latin and Amerindian populations by exceeds 1% only in Asians, Latin American and Amerindian populations.

Line 216. The text should be rewritten regarding the new table 3.

Line 281. The authors should avoid the expression we believe.

Line 296. The results shown in this paragraph do not match with table 2.

Line 302. The results shown in this paragraph do not match with table 2.

Line 312. The authors should cite a paper where CLM, PUR, MXL and PEL populations are analyzed all together. Moreover, some of the articles cited have analyzed similar populations but not the 1000 genomes. The following articles perform local ancestry analyses of CLM, PUR, MXL and PEL populations, while the last one, also includes local ancestry information for a South-Eastern Brazilian population. https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1004023

https://www.cell.com/ajhg/fulltext/S0002-9297(17)30107-6

https://www.nature.com/articles/s41598-019-50362-2

Line 315 To be able to state “demonstrate” the authors should carry out a correlation test. Either the text should be adapted or the correlation performed.

Supplementary table 1 is a presentation of the new data generated by the authors.

Thus, the BAP population should be included (differentiated of the Native Americans), and therefore the second column can be renamed as population.

Supplementary table 2 should also include BAP population.

Finally a more coherent nomenclature can be used. I.e. a unified criteria to use the rs ID or the aminoacid subsitution. Moreover, the authors should reference the tables in the text when they discuss them explicitly.

To conclude, the genetic information of the BAP population should also be available, together with the Native Americans.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files to be viewed.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2020 Apr 15;15(4):e0231651. doi: 10.1371/journal.pone.0231651.r004

Author response to Decision Letter 1


26 Mar 2020

We are hereby submitting a new revised version of the Original Article entitled “Identification of NUDT15 gene variants in Amazonian Amerindians and admixed individuals from northern Brazil”, submitted previously as manuscript PONE-D-19-30849R1. We would like to thank the Reviewer for their high quality and constructive reviews that greatly helped improved our manuscript, and the Editor for his careful reading.

In this revised version of the manuscript, we did our best to address the new comments raised by the Reviewer 2 (all the major modifications are marked in green during the file “Revised Manuscript with Track Changes”). Additionally, we have no legal or ethical restrictions on sharing data publicly. Therefore, we have uploaded the minimal anonymized data set necessary to replicate our study findings as a Supporting Information file, named “Anonymised datasets”.

A detailed item-by-item response to each of the Reviewer’s points has been uploaded in the PLOS one submission page. Our replies are marked in italic.

Given the above, we hope that your suggestions have been adressed. Thank you for the attention.

Attachment

Submitted filename: Rebuttal Letter.pdf

Decision Letter 2

Francesc Calafell

30 Mar 2020

Identification of NUDT15 gene variants in Amazonian Amerindians and admixed individuals from northern Brazil

PONE-D-19-30849R2

Dear Dr. Fernandes,

We are pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it complies with all outstanding technical requirements.

Within one week, you will receive an e-mail containing information on the amendments required prior to publication. When all required modifications have been addressed, you will receive a formal acceptance letter and your manuscript will proceed to our production department and be scheduled for publication.

Shortly after the formal acceptance letter is sent, an invoice for payment will follow. To ensure an efficient production and billing process, please log into Editorial Manager at https://www.editorialmanager.com/pone/, click the "Update My Information" link at the top of the page, and update your user information. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to enable them to help maximize its impact. If they will be preparing press materials for this manuscript, you must inform our press team as soon as possible and no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

With kind regards,

Francesc Calafell

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Acceptance letter

Francesc Calafell

1 Apr 2020

PONE-D-19-30849R2

Identification of NUDT15 gene variants in Amazonian Amerindians and admixed individuals from northern Brazil

Dear Dr. Fernandes:

I am pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please notify them about your upcoming paper at this point, to enable them to help maximize its impact. If they will be preparing press materials for this manuscript, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

For any other questions or concerns, please email plosone@plos.org.

Thank you for submitting your work to PLOS ONE.

With kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Francesc Calafell

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Table. Name, location and number of individuals in each population group studied.

    (PDF)

    S2 Table. Allelic frequency independently for each of the Amerindian populations (N ≥ 5), of the admixed Brazilian population, of the subpopulations of the American group from the 1000 genomes, and of the southeast Brazilian samples.

    CML = Colombians from Medellin, Colombia; MXL: Mexican Ancestry from Los Angeles USA; PEL = Peruvians from Lima, Peru; PUR = Puerto Ricans from Puerto Rico.

    (PDF)

    S1 Dataset

    (PDF)

    Attachment

    Submitted filename: Response to Reviewers.docx

    Attachment

    Submitted filename: Rebuttal Letter.pdf

    Data Availability Statement

    All relevant data are within the paper and its Supporting Information files.


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES