Abstract
Currently, many of the world’s most culturally and genetically diverse populations, located in Africa, risk exclusion from advancements in pharmacogenomics (PGx) and personalized medicine. Optimizing treatment outcomes for these populations is crucial, particularly for widespread diseases such as tuberculosis (TB). Reducing adverse drug reactions is essential for improving treatment adherence and overall outcomes. However, investigating the PGx landscape in African populations is challenging due to the lack of genotype and phenotype data, as well as limited computational tools and resources tailored to their genetic diversity. This study assessed various bioinformatic methodologies to characterize variations in the absorption, distribution, metabolism, and excretion (ADME) of anti-TB drugs in a large African cohort (>21 populations from public and in-house datasets). Special focus was placed on the Khoe-San, one of Africa’s most genetically diverse groups, and the South African Coloured (SAC) community, whose richly diverse genetic background arises from recent admixture. We developed a graphic resource to support the investigation of anti-TB drug PGx in Africa. African-specific genomic studies addressing major health challenges on the continent are critical for informing the development of relevant genotyping and reference panels, enabling more cost-efficient personalized care in the region. This study offers a comprehensive assessment of the TB PGx landscape in Africa and highlights the potential of computational methods to promote the inclusion of genomically diverse African populations in PGx research.
Keywords: bioinformatics, interactive web application, genotype imputation, tuberculosis, pharmacogenetics, precision medicine
Introduction
Tuberculosis (TB) remains the deadliest disease caused by a single bacterial agent, claiming ~1.6 million lives in 2022. Efforts by the World Health Organization (WHO) to reduce TB infection and mortality rates have experienced tremendous setbacks. Personalized medicine offers the potential for more effective TB treatments [1, 2], and, if implemented at scale, could transform healthcare systems by improving treatment outcomes, reducing hospital admissions, and lowering cost [3, 4]. Although TB treatment—including first-line and second-line drugs—is lifesaving, the African region bears the highest burden of TB-HIV coinfection [5]. Alarmingly, a substantial proportion of patients in this region experience treatment failure (>8%) [6], drug resistance (>%18) [5], and adverse drug reactions (2–28%) [7]. While molecular mechanisms underlying these outcomes remain incompletely understood, TB treatment success is known to be drug-concentration dependent [8]. Both genetic and nongenetic factors contribute to inter-individual and inter-population differences in responses to TB drug therapy, with varying effect sizes. Notably, TB drug exposure differs across populations, and many African ancestry groups exhibit subtherapeutic drug levels [9–13].
The pharmacokinetic parameters, including the area under the curve (AUC) and peak serum concentration (Cmax) of TB drugs such as rifampicin (RIF) [10], isoniazid (INH) [10, 14], pyrazinamide (PZA), bedaquiline (BDQ) [15], moxifloxacin (MXF) [16], and clofazimine (CFZ) [15] are associated with distinct PGx variations across different African (and other) populations [17]. However, replication of associations between genotypes and phenotypes in diverse populations is inconsistent [18, 19]. Consequently, none of the TB PGx markers are currently classified as actionable PGx by PharmGKB, underscoring the need for a population-specific, comprehensive approach towards understanding the effects of genetic variations on TB drug responses. A wide range of PGx genes are involved in the ADME of drugs, broadly falling into three categories: drug transporters, drug metabolizers, and drug targets [20]. In addition, genes involved in the Vitamin D pathway (DRD2) [21] and in oxidative stress response (NOS) [22] could also influence a patient’s predisposition to develop TB drug-induced hepatoxicity. The complex interplay between concomitantly administered drugs and PGx genes requires a holistic approach in which not only single variants and genes, but the complete PGx landscape is considered to predict complex phenotypes and identify patients at risk of adverse events or treatment failure.
African populations remain largely understudied to date, yet carry the most diverse genomes, with complex substructures, and highly varying allele frequencies across the African continent [23, 24]. The largest portion (78%) of unreported variation in African populations is population-specific [25]. Da Rocha and colleagues (2021) showed that most variants within ADME genes in African populations are unique and allele frequencies of important coding variants vary by more than 10% between African populations. Although most minor allele frequencies (MAF) in African ADME genes are rare (MAF < 1%), an estimated 99.8% of African individuals can be expected to carry at least one variant i.e. of importance as identified by PharmGKB [26]. Furthermore, it is recognized that compared to whole genome sequencing data (WGS), commonly used genotyping panels such as the MEGA array and OmniExpress, captured only 5% and up to 20%, respectively, of the more common variation (MAF > 10%) in important PGx genes in Africans [26]. In less common variation (MAF < 1%), even fewer variants (~2%) are detected using these two arrays.
Imputation could provide a powerful tool to compensate for the limitations of current genotyping data, leveraging linkage disequilibrium (LD) to computationally predict genotypes. Imputation has become an indispensable tool in increasing the power of genome-wide association studies (GWAS), but imputation accuracy is lowest in African populations [27], largely owing to the limited diversity of available reference panels and because imputation accuracy decreases with rare and low frequency variants [24]. INFO scores (or R2 values) are lower in Southern Africans than in Europeans for alleles occurring at a frequency of 5–50% [28]. Population-specific, high-coverage WGS reference panels, however, significantly improve imputation accuracy even in low frequency variation [29]. Although sequencing technologies currently remain the gold standard to discover the numerous rare and novel variants that are characteristic of African genomes, imputation may offer a cost-effective methodology while sequencing possibilities are becoming ever more cost-effective and are providing increasingly more appropriate reference panels for African populations. In this study, we therefore initially set out to compare the imputation accuracy between two commonly used reference panels, the African Genome Resource (AGR) offered by the Sanger Imputation Server (SIS) and the Trans-Omics for Precision Medicine (TOPMed) reference panels, specifically focusing on principal ADME genes and a highly complex, five-way admixed population.
Secondly, we focus on the Southern African Khoe-San population, who are the most diverse lineage of all human populations [30, 31] and contribute between 15% and 75% of their genome to other groups in southern Africa [32]. Using a local ancestry adjusted association model (LAAA) [33], we test for a correlation between Khoe-San ancestry and TB PGx variation. This methodology robustly predicts population-specific association signals in heterogenous, complex African genomes by incorporating global and local ancestry information [34]. To date, only 2% GWAS conducted up to 2020 contained data from individuals of African descent [35, 36]. Population-specific factors such as LD, varying allele frequencies, environmental factors as well as limited power result in poor replication of GWAS in Africans [37]. Leveraging the power of imputation and ancestry-specific information could assist in identifying TB drug response related phenotypes with PGx genotypes and gaining a better understanding of the TB PGx landscape in Africa.
Given the rapid technological advances in computational biology, capturing, and cataloguing the rich genetic diversity of the African continent to the benefit of personalized medicine is within reach. African populations are poor proxies for each other [38], as allele frequencies and genetic structure do vary with geographic proximity [26]. The final aim of our study therefore is to provide the first publicly available, visually comprehensive platform for African population-specific PGx information relating to TB treatment, combining allele frequencies with TB PGx information and admixture proportions from 21 distinct African populations, providing a useful tool for other researchers in the field.
Methods
Genetic data acquisition, ethics, and quality control
Ethics approval for this study was obtained from the Health Research Ethics Committee of Stellenbosch University, reference number N21/11/136. All studies and associated datasets detailed in Table 1 obtained informed consent from the respective study participants for further research.
Table 1.
Datasets, QC and imputation results. 1000 GDP: 1000 Genomes Diversity Project. HGDP: Human Genome Diversity Project. SGDP: Simons Genome Diversity Project. Ind.: Individuals. Imp.: Imputation.
| ID | Country | Population/ Study site | Dataset ID | Genotype technique | Ind. before QC | Ind. after QC/Imp. | SNPs before QC/imp. | SNPs after QC/imp. |
|---|---|---|---|---|---|---|---|---|
| UGA | Uganda | Makerere University | PHS002528 | Illumina Global Screening Array | 186 | 186 | 343 173 | 6 324 2719 |
| KEN_K | Kenya | KEMRI Wellcome Trust | 187 | 187 | ||||
| KEN_M | Kenya | Moi University | 182 | 182 | ||||
| ETH | Ethiopia | Aari | PHS000449 [45] | Human1M-Duo Array | 7 | 7 | 1 074 966 | 7 351 1891 |
| ETH | Hamer | 7 | 7 | |||||
| ETH | Amhara | 28 | 28 | |||||
| PYG | Cameroon | Baka | 46 | Illumina 1M SNP Array | 25 | 17 | 1 083 500 | 71 249 264 |
| PYG | Bakola | 29 | 24 | |||||
| PYG | Bedzan | 13 | 7 | |||||
| PYG | Lemande | 19 | 19 | |||||
| PYG | Ngumba | 20 | 18 | |||||
| PYG | Tikar South | 19 | 17 | |||||
| PED | South Africa | Pedi | EGAD00001009067 [47] | HiSeq X Ten, Illumina NovaSeq 6000 | 21 | 21 | 17 941 588 (WGS) | 14 068 879 |
| NAM | Nama | 48 | WGS | 180 | 127 | 17 405 384 (WGS) | 88 294 417 | |
| Omni Express Plus Array | ||||||||
| MEGA Array | ||||||||
| XHO | Xhosa | Möller/Uren (unpublished data) | WGS | 148 | 148 | 27 128 982 (WGS) | 23 875 535 | |
| SAC | South African Coloured | 34 | H3A Array | 2494 | 1695* | 89 387 186 (WGS) | 89 387 186 | |
| Affymetrix 500k Array | ||||||||
| MEGA Array | ||||||||
| KHM | Khomani | 49 | Omni Express Plus Array | 270 | 104 | 1 800 655 | 88 759 168 | |
| Omni Express Array | ||||||||
| MEGA Array | ||||||||
| GHA | Ghana | Ghana | EGAD00010001734 (MalariaGEN)* | Illumina Omni 2.5M genotyping | 782 | 120* | 2 314 069 | 49 046 152 |
| MAL | Malawi | Malawi | EGAD00010000903 (MalariaGEN)* | 3086 | 120* | 2 314 174 | 73 148 123 | |
| CAM | Cameroon | Cameroon | EGAD00010001740 (MalariaGEN)* | 1471 | 120* | 2 314 174 | 48 495 467 | |
| TAN | Tanzania | Tanzania | EGAD00010001743 (MalariaGEN)* | 979 | 120* | 2 314 174 | 38 914 942 | |
| BFA | Burkina Faso | Burkina Faso | EGAD00010001739 (MalariaGEN)* | 1446 | 120* | 2 115 586 | 70 247 745 | |
| MOZ | Algeria | Mozabite | HGDP [50] | WGS | 29 | 27 | 597 573 | 50 810 341 |
| GWD | Gambia | Gambia Western Division | 1000 GDP [51] | WGS | 180 | 104 | 40 071 253 | 42 699 202 |
| MSL | Sierra Leone | Mende | 128 | 75 | 42 699 202 | |||
| ESN | Nigeria | Esan | 173 | 92 | 42 699 202 | |||
| YRI | Nigeria | Yoruba | 108 | 108 | 42 699 202 | |||
| LWK | Kenia | Luhya | 115 | 82 | 42 699 202 | |||
| EGY | Egypt | Egyptian | SGDP [52] | Affymetrix Human Origin Array | 17 | 17 | 597 573 | 9 115 051 |
| BIA | DRC | Biaka (Pygmy) | HGDP [50] | WGS | 20 | 17 | 1 083 500 | 5 741 039 |
All datasets underwent quality control (QC) using Plink (v1.90b6.26) [39]. Sex chromosomes and mtDNA variants were removed. Variants for which more than 5% of individuals had missing information were excluded, as were all insertions, deletions and monomorphic sites, and variants that did not satisfy conditions for Hardy Weinberg Equilibrium (HWE) (P < .00001). Individuals for whom <10% of genotyping data was available, were removed. Sequences were aligned to GRCh37 using Plink and where necessary lifted over with chain files obtained from UCSC Genome Browser (https://genome.ucsc.edu/cgi-bin/hgLiftOver). Relatedness between individuals was assessed using King (v2.2.9) [40], and if related by at least 2°C, one individual of the related pair was removed.
Imputation and phasing performance
All genotyped datasets were prephased with Shapeit and imputed on the SIS (https://imputation.sanger.ac.uk), using the AGR as a reference panel, which yields the most accurate imputation results in African populations [41]. To quantitively assess the performance of imputation in ADME genes, a secondary phasing and imputation method was employed, that housed on the TOPMed imputation server [42].
To compare imputation performance in ADME genes to the rest of the genome, 32 core ADME genes were selected as defined by the PharmaADME Consortium (http://www.pharmaadme.org). Ensembl (https://grch37.ensembl.org/index.html) and Vcftools (v.1.17) were used to locate and select these regions (10 000 bp upstream and downstream from gene start/end sites) according to GRCh37.p19 coordinates. R2 values and INFO scores ascertained imputation performance. R (v4.2.0) was utilized to visualize results and to compare the two imputation servers by performing an independent t-test.
Global and local ancestry inference
Global ancestry inference was performed using ADMIXTURE [43]. Reference populations representing world populations served to generate a master file (Supplementary Table 2). Each group was represented by 40 individuals to mitigate bias, the smallest number of individuals available for one dataset. African target populations were LD-pruned and merged with the master file using Plink (v1.90b6.26). Admixture proportions (k = 2 − 6) were visualized in R.
RFMix v2.03 was employed with default parameters ( https://github.com/slowkoni/rfmix/blob/master/MANUAL.md) to obtain local ancestry proportions at a chromosomal segment level. The reference ancestral populations (Supplementary Table 1) were refined based on the ADMIXTURE run and passed analyses on the specific datasets. As above, 40 individuals were randomly selected from each population.
Statistical analysis of the relationship between Khoe-San ancestry and TB drug ADME
The specific genotype and local ancestry calls of the PGx markers in the database were used to create allele-, ancestry- and allele-ancestry dosage files using a publicly available python script (https://github.com/TBHostGenetics/LAAA-model) (Swart et al., 2021). For the allele dosing, 0 represents the major allele and 1, the minor allele. For the ancestry dosing one represents Khoe-San ancestry and 0 identifies other ancestries. In the allele-ancestry dosage files, one represents a minor allele located within a region of Khoe-San ancestry and 0 represents the minor allele falling within a region of other ancestry. These were used to calculate the biallelic state of each locus as 0, one, or two copies for allele and ancestry.
The association analysis between PGx SNPs and ancestry was conducted exclusively within the southern African populations known to have Khoe-San ancestry, and not across the full dataset listed in Table 1. As the study specifically focused on uncovering Khoe-San-specific associations—an area with minimal prior research—conducting a meta-analysis across unrelated populations would not have been appropriate for this ancestry-focused approach. To determine the association between the PGx SNPs and ancestry, multinomial logistic regression (MLR) was performed in R using the vglm() function. The dependent variable is the allele dose, and the ancestry dose the independent variable. Age, sex, global ancestry proportions (excluding the Han Chinese to avoid collinearity) as well as the interaction between sex and local ancestry dosage, were selected as covariates. The reference level selected was “0”, which represents homozygosity for the major allele. As a Hauck-Donner effect (HDE) was detected for most variants, global ancestry proportions were removed from the model. The final model was as follows:
PGx marker ~ Age + Sex + Sex*Local ancestry + Local ancestry
The statistical model was adapted from the Local Ancestry Adjusted Allelic Association (LAAA) framework [34]. Sex was included due to its potential to influence allele presence [44]. Age was included to account for potential variation in allele distribution that may emerge over time, acknowledging that while age-related somatic mutations are rare, age remains a relevant covariate in population-based genetic models. Sex*LocalAncestry was added to the model to account for a differential ancestry effect between sexes [45]. After obtaining significant P-values (P < .05), Bonferroni correction was applied to address inflated Type I error rates due to multiple comparisons. The chosen significance level (0.05) was divided by the number of tests conducted [18], adjusting the threshold for each individual comparison.
PGx marker database generation and allele frequency determination
A total of 44 PGx variants were curated through a systematic literature search of the PharmGKB database, filtering for variants with statistically significant associations (P < .05), including only those where P-values had been adjusted for multiple testing where applicable. Studies reporting associations between genetic variants and TB treatment outcomes or the pharmacokinetic/pharmacodynamic effects of TB drugs were included. Additionally, genome-wide association studies were included, applying more stringent significance thresholds consistent with GWAS standards.
PharmGKB was selected due to its peer-reviewed curation process, which ensures consistency, reliability, and comparability of PGx data. We also included recent peer-reviewed publications not yet indexed by PharmGKB, particularly those focusing on African populations, due to their relevance and credibility within the field.
Inclusion criteria were based on the presence of population-specific data and reported associations with TB drug outcomes. All populations were considered in line with the “Out of Africa” hypothesis, which supports the relevance of African genetic variation for understanding PGx associations globally. However, only variants present in the African populations within our study were included in the final dataset displayed on the web application.
The frequencies of all selected bi-allelic SNPs were determined in all African cohorts using Plink, and significant differences in frequencies as compared to the African average were calculated in R using Fisher’s Exact Test. GnomAD [46] provided a resource to compare allele frequencies to nonAfrican populations. Using the R Shiny package, the TBPGxForAfrica App was created, which upon entering a
rs ID, retrieves a geographic choropleth map, a histogram showing the allele frequency for specific African populations, and the TB PGx information for the queried SNP. Google Open Street Map and polygon data from Natural Earth (naturalearthdata.com), were employed to create the map.
Results
QC and Imputation
In total, 3916 samples underwent QC and 3559 samples were imputed (Table 1). Due to relatedness, up to 10% of samples of public datasets, and 50% of Khomani and 27% of the Nama samples were removed of the in-house datasets. Variants were lost to further analysis (Table 1) due to genotype missingness of 5%, INFO scores < 0.6 and minor allele frequency cut-offs.
Comparison of imputation performance
The TOPMed reference panel imputed more SNPs than AGR, but AGR outperformed TOPMed by imputing higher quality SNPs (Fig. 1A and B). For both ADME and non-ADME genes, AGR imputed more SNPs with an INFO score between 0.8 and 1.0 than the number of SNPs TOPMed imputed with an INFO score between 0.2 and 1.0. Approximately 33% of SNPs imputed by AGR have an INFO score < 0.2 while for TOPMed 88% 0f SNPs have an INFO score < 0.2. This trend was seen for both ADME and non-ADME genes (Fig. 2A and B). The AGR reference panel achieved the highest mean INFO score (surpassing 0.90) while the TOPMed imputation panel had a maximum mean INFO score of 0.85.
Figure 1.
Total number of SNPs imputed for each panel and their distribution across all INFO (or R2) bins. (A) ADME genes. (B) Non-ADME genes.
Figure 2.
Mean INFO scores for non-ADME and ADME variation across MAF ranges. (A) AGR. (B) TOPMed.
Splitting the data into MAF bins, the AGR reference panel achieved higher INFO scores for rare variants (MAF 0%–5%, mean score AGR = 0.69, TOPMed = 0.12), for both ADME and non-ADME genes. Above a MAF of 5%, the AGR runs mean INFO scores ranged from 0.83 to 0.91 while TOPMed mean INFO scores ranged from 0.78 to 0.82. Thus, imputation with the AGR reference panel yielded superior quality imputed SNPs as compared to TOPMed.
The means of the two reference panels were significantly different in imputation performance (P < 2e-16), statistically confirming that AGR has a stronger imputation performance in terms of SNP quality.
Comparison of ADME genes and non-ADME genes
SNPs in ADME genes had higher mean INFO scores compared to non-ADME genes except for MAF=10-20% for the AGR reference panel (Fig. 2A) and MAF = 20%–30% for TOPMed (Fig. 2B). For both, the INFO scores for ADME and non-ADME genes are matched for variants with MAF < 10%. The AGR INFO score for ADME genes exceeds non-ADME genes apart from when MAF is between 10% and 20%. There were larger discrepancies between non-ADME genes and ADME genes for the AGR whereas TOPMed showed similar mean INFO scores for both. For both reference panels, ADME genes were imputed with higher quality compared to non-ADME genes (Fig. 2) and for the TOPMed reference panel, this difference was significant (t-statistic -14.9560, P < 2.2e-16).
Global ancestry inference results
The lowest cross-validation (CV) value for admixture analysis was obtained at k = 4 (Table 2), and Q-matrices (example shown in Fig. 4D) for k=2-6 of all African populations are available online (https://tbpgxforafrica.shinyapps.io/PGxForAfrica/). Northern African populations EGY, MZB, and ETH share European ancestry proportions (81.5%, 77.6%, and 28.9%), which are almost absent in Eastern (LWK) and Western (YRI) African populations (1.3% and 0%) (Table 2). The admixed SAC population carry a substantial proportion of European admixture (39.1%), as well as Southern African admixture (39%) (Table 2). At k = 5, the BIA separate distinctly from the populations, which is in accordance with other studies showing the rainforest hunter gatherer populations to be genetically distinctive [47]. The geographic grouping of all populations coincides with other studies [47], with the Eastern African proportion the largest across African populations.
Table 2.
Admixture proportions (%) across all African populations (k = 4). EGY: Egypt, MZB: Mozabite, Algeria. ETH: Ethiopia. KEN_M: Kenya Moi. UGA: Uganda. CAM: Cameroon. GWD: Gambia, Western Division. GHA: Ghana. TAN: Tanzania. MAL: Malawi. BFA: Burkina Faso. ESN: Esan, Nigeria. MSL: Mende, Sierra Leone. YRI: Yoruba in Idaban, Nigeria. LWK: Luhya, Western Kenya. PYG: Pygmy. BIA: Biaka. XHO: Xhosa, South Africa. KHM: Khomani. NAM: Nama, South Africa. PED: Pedi. SAC: South African Coloured, South Africa.
| EGY | MZB | ETH | KEN_M | UGA | CAM | GWD | GHA | TAN | MAL | BFA | ESN | MSL | YRI | LWK | PYG | BIA | XHO | KHM | NAM | PED | SAC | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| European | 81.5 | 77.6 | 28.9 | 9.9 | 8.1 | 11 | 5.8 | 10.3 | 10.3 | 5.9 | 7.1 | 5.4 | 6.4 | 0 | 1.3 | 5.6 | 0 | 0.7 | 15.7 | 9.3 | 0 | 39.1 |
| Eastern Africa | 13.1 | 22.1 | 47.7 | 69.8 | 73.1 | 74.6 | 86 | 78.9 | 79.8 | 82.3 | 80.9 | 84.5 | 81.2 | 93.5 | 84.3 | 73.1 | 62.2 | 71.8 | 68 | 15.6 | 0 | 13.7 |
| Southern Africa Bantu-Speaking | 0.5 | 0 | 9.5 | 8.8 | 9.3 | 7.5 | 0 | 4.7 | 7.7 | 4.8 | 4.6 | 3.6 | 4.6 | 0 | 7.2 | 16.2 | 36.7 | 24.8 | 8.8 | 74.9 | 100 | 39 |
| Asian | 4.8 | 0 | 13 | 11.5 | 8.1 | 6.9 | 7.9 | 6.1 | 6.1 | 7 | 7.5 | 6.4 | 7.7 | 6.3 | 7.2 | 5.1 | 1 | 2.7 | 7.5 | 0 | 0 | 8.3 |
Figure 4.
TBPGxforAfrica Application results (https://tbpgxforafrica.shinyapps.io/PGxForAfrica/). A. Interactive map showing frequency results for the NAT2*5 allele, the popup showing frequency results hovering over Ethiopia. B. Frequency results for the NAT2*5 allele across populations. C. PGx information for the NAT2*5 allele obtained from current literature. D. Admixture proportions (k = 4) for the XHO population in relation to reference populations, divided into sets of 40 individuals. JPT: Tokyo, Japan. CHS: Han, Southern China. KHV: Kinh in Vietnam. BEB: Bengali in Bangladesh. PJL: Punjabi, Pakistan. IBS: Iberian, Spain. GBR: British in England and Scotland. FIN: Finnish. EGY: Egyptian. MZB: Mozabite, Algeria. YRI: Yoruba, Nigeria. LWK: Luhya of Webuye, Kenya. BIA: Biaka, Congo. XHO: Xhosa, South Africa. NAM: Nama, South Africa.
Relationship between TB drug metabolism and Khoe-San ancestry
A literature search and the PharmGKB resource (as of December 2023) identified a list of 44 SNPs (Supplementary Tables 3–6) as biomarkers significantly associated with TB treatment outcome (P < .05).
Of the PGx markers in the developed database, 14 were not imputed across all populations and were excluded from the Khoe-San ancestry association analysis. Further SNPs (n = 9) were removed due to low imputation quality scores (INFO < 0.6), being monomorphic, during data merging and the overlapping of genotyping and local ancestry calls (n = 4). A total of 17 SNPs were included in the statistical analysis (Table 3).
Table 3.
SNPs (n = 17) included in Fisher's exact test, SNPs significantly associated with Khoe-San ancestry (p<0.05) (n = 7), after Bonferroni correction (bold) (P < .0029) (n = 2).
| Gene/Allele | rsID | CHR:POS | Reference | Ref>Alt | Geno-type | Alleles of Khoe-San Origin | Standard error (SE) | OR (95% CI) | P-value |
|---|---|---|---|---|---|---|---|---|---|
| AADAC | rs1803155 | 3:151545601 | 76 | not significant | |||||
| AGBL4 | rs320003 | 1:49126778 | 36 | not significant | |||||
| CYP2B1 | rs4646536 | 12:58157988 | 20 | not significant | |||||
| CYP2C19 | rs9332096 | 10:96696875 | 77 | C>T | CT | Both | 0.3093 | 1.924 (1.049-3.528) | 0.0343 |
| CYP2E1 | rs3813867 | 10:135339605 | 78 | G>C | CC | One | 0.5260 | 0.120 (0.043-0.338) | 5.81e-05 |
| FAM65B | rs10946739 | 6:24993127 | 36 | not significant | |||||
| NAT2 | rs4646244 | 8:18247718 | 77 | T>A | TA | Both | 0.214995 | 1.625 (1.067-2.477) | 0.0238 |
| AA | Both | 0.433615 | 2.931 (1.253-6.857) | 0.0131 | |||||
| NAT2*5 | rs1801280 | 8:18257854 | 60 | not significant | |||||
| NAT2*11 | rs1799929 | 8:18257994 | 79 | not significant | |||||
| NAT2*6 | rs1799930 | 8:18258103 | 60 | G>A | GA | Both | 0.214214 | 1.561 (1.026-2.376) | 0.03751 |
| AA | Both | 0.397083 | 3.032 (1.392-6.604) | 0.00521 | |||||
| NAT2*12 | rs1208 | 8:18258316 | 80 | AA | Both | 0.290697 | 2.022 (1.144-3.575) | 0.0154 | |
| NAT2*7 | rs1799931 | 8:18258370 | 60 | not significant | |||||
| PXR | rs3814055 | 3:119500035 | 81 | C>T | TT | Both | 0.1663489 | 1.401 (1.012-1.942) | 0.04229 |
| SODI | rs4880 | 6:160113872 | 82 | not significant | |||||
| UGT1A1 | rs3755319 | 2:234667582 | 15 | not significant | |||||
| VDR | rs1544410 | 12:48239835 | 20 | not significant | |||||
| XPO | rs11125883 | 2:61710573 | 21 | A>C | AC | Both | 0.177223 | 0.505 (0.357-0.715) | 0.000119 |
| CC | Both | 0.465644 | 0.346 (0.139-0.863) | 0.022746 | |||||
A total of 1998 individuals were available for multinomial logistic regression analysis. Out of the 17 PGx markers, seven exhibited significant associations with Khoe-San ancestry (Table 3). With Bonferroni correction, (adjusted P = 0.0029), only two markers (rs11125883 and rs3813867) exhibited significant associations. Notably, PGx markers rs4646244, rs1799930, and rs1208 displayed positive associations between both the heterozygote and homozygous alternate (alt) allele genotypes and having Khoe-San ancestry present at both alleles. For rs11125883, a negative association was observed. PGx markers rs3814055 and rs9332096 both demonstrated significant positive associations between having Khoe-San ancestry in both alleles and genotypes heterozygote and homozygous alt allele, respectively.
PGx allele frequencies
Of the 44 SNPs that were previously associated with TB treatment outcomes according to current literature, 12 were selected as the most relevant according to high-quality studies [48], repeatedly being significantly associated with PK parameters and/or treatment outcome for INH and RIF, particularly in African populations (Supplementary Table 3 and 4). These SNPs presented with varying frequencies across populations (Fig. 3). Frequencies of seven of these 12 SNPs are significantly different from the African average frequency in at least one of the individual populations (Table 4). Frequencies of SNPs in SLCO1B1 and NAT2 in ETH were much higher than in other African populations. The NAT2*5 and NAT2*6 alleles, which play a key role in determining metabolizer phenotype, are significantly prevalent in the NAM and KHM populations. Of note, the variant rs3813867 that was positively associated with Khoe-San ancestry, is significantly more frequent (p= 5.2e-05) in the KHM (0.11) compared to the African average (0.04).
Figure 3.
Frequencies of the 12 most important PGx variants across African and non-African populations.
Table 4.
Important TB PGx SNP frequencies varying significantly across African populations. *SNP associated with Khoe-San ancestry.
| Gene | SNP | Avg African frequency | Population | Frequency | p-value | OR(95% CI) |
|---|---|---|---|---|---|---|
| SLCO1B1 | rs4149032 | 0.25 | AA | 0.68 | 2.2e-16 | 0.1608305 (0.101-0.249) |
| SLCO1B1 | rs4149032 | 0.25 | ETH | 0.46 | 0.0136 | 0.42208 (0.211-0.854) |
| AGBL4 | rs393994 | 0.21 | ETH | 0.16 | 0.004178 | 0.368225 (0.183-0.753) |
| AGBL4 | rs320003 | 0.24 | ETH | 0.45 | 0.006502 | 0.3910119 (0.195-0.791) |
| NAT2*11 | rs1799929 | 0.23 | ETH | 0.42 | 0.01035 | 0.4093347 (0.204-0.836) |
| NAT2*7 | rs1799931 | 0.02 | ETH | 0.05 | 2.2e-16 | 757.1287 (194.336-8192.000) |
| NAT2*5 | rs1801280 | 0.27 | NAM | 0.19 | 0.04235 | 1.597202 (1.011-2.616) |
| NAT2*6 | rs1799930 | 0.21 | KHM | 0.13 | 0.0006384 | 1.839998 (1.271-2.737) |
| CYP2E1 | rs3813867* | 0.04 | KHM | 0.11 | 5.2e-05 | 0.3896951 (0.253-0.616) |
| CYP2E1 | rs6413432 | 0.07 | KHM | 0.27 | 2.2e-16 | 0.2377769 (0.177-0.321) |
Interactive tool: TBPGxforAfrica
Using R Shiny, we provide an interactive tool (https://tbpgxforafrica.shinyapps.io/PGxForAfrica/) to catalogue the significance (Supplementary Tables 3–6) and frequency (Supplementary Tables 7–10) of important TB PGx SNPs across >21 African populations for a total of 44 known TB PGx SNPs, including the South African XHO and KHO populations that have thus far not been represented in literature. An rsID as entered by the user is searched within the internal database, outputting an interactive geographic choropleth map (Fig. 4A), a histogram (Fig. 4C) showing the allele frequency across specific populations as well as the relevant PGx information (Fig. 4B) correlating to the SNP. The intensity of the shading of a country represents the frequency of the queried variant in that region, with frequency numbers displayed when hovering over the map (Fig. 4A). Admixture proportions for each population are indicated (Fig. 4D). The database is built to include additional information as it becomes available.
Discussion
Pre-emptive genetic testing is increasingly implemented in Europe and Asia [49], reducing the incidence of ADRs up to 30% [50]. While PGx guidelines are progressively fine-tuned in developed countries, PGx guidelines are presently not available for African populations [51], even though their inclusion could greatly improve catering precision medicine to increasingly admixed populations globally. To date, only 3.6% of the PGx data in the pharmacogenomic database PharmGKB is sourced from African populations [52]. Current public platforms capturing allele frequencies, such as gnomAD or dbSNP, distinguish mostly between African and African American datasets, which is an inadequate representation of the vast genetic diversity of the estimated 200 ethnolinguistic groups residing on the continent. In Africa, countries and populations are not only unique on a genetic level with varying allele frequencies and admixture proportions, but have distinct disease burdens, with HIV and TB playing a predominant role. Within the next few decades, the African population is estimated to double, likely continuing to be challenged by a high TB/HIV burden. A significant number of reports identified INH as the suspect drug behind ADR reports in South Africa [51], and even though NAT2 is not regarded actionable according to current guidelines (PharmGKB), the high TB burden in the region justifies its consideration [51]. Implementation could be technically feasible and cost-effective in an African setting [2, 4].
Our study identified a large number of variants (n = 44) which could play a role in TB PGx outcomes, two of which were identified to be linked to Khoe-San ethnicity. Only a limited number of variants (n = 12) are likely to be common and have significant impact across populations. We show that computational methods such as imputation and population-specific LAAA could be pivotal in leveraging the unexplored data contained in these unique and diverse African genomes, to benefit better treatment options in under-served populations [24, 26, 53–58]. This is to our knowledge the first study to assess imputation performance between ADME and non-ADME genes and between two imputation platforms in a five-way admixed African population.
Both TOPMed and AGR have been identified as the best performing tools for imputing African genomes [41]. In this study, TOPMed imputed a greater number of SNPs, but the AGR imputed higher quality SNPs (Fig. 2). TOPMed is by far the largest panel, containing 97 256 multiethnic samples, whereas the AGR encompasses 4965 predominantly African genomes, including samples from Egypt, Ethiopia, Namibia (Khoe-San/Nama), and South Africa (Zulu). The admixed SAC were better represented by the AGR reference panel, leading to improved imputation scores (Fig. 2). The difference between imputation scores between ADME genes and non-ADME genes was apparent but not significant in AGR, but highly significant in TOPMed. Likely, the greater number of imputed SNPs in TOPMed facilitated this association.
ADME genes in Africans are highly diverse, with short LD blocks [26], which could set imputation efforts at a disadvantage. Nonetheless, when compared to the rest of the genome, this result (Fig. 1) may indicate that ADME genes and LD patterns are more stable and conserved, favoring imputation performance in ADME genes compared with the rest of the genome. This result is encouraging for research attempting to further PGx research in ADME genes in African populations using imputation, where WGS data is scarcely available, such as this study. Whilst imputation does facilitate increased power on a genome level, it’s use in fine scale, focused analysis of specific variants within specific PGx genes in African populations may be limited with the current representation of African genomes in reference panels. For example, information from this study for the NAT2*13 allele is absent in 12 of the 21 populations due to low info scores but is highly frequent in all populations (Supplementary Table 7), is associated with hepatoxicity [59] and a key predictor of NAT2 metabolizer phenotype [2]. For the SNP rs2032582 in the ABCB1 gene, associated with MXF metabolism, frequency data was only available in two out of 21 populations, although it is a common variant and of importance in the metabolism of many drugs [60, 61].
Population differences in drug disposition and ADR risk arise because of differences in allele frequencies, and thus, populations sharing recent ancestry will have similar risk allele frequencies. By testing for an association between TB ADME-associated variation and the supremely diverse Khoe-San ancestry, we can identify ancestral alleles (vs derived alleles) that play a role in TB drug response. The CYP2E1 variant rs3813867 associated with Khoe-San ancestry in this study (Table 8, P = .00019) was shown to significantly contribute to ATDILI risk [62, 63] and was significantly more frequent in Southern African populations (Table 4). The rs11125883 SNP in the XPO1 gene was negatively associated with Khoe-San ancestry (P = .000119). The protein exportin 1 is involved in antioxidant enzyme expression and the major A-allele was found to be associated with ATDILI in only one study involving Japanese patients, whereas the less frequent C-allele exhibited a protective effect [22].
Compared to all other human populations, the Khoe-San carry the highest level of heterozygosity and private alleles, with 25.5% of variation occurring exclusively in the Khoe-San population [31]. In comparison, only 18% of genetic variation is exclusive to other African populations, and only 10.6% occur only in non-Africans [31]. Considering the “Out of Africa” theory, the Khoe-San thus represent a catalogue of ancestral alleles for other population groups, and exploration of their unique genomes could provide valuable insights for PGx research, GWAS, and polygenic risk scores. Firstly, ancestral alleles are found at a 9.51% higher frequency in Africans than non-Africans, whereas derived alleles have a 5.4% lower frequency in Africans, leading to less biassed GWAS results in African vs non-African populations [64]. Secondly, differentiating between ancestral alleles (stemming from the original ancestral populations and typically present in higher frequency in Africans) and derived alleles (accumulated de novo mutations, typically less frequent in Africans) could lead to less SNP ascertainment bias in SNP arrays and improved interpretation of GWAS studies and polygenic risk scores [64]. Caution is advised when extrapolating GWAS results to predict differential drug treatment responses [64], but our study shows that correcting for ancestry-associated allele frequency differences, could provide a solution to discerning true risk alleles from allele frequency bias. This study underscores the complex genetic diversity and relatedness underlying the PGx landscape in African populations and identifies the feasibility of using imputation and LAAA to enrich population genetic data.
Global admixture analysis (Fig. 4D and Table 2) indicate that the African datasets are heterogenous, with great differences in admixture proportions on a population and an individual level (https://tbpgxforafrica.shinyapps.io/PGxForAfrica/). This serves to reinforce that individuals within an African population group can be expected to vary substantially from each other with regards to admixture, allele frequency distribution, and therefore, TB drug response. The five-way, recently admixed SAC population stands out as the most heterogenous population (Table 2). As the human population is globally increasingly admixed, studying the unique SAC population and the effects of admixture on drug disposition could thus provide valuable insights for modern populations.
Scarce alleles are often inherently pathogenic and could carry greater weight in different populations, necessitating the inclusion of diverse population groups in research. As NAT2 is responsible for 88% of the INH metabolism [65], and its effect on INH metabolism well established, differences in NAT2 allele frequencies are of considerable importance to TB PGx. Notably, the NAT2*14 allele is exclusive to Africans (Supplementary Table 3), and its inclusion in genotyping tests greatly improves predictions of NAT2 metabolizer status [2]. The frequencies of the NAT2*14 allele correlate with those found in other studies [14, 66], varying between 0 (PED) up to 0.14 (GWD) in African populations. With regards to effect size, NAT2*7 has the most pathogenic effect on enzyme functionality, followed by NAT2*6 and NAT2*5 [14], but the frequency of NAT2*7 is relatively low across all populations (highest in YRI = 0.04), followed by relatively common alleles NAT2*6 (highest in UGA = 0.28) and NAT2*5 (highest in KEN=0.42, Supplementary Table 8). The relatively lower frequency of both the NAT2*5 and NAT2*6 alleles in the Southern African NAM and KHM populations could have implications for genotype-directed TB PGx treatment.
CYP2E1 is of importance mostly in carriers of NAT2 slow metabolizer phenotypes, when alternative pathways of INH become more important [67]. Although the frequency of CYP2E1 SNPs is low across African (and non-African) populations, two CYP2E1 SNPs are markedly most common in the XHO population (rs3813867 = 0.11, rs6413432 = 0.23, Supplementary Table 7). Variation in the NAT2 and CYP2E1 genes may be worthy candidates for inclusion in precision medicine, with associations having been replicated in various populations (Supplementary Table 4). Furthermore, of note, the association between the SLCO1B1 locus and hepatoxicity has been well established [9, 17, 68–70], and variants within this gene are more frequent in the ETH than Southern African populations. The strongest associations are described with metabolizer phenotypes rather than individual variation, suggesting a benefit for genotype-adjusted dosing in African populations [10, 71]. Cataloguing data on TB PGx variation across populations is the foundation for guiding precision medicine in Africa. Within the 21 African populations investigated here, there are limitations to generalizability or transferability of PGx data from one population to predict clinical outcomes in another. Selecting actionable variation to be included in African-specific PGx guidelines will require balancing the technical feasibilities that underlie effect size, allele frequencies, linkage and relatedness, and evidence for economic and clinical benefits. The data presented here may serve to inform future African-specific PGx research, and development of more appropriate genotyping panels and TB PGx guidelines, based on African specific catalogues keeping record of SNP significance, frequencies, and relatedness.
Future work
The majority of association studies for TB PGx described in literature to date were conducted in Europeans and Asians (Supplementary Tables 3–6), possibly leading to enrichment of alleles that are polymorphic and of intermediate frequency in these populations [64], but not in the African populations investigated in this study. Furthermore, considering the extraordinarily high number of novel and private SNPs [24], indicates that the list of variants selected for this study has its limitations in representing TB PGx in Africans. Rare and yet uncharacterized genetic variation is likely to play a noteworthy role in African populations, and in silico analysis could provide an indication of pathogenicity of yet uncharacterized variation. We [54] intend on furthering research into this direction and adding more information to the platform as it becomes available.
Key Points
Existing databases and reference panels fail to sufficiently capture the genetic diversity of African populations, placing them at a disadvantage for pharmacogenetic applications.
Significant variations in allele frequencies and admixture proportions across African populations make it unfeasible to generalize pharmacogenetic findings from one population to another.
Computational approaches such as local ancestry adjusted association models (LAAA), genotype imputation and interactive web applications, could play a pivotal role in analyzing, enhancing and visualizing African genomic data.
Cataloging TB pharmacogenetic variation across African populations could serve as a foundation for informing TB pharmacogenetic guidelines, which could enhance treatment outcomes at scale.
Supplementary Material
Contributor Information
Carola Oelofse, South African Medical Research Council Centre for Tuberculosis Research; Division of Molecular Biology and Human Genetics, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa.
Anwani Siwada, South African Medical Research Council Centre for Tuberculosis Research; Division of Molecular Biology and Human Genetics, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa.
Khaleila Flisher, South African Medical Research Council Centre for Tuberculosis Research; Division of Molecular Biology and Human Genetics, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa.
Marlo Möller, South African Medical Research Council Centre for Tuberculosis Research; Division of Molecular Biology and Human Genetics, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa; Centre for Bioinformatics and Computational Biology, Stellenbosch University, Stellenbosch, South Africa; National Institute for Theoretical and Computational Sciences (NITheCS), South Africa; Genomics for Health in Africa (GHA), Africa-Europe Cluster of Research Excellence (CoRE), South Africa.
Caitlin Uren, South African Medical Research Council Centre for Tuberculosis Research; Division of Molecular Biology and Human Genetics, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa; Centre for Bioinformatics and Computational Biology, Stellenbosch University, Stellenbosch, South Africa; National Institute for Theoretical and Computational Sciences (NITheCS), South Africa; Genomics for Health in Africa (GHA), Africa-Europe Cluster of Research Excellence (CoRE), South Africa.
Conflict of interest statement: None Declared.
Funding
Research reported in this publication was supported by the Grants Innovation and Product Development unit of the South African Medical Research Council with funds received from Novartis and GSK R&D (Grant # GSKNVS1/202101/001)
References
- 1. Khan A, Abbas M, Verma S. et al. Genetic variants and drug efficacy in tuberculosis: a step toward personalized therapy. Glob Med Genet 2022;09:090–6. 10.1055/s-0042-1743567 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Verma R, Patil S, Zhang N. et al. A rapid pharmacogenomic assay to detect nat2 polymorphisms and guide isoniazid dosing for tuberculosis treatment. Am J Respir Crit Care Med 2021;204:1317–26. 10.1164/rccm.202103-0564OC [DOI] [PubMed] [Google Scholar]
- 3. Turner RM, Magavern EF, Pirmohamed M. Pharmacogenomics: relevance and opportunities for clinical pharmacology. Br J Clin Pharmacol 2022;88:3943–6. 10.1111/bcp.15329 [DOI] [PubMed] [Google Scholar]
- 4. Rens NE, Uyl-De Groot CA, Goldhaber-Fiebert JD. et al. Cost-effectiveness of a pharmacogenomic test for stratified isoniazid dosing in treatment of active tuberculosis. Clin Infect Dis 2020;71:3136–43. 10.1093/cid/ciz1212 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.World Health Organisation. Global Tuberculosis Report 2022. Geneva: World Health Organisation. licence: cc bY-Nc-sa 3.0 iGo. [Google Scholar]
- 6. Teferi MY, El-Khatib Z, Boltena MT. et al. Tuberculosis treatment outcome and predictors in africa: a systematic review and meta-analysis. Int J Environ Res Public Health 2021;18:10678. 10.3390/ijerph182010678 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Tostmann A, Boeree MJ, Aarnoutse RE. et al. Antituberculosis drug-induced hepatotoxicity: concise up-to-date review. Journal of Gastroenterology and Hepatology (Australia) 2008;23:192–202. 10.1111/j.1440-1746.2007.05207.x [DOI] [PubMed] [Google Scholar]
- 8. Donald PR, Sirgel FA, Botha FJ. et al. The early bactericidal activity of isoniazid related to its dose size in pulmonary. Tuberculosis. 1997;156:895–900. 10.1164/ajrccm.156.3.9609132 [DOI] [PubMed] [Google Scholar]
- 9. Weiner M, Peloquin C, Burman W. et al. Effects of tuberculosis, race, and human gene SLCO1B1 polymorphisms on rifampin concentrations. Antimicrob Agents Chemother 2010;54:4192–200. 10.1128/AAC.00353-10 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Sileshi T, Mekonen G, Makonnen E. et al. Effect of genetic variations in drug-metabolizing enzymes and drug transporters on the pharmacokinetics of Rifamycins: a systematic review. Pharmgenomics Pers Med 2022;Volume 15:561–71. 10.2147/pgpm.s363058 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Ngo HX, Xu AY, Velásquez GE. et al. Pharmacokinetic-pharmacodynamic evidence from a phase 3 trial to support flat-dosing of rifampicin for tuberculosis. Clinical Infectious Diseases 78:1680–9 Published online March 11, 2024. 10.1093/cid/ciae119 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Wilkins JJ, Langdon G, Mcilleron H. et al. Variability in the population pharmacokinetics of isoniazid in south African tuberculosis patients. Br J Clin Pharmacol 2011;72:51–62. 10.1111/j.1365-2125.2011.03940.x [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Chideya S, Winston CA, Peloquin CA. et al. Isoniazid, rifampin, ethambutol, and pyrazinamide pharmacokinetics and treatment outcomes among a predominantly HIV-infected cohort of adults with tuberculosis from Botswana. Clin Infect Dis 2009;48:1685–94. 10.1086/599040 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Fukunaga K, Kato K, Okusaka T. et al. Functional characterization of the effects of N-acetyltransferase 2 alleles on N-acetylation of eight drugs and worldwide distribution of substrate-specific diversity. Front Genet 2021;12:12. 10.3389/fgene.2021.652704 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Haas DW, Abdelwahab MT, van Beek SW. et al. Pharmacogenetics of between-individual variability in plasma clearance of Bedaquiline and clofazimine in South Africa. J Infect Dis 2022;226:147–56. 10.1093/infdis/jiac024 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Naidoo A, Ramsuran V, Chirehwa M. et al. Effect of genetic variation in UGT1A and ABCB1 on moxifloxacin pharmacokinetics in south African patients with tuberculosis. Pharmacogenomics. 2018;19:17–29. 10.2217/pgs-2017-0144 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Yang S, Hwang SJ, Park JY. et al. Association of genetic polymorphisms of CYP2E1, NAT2, GST and SLCO1B1 with the risk of anti-tuberculosis drug-induced liver injury: a systematic review and meta-analysis. BMJ Open 2019;9:e027940. 10.1136/bmjopen-2018-027940 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Thomas L, Miraj SS, Surulivelrajan M. et al. Influence of single nucleotide polymorphisms on rifampin pharmacokinetics in tuberculosis patients. Antibiotics. 2020;9:1–15. 10.3390/antibiotics9060307 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Richardson M, Kirkham J, Dwan K. et al. Influence of genetic variants on toxicity to anti-tubercular agents: a systematic review and meta-analysis (protocol). Syst Rev 2017;6:142. 10.1186/s13643-017-0533-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Chamboko CR, Veldman W, Tata RB. et al. Human cytochrome P450 1, 2, 3 families as Pharmacogenes with emphases on their antimalarial and antituberculosis drugs and prevalent African alleles. Int J Mol Sci 2023;24:3383. 10.3390/ijms24043383 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Allegra S, Fatiguso G, Calcagno A. et al. Role of vitamin D pathway gene polymorphisms on rifampicin plasma and intracellular pharmacokinetics. Pharmacogenomics. 2017;18:865–80. 10.2217/pgs-2017-0176 [DOI] [PubMed] [Google Scholar]
- 22. Nanashima K, Mawatari T, Tahara N. et al. Genetic variants in antioxidant pathway: risk factors for hepatotoxicity in tuberculosis patients. Tuberculosis. 2012;92:253–9. 10.1016/j.tube.2011.12.004 [DOI] [PubMed] [Google Scholar]
- 23. Tishkoff SA, Reed FA, Friedlaender FR. et al. The genetic structure and history of Africans and African Americans. Science (1979) 2009;324:1035–44. 10.1126/science.1172257 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Pereira L, Mutesa L, Tindana P. et al. African genetic diversity and adaptation inform a precision medicine agenda. Nat Rev Genet 2021;22:284–306. 10.1038/s41576-020-00306-8 [DOI] [PubMed] [Google Scholar]
- 25. Fan S, Kelly DE, Beltrame MH. et al. African evolutionary history inferred from whole genome sequence data of 44 indigenous African populations. Genome Biol 2019;20:82. 10.1186/s13059-019-1679-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. da Rocha JEB, Othman H, Botha G. et al. The extent and impact of variation in ADME genes in sub-Saharan African populations. Front Pharmacol 2021;12:12. 10.3389/fphar.2021.634016 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Huang L, Li Y, Singleton AB. et al. Genotype-imputation accuracy across worldwide human populations. Am J Hum Genet 2008;84:235–50. 10.1016/j.ajhg.2009.01.013 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Cahoon JL, Rui X, Tang E. et al. Imputation accuracy across global human populations. Am J Hum Genet 2024;111:979–89. 10.1101/2023.05.22.541241 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Mitt M, Kals M, Pärn K. et al. Improved imputation accuracy of rare and low-frequency variants using population-specific high-coverage WGS-based imputation reference panel. Eur J Hum Genet 2017;25:869–76. 10.1038/ejhg.2017.51 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30. Uren C. Investigating Southern African Genetic Diversity and its Role in TB Susceptibility; Diss. Stellenbosch: Stellenbosch University; 2017. https://scholar.sun.ac.za
- 31. Schlebusch CM, Sjödin P, Breton G. et al. Khoe-san genomes reveal unique variation and confirm the deepest population divergence in homo sapiens. Mol Biol Evol 2020;37:2944–54. 10.1093/molbev/msaa140 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Van Eeden G, Uren C, Möller M. et al. Inferring recombination patterns in African populations. Hum Mol Genet 2021;30:R11–6. 10.1093/hmg/ddab020 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33. Duan Q, Xu Z, Raffield LM. et al. A robust and powerful two-step testing procedure for local ancestry adjusted allelic association analysis in admixed populations. Genet Epidemiol 2018;42:288–302. 10.1002/gepi.22104 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34. Swart Y, Uren C, van Helden PD. et al. Local ancestry adjusted allelic association analysis robustly captures tuberculosis susceptibility loci. Front Genet 2021;12:12. 10.3389/fgene.2021.716558 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35. Popejoy AB. Diversity in precision medicine and pharmacogenetics: methodological and conceptual considerations for broadening participation. Pharmgenomics Pers Med 2019;12:257–71. 10.2147/PGPM.S179742 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Cruz LA, Cooke Bailey JN, Crawford DC. Importance of diversity in precision medicine: generalizability of genetic associations across ancestry groups toward better identification of disease susceptibility variants. Annual Review of Biomedical Data Science Annu Rev Biomed Data Sci 2023;2023:339–56. 10.1146/annurev-biodatasci-122220 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37. Petros Z, Lee MTM, Takahashi A. et al. Genome-wide association and replication study of anti-tuberculosis drugs-induced liver toxicity. BMC Genomics 2016;17:755. 10.1186/s12864-016-3078-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38. Twesigomwe D, Drögemöller BI, Wright GEB. et al. Characterization of CYP2D6 pharmacogenetic variation in sub-Saharan African populations. Clin Pharmacol Ther 2023;113:643–59. 10.1002/cpt.2749 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39. Purcell S, Neale B, Todd-Brown K. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 2007;81:559–75. 10.1086/519795 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40. Manichaikul A, Mychaleckyj JC, Rich SS. et al. Robust relationship inference in genome-wide association studies. Bioinformatics. 2010;26:2867–73. 10.1093/bioinformatics/btq559 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41. Sengupta D, Botha G, Meintjes A. et al. Performance and accuracy evaluation of reference panels for genotype imputation in sub-Saharan African populations. Cell. Genomics. 2023;3:100332. 10.1016/j.xgen.2023.100332 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42. Taliun D, Harris DN, Kessler MD. et al. Sequencing of 53,831 diverse genomes from the NHLBI TOPMed program. Nature. 2021;590:290–9. 10.1038/s41586-021-03205-y [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43. Alexander DH, Novembre J, Lange K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res 2009;19:1655–64. 10.1101/gr.094052.109 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44. Zuo L, Wang T, Lin X. et al. Sex difference of autosomal alleles in populations of European and African descent. Genes Genomics 2015;37:1007–16. 10.1007/s13258-015-0332-z [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45. Ongaro L, Molinaro L, Flores R. et al. Evaluating the impact of sex-biased genetic admixture in the americas through the analysis of haplotype data. Genes (Basel) 2021;12:1580. 10.3390/genes12101580 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46. Gudmundsson S, Singer-Berk M, Watts NA. et al. Variant interpretation using population databases: lessons from gnomAD. Hum Mutat 2022;43:1012–30. 10.1002/humu.24309 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47. Scheinfeldt LB, Soi S, Lambert C. et al. Genomic evidence for shared common ancestry of east African hunting-gathering populations and insights into local adaptation. Proc Natl Acad Sci U S A 2019;116:4166–75. 10.1073/pnas.1817678116 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48. Verma R, da Silva KE, Rockwood N. et al. A nanopore sequencing-based pharmacogenomic panel to personalize tuberculosis drug dosing. Am J Respir Crit Care Med 2024;209:1486–96. 10.1164/rccm.202309-1583OC [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49. Kim JA, Ceccarelli R, Lu CY. Pharmacogenomic biomarkers in us fda-approved drug labels (2000–2020). J Pers Med. 2021;11:1–13. 10.3390/jpm11030179 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50. Swen JJ, van der Wouden CH, Manson LE. et al. A 12-gene pharmacogenetic panel to prevent adverse drug reactions: an open-label, multicentre, controlled, cluster-randomised crossover implementation study. The Lancet 2023;401:347–56. 10.1016/S0140-6736(22)01841-4 [DOI] [PubMed] [Google Scholar]
- 51. Hurrell T, Naidoo J, Masimirembwa C. et al. The case for pre-Emptive pharmacogenetic screening in South Africa. J Pers Med 2024;14:114. 10.3390/jpm14010114 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52. Corpas M, Siddiqui MK, Soremekun O. et al. Annual review of pharmacology and toxicology addressing ancestry and sex bias in pharmacogenomics. The Annual Review of Pharmacology and Toxicology is 2023;64:53–64. 10.1146/annurev-pharmtox-030823 [DOI] [PubMed] [Google Scholar]
- 53. Mpye KL, Matimba A, Dzobo K. et al. Disease burden and the role of pharmacogenomics in African populations. Glob Health Epidemiol Genom 2017;2:e1. 10.1017/gheg.2016.21 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54. Ndong, Sima CAA, Othman H, Möller M. et al. Advancing pharmacogenetics research in Africa: the “project Africa GRADIENT” initiative. Drug Discov Today 2024;29:103939. 10.1016/j.drudis.2024.103939 [DOI] [PubMed] [Google Scholar]
- 55. Othman H, Jemimah S, da Rocha JEB. SWAAT bioinformatics workflow for protein structure-based annotation of ADME gene variants. J Pers Med 2022;12:263. 10.3390/jpm12020263 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56. Zhang H, De T, Zhong Y. et al. The advantages and challenges of diversity in pharmacogenomics: can minority populations bring us closer to implementation? Clin Pharmacol Ther 2019;106:338–49. 10.1002/cpt.1491 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57. Warnich L, Drögemöller BI, Pepper MS. et al. Pharmacogenomic research in South Africa: lessons learned and future opportunities in the rainbow. Nation. 2011;9:191–207. 10.2174/187569211796957575 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58. Dandara C, Masimirembwa C, Haffani YZ. et al. African pharmacogenomics consortium: consolidating pharmacogenomics knowledge, capacity development and translation in Africa. AAS Open Res 2019;2:2. 10.12688/aasopenres.12965.1 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59. Possuelo LG, Castelan JA, De Brito TC. et al. Association of slow N-acetyltransferase 2 profile and anti-TB drug-induced hepatotoxicity in patients from southern Brazil. Eur J Clin Pharmacol 2008;64:673–81. 10.1007/s00228-008-0484-8 [DOI] [PubMed] [Google Scholar]
- 60. Gréen H, Falk IJ, Lotfi K. et al. Association of ABCB1 polymorphisms with survival and in vitro cytotoxicty in de novo acute myeloid leukemia with normal karyotype. Pharmacogenomics Journal 2012;12:111–8. 10.1038/tpj.2010.79 [DOI] [PubMed] [Google Scholar]
- 61. Gervasini G, Jara C, Olier C. et al. Polymorphisms in ABCB1 and CYP19A1 genes affect anastrozole plasma concentrations and clinical outcomes in postmenopausal breast cancer patients. Br J Clin Pharmacol 2017;83:562–71. 10.1111/bcp.13130 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62. Roy B, Chowdhury A, Kundu S. et al. Increased risk of antituberculosis drug-induced hepatotoxicity in individuals with glutathione S -transferase M1 ‘null’ mutation. J Gastroenterol Hepatol 2001;16:1033–7. 10.1046/j.1440-1746.2001.02585.x [DOI] [PubMed] [Google Scholar]
- 63. Bose PD, Sarma MP, Medhi S. et al. Role of polymorphic N-acetyl transferase2 and cytochrome P4502E1 gene in antituberculosis treatment-induced hepatitis. Journal of Gastroenterology and Hepatology (Australia) 2011;26:312–8. 10.1111/j.1440-1746.2010.06355.x [DOI] [PubMed] [Google Scholar]
- 64. Kim MS, Patel KP, Teng AK. et al. Genetic disease risks can be misestimated across global populations 06 biological sciences 0604 genetics. Genome Biol 2018;19:179. 10.1186/s13059-018-1561-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65. Kinzig-Schippers M, Tomalik-Scharte D, Jetter A. et al. Should we use N-acetyltransferase type 2 genotyping to personalize isoniazid doses? Antimicrob Agents Chemother 2005;49:1733–8. 10.1128/AAC.49.5.1733-1738.2005 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66. Patin E, Harmant C, Kidd KK. et al. Sub-Saharan African coding sequence variation and haplotype diversity at the NAT2 gene. Hum Mutat 2006;27:720. 10.1002/humu.9438 [DOI] [PubMed] [Google Scholar]
- 67. Wang P, Pradhan K, Zhong X. et al. Isoniazid metabolism and hepatotoxicity. Acta Pharm Sin B 2016;6:384–92. 10.1016/j.apsb.2016.07.014 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68. Li LM, Chen L, Deng GH. et al. SLCO1B1*15 haplotype is associated with rifampin-induced liver injury. Mol Med Rep 2012;6:75–82. 10.3892/mmr.2012.900 [DOI] [PubMed] [Google Scholar]
- 69. Weiner M, Gelfond J, Johnson-Pais TL. et al. Elevated plasma moxifloxacin concentrations and SLCO1B1 g.11187G>a polymorphism in adults with pulmonary tuberculosis. Antimicrob Agents Chemother 2018;62:e01802-17. 10.1128/AAC.01802-17 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70. Chigutsa E, Visser ME, Swart EC. et al. The SLCO1B1 rs4149032 polymorphism is highly prevalent in south Africans and is associated with reduced rifampin concentrations: dosing implications. Antimicrob Agents Chemother 2011;55:4122–7. 10.1128/AAC.01833-10 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71. Masiphephethu MV, Sariko M, Walongo T. et al. Pharmacogenetic testing for NAT2 genotypes in a Tanzanian population across the lifespan to guide future personalized isoniazid dosing. Tuberculosis. 2022;136:102246. 10.1016/j.tube.2022.102246 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.




