Skip to main content
OMICS : a Journal of Integrative Biology logoLink to OMICS : a Journal of Integrative Biology
. 2021 Apr 8;25(4):213–233. doi: 10.1089/omi.2021.0004

Human OMICs and Computational Biology Research in Africa: Current Challenges and Prospects

Yosr Hamdi 1,2,, Lyndon Zass 3,*, Houcemeddine Othman 4,*, Fouzia Radouani 5, Imane Allali 3,6, Mariem Hanachi 7,8, Chiamaka Jessica Okeke 9, Melek Chaouch 7, Maureen Bilinga Tendwa 9, Chaimae Samtal 10,11, Reem Mohamed Sallam 12,13, Nihad Alsayed 14, Michael Turkson 15, Samah Ahmed 14, Alia Benkahla 7, Lilia Romdhane 1,8, Oussema Souiai 7, Özlem Tastan Bishop 9, Kais Ghedira 7,, Faisal Mohamed Fadlelmola 14,, Nicola Mulder 3,, Samar Kamal Kassim 12,
PMCID: PMC8060717  PMID: 33794662

Abstract

Following the publication of the first human genome, OMICs research, including genomics, transcriptomics, proteomics, and metagenomics, has been on the rise. OMICs studies revealed the complex genetic diversity among human populations and challenged our understandings of genotype–phenotype correlations. Africa, being the cradle of the first modern humans, is distinguished by a large genetic diversity within its populations and rich ethnolinguistic history. However, the available human OMICs tools and databases are not representative of this diversity, therefore creating significant gaps in biomedical research. African scientists, students, and publics are among the key contributors to OMICs systems science. This expert review examines the pressing issues in human OMICs research, education, and development in Africa, as seen through a lens of computational biology, public health relevant technology innovation, critically-informed science governance, and how best to harness OMICs data to benefit health and societies in Africa and beyond. We underscore the disparities between North and Sub-Saharan Africa at different levels. A harmonized African ethnolinguistic classification would help address annotation challenges associated with population diversity. Finally, building on the existing strategic research initiatives, such as the H3Africa and H3ABioNet Consortia, we highly recommend addressing large-scale multidisciplinary research challenges, strengthening research collaborations and knowledge transfer, and enhancing the ability of African researchers to influence and shape national and international research, policy, and funding agendas. This article and analysis contribute to a deeper understanding of past and current challenges in the African OMICs innovation ecosystem, while also offering foresight on future innovation trajectories.

Keywords: Africa, computational biology, African OMICs innovation ecosystem, foresight, diversity, bioinformatics, public health

Introduction

Biomedical OMICs research is based on systems science and driven by a multitude of high-throughput technologies such as genomics, transcriptomics, proteomics, and metabolomics, not to mention computational biology. OMICs science and scholarship contribute to both therapeutics and diagnostics innovation in health care and through its systems focus, help advance public and population health, as well as preventive and personalized medicine.

African scientists, students, and publics have been and are among the key contributors to global systems science and innovation ecosystem. This expert review and analysis bring together some of the leading experts and ideas in the field. We discuss the pressing issues on human OMICs Research and Development (R&D) in Africa and prospects and challenges as seen through a lens of computational biology, technology innovation, and critical governance of emerging data and how best to harness OMICs data and science to benefit population health in Africa and beyond.

In this expert review, we highlight the challenges to conducting OMICs research in Africa, specifically focusing on genomics, transcriptomics, proteomics, and metagenomics, since a substantial amount of human data for these OMICs areas has been reported. First, we discuss the availability of African OMICs data in public databases. Thereafter, we discuss the population and genetic diversity associated with populations from North and Sub-Saharan Africa. Then, technical, ethical, and socioeconomic challenges and opportunities related to African human OMICs studies are discussed. Finally, we describe the key issues related to African OMICs data analysis, storage, sharing, and annotation in public databases, highlighting the need for standardized methods by which human African data should be presented with respect to ethnicities and populations.

Overarching Systems Science Context

Due to the reduced cost of sequencing and increased access to high throughput technologies, OMICs data are currently being produced at unprecedented rates, resulting in increased sizes of data being submitted to public repositories (Luo et al., 2016). This represents an important milestone in biomedical research as it facilitates the generation, access, and use of data to promote and maintain the scientific integrity of biomedical sciences; however, it has also come with unexpected challenges. Due to the diversity and complexity of these data, organizing and integrating datasets represent not only a monumental conceptual challenge but also a practical hurdle in the routine analysis of generated data. Several publicly available tools and databases have been developed to address such concerns, including those provided by UCSC (Kent et al., 2002), Ensembl (Cunningham et al., 2019), and NCBI (NCBI Resource Coordinators, 2017).

Due to the growing number of research studies being conducted on African populations, the amount of OMICs data from this region is on the rise (Dandara et al., 2014). Continental collaborations and consortia, such as the Human Heredity and Health in Africa (H3Africa) consortium and the H3Africa Bioinformatics Network (H3ABioNet), have enabled and facilitated world-class OMICs research across the African continent (Adoga et al., 2014; Mulder et al., 2016). Despite these advances, few guidelines and standards available to describe, annotate, and classify OMICs data generated from African populations and few African-specific OMICs databases exist (Khan, 2017; Kruse et al., 2016). Although high-throughput technologies offer unsurpassed OMICs data generation, significant challenges related to the application of the generated data to advance health ecosystems still exist.

Additional steps are required to move OMICs from proof of concept to a more functional and useful tool that can be applied to environmental monitoring for critically-informed and astute governance of ecological crises.

Methods Used in the Expert Review

The review describes and discusses information retrieved from both literature curation and data exploration in selected (existing) sources. A comprehensive literature review was conducted on the topics of genetic diversity in African populations, as well as the state of OMICs research in Africa (and the related challenges). When exploring the availability of African OMICs data generated from human research, several existing databases were consulted.

First, we investigated the OMICs Discovery Index (OmicsDI) to identify the overall distribution of African genomics, transcriptomics, and proteomics datasets. Thereafter, we explored more specific databases for each of the aforementioned fields. Because the curation of metagenomics datasets is still lacking in OmicsDI, we conducted a more holistic data search for metagenomics using the Sequence Read Archive (SRA). To retrieve OMICs records related to African research, standard, targeted querying methods were used where possible. These methods are summarized in Table 1.

Table 1.

Methods Used to Explore the Available African Data in the Selected Resources

Omics platform Resources Access date Method used
Multiomics OmicsDI February 2021 Targeted query and filter (“Africa” OR “African”; Homo sapiens)
Genomics GWAS Catalog February 2021 Custom Python script counting Ancestry entries
PharmGKB February 2021 Custom Python script counting Region entries
Transcriptomics GEO February 2021 Targeted query and filter (“Africa” AND “African”; Homo sapiens)
ArrayExpress February 2021 Targeted query and filter (“Africa” OR “African”; Homo sapiens)
Proteomics PRIDE February 2021 Provided by database
Metagenomics SRA February 2021 Targeted querying e.g., “(Uganda OR Algeria OR Bantu) AND (microbiota OR microflora) AND (human or ‘homo sapiens’)”

GWAS, Genome-Wide Association Study; PharmGKB, PharmacoGenomics Knowledge Base; PRIDE, Proteomics Identification Database; SRA, Sequence Read Archive.

Challenges Related to Data Availability and Diversity

Availability of human African OMICs data

The rapid evolution of OMICs related technologies has been supported by the development of computational tools used in the processing and analysis of large datasets. However, African researchers are still facing considerable challenges with regards to generating OMICs data. Nevertheless, a large collection of resources provide access to information relevant to African OMICs research, although the rigor of clear labeling differs among these data sources (summarized in Supplementary Table S1).

Genomics

Several existing databases host relevant genomics data that range from large-scale sequencing and phenotypic data—such as the SRA (Leinonen et al., 2011) and the European Genome-Phenome Archive (Lappalainen et al., 2015)—to databases and warehouses, which gather genetic information at gene and variant level—such as the Genome-Wide Association Studies (GWASs) Catalog (Buniello et al., 2019; Morales et al., 2018) and PharmacoGenomics Knowledge Base (PharmGKB) (Thorn et al., 2013). However, these resources are typically characterized by an underrepresentation of African information. A targeted search in OmicsDI, a knowledge discovery framework across heterogeneous OMICs data linked to several existing OMICs databases (Perez-Riverol et al., 2017), revealed 812 genomic datasets, less than half compared to the European counterpart. Table 2 summarizes some of the most notable African genomics studies that have been published in the past decade.

Table 2.

Genomic Studies in Different African Subpopulations (GWAS, WES, and WGS)

Region/population Country/ethnic group No. of samples
PMID
GWAS/haplotype analysis WES WGS
African-American Afro-American 23,827 (stage one)
11,544 (replication study)
25102180
Afro-American   761   23103231
African Ancestry in Southwest USA (ASW)     68 27638885
African Caribbean   98 27638885
North Africa Tunisia 135 59 30594178
29652911
29879995
29169765
31098807
31073722
239     21915847
21833004
24312208
Morocco 3 28951997
87 24312208
Algeria 102 17909833
Libya 215   24312208
Touareg (Libya) 47 21312181
Egypt 110 19686289
East Africa Luhya in Webuye, Kenya (LWK)   106 27638885
Niger-Congo
Nilo-Saharan
740 31293624
Luhya, Kikuyu, Kalenjin (Kenya) 320 25470054
Kenya 2741 cases
4183 controls (89 genotyped SNPs)
25261933
Tanzania 501 cases
504 controls (89 genotyped SNPs)
25261933
Malawi 1815 cases
3272 controls (89 genotyped SNPs)
25261933
Amhara, Oromo, Somali (Ethiopia)     320 25470054
Ethiopian (Weth) 120 25069476
Baganda, Banyarwanda, Barundi (Uganda) 320 25470054
West & Central Africa Yuroba, Fula, Jola, Mandinka, Manjago, Serehule, Serere, Wollof, Bambara, Malinke, Mossi, Akans, Kasem, Namkam, Bantu, Semi-Bantu 1122 31293624
Congo 19 23 21627702
Gabon 163   21627702
Cameroon 166 179 21627702
Cameroon 914 cases
914 controls (89 genotyped SNPs)
25261933
Nigeria 132 21627702
Esan in Nigeria (ESN) 111 27638885
Yoruba in Ibadan, Nigeria 100 27638885
Yoruba, Igbo (Nigeria) 320 25470054
Nigeria 114 cases
88 controls (89 genotyped SNPs)
25261933
Ga-Adangbe (Ghana) 320 25470054
Ghana 1032   28408189
Ghana (Navrongo/Noguchi) 2459 cases
2129 controls (89 genotyped SNPs)
25261933
Ghana (Kumasi) 1923 cases
2326 controls (89 genotyped SNPs)
25261933
Mali 510 cases
389 controls (89 genotyped SNPs)
25261933
Burkina Faso 983 cases
816 controls (89 genotyped SNPs)
25261933
Mende in Sierra Leone 98 27638885
Gambia 120 27638885
The Kombos region of The Gambia 2500 (replication study in 3500 children from the same region)     19465909
Jola, Fula, Wolof, Mandinka (Gambia) 320 25470054
Gambia 2801 cases
4527 controls (89 genotyped SNPs)
25261933
Southern Africa Malawi, Amaxhosa, Sebantu, JU/Hoansi,Ixun, Karretjie, Khomani, Heroro, Nama, Gui/Gana, Khwe 328 31293624
Botswana 164 29706352
Uganda 150 29706352
Black Xhosa 25 Unpublished
Sotho, Zulu (South Africa) 320 25470054
South Africa 24 (8 Colored and 16 black south eastern Bantu-speakers) 29233967

GWAS, Genome Wide Association Study; WES, number of samples from Whole Exome Sequencing; WGS, number of samples from Whole Genome Sequencing.

The GWAS Catalog contains information related to manually curated GWASs, to facilitate the identification of causal variants and understanding of disease mechanisms (Buniello et al., 2019). Unfortunately, although this does represent an increase from previous analysis, African data are still limited to only 10% of the overall studies reported in this database (Fig. 1). These studies are covered under the ancestral categories “African-American,” “African-unspecified,” and “Sub-Saharan African,” the majority of which are African-American.

FIG. 1.

FIG. 1.

Size of data entries by world region in the GWAS Catalog and PharmGKB databases. The illustration displays a notable bias toward European subjects and poor representation of subjects from Africa. (GMEs: Great Middle Easterns). GWAS, Genome-Wide Association Study; PharmGKB, PharmacoGenomics Knowledge Base.

In addition, a large percentage of North African studies are clustered in the “Greater Middle Eastern” category, which contains “Middle Eastern,” “North African,” or “Persian” data as defined by the authors. Similar representation was identified in PharmGKB, a comprehensive database which provides relevant information for clinicians and researchers on the impact of genetic variants on drug response and toxicity (Thorn et al., 2013). African studies comprise ∼4% of studies within PharmGKB (Fig. 1), including “African Americans/Afro-Caribbeans” and “Sub-Saharan Africans.” A considerable number of additional African studies are clustered in the categories “Other” and “Near Eastern” (Thorn et al., 2013).

These disparities extend to other genomic resources where African samples and data are significantly underrepresented. When investigating the most common cancer cell lines and tumor samples available at the National Cancer Institute Patient-Derived Models Repository, we found that a significant proportion of samples (48.3%) have no records on their donor's race/ethnicity, and African-Americans only represent 3.8% of the total number of donors (n = 794) (Guerrero et al., 2018). Similar proportions are observed in 15 international cancer biobanks (Vaught et al., 2009), 3 U.S. biorepositories, and the Komen Tissue Bank (McCarty et al., 2017) where, among African ethnicities, only African-American samples are reported, representing 11.05% of the total cohort (n = 8293).

In addition, data collected from five major cancer genomics projects (TCGA, TARGET, cancer-related GWAS, OncoArray Consortium, and The Chinese Cancer Genome Consortium) showed that tumor samples were collected predominantly from Caucasians (91.1%); 0.64% of individuals have no data on ethnicity, and among the total number of samples (n = 6,765,447), a very modest African-American representation was observed (1.7%) (Guerrero et al., 2018). Moreover, among the 27,591,555 human DNA and cDNA sequences (“Homo Sapiens”) that have been submitted to GenBank, only 24,567 African sequences are reported (8%), half of which were derived from one African country (South Africa: 14,801 DNA sequences) (Fig. 2).

FIG. 2.

FIG. 2.

Representation of Human genomic sequences with African origins in NCBI nucleotide database. Only entries with “country” qualifiers were mined from the database. A total of 21,456 human genome entries (sequences of some target loci) have been submitted by these African countries.

Transcriptomics

The transcriptome refers to the total RNA within a cell or tissue and reflects all genes that are actively expressed at any given moment. The transcriptome is the template for protein synthesis during translation, resulting in a corresponding protein complement or proteome; it also includes noncoding RNAs (Lowe et al., 2017). Transcriptomics data associated with African individuals are largely underrepresented, OmicsDI contains <600 African transcriptomics datasets, compared to over 1000 European datasets. Targeted querying in the GEO by NCBI (Clough and Barrett, 2016) returns 23 hits, while 278 hits are returned from The ArrayExpress Archive (Parkinson et al., 2005).

Notably, country or region is not among the metadata collected by either GEO or the ArrayExpress Archive, and results are only a reflection of study title and/or description. Although the ArrayExpress Archive provides the possibility to use the Experimental Factor Ontology for browsing the database, the ontology subclass “ancestry category” only includes the “Sub-Saharan Africa” and “Yoruban” terms in the tree hierarchy downstream of the “African” term.

Proteomics

Numerous proteomic studies have provided evidence for the influence of ethnicity on protein expression (Cho et al., 2017; Kim et al., 2010; Nguyen et al., 2019). We are currently in what might be considered the “golden age” of proteomics, characterized by the creation of public proteomics databases, the development of structural analysis tools which predict variant effects, such as HUMA and PRIMO (Brown and Tastan Bishop, 2018; Hatherley et al., 2016), and the standardization of data submission under the ProteomeXchange consortium (Vizcaíno et al., 2014). Despite international progress, few projects studying the proteomes of either African populations or that of the pathogens related to diseases endemic to Africa have been uploaded to publicly accessible databases. Approximately 23 African proteomic studies have been reported by ProteomeXchange, with the majority conducted in Egypt, Ghana, and South Africa (Vizcaíno et al., 2014). Similarly, OmicsDI contains only 114 African proteomics datasets, and the Proteomics Identification Database (PRIDE) (Perez-Riverol et al., 2019) contains only 16 (14 from South Africa and 1 each from Tunisia and Egypt). This may be as a result of hesitance to share data or a lack of awareness regarding the importance of sharing such data. The importance of data sharing needs to be emphasized, and it is the responsibility of African scientists to carefully annotate their data in a manner that highlights the ethnic and regional origin of the studied populations. In this regard, H3Africa and H3ABioNet often host training events and(or) awareness campaigns highlighting the importance of this.

Metagenomics

Metagenomic research is of high interest due to the therapeutic potential of microbiota and its close link to health (Cani, 2018; Hadrich, 2018; Martin et al., 2018). Like other fields, human metagenomic data of African origin are underrepresented in public databases, accounting for only 5% of data when querying the SRA using targeted querying expressions, for example, “(Uganda OR Algeria OR Bantu) AND (microbiota OR microflora) AND (human or ‘homo sapiens’).” The query expression aims to search for SRA entries about studies analyzing the metagenome and microbiota isolated from humans that are curated with metadata about African countries (example, Algeria and Uganda) or an African ethnicity (example, Bantu and Xhosa).

Among the projects identified, while 84% implicitly involved African collaborators, only 42% of the projects were conducted in African laboratories (based on data owners and/or first, last, or corresponding authors). In addition, the majority of these data originated from Sub-Saharan Africa, with an obvious lack of North African representation (derived mostly from Egypt) (Allali et al., 2018; Aly et al., 2016; Awany et al., 2018; Shankar et al., 2017). Moreover, the metadata associated with the metagenomic data is insufficient, highlighting the lack of appropriate annotation guidelines. In this regard, a number of metadata reporting frameworks have been developed and are being adapted for African use (Field et al., 2010).

Population diversity associated with African OMICs data

Despite the fact that a substantial amount of African OMICs data is available in public databases, there are many challenges related to finding such data. These challenges are rooted in the population diversity of Africans and are related to the metadata reporting requirements emphasized by public databases and the manner in which these datasets are described. African populations are characterized by high levels of within-population genetic diversity, unique genetic variants that are not reported in other populations, and reduced linkage disequilibrium among loci compared to non-African populations (Campbell and Tishkoff, 2008).

Genetic studies in Africa suggest that genetic variation is broadly correlated with geography, language classification, and ethnicities (Choudhury et al., 2018; Gomez et al., 2014; Li et al., 2014). Although studied to a much lesser degree, population diversity, even between different African populations, has also been associated with differential gene expression and regulation as determined by transcriptome sequencing (Martin et al., 2014). Similarly, population diversity has also been linked to protein diversity; however, research on such diversity is lacking, particularly so in Africa (Nedelkov, 2008).

Because of this complex genetic architecture, current databases typically use substandard classification methods for African OMICs data, often leading to erroneous classifications. For example, North African populations are often reported with European, Eurasian, Mediterranean, Middle Eastern, or Greater Middle Eastern populations. This clustering makes it difficult for African researchers to find the information that is relevant and appropriate for their specific populations of interest. Linguistic identity has also been used to classify African populations, erroneously assuming a correlation between dialectal identity and ethnicity. This classification does not consider the diversity among African populations (including several admixed populations), further described below.

North Africa

North Africa is defined as the most northerly region of the continent, extending from the shores of the Atlantic Ocean in the west to the Red Sea in the east. It is composed of seven countries: Mauritania, Morocco, Algeria, Tunisia, Libya, Egypt, and Sudan, as defined by the United Nations (www.un.org/en).

The majority of the North African countries have an ancient Berber background with influences from several civilizations during different historical periods. Recent studies investigating the origin of human populations in North Africa (Arauna et al., 2017; Fadhlaoui-Zid et al., 2013; Henn et al., 2012) suggest ancestral contributions from four main sources: (1) an autochthonous Maghrebi component related to back migration to Africa from Eurasia; (2) a Middle Eastern component associated with the Arab conquest; (3) a Sub-Saharan component derived from trans-Saharan migrations; and (4) a European component linked to prehistoric and recent historic movements (Fregel et al., 2018).

Information related to North African subpopulations is summarized in Table 3. Historically, most North Africans share both common and mixed ancestry, although Tunisian Berbers have shown long periods of genetic isolation and appear to have diverged from surrounding populations without subsequent mixture. In contrast, continuous gene flow from the Middle East has made Egyptians genetically closer to Eurasians than to North Africans (Fadhlaoui-Zid et al., 2013).

Table 3.

North African Ethnolinguistic Groups

Ethnic group Country Region Language Population (in million)
Baggara (Afro-Asiatic, Semitic) Sudan Chad Basin Sudanese Arabic 6
Berbers (Afro-Asiatic, Berber) Morocco, Algeria, Libya, Mauritania, Tunisia, Egypt Maghreb Berber 27
Copts (Afro-Asiatic, Egyptian) Egypt, Sudan Nile Valley Coptic 10
Fur (Nilo-Saharan, Eastern Sudanic) Sudan Nile Valley Fur 1.0
Haratin (Afro-Asiatic, Semitic) Mauritania, Morocco Maghreb Hassaniya Arabic 2
Maghrebis (Afro-Asiatic, Semitic) Morocco, Algeria, Tunisia, Libya Maghreb Maghrebi Arabic 72
Moors (Afro-Asiatic, Semitic) Mauritania, Morocco Maghreb Hassaniya Arabic 2
Nubians (Nilo-Saharan, Eastern Sudanic) Sudan, Egypt Nile Valley Nobiin 1.0
Sudanese Arabs (Afro-Asiatic, Semitic) Sudan Nile Valley Sudanese Arabic 28
Toubou (Nilo-Saharan) Libya, Sudan Tibesti Tebu 0.35
Tuareg (Afro-Asiatic, Berber) Algeria, Libya, Morocco, Tunisia Maghreb/Sahara Tuareg 1.2
Zaghawa (Nilo-Saharan, Eastern Saharan) Sudan Chad Basin Zaghawa 0.2

The complex genetic structure of North African populations has been the subject of many studies focused on the Y chromosome, mitochondrial DNA, and single nucleotide polymorphism (SNP) genotyping methods. These studies showed the admixed and intermediate genetic architecture of the North African populations (Hamdi et al., 2018; Kefi et al., 2018; Sánchez-Quinto et al., 2012). Previous studies conducted on North African groups have detected vast genetic variability and heterogeneity, as well as a lack of discrete genetic grouping by either geographical or linguistic criteria (Gaibar et al., 2018). Indeed, the main demographic features of North African populations are their familial structure and high rates of consanguinity and endogamy, which have a proven impact on the genetic structure of these populations and, consequently, on the incidence rate of several hereditary diseases, mainly rare forms of autosomal recessive genetic disorders.

Recent evidence redated the origin of our species to an origin 300,000 years ago with the discovery of the world's oldest well-dated fossil of Homo sapiens in the “Jebel Irhoud” cave in Morocco (Hublin et al., 2017). This highlights the importance of the region in describing modern human history. Despite the genetic wealth of North African populations, they are not well represented in the primary human genomics projects (e.g., HapMap, 1000 Genomes) and they are poorly covered in public OMICs databases.

Recently, a custom-built ancestry panel containing 111 SNPs combined with a previous set of 126 SNPs that have patterns of allele frequency differentiation was developed to analyze North African and Middle Eastern populations (Pereira et al., 2019). The study results show that the Middle Eastern and North African populations tend to cluster close to the Southwest Asia and Southern European populations. This study confirmed results from previous studies suggesting that geography was not the only cause of genetic variation and that patterns of population substructure were influenced by factors such as culture and religion. Indeed, the predominantly Muslim Middle Eastern populations show signs of genetic admixture with African populations, while Christian groups have higher proportions of Western European ancestry (Haber et al., 2011).

Sub-Saharan Africa

Sub-Saharan African populations are spread across 46 countries, over a wide variety of biomes (Sahara, savannah, tropical forest, etc.), shaping great linguistic and cultural diversity (Choudhury et al., 2018). There are >2,000 different languages in Sub-Saharan Africa, classified into four subcategories termed Afro-Asiatic, Niger-Kordofanian, Nilo-Saharan, and Khoisan (Busby et al., 2016; Campbell and Tishkoff, 2010; Choudhury et al., 2018; Fan et al., 2019). For example, the hunter-gatherers of central African rain forests are categorized into 20 culturally heterogeneous groups who practice either hunting-gathering or fishing and agriculture (Lorente-Galdos et al., 2019; Verdu and Destro-Bisol, 2012).

Figure 3 describes the distribution of the main ethnic groups in different Sub-Saharan African countries. This observed diversity may be largely explained by the complex history of population migration and evolution. Recent studies revealed the oldest split between the hunter-gatherers Khoe and San ancestry and modern human populations to be between 100 and 350,000 years ago (Choudhury et al., 2018; Schlebusch et al., 2017; Veeramah et al., 2012). Thereafter, a succession of widespread events led to the establishment of (1) northern and southern Khoe and San in Southern Africa, (2) rain forest foragers in Central Africa, (3) Afro-Asiatic speakers and Nilo-Saharans in the East, and (4) the Nilo-Saharan and Niger-Congo in the West (Choudhury et al., 2018). The expansion of the Bantu to the South led to an increase in diversity and a decrease of the Khoe and San genetic influence (Choudhury et al., 2018; Li et al., 2014). Additional Eurasian admixture was also reported in the Southern, East, and West African populations (Gurdasani et al., 2015; Pickrell et al., 2014). Arab expansion, mainly on the eastern side of the continent, had significant influence on Swahili culture (Capredon et al., 2013; Msaidie et al., 2011).

FIG. 3.

FIG. 3.

Primary Sub-Saharan African Ethno-Linguistic Groups. Data have been collected only for groups with populations of more than one million members. Data were extracted from the Joshua project database (https://joshuaproject.net/).

Finally, recent colonial history further contributed to this diversification (de Wit et al., 2010), and new studies covering ethnolinguistic groups with little or no prior representation are continuously being published (Busby et al., 2016; Lorente-Galdos et al., 2019). The environmental and cultural specificities, as well as the complex demographic history of Sub-Saharan African populations, have contributed to their high genetic heterogeneity (Gomez et al., 2014; Tishkoff et al., 2009). Despite the genetic diversity of Sub-Saharan African populations, their representation is limited to specific regions in HapMap and 1000 Genomes projects, while most other ethnic groups are underrepresented.

Challenges Related to African OMICs Studies

A number of key questions are raised with regards to the genetic wealth of Africa, as well as the importance of African populations in the origin of human species on one hand and the contrasting modest representation of these populations in OMICs research on the other hand. In the following section, we discuss many of the challenges that limit progress in African OMICs research (Fig. 4).

FIG. 4.

FIG. 4.

Primary challenges related to African OMICs data processing.

Infrastructure

Despite the efforts led by key stakeholders to tackle the infrastructure challenges in Africa for improving research, much progress still needs to be made (Laurance et al., 2015). Progress is hindered by numerous concerns, including lack of governmental support, weak scientific programs, poor financial resources, environmental issues, and weak infrastructure (Rotimi et al., 2017). One of the biggest infrastructural challenges in Africa remains the power/electricity supply. Access to electricity affects the majority of Sub-Saharan countries with some exceptions, while North Africa profits from coverage levels comparable with developed countries (The Economist, November 2019).

The majority of the Sub-Saharan African countries suffer from frequent and extended power shortages, which hinder and limit connectivity and communication (Laurance et al., 2015). This limited access creates a bottleneck with regards to education opportunities and scientific research development across the continent. Therefore, providing access to comprehensive communication technology and opportunities for high-speed Internet connectivity across the continent will greatly improve the quality of education, scientific research, and collaborations.

Furthermore, the capacity to conduct OMICS studies in Africa is limited by the availability of the required equipment in many countries (Inzaule et al., 2021). In terms of instrumentation, many countries in Africa still lack sequencing equipment and capacity, such as Mauritania, Niger, Chad, Burkina Faso, and Libya. Others are relatively well equipped such as South Africa (https://www.diplomics.org.za/) and Nigeria in Sub-Saharan Africa and Tunisia and Egypt in North Africa that have recently launched two big genomics projects entitled Genome Tunisia and Genome Egypt. The hardware distribution for sequencing technologies is also unbalanced across the continent, and it is clear that for some countries the available resources would not be able to cover the need for generating representative data. For example, Sudan and Angola, with a population of 42 and 31 million, respectively, host one device each for third-generation sequencing.

Despite these infrastructural challenges, many initiatives have been established to improve scientific research on the continent. These include: the H3Africa consortium, aiming to investigate health and disease in diverse African subpopulations, the One-African Science Partnership for Intervention Research Excellence (2016–2021), aiming to build world-leading Pan-African research capacity, and the African Academy of Sciences (AAS), which supports projects related to health and natural sciences mainly with The Developing Excellence in Leadership, Training, and Science in Africa (DELTAS). DELTAS is a US$100 million program supporting the Africa-led development of world-class researchers and scientific leaders.

Biobanking

With support from H3Africa, infrastructure for OMICs sample storage has recently been established in several locations across Africa. The consortium promotes best practices using internationally accepted guidelines such as the International Society for Biological and Environmental Repositories (ISBER) and the African Society of Laboratory Medicine (ASLM). Three regional H3Africa biobanks were established in the West (Nigeria), East (Uganda), and South Africa, using STARLIMS and FreezerWorks Laboratory Information Management Systems (LIMS) to automate storage, sample, and data management. These biobanks store African biological samples and associated metadata from thousands of volunteers participating in H3Africa projects, which aim to study the role of genetics in disease susceptibility and adverse drug reactions. The samples could facilitate further development of drugs and vaccines for use in the developing world (Fakunle and Loring, 2012) and are available to other researchers. Researchers not affiliated with H3Africa can apply to access these samples after approval by a Data and Biospecimen Access Committee.

The genomic and phenotype data related to the biospecimens available from the three H3Africa biorepositories are housed in the European Genome-phenome Archive, available for searching and access request using the H3Africa catalog (https://catalog.h3africa.org). Similarly, the Bridging Biobanking and Biomedical Research across Europe and Africa (B3Africa) initiative has also supported research, data analysis, and sample storage across Europe and Africa. Together with H3Africa, they have facilitated the testing of LIMS and bioinformatics tools developed for less resourced settings, as well as promoting the certification and accreditation of these biorepositories (Abimiku et al., 2017; Kinkorová and Topolčan, 2018). Notably, there are no biorepositories set up in North Africa, limiting sample storage and sharing capacity in this region. Thus, more needs to be done to implement sample storage facilities that follow international guidelines in all African regions.

Training and education challenges

Generating and processing OMICs data require a well-trained multidisciplinary team and expertise in over 20 different professions (Fig. 5). Human biological material (HBM) is first collected by nurses, surgeons, radiologists, and/or pathologists, while phenotypic data are collected by medical doctors or specifically trained persons that have access to participants or their medical records. The H3ABioNet and H3Africa consortia, together with several other stakeholders, launched an African Genomics Medicine Training initiative for nurses and medical doctors, which aims to provide introductory and basic genomics education to health care workers in the biomedical field (Nembaware and Mulder, 2019).

FIG. 5.

FIG. 5.

Key steps, actors, and skills needed during the whole life cycle of OMICs data.

Subsequent steps consist of storing biological samples and their associated data. Biorepository members have to be trained in good laboratory practice, ethics, data management, and governance. The third step of the process is the use of OMICs technologies to perform suitable experiments such as DNA, RNA, and/or protein extraction and sequencing. At this stage, well-trained laboratory technicians with good laboratory practice skills and OMICs technology expertise are needed. Once OMICs data are generated, bioinformaticians and computer engineers have to ensure appropriate data storage, annotation, and analysis. Notably, the bioinformatics community in Africa remains relatively small compared to developed countries. Consequently, there is a need for investment in local education programs. Bioinformatics education and training are now fully integrated in academic curricula in developed countries; however, few African countries have introduced bioinformatics degree programs (Karikari et al., 2015; Tastan Bishop et al., 2015). In most cases bioinformatics is taught as an optional subject at different degree levels (Bachelors and Masters). More recently, a Fogarty International grant opportunity was dedicated to bioinformatics research training in Africa and the establishment of Bioinformatics degrees in multiple African institutions.

Finally, geneticists are needed to interpret the analyzed data to make reliable phenotype–genotype (or other OMICs data) correlations. Biostatisticians are also required to make valid statistical inferences and predictions from the processed data. Depending on the project, the final report will be communicated to either a genetic counselor and the referent medical doctor or to the researcher who will conclude the process by producing relevant and impactful scientific outputs. Ethicists, data protection officers (DPOs), project managers, and study coordinators are required through all the steps highlighted above. The research lifecycle highlights the need for graduate-level training in various fields to appropriately conduct OMICs research. Training programs on many of the aforementioned skills still need to be implemented and(or) expanded in several African countries.

While the aforementioned capacity needs are being addressed, there is also a need to increase the emphasis on providing appropriate incentives for the researchers being trained in these efforts to prevent African brain drain, both at an institutional level and more widely. Notably, in terms of scientific outputs, large-scale OMICs studies in Africa are often being conducted in collaboration with institutions in developed countries (Hedt-Gauthier et al., 2019), and African researchers remain underrepresented in key authorship positions in articles published from health research done in Africa (Mbaye et al., 2019).

Regionally, scientific outputs are most often dominated by specific countries within West-, East-, and South Africa. Data from North Africa were notably excluded from the aforementioned studies. This emphasizes the need for greater investment in capacity building and equitable research partnerships in African OMICs research. Researchers from high-income countries should challenge persistent power differentials in such research, and more African–African collaborations should be encouraged to facilitate the transfer of technical expertise and authorship position on the resulting research outputs.

Challenges related to OMICs data processing

Several data processing challenges are discussed in the following section and summarized in Table 4. These include data generation, standardization, analysis, storage, and sharing.

Table 4.

Technical Challenges and Suggested Solutions Related to African OMICs and Computational Biology Research

Specific challenges Needs Solutions
Data sharing and standardization Data management and documentation - Guidelines and frameworks for data sharing in Africa
- Adhering to FAIR principles
Data standardization - OMICs and Phenotype standards for Africa
- Guidelines for data standardization
- Adhering to FAIR principles
Data and sample storage Implementation of accredited biobanks in North Africa that meet international guidelines. South-North collaboration in biobanking (knowledge transfer, sharing experience, and lessons learned from H3Africa and B3Africa biorepository implementation).
Data security, licensing, and ownership. Improving the informatics and bioinformatics infrastructure in Africa.
Ergonomic usage and availability of technical support for data storage.
Data annotation Standardized nomenclature and vocabulary for description of ethnic and language groups in Africa. Ethnolinguistic- and other ontologies, as well as associated guidelines of use.
Awareness and use of existing ontologies and data reporting frameworks - Increase awareness
- Promote and facilitate use
- Capacity development efforts
Data analysis Robust bioinformatics capacity and computational infrastructure. Local and international investment in bioinformatics and OMICs infrastructure in Africa
Increased and consistent bioinformatics capacity development efforts
Increased and consistent bioinformatics collaboration between African institutions to encourage knowledge transfer

Data generation

Cost efficiency in OMICs technologies has led to their widespread usage in developed countries (Sims et al., 2014). However, it is still costly to perform OMICs experiments in many African countries (Helmy et al., 2016), and many African institutions lack the adequate facilities and trained personnel (Helmy et al., 2016; Nchinda, 2002) to implement such experiments (Adedokun et al., 2016). In addition, due to profit margins added by suppliers, lengthy procurement procedures, and high shipping costs, obtaining research consumables for locally run experiments can be complicated, with consumables often arriving late, delaying ongoing data generating projects. Therefore, increased financial support is required to generate OMICs data in low-to-middle income countries (LMICs), as well as the building of proper infrastructure to support these endeavors (Adedokun et al., 2016). This requires significant investment in technical staff to run the machines and long-term maintenance costs. Such investment should be sought from various stakeholders, starting from governmental bodies and organizations and including nongovernmental organizations, local universities, and international funding agencies.

That being said, as previously described, a number of OMICs data generating centers have been established across the continent more recently, including profit and nonprofit organizations (Inzaule et al., 2021). These organizations often establish collaborating networks and(or) relationships to facilitate knowledge, skills, and technology transfer, therefore, building up existing establishments and relationships at regional level and facilitating accessibility across Africa. A notable limitation remains, however, that many of the tools and products used in OMICs data generation are often developed and produced by companies established outside of Africa, and thus, African data generating centers may face challenges in remaining financially competitive with these centers. Due to financial limitations, researchers often have to opt for institutions outside of Africa for data generation, as the cost may in fact be higher within Africa.

Data analysis

OMICs studies are heavily dependent on high-throughput sequencing approaches and typically rely on an assembled reference genome and public databases generated from OMICs projects. Due to the genetic diversity of African populations, customization of analysis workflows is becoming more and more valued in the OMICs field. For example, in GWAS, imputation is often needed to infer the genotypes from tag SNPs. The portability of Caucasian tagging SNPs to African GWASs is not ideal, mainly due to low levels of linkage disequilibrium in the latter. Thus, the imputation reference panel and tools should be carefully selected (Huang et al., 2011). Limited infrastructure to manage the data and poor Internet connectivity are some of the major limitations to the current utilization of these tools by African researchers.

Recently there has been more evidence pointing to the necessity of providing population-specific tools for variant discovery, especially for structural variations. Moreover, using deep sequencing of 910 African individuals, it was shown that a large number of reads failed to align to any part of the GRCh38 H. Sapiens reference genome assembly and that the African pan-genome is estimated to contain about 10% more genetic material (Shukla et al., 2019). In addition, data from the 1000 Genomes project showed that false positive and negative rates of variant calling differ between populations (Marigorta and Navarro, 2013). In response, the H3Africa consortium has designed a cost-effective African-enriched array of 2.4M SNPs that are now available for Sub-Saharan African-specific GWAS research. In addition, efforts have been made to generate ethnic specific reference genomes, which improve the true positive and negative calling of variants, permitting more reliable downstream analysis (Cho et al., 2016; Sherman et al., 2019).

Interest in using graph-based approaches has been increasing in recent years to overcome the limitations of using the linear reference genome. Graph genomes are a specific data structure that allows the combination of genetic information of many genomes to which sequencing reads can be mapped. Recently the advent of the HISAT2 algorithm and HISAT genotype allowed efficient mapping and sensitive genotyping using a graph-based approach (Kim et al., 2019). The H3ABioNet consortium is attempting to construct an appropriate graph genome for African populations. Moreover, the consortium aims to develop novel solutions for simulation of multiscenario- and multi-OMICs-based medical population genetics data for assessing and implementing novel variant detection and validation approaches that are more relevant to African descent populations. In this context, several variant calling tools were found to perform differently in identifying African and European polymorphisms from simulated high and low coverage whole genome sequencing datasets (Alosaimi et al., 2020).

Therefore, appropriate computational resources are necessary to cope with the specific genetic makeup of African populations. Evaluating the performance of available bioinformatics tools should help users to optimize their choices in workflow development, and this benchmarking should be continuous due to the dynamic nature of the bioinformatics field. Moreover, it is necessary to establish and update open access and gold standard benchmarking OMICs datasets from Africa to help in evaluating the performance of computational tools. In contrast, developing standardized, optimized, and well-tested tools would not only promote the use of appropriate solutions in computational workflow design but also would facilitate the reproducibility of bioinformatics analysis. Collaborative activities within the H3ABioNet consortium are currently ongoing to provide African researchers with containerized workflows that are easy to deploy for OMICs analysis using standard operating procedures, including GWAS, 16S RNA, RNA-seq, and human variant discovery and prioritization analysis (https://www.h3abionet.org/tools-and-services/workflows).

Data storage

Many African institutions and research facilities are still lacking the appropriate technical support for data storage due to lack of funding or computing infrastructure and expertise (Mulder et al., 2017a). Thus, African researchers often have to resort to using global databases to deposit their data and promote their research. The choice of database depends on several requirements. One of the most important considerations is the controlled access that ensures data privacy and restricted usage. Other considerations include data security, data ownership and licensing, straightforward usage, and availability of technical support for data storage (Mulder et al., 2017b). The development of a centralized database for African OMICs data and research might not be an optimal choice in the short term given the multitude of options for sharing data in currently available databases, unless the benefits behind such a choice are stated clearly in terms of avoiding unethical access, improving the quality of the shared data, or facilitating its exploration among African scientists. Centralized databases can help in setting usage boundaries between the different stakeholders, which might be required to avoid unethical usage of OMICs data in the future (Stokstad, 2019).

Various registries that maintain extensive lists of databases exist, such as the registry of research data repositories (re3data.org). As of September 2019, the website reports a list of 1291 databases in life sciences. Among them, KWTRP Research Data Repository Dataverse in Kenya is the only one dedicated to hosting African OMICs data (accessed: September 20, 2019) (Pampel et al., 2013). In addition, DataMed, a search engine used to choose an appropriate database for data storage and sharing, was specifically developed for biomedical research and could be used by African scientists to store their data. DataMed operates under the FAIR principles and includes a list of 75 indexed databases (Figueiredo, 2017; Wilkinson et al., 2016). H3ABioNet also hosts the H3Africa Archive, which serves as a data coordinating center for submission of H3Africa data to public repositories. To maximize local data storage capacity, other modalities may be explored. Moreover, the increased availability of low-cost legacy computers has brought cloud computing settings to the front line. Cloud solutions can be used in several African countries that are lacking the required infrastructure and skills for data storage. However, cloud storage comes with several ethical and data security issues that need to be addressed based on each country's regulatory and ethics rules.

Data transfer

Like data storage, several infrastructure challenges also impede the transfer of Big Data to, within, and out of Africa. These challenges include slow and unstable Internet connectivity; unreliable power supply, continent-wide obsolete computer infrastructure that varies between medium-scale server infrastructure to a small number of workstations, with multiple operating systems (Windows, Linux) for different purposes; and a lack of centralized and secure data storage (Mulder et al., 2017a). Due to the cost and sensitive nature of human OMICs data, H3ABioNet has facilitated the use of the Globus Online application for secure and reliable data transfer across Africa. All data are encrypted for transfer. As a first step, the Network evaluated connectivity between different collaborating end points, providing infrastructure capacity where needed. Despite these advances, for areas where the Internet is too slow or unreliable, portable hard disks are still used for data transfer.

Data sharing

Sharing is a practice that concerns the entire lifecycle of data processed during a scientific study (Griffin et al., 2017). Therefore, considering data sharing as the act of making available only raw or semiprocessed data is a common misconception. Due to the genetic diversity in Africa and the variability of computational approaches, sharing metadata should also be prioritized to avoid reproducibility issues. Although an increasing number of journals request submission of data to some form of electronic database, supplementary materials in peer-reviewed articles are the most utilized form of data sharing. However, the diversity of supplementary materials, as well as their nonstandardized formats, complicates the efficient use and exploration of such data. In addition, OMICs data files are too large and often too sensitive to share through supplementary material. Furthermore, reservations to share data may be rooted in historical accounts of African exploitation. In addition, the phenotypic characterization and collection of biological samples require enormous effort from multiple stakeholders (Modjarrad et al., 2016).

Given their efforts, once data are generated, researchers are reluctant to immediately share it, because they seek maximal exploitation of the data. This is in contrast with global trends, where there is common agreement about the importance and benefit of sharing biomedical OMICs research data (Oberkampf et al., 2015). Sharing data is endorsed as the default procedure according to the latest report from the WHO in 2015 on “Developing global norms for sharing data and results during public health emergencies” (Modjarrad et al., 2016). A qualitative study aimed at surveying African researchers would be of great interest to determine additional concerns other than technical capacity. A good starting point would be to explore the outcomes of Wiley's survey of 90,000 researchers from around the world (including African countries). These results, however, are not publicly available, and the reporting consists of generic descriptions and limited infographic resources (“How and why researchers share data (and why they don't)?”, 2014).

Data standardization

With the absence of specific guidance for data sharing, researchers tend to judge intuitively what might be useful to share based on their needs. Therefore, the diversity associated with OMICs data, as well as the absence of comprehensive and practical guidelines, limits the reuse and interoperability of the shared data. Several projects in LMICs encounter difficulties around data management and documentation and are not able to provide structured metadata. Management and documentation are universal challenges for data that are stewarded and curated inadequately (Anderson et al., 2007). The standardization of research data and metadata are thus important factors to consider during data management, as they facilitate the generation of high quality, compatible, and interoperable datasets. Data standardization supports knowledge accumulation by facilitating meta-analysis, thus producing larger integrated data with greater research potential (Oberkampf et al., 2015).

Standardized guidance for data sharing should cover the entire data lifecycle. Several guidelines and frameworks have been established, which could be applied by African researchers (Lin and Strasser, 2014). The implementation of the FAIR principles is intended to be used with hardware, algorithms, management policies, pipelines, and other research data-related tools for effective data management and sharing (Figueiredo, 2017; Wilkinson et al., 2016). To facilitate the use of existing frameworks and guidelines, H3ABioNet has aimed to raise awareness across the consortium of phenotype data collection and relevant metadata reporting standards through the development of collections, which incorporate these standards in easily implementable tools. In addition, H3ABioNet also works with H3Africa researchers and other field experts to develop novel data collection standards relevant for application in Africa. These resources span across a range of research fields, including mental health, cardiovascular disease, cancer, and infectious diseases. It also includes metadata guidelines to generate reports, as well as standards, for the collection of lifestyle and environmental factors. These standards fulfill FAIR principles and promote large-scale collaboration across Africa.

Socioeconomic Challenges

Economic gaps

Human resource capacity building in OMICs research plays a vital role not only in health and disease but also in building a knowledge-based economy (Dandara et al., 2014; Kumwenda et al., 2017). In 2016, from a total of US$22.3 billion spent on R&D worldwide, African countries spent only US$0.25 billion (1.1%) (Sokolov-Mladenović et al., 2016). Data from the UNESCO Institute for Statistics (access date: 22/11/2019) show alarming disparities in R&D expenditure for African countries with South Africa and Egypt accounting for 60% of all represented countries (Fig. 6A). R&D expenditure as a percentage of the Gross Domestic Product (GDP), however, shows similar values between Kenya and South Africa (0.8%). This remains below the goal of 1% established in 2007 by the African Union members (Sokolov-Mladenović et al., 2016). By comparison, most of the developed countries spend over 1.5% of their GDPs on R&D, and expenditures were suggested to be proportional to the positive impact on economic growth (Simpkin et al., 2019). However, in a time of economic crisis, R&D spending is often assigned the lowest priority compared to other issues like providing basic education and addressing health needs.

FIG. 6.

FIG. 6.

R&D economic indicators in Africa. (A) Expenditure on R&D in some African countries and (B) Number of researchers per million inhabitants. Data in (A) and (B) are extracted from UNESCO Institute for Statistics (http://data.uis.unesco.org/). (C) Based on the expenditure data we calculated the average number of Whole genomes (916 USD$ per genome) that could be sequenced by every 100 researchers per country or the equivalent total number of CPUs (2.5 GHz Xeon E5420) purchased for every 100 researchers per country. These statistics do not consider the area of expertise for each researcher, and we assume that those working in life science are of the same proportion from country to country. (D) Total number of OMICs projects per African country funded by abroad organizations for the period 2015–2018. Data are extracted from NIH World Report (https://worldreport.nih.gov/). (E) Funding contribution of each HIRO for OMICs research in Africa. CPUs, Central Processing Units; HIRO, Heads of International Research organizations; R&D, Research and Development.

Funding research in Africa depends heavily on both government and nongovernmental organizations. The private sector contribution is marginal (Simpkin et al., 2019). In the context of OMICs research, the effectiveness of biotechnology and pharmaceutical companies might be hindered by several factors, including the unstable, geopolitical, and economic environment, the defective technological infrastructure, and the lack of a competent workforce. Conflicting interests between the academic and private sector may also prevent private investment in OMICs R&D in Africa. For instance, OMICs research is conducted to tackle national priorities in many countries, like combating malaria and HIV, over long-term policies.

The private sector, however, requires short-term outcomes that guarantee a return on investment and profitability, usually in the form of intellectual property (IP). Based on the WIPO 2019 report, only 0.6% of the total granted patents have been submitted by African countries, which clearly demonstrate the necessity to invest more in OMICs R&D. Moreover, the density of researchers per population in African countries varies across the continent. Tunisia has the highest density of researchers (1800/million inhabitants), while Tanzania has the lowest with only 18 per million (Fig. 6B).

The continent has an average around 198 researchers per million people, which remains below the world average (“There are not enough scientists in Africa. How can we turn this around?”, 2014); however, brain drain has been noted as a significant issue on the continent (Zimbudzi, 2013). In addition, it seems that even countries with the highest density of researchers do not provide enough funding support to their researchers. Our collected data and the metrics that we have set in Figure 5C show that a group of 100 researchers in Tunisia can only provide 4 human whole genome sequences or purchase only 11 Central Processing Units. By comparison, the same values rise by 6-fold in South Africa and 17-fold in Tanzania (Fig. 6C).

OMICs research in Africa depends mainly on funding from foreign institutions. The Heads of International Research organizations (HIRO) group is engaged in supporting many African countries to align their efforts with global research initiatives. This type of support includes financial aid, networking, capacity building, and collaboration capabilities. However, geographic disparities are evident. The East Africa region has the largest number of HIRO funded projects (164 projects), followed by West Africa (126 projects) and Southern Africa (90 projects) (Fig. 6D). North and Central Africa, however, are represented by only 15 and 17 projects, respectively, highlighting the disparities in funding opportunities in those regions compared to aforementioned ones.

Based on our analysis, the NIH seems to have the highest share of the HIRO funded projects (75%), while >22% of allocated funds come from European organizations, including the Medical Research Council (United Kingdom) and the European Union (Fig. 6E). In addition, many young and early career African scientists benefited from student exchange programs and funding grants offered in different parts of the globe for training and knowledge transfer. However, this remains insufficient, as there are several talented African scientists that are not getting opportunities through these programs. Therefore, to mitigate the aforementioned challenges, Africa needs to increase the number of, and incentives and prospects for, biomedical and OMICs experts by investing heavily in scientific research (de Vries et al., 2017; Sathar and Dhai, 2012). This will create an equal competitive ground for the bioeconomic pool compared with other parts of the world (de Vries et al., 2017).

Legal challenges

Several developing countries are implementing their own legislation to regulate and enforce requirements and standards related to OMICs research especially when human participants are involved. However, whether these regulatory structures foster research accountability or avoid the regular occurrence of incidents such as infringement of IP rights, storage and export of samples, data sharing, and privacy is not well established (Horn, 2013). Historically, African researchers have been perceived and treated as mere data collectors, who do not receive the same level of recognition and benefits that are received by fellow researchers in the same collaborative field abroad (Mulder et al., 2017b). Translation of OMICs research findings into useful clinical products and services involves investment in technology, patenting, and protecting IP rights. In this regard, a lack of resources and high cost in interpreting OMICs research have given rise to exploitation of LMIC researchers due to the absence of appropriate local IP administration (Yakubu et al., 2018). This is an indication that African legislation is absent, obsolete, restrictive, or difficult to navigate (Jao et al., 2015; de Vries and Munung, 2019).

The development and incorporation of regional frameworks that demonstrate conformity with local and international regulations, as well as protection of data ownership and IP, are crucial. Currently, African regulatory bodies use diverse regulation requirements. For example, regulations, guidelines, and coordination of human subject research in countries like Malawi, Nigeria, Kenya, Tanzania, Zimbabwe, and Uganda are developed by their respective national research councils or ethics committees, while in South Africa, the National Health Act (Act No. 61 of 2003) primarily governs the national ethics regulations, and Chapter 8 of the Act governs the legal aspects of using HBM (Sathar and Dhai, 2012).

In addition, the Medical Research Council and the Health Professions Council of South Africa have published research ethics guidelines independently and IP rights and patents benefit from the Publicly Financed Research and Development Act (IPR Act) (Sathar and Dhai, 2012). Notable differences arise in the definition of HBM too: Kenya and Tanzania guidelines refer to them as “human tissue,” Malawian guidelines define them as “genetic resources”; Ugandan national guidelines include “microorganisms” in its definition, while Nigeria defines them as “samples and biological materials.” Therefore, it is extremely difficult to make comparisons between them or inferences of guidelines, laws, and regulations of different African countries based on varying definitions of what comprises HBMs. In addition, many African countries require permits from one or more national agencies such as the national regulatory authority (Lesotho, Rwanda, Nigeria, Ethiopia, South Africa, Botswana, Zambia, and Malawi); The Ministry of Health (Zambia and Cameroon); or the Medical Research of Science and Technology (Kenya, Zimbabwe, and Uganda) before exporting HBMs. In a country such as South Africa, apart from ethics approval in collection of data, national approval for export permits is an additional requirement (Andanda and Govender, 2015). In addition, countries such as Malawi, Kenya, Zambia, and Ethiopia require local principal investigators to be associated with any research on samples pertaining to the country or data being analyzed outside the countries, while Nigeria requires the principal investigator to be affiliated with a registered institution in Nigeria with the capacity for doing the proposed study (de Vries and Munung, 2019).

Given that the last 40 years have been marked by the democratization of many African countries, biomedical research practices have to adjust to the values of equality and inclusiveness. Policy makers should provide guidance and policies to ensure the effectiveness of collaborations and joint research projects and enforce OMICs data sharing consistent with national and international laws.

Ethical challenges

Due to the rapid evolution of African OMICs research, there is an increased need to incorporate ethical standards in OMICs research to curb challenges associated with science policy, promote transparency, and facilitate collaboration among scientists from around the world (Jao et al., 2015). It is also crucial to obtain written consent from study participants and avoid stigmatization (Jao et al., 2015). Genetic ancestry testing results are often considered to be deterministic of one's ethnic identity and could be used to claim racial superiority or to justify radically enforced sociocultural actions, whereas the tests should rather be used to elucidate the complex history of human genetic diversity (de Vries and Munung, 2019).

Additional ethical concerns include privacy, confidentiality, data ownership, and appropriate models of consent, which maintain participant autonomy and address withdrawal issues, incidental findings, and in some cases language barriers (Yakubu et al., 2018). There are several forms of informed consent, including broad consent for unspecified but restricted future use, narrow/specific use, or open/blanket consent for unrestricted sharing and use that supports OMICs research. These can be distinguished based on the choices made by participants (Sheehan, 2011; de Vries and Munung, 2019). Broad consent has previously been supported by Gardy et al. (2015) and Tindana and de Vries (2016) who concluded that there are no a priori reasons against the use of broad consent for genomics research in Africa (Grady et al., 2015; Tindana and de Vries, 2016).

Although there is an ethics debate about the acceptability of broad consent for OMICs research in Africa, it is currently adopted for many OMICs studies across the continent (de Vries and Munung, 2019). One of these debates includes whether or not a “reconsenting process” is required if there is a deviation from the original research plan. However, research carried out by Jao et al. have suggested that reconsenting is an unnecessary inconvenience and insensitive to family members in cases where the participants have died (Jao et al., 2015).

More recently, tiered consent has been introduced as a method to provide participants with the autonomy to participate in a study at a risk level with which they are comfortable. To this end, a framework for implementing tiered informed consent for health genomics research in Africa has been proposed, including a template for tiered consent, discussions on how tiered consent can be explained to participants, and a schema to facilitate the storage of tiered consent data (Nembaware et al., 2019). The transfer of HBMs is also considered to be a critical topic in OMICs research that requires material transfer agreements (MTAs). Several obstacles arise during the process of obtaining ethical approval from African Institutional Review Boards (IRBs) before exporting HBMs. These include lack of appropriate MTAs, inaccurate sample anonymization, and delays in obtaining ethical approval. In some instances, biological samples from ongoing international trials have been shipped with expired MTAs, and a number of new trials have started without such agreements (Sathar and Dhai, 2012).

Another major challenge related to African genomics research is whether individual genetic research results should be fed back to research participants. The Individual Findings in Genetics Research in Africa (IFGeneRA) Collaborative Centre aims to progressively build country-specific policies relating to the return of individual genetic research results. IFGeneRA combines methods from medical genetics, bioethics, genetic counseling, health economics, bioinformatics, and social science to explore these issues. However, the IFGeneRA initiative is only focusing on three African countries (Cameroon, Botswana, and South Africa) and projects on only two diseases (Fragile X and HIV/AIDS). These collaborative efforts should be extended to all African countries and should include more diseases.

In this regard, the AAS is currently creating the first guidelines to collect, store, and share research data and specimens in ways that protect study participants from exploitation and benefit African citizens. It is noteworthy that in almost all African countries, the mandatory role of the DPO is still not well recognized. DPOs should be involved in educating researchers about compliance with personal data protection regulatory frameworks, training staff involved in data processing, and conducting regular security audits. DPOs might also serve as a contact point between OMICs data users and any supervisory authorities that oversee activities related to these data.

More recently, calls for increased incorporation of humanities in science, or rather the return of the “heart” of science, have been growing worldwide to address the issues of reproducibility and productivity in science (Sarewitz, 2016). Such issues are equally relevant to the ethics of African OMICs research, requiring a paradigm shift from competitive, isolated science practice driven by technological potential to open and collaborative practice informed by holistic outcomes, which include economic, health, and social outcomes (Von Schomberg, 2020). Such shifts require the reenvisioning of scientific incentives, particularly at an academic level, as well as increased emphasis on executive functioning skills at primary, secondary, and higher learning educational institutions. The formulation of scientific studies also needs to be reevaluated, with increased focus on societal rather than technological objectives and strong ethical research components within research proposals.

Open science has been extremely topical in recent years; however, its true benefit has been further highlighted by the COVID-19 pandemic, in which open scientific practice and collaboration have led to rapid research and innovation, giving us just a hint at what can be achieved with collaborative research practice (Springer, 2020). In Africa, specifically, such success has also been illustrated by the H3Africa consortium, and this serves as a blueprint for African institutions and researchers to work together to address the health and social concerns on the continent.

Conclusions

This review highlights several important gaps in OMICs and computational biology research in Africa. Disparities between North and Sub-Saharan Africa are noticed at different levels, including research priorities, funding opportunities, infrastructure, and regulatory frameworks. The lack of sustainable OMICs research activities observed in North and Central Africa needs to be addressed given the geohistorical importance of these regions. Initiatives like H3Africa are facilitating building of a critical mass of highly skilled individuals in the field. Although data-related challenges remain, H3ABioNet, the H3Africa bioinformatics network has contributed significantly to improving the computational biology abilities of African researchers and accessibility to OMICs workflows and computing infrastructure over the last 8 years.

Decisions must be taken at the continent-wide level to set up harmonized guidelines and standards that will be used to overcome the challenges discussed in this review. The development of a clear ethnolinguistic ontology that organizes ethnic and linguistic classifications in a formal and consistent way can help to resolve the annotation problems associated with population diversity in both North and Sub-Saharan African regions. Ethical, legal, and socioeconomic challenges need to be resolved too, including validating collaborations, IP, community engagement, genetic discrimination and stigmatization, and data security issues. In addition, training planning must also consider the multidisciplinary nature of OMICs research and the modularity of the required skills that must be adapted to a dynamic and diverse environment. An important topic to be further explored in the future should also be the continent's investment in the scientific outputs produced (e.g., data and software). Efforts to address the challenges associated with translational research endeavors should be increased.

It has previously been noted that Africa experiences difficulty sustaining and managing innovation, and some of the major barriers influencing these experiences often overlap with the data challenges discussed in this review, including economic and capacity gaps. More recently, the focus of translational research in OMICs has shifted to the implementation of pharmacogenomics in Africa with the establishment of the African Pharmacogenomics Consortium. Efforts by such consortia may very well establish a blueprint that can be followed in the future, focusing on sustaining innovation and translation of research knowledge.

In sum, the present expert review and analysis contribute to a deeper understanding of past and current challenges in systems science in Africa while also offering foresight on future innovation trajectories.

Availability of Data and Materials

All the data used in this review are available from public databases referenced in the main text with their proper citation.

Supplementary Material

Supplemental data
Supp_TableS1.docx (17.2KB, docx)

Acknowledgments

The authors also thank the Databases & Resources Work Package of the H3ABioNet consortium and the Precision Medicine project members for their inputs.

Abbreviations Used

AAS

African Academy of Sciences

ASLM

African Society of Laboratory Medicine

CPUs

Central Processing Units

DELTAS

Developing Excellence in Leadership, Training, and Science in Africa

DPO

Data Protection Officer

GDP

Gross Domestic Product

GWAS

Genome-Wide Association Study

HBM

Human biological material

HIRO

Heads of International Research organizations

IFGeneRA

Individual Findings in Genetics Research in Africa

ISBER

International Society for Biological and Environmental Repositories

IP

intellectual property

MTAs

material transfer agreements

PharmGKB

PharmacoGenomics Knowledge Base

OmicsDI

OMICs Discovery Index

LIMS

Laboratory Information Management Systems

PRIDE

Proteomics Identification Database

SAs

Supervisory Authorities

SRA

Sequence Read Archive

Authors' Contributions

Y.H. led the writing of the article and the design of its content. L.Z. and H.O. contributed to the data retrieval and exploration and the preparation of the figures. F.R., I.A., M.H., C.O., M.C., B.M.T., C.S., R.M., N.A., M.T., S.A., A.B., L.R., O.S., and O.T.B. all authors contributed to the extraction of the relevant African data from the public databases. K.G., F.M.F., N.M., and S.K.K. proposed the project, coordinated the work, and co-led it. All authors contributed to writing the article and read and approved the final article.

Disclaimer

The content and the views presented are solely the responsibility and personal opinions of the authors and do not necessarily represent the official views of the affiliated institutions or the National Institutes of Health.

Author Disclosure Statement

No competing financial interests exist.

Funding Information

This work was supported by the National Institutes of Health (NIH) Common Fund under grant number [U24HG006941].

Supplementary Material

Supplementary Table S1

References

  1. Abimiku A, Mayne ES, Joloba M, Beiswanger CM, Troyer J, and Wideroff L. (2017). H3Africa Biorepository Program: Supporting genomics research on African populations by sharing high-quality biospecimens. Biopreser Biobank 15, 99–102 [Google Scholar]
  2. Adedokun BO, Olopade CO, and Olopade OI. (2016). Building local capacity for genomics research in Africa: Recommendations from analysis of publications in Sub-Saharan Africa from 2004 to 2013. Glob Health Action 9, 31026. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Adoga MP, Fatumo SA, and Agwale SM. (2014). H3Africa: A tipping point for a revolution in bioinformatics, genomics and health research in Africa. Source Code Biol Med 9, 10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Allali I, Boukhatem N, Bouguenouch L, et al. (2018). Gut microbiome of Moroccan colorectal cancer patients. Med Microbiol Immunol (Berl.) 207, 211–225 [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Alosaimi S, van Biljon N, Awany D., et al. (2020). Simulation of African and non-African low and high coverage whole genome sequence data to assess variant calling approaches. Brief Bioinform bbaa366. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Aly AM, Adel A, El-Gendy AO, Essam TM, and Aziz RK. (2016). Gut microbiome alterations in patients with stage 4 hepatitis C. Gut Pathog 8, 42. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Andanda P, and Govender S. (2015). Regulation of Biobanks in South Africa. J Law Med Ethics 43, 787–800 [DOI] [PubMed] [Google Scholar]
  8. Anderson NR, Lee ES, Brockenbrough JS, et al. (2007). Issues in biomedical research data management and analysis: Needs and barriers. J Am Med Inform Assoc 14, 478–488 [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Arauna LR, Mendoza-Revilla J, Mas-Sandoval A, et al. (2017). Recent historical migrations have shaped the gene pool of Arabs and Berbers in North Africa. Mol Biol Evol 34, 318–329 [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Awany D, Allali I, Dalvie S, et al. (2018). Host and microbiome Genome-Wide Association Studies: Current state and challenges. Front Genet 9, 637. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Brown DK, and Tastan Bishop Ö. (2018). HUMA: A platform for the analysis of genetic variation in humans. Hum Mutat 39, 40–51 [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Buniello A, MacArthur JAL, Cerezo M, et al. (2019). The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res 47, D1005–D1012 [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Busby,G.B., Band G, SiLe Q, et al. (2016). Admixture intoand within sub-Saharan Africa. eLife 5. [Epub ahead of print]; DOI: 10.7554/eLife.15266 [DOI] [Google Scholar]
  14. Campbell MC, and Tishkoff SA. (2008). AFRICAN GENETIC DIVERSITY: Implications for human demographic history, modern human origins, and complex disease mapping. Annu Rev Genomics Hum Genet 9, 403–433 [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Campbell MC, and Tishkoff SA. (2010). The evolution of human genetic and phenotypic variation in Africa. Curr Biol CB 20, R166–R173 [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Cani PD. (2018). Human gut microbiome: Hopes, threats and promises. Gut 67, 1716–1725 [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Capredon M, Brucato N, Tonasso L, et al. (2013). Tracing Arab-Islamic Inheritance in Madagascar: Study of the Y-chromosome and Mitochondrial DNA in the Antemoro. PLoS One 8, e80932. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Cho HR, Kim HS, Park JS, et al. (2017). Construction and characterization of the Korean whole saliva proteome to determine ethnic differences in human saliva proteome. PLoS One 12, e0181765. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Cho YS, Kim H, Kim H-M, et al. (2016). An ethnically relevant consensus Korean reference genome is a step towards personal reference genomes. Nat Commun 7, 1–13 [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Choudhury A, Aron S, Sengupta D, Hazelhurst S, and Ramsay M. (2018). African genetic diversity provides novel insights into evolutionary history and local adaptations. Hum Mol Genet 27, R209–R218 [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Clough E, and Barrett T. (2016). The Gene Expression Omnibus Database. Methods Mol Biol Clifton NJ 1418, 93–110 [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Cunningham F, Achuthan P, Akanni W, et al. (2019). Ensembl 2019. Nucleic Acids Res 47, D745–D751 [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Dandara C, Huzair F, Borda-Rodriguez A, et al. (2014). H3Africa and the African Life Sciences Ecosystem: Building sustainable innovation. OMICS J Integr Biol 18, 733–739 [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. de Vries J, Munung SN, Matimba A, et al. (2017). Regulation of genomic and biobanking research in Africa: A content analysis of ethics guidelines, policies and procedures from 22 African countries. BMC Med Ethics 18, 8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. de Vries J, and Munung NS. (2019). Ethical considerations in genomic research in South Africa. S Afr Med J 109, 375–377 [Google Scholar]
  26. de Wit E, Delport W, Rugamika CE, et al. (2010). Genome-wide analysis of the structure of the South African Coloured Population in the Western Cape. Hum Genet 128, 145–153 [DOI] [PubMed] [Google Scholar]
  27. Fadhlaoui-Zid K, Haber M, Martínez-Cruz B, Zalloua P, Elgaaied AB, and Comas D. (2013). Genome-wide and paternal diversity reveal a recent origin of human populations in North Africa. PLoS One 8, e80293. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Fakunle ES, and Loring JF. (2012). Ethnically diverse pluripotent stem cells for drug development. Trends Mol Med 709–716 [DOI] [PubMed] [Google Scholar]
  29. Fan S, Kelly DE, Beltrame MH, et al. (2019). African evolutionary history inferred from whole genome sequence data of 44 indigenous African populations. Genome Biol 20, 82. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Field D, Sansone S, DeLong EF, et al. (2010). Meeting report: Metagenomics, metadata and metaanalysis (M3) at ISMB 2010. Stand. Genomic Sci 3, 232–234 [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Figueiredo AS. (2017). Data sharing: Convert challenges into opportunities. Front Public Health 5, 327. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Fregel R, Méndez FL, Bokbot Y, et al. (2018). Ancient genomes from North Africa evidence prehistoric migrations to the Maghreb from both the Levant and Europe. Proc Natl Acad Sci U S A 115, 6774–6779 [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Gaibar M, Novillo A, Romero-Lorca A, Esteban ME, and Fernández-Santander A. (2018). Pharmacogenetics of ugt genes in North African populations. Pharmacogenomics J 18, 609–612 [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Gomez F, Hirbo J, and Tishkoff SA. (2014). Genetic variation and adaptation in Africa: Implications for human evolution and disease. Cold Spring Harb Perspect Biol 6, a008524. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Grady C, Eckstein L, Berkman B, et al. (2015). Broad consent for research with biological samples: Workshop conclusions. Am J Bioeth AJOB 15, 34–42 [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Griffin PC, Khadake J, LeMay KS, et al. (2017). Best practice data life cycle approaches for the life sciences. F1000Research 6, 1618. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Guerrero S, López-Cortés A, Indacochea A, et al. (2018). Analysis of racial/ethnic representation in select basic and applied cancer research studies. Sci Rep 8, 1–8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Gurdasani D, Carstensen T, Tekola-Ayele F, et al. (2015). The African Genome Variation Project shapes medical genetics in Africa. Nature 517, 327–332 [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Haber M, Platt DE, Badro DA, et al. (2011). Influences of history, geography, and religion on genetic structure: The Maronites in Lebanon. Eur J Hum Genet 19, 334–340 [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Hadrich D. (2018). Microbiome research is becoming the key to better understanding health and nutrition. Front Genet 9. [Epub ahead of print]; DOI: 10.3389/fgene.2018.00212 [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Hamdi Y, Ben Rekaya M, Jingxuan S, et al. (2018). A genome wide SNP genotyping study in the Tunisian population: Specific reporting on a subset of common breast cancer risk loci. BMC Cancer 18, 1295. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Hatherley R, Brown DK, Glenister M, and Tastan Bishop Ö. (2016). PRIMO: An interactive homology modeling pipeline. PLoS One 11, e0166698. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Hedt-Gauthier BL, Jeufack HM, Neufeld NH, et al. (2019). Stuck in the middle: A systematic review of authorship in collaborative health research in Africa, 2014–2016. BMJ Global Health 4, e001853. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Helmy M, Awad M, and Mosa KA. (2016). Limited resources of genome sequencing in developing countries: Challenges and solutions. Appl Transl Genomics 9, 15–19 [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Henn BM, Botigué LR, Gravel S, et al. (2012). Genomic Ancestry of North Africans Supports Back-to-Africa Migrations. PLoS Genet 8, e1002397. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Horn LM. (2013).Promoting responsible research conduct in a developing world academic context. South Afr J Bioeth Law 6, 21–24 [Google Scholar]
  47. How and why researchers share data (and why they don't). (2014). https://www.wiley.com/network/researchers/licensing-and-open-access/how-and-why-researchers-share-data-and-why-they-dont Accessed October27, 2020
  48. Huang L, Jakobsson M, Pemberton TJ, et al. (2011). Haplotype variation and genotype imputation in African populations. Genet Epidemiol 35, 766–780 [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Hublin J-J, Ben-Ncer A, Bailey SE, et al. (2017). New fossils from Jebel Irhoud, Morocco and the pan-African origin of Homo sapiens. Nature 546, 289–292 [DOI] [PubMed] [Google Scholar]
  50. Inzaule SC, Tessema SK, Kebede Y, et al. (2021). Genomic-informed pathogen surveillance in Africa: Opportunities and challenges. Lancet Infect Dis S1473-309930939-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Jao I, Kombe F, Mwalukore S, et al. (2015). Involving research stakeholders in developing policy on sharing public health research data in Kenya: Views on fair process for informed consent, access oversight, and community engagement. J Empir Res Hum Res Ethics 10, 264–277 [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Karikari TK, Quansah E, and Mohamed WMY. (2015). Developing expertise in bioinformatics for biomedical research in Africa. Appl Transl Genom 6, 31–34 [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Kefi R, Hechmi M, Naouali C, et al. (2018). On the origin of Iberomaurusians: New data based on ancient mitochondrial DNA and phylogenetic analysis of Afalou and Taforalt populations. Mitochondrial DNA Part A 29, 147–157 [DOI] [PubMed] [Google Scholar]
  54. Kent WJ, Sugnet CW, Furey TS, et al. (2002). The human genome browser at UCSC. Genome Res 12, 996–1006 [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Khan AM. (2017). Guidelines for standardizing and increasing the transparency in the reporting of biomedical research. J Thorac Dis 9, 2697–2702 [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Kim CX, Bailey KR, Klee GG, et al. (2010). Sex and ethnic differences in 47 candidate proteomic markers of cardiovascular disease: The Mayo Clinic proteomic markers of arteriosclerosis study. PLoS One 5, e9065. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Kim D, Paggi JM, Park C, Bennett C, and Salzberg SL. (2019). Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat Biotechnol 37, 907–915 [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Kinkorová J, and Topolčan O. (2018). Biobanks in Horizon 2020: Sustainability and attractive perspectives. EPMA J 9, 345–353 [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Kruse CS, Goswamy R, Raval Y, and Marawi S. (2016). Challenges and opportunities of big data in health care: A systematic review. JMIR Med Inform 4, e38. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Kumwenda S, Niang EHA, Orondo PW, et al. (2017). Challenges facing young African scientists in their research careers: A qualitative exploratory study. Malawi Med J J Med Assoc Malawi 29, 1–4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Lappalainen I, Almeida-King J, Kumanduri V, et al. (2015). The European Genome-phenome Archive of human data consented for biomedical research. Nat Genet 47, 692–695 [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Laurance WF, Sloan S, Weng L, and Sayer JA. (2015). Estimating the environmental costs of Africa's Massive “Development Corridors.” Curr Biol CB 25, 3202–3208 [DOI] [PubMed] [Google Scholar]
  63. Leinonen R, Sugawara H, Shumway M; International Nucleotide Sequence Database Collaboration. (2011). The sequence read archive. Nucleic Acids Res 39, D19–D21 [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Li S, Schlebusch C, and Jakobsson M. (2014). Genetic variation reveals large-scale population expansion and migration during the expansion of Bantu-speaking peoples. Proc Biol Sci 281, 20141448. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Lin J, and Strasser C. (2014). Recommendations for the role of publishers in access to data. PLoS Biol 12, e1001975. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Lorente-Galdos B, Lao O, Serra-Vidal G, et al. (2019). Whole-genome sequence analysis of a Pan African set of samples reveals archaic gene flow from an extinct basal population of modern humans into sub-Saharan populations. Genome Biol 20, 77. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Lowe R, Shirley N, Bleackley M, Dolan S, and Shafee T. (2017). Transcriptomics technologies. PLoS Comput Biol 13, e1005457. [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Luo J, Wu M, Gopukumar D, and Zhao Y. (2016). Big data application in biomedical research and health care: A literature review. Biomed Inform Insights 8, 1–10 [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Marigorta UM, and Navarro A. (2013). High trans-ethnic replicability of GWAS results implies common causal variants. PLoS Genet 9, e1003566. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Martin AR, Costa HA, Lappalainen T, et al. (2014). Transcriptome sequencing from diverse human populations reveals differentiated regulatory architecture. PLoS Genet 10, e1004549. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Martin TC, Visconti A, Spector TD, and Falchi M. (2018). Conducting metagenomic studies in microbiology and clinical research. Appl Microbiol Biotechnol 102, 8629–8646 [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Mbaye R, Gebeyehu R, Hossmann S, et al. (2019). Who is telling the story? A systematic review of authorship for infectious disease research conducted in Africa, 1980–2016. BMJ Global Health 4:e001855. [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. McCarty JR, Jiang G, Johnson ML, et al. (2017). Abstract 5281: Komen Tissue Bank donors: Genetically determined ethnicity and race. Cancer Res 77, 5281 [Google Scholar]
  74. Modjarrad K, Moorthy VS, Millett P, Gsell P-S, Roth C, and Kieny M-P. (2016). Developing global norms for sharing data and results during public health emergencies. PLoS Med 13, e1001935. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Morales J, Welter D, Bowler EH, et al. (2018). A standardized framework for representation of ancestry data in genomics studies, with application to the NHGRI-EBI GWAS Catalog. Genome Biol 19, 21. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Msaidie S, Ducourneau A, Boetsch G, et al. (2011). Genetic diversity on the Comoros Islands shows early seafaring as major determinant of human biocultural evolution in the Western Indian Ocean. Eur J Hum Genet 19, 89–94 [DOI] [PMC free article] [PubMed] [Google Scholar]
  77. Mulder N, Adebamowo CA, Adebamowo SN, et al. (2017a). Genomic research data generation, analysis and sharing—Challenges in the African Setting. Data Sci J 16, 49 [Google Scholar]
  78. Mulder NJ, Adebiyi E, Adebiyi M, et al. (2017b). Development of bioinformatics infrastructure for genomics research in H3Africa. Glob Heart 12, 91–98 [DOI] [PMC free article] [PubMed] [Google Scholar]
  79. Mulder NJ, Adebiyi E, Alami R, et al. (2016). H3ABioNet, a sustainable pan-African bioinformatics network for human heredity and health in Africa. Genome Res 26, 271–277 [DOI] [PMC free article] [PubMed] [Google Scholar]
  80. NCBI Resource Coordinators. (2017). Database Resources of the National Center for Biotechnology Information. Nucleic Acids Res 45, D12–D17 [DOI] [PMC free article] [PubMed] [Google Scholar]
  81. Nchinda TC. (2002). Research capacity strengthening in the South. Soc Sci Med 54, 1699–1711 [DOI] [PubMed] [Google Scholar]
  82. Nedelkov D. (2008). Population proteomics: Investigation of protein diversity in human populations. Proteomics 8, 779–786 [DOI] [PubMed] [Google Scholar]
  83. Nembaware V, Johnston K, Diallo AA, et al. (2019). A framework for tiered informed consent for health genomic research in Africa. Nat Genet 51, 1566–1571 [DOI] [PMC free article] [PubMed] [Google Scholar]
  84. Nembaware V, and Mulder N. (2019). The African Genomic Medicine Training Initiative (AGMT): Showcasing a Community and Framework Driven Genomic Medicine Training for Nurses in Africa. Front Genet 10, 1209. [DOI] [PMC free article] [PubMed] [Google Scholar]
  85. Nguyen TPH, Patrick CJ, Parry LJ, and Familari M. (2019). Using proteomics to advance the search for potential biomarkers for preeclampsia: A systematic review and meta-analysis. PLoS One 14, e0214671. [DOI] [PMC free article] [PubMed] [Google Scholar]
  86. Oberkampf H, Gojayev T, Zillner S, Zühlke D, Auer S, and Hammon M. (2015). The Semantic Web. Latest Advances and New Domains, Lecture Notes in Computer Science. Cham: Springer International Publishing, 652–667 [Google Scholar]
  87. Pampel H, Vierkant P, Scholze F, et al. (2013). Making Research Data Repositories Visible: The re3data.org Registry. PLoS One 8, e78080 [DOI] [PMC free article] [PubMed] [Google Scholar]
  88. Parkinson H, Sarkans U, Shojatalab M, et al. (2005). ArrayExpress—A public repository for microarray gene expression data at the EBI. Nucleic Acids Res 33, D553–D555 [DOI] [PMC free article] [PubMed] [Google Scholar]
  89. Pereira V, Freire-Aradas A, Ballard D, et al. (2019). Development and validation of the EUROFORGEN NAME (North African and Middle Eastern) ancestry panel. Forensic Sci Int Genet 42, 260–267 [DOI] [PubMed] [Google Scholar]
  90. Perez-Riverol Y, Bai M, da Veiga Leprevost F, et al. (2017). Discovering and linking public ‘Omics’ datasets using the OMICs Discovery Index. Nat Biotechnol 35, 406–409 [DOI] [PMC free article] [PubMed] [Google Scholar]
  91. Perez-Riverol Y, Csordas A, Bai J, et al. (2019). The PRIDE database and related tools and resources in 2019: Improving support for quantification data. Nucleic Acids Res 47, D442–D450 [DOI] [PMC free article] [PubMed] [Google Scholar]
  92. Pickrell JK, Patterson N, Loh P-R, et al. (2014). Ancient west Eurasian ancestry in southern and eastern Africa. Proc Natl Acad Sci U S A 111, 2632–2637 [DOI] [PMC free article] [PubMed] [Google Scholar]
  93. Rotimi CN, Bentley AR, Doumatey AP, Chen G, Shriner D, and Adeyemo A. (2017). The genomic landscape of African populations in health and disease. Hum Mol Genet 26, R225–R236 [DOI] [PMC free article] [PubMed] [Google Scholar]
  94. Sánchez-Quinto F, Botigué LR, Civit S, et al. (2012). North African populations carry the signature of admixture with Neandertals. PLoS One 7, e47765. [DOI] [PMC free article] [PubMed] [Google Scholar]
  95. Sarewitz D. (2016). Saving Science. https://www.thenewatlantis.com/publications/saving-science Accessed February18, 2021
  96. Sathar MA, and Dhai A. (2012). Laws, regulations and guidelines of developed countries, developing countries in Africa, and BRICS regions pertaining to the use of human biological material (HBM) in research. South Afr J Bioeth Law 5, 51–54 [Google Scholar]
  97. Schlebusch CM, Malmström H, Günther T, et al. (2017). Southern African ancient genomes estimate modern human divergence to 350,000 to 260,000years ago. Science 358, 652–655 [DOI] [PubMed] [Google Scholar]
  98. Shankar V, Gouda M, Moncivaiz J, et al. (2017). Differences in gut metabolites and microbial composition and functions between Egyptian and U.S. children are consistent with their diets. mSystems 2, e00169 [DOI] [PMC free article] [PubMed] [Google Scholar]
  99. Sheehan M. (2011). Can broad consent be informed consent? Public Health Ethics 4, 226–235 [DOI] [PMC free article] [PubMed] [Google Scholar]
  100. Sherman RM, Forman J, Antonescu V, et al. (2019). Assembly of a pan-genome from deep sequencing of 910 humans of African descent. Nat Genet 51, 30–35 [DOI] [PMC free article] [PubMed] [Google Scholar]
  101. Shukla HG, Bawa PS, and Srinivasan S. (2019). hg19KIndel: Ethnicity normalized human reference genome. BMC Genomics 20, 459. [DOI] [PMC free article] [PubMed] [Google Scholar]
  102. Simpkin V, Namubiru-Mwaura E, Clarke L, and Mossialos E. (2019). Investing in health R&D: Where we are, what limits us, and how to make progress in Africa. BMJ Glob Health 4, e001047. [DOI] [PMC free article] [PubMed] [Google Scholar]
  103. Sims D, Sudbery I, Ilott NE, Heger A, and Ponting CP. (2014). Sequencing depth and coverage: Key considerations in genomic analyses. Nat Rev Genet 15, 121–132 [DOI] [PubMed] [Google Scholar]
  104. Sokolov-Mladenović S, Cvetanović S, and Mladenović I. (2016). R&D expenditure and economic growth: EU28 evidence for the period 2002–2012. Econ Res-Ekon Istraživanja 29, 1005–1020 [Google Scholar]
  105. Springer S. (2020). Caring geographies: The COVID-19 interregnum and a return to mutual aid. Dial Hum Geography 10, 112–115 [Google Scholar]
  106. Stokstad. (2019). Major U.K. genetics lab accused of misusing African DNA. Sci. AAAS. https://www.sciencemag.org/news/2019/10/major-uk-genetics-lab-accused-misusing-african-dna Accessed February22, 2021 [DOI] [PubMed]
  107. Tastan Bishop Ö, Adebiyi EF, Alzohairy AM, et al. (2015). Bioinformatics education—Perspectives and challenges out of Africa. Brief Bioinform 16, 355–364 [DOI] [PMC free article] [PubMed] [Google Scholar]
  108. There are not enough scientists in Africa. How can we turn this around?. (2014). World Econ Forum. https://www.weforum.org/agenda/2017/05/scientists-are-the-key-to-africas-future Accessed October27, 2020
  109. Thorn CF, Klein TE, and Altman RB. (2013). PharmGKB: The pharmacogenomics knowledge base. Methods Mol Biol Clifton NJ 1015, 311–320 [DOI] [PMC free article] [PubMed] [Google Scholar]
  110. Tindana P, and de Vries J. (2016). Broad consent for genomic research and biobanking: perspectives from low- and middle-income countries. Annu Rev Genomics Hum Genet 17, 375–393 [DOI] [PubMed] [Google Scholar]
  111. Tishkoff SA, Reed FA, Friedlaender FR, et al. (2009). The genetic structure and history of Africans and African Americans. Science 324, 1035–1044 [DOI] [PMC free article] [PubMed] [Google Scholar]
  112. Vaught J, Kelly A, and Hewitt R. (2009). A review of international biobanks and networks: Success factors and key benchmarks. Biopreserv Biobank 7, 143–150 [DOI] [PMC free article] [PubMed] [Google Scholar]
  113. Veeramah KR, Wegmann D, Woerner A, et al. (2012). An early divergence of KhoeSan ancestors from those of other modern humans is supported by an ABC-based analysis of autosomal resequencing data. Mol Biol Evol 29, 617–630 [DOI] [PMC free article] [PubMed] [Google Scholar]
  114. Verdu P, and Destro-Bisol G. (2012). African Pygmies, what's behind a name? Hum Biol 84, 1–10 [DOI] [PMC free article] [PubMed] [Google Scholar]
  115. Vizcaíno JA, Deutsch EW, Wang R, et al. (2014). ProteomeXchange provides globally co-ordinated proteomics data submission and dissemination. Nat Biotechnol 32, 223–226 [DOI] [PMC free article] [PubMed] [Google Scholar]
  116. Von Schomberg R, and Özdemir V. (2020). Full throttle: COVID-19 open science to build planetary public goods. OMICS 24, 509–511 [DOI] [PubMed] [Google Scholar]
  117. Wilkinson MD, Dumontier M, Aalbersberg IJJ, et al. (2016). The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 3, 160018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  118. Yakubu A, Tindana P, Matimba A, et al. (2018). Model framework for governance of genomic research and biobanking in Africa—A content description. AAS Open Res 1, 13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  119. Zimbudzi E. (2013). Stemming the impact of health professional brain drain from Africa: A systemic review of policy options. J Public Health Afr 4, e4. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental data
Supp_TableS1.docx (17.2KB, docx)

Data Availability Statement

All the data used in this review are available from public databases referenced in the main text with their proper citation.


Articles from OMICS : a Journal of Integrative Biology are provided here courtesy of Mary Ann Liebert, Inc.

RESOURCES