Version Changes
Revised. Amendments from Version 1
In the revised manuscript, we have included a section to reflect existing bioinformatics initiatives in Africa. We have also included a discussion on the challenges and problems that can arise when considering ethical and legal issues in genomic collaborations. A new column in Table 1 has been added to include countries and institutions in Africa where specific NGS platforms are located. In Figure 2, we have inserted a plot showing the distribution of SRA data for Africa only from 2013 to date. In the supplementary Table 3 (https://doi.org/10.6084/m9.figshare.26826301.v1), we confirm that the table has pathogen data specific to Africa. The revised version also provides estimates that offer insights into the cost implication of building sustainable local bioinformatics capacity in Africa.
Abstract
The routine genomic surveillance of pathogens in diverse geographical settings and equitable data sharing are critical to inform effective infection control and therapeutic development. The coronavirus disease 2019 (COVID-19) pandemic highlighted the importance of routine genomic surveillance of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) to detect emerging variants of concern. However, the majority of high-income countries sequenced >0.5% of their COVID-19 cases, unlike low- and middle-income countries. By the end of 2022, many countries around the world had managed to establish capacity for pathogen genomic surveillance. Notably, Beta and Omicron; 2 of the 5 current SARS-CoV-2 variants of concern were first discovered in Africa through an aggressive sequencing campaign led by African scientists. To sustain such infrastructure and expertise beyond this pandemic, other endemic pathogens should leverage this investment. Therefore, countries are establishing multi-pathogen genomic surveillance strategies. Here we provide a catalog of the current landscape of sequenced and publicly shared pathogens in different countries in Africa. Drawing upon our collective knowledge and expertise, we review the ever-evolving challenges and propose innovative recommendations.
Keywords: Africa, Genome sequencing, Pathogen Genomic data, Public health, Data sharing
Introduction
The global genomic surveillance strategy for pathogens with pandemic and epidemic potential, 2022–2032 was established by the World Health Organisation (WHO) to help countries develop their national genomic surveillance strategy for priority pathogens. 1 The regular collection and sharing of such data are also fundamental for monitoring and effectively responding to outbreaks and for tracking antimicrobial resistance (AMR) to inform decision-making. Furthermore, WHO emphasizes that sharing of such data is critical for preventing, detecting, and timely responding to epidemics and pandemics at all levels, and is in the interest of all Member States. 2
Africa remains increasingly prone to infectious disease threats, recording at least 140 disease outbreaks annually 2 , 3 besides the high burden of drug resistance that has been estimated to have caused at least 1.05 million deaths in 2019. 4 This calls for an urgent need to build and sustain capacity for near real-time outbreak detection, characterization, and routine pathogen genomic surveillance. Coronavirus disease (COVID-19) showed that most of the public health institutions in Africa if adequately capacitated and facilitated with high-throughput sequencing platforms, reagents, and data analytics infrastructure, can perform rapid pathogen detection, characterization, surveillance, and equitable data sharing. As of January 4, 2024, over 170,000 Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) genomes from 53 out of 55 countries in Africa have been sequenced and shared via GISAID (the Global Initiative on Sharing All Influenza Data). 5 An important lesson to learn is that we need to facilitate national laboratories and local scientists to undertake routine pathogen sequencing activities within their countries while collaborating with the rest of the globe. The continent still has several pathogens of epidemic and pandemic potential that require near real-time tracking and in response, the Africa Centres for Disease Control and Prevention (Africa CDC) generated a list of priority pathogens for effective emergency preparedness and response. 6
Pathogen sequencing in Africa
Next-generation sequencing (NGS) is improving how disease outbreaks are accurately detected and investigated at an unparalleled magnitude. Most public health and research laboratories in many countries including Africa have at least one of the common NGS platforms (see Table 1). Different variables affect the choice of NGS platforms to be acquired by an institution such as the cost of the instrument, availability of after-sale service, access and cost of reagents, sequencing run time, throughput, accuracy, strategies for using NGS (targeted/panel, whole-exome, and whole-genome), depth of sequencing coverage, length of sequencing reads, single-end versus paired-end sequencing, genome size of an organism to be sequenced.
Table 1. Key specifications of common NGS throughput platforms in Africa.
Throughput sequencing categories/Data output
range (Gb: gigabyte and Tb: terabyte) |
Common equipment/Maximum Read Length (base pairs)/Maximum Raw data Output (Gb: gigabyte and Tb: terabyte) | Key applications/sequencing strategies | Selected African countries/institutions with specific NGS equipment |
---|---|---|---|
Low throughput
(1 – 100 Gb) |
iSeq 100-2×150 bp - 1.2 Gb | Targeted panel sequencing
Pathogen sequencing (microbes and viruses) 16S metagenomic sequencing |
MinION (Institut Pasteur Du Maroc, Morocco; National Public Health Reference Laboratory, Somalia; Public Health Laboratory, Chad; Center for Research on Meningitis and Schistosomiasis, CERMES, Naimey, Niger; National Reference Laboratory, Lesotho; Pasteur Institute of Bangui, Central African Republic; University of Science of Technical and Technology De Bamako, Mali; Laboratoire National de Santé Publique de Brazzaville, Republic of Congo; Laboratoire National de Biologie Clinique et de Santé Publique, Central African Republic; Institut Pasteur, Cote d'Ivoire; Jean Piaget Molecular Biology Laboratory, Guinea-Bissau; National Public Health Institute, Cape Verde; Laboratoire d'Analyses Medicales Malagasy, Madagascar, Institut National de Santé Publique, INSP, Burkina Faso; Viral Hemorrhagic Fevers Laboratory of Benin; National Reference Laboratory, São Tomé and Príncipe; Institut National de Santé Publique, INSP, Burkina Faso; Institut Pasteur Dakar, Senegal; Public Health Institute of Malawi, PHIM, Malawi; CPHRL, Sierra Leone; National Institute of Public Health, INSP, Burundi; UVRI, Uganda), iSeq 100 (Laboratoire National de Santé Publique de Brazzaville, Republic of Congo; Centre de recherche et de fomation en infectiologie de Guinee, CERFIG, Guinea; Laboratoire National de Référence, Togo; INRB, DRC; Institut Pasteur Dakar, Senegal; Public Health Laboratory, Seychelles; UVRI, Uganda), MiniSeq (Pasteur Institute of Bangui, Central African Republic; Central Health Laboratory, Mauritius; Rwanda Biomedical Centre, RBC, Rwanda; Public Health Institute of Malawi, PHIM, Malawi; CPHRL, Sierra Leone; National Institute of Public Health, INSP, Burundi), MiSeq (National Reference Laboratory, Lesotho; Jean Piaget Molecular Biology Laboratory, Guinea-Bissau; Instituto Naçional de Saúde Pública, INSP, Angola; NCDC, Nigeria; CPHL & UVRI, Uganda; EPHI, Ethiopia; CPHL, Egypt; INRB, DRC; NICD, South Africa; National Public Health Laboratory, Tanzania; ILRI & KEMRI, Kenya; NPHL & National University Research Institute, Khartoum, Sudan; Institut Pasteur d'Algérie, Algeria; Institut Pasteur Du Maroc, Morocco; Instituto Nacional de Saúde, INS, Mozambique; Noguchi Memorial Institute for Medical Research, NMIMR, Ghana; Institut Pasteur, Madagascar; Le Laboratoire National de Santé Publique, LNSP, Cameroon; Zambia National Public Health Institute, ZNPHI, Zambia; National Microbiological Reference Laboratory, NMRL, Zimbabwe; Institut Pasteur de Tunis, Tunisia; National Public Health Reference Laboratory, Liberia; NIP & UNAM, Namibia; MRC Unit, The Gambia; Centre Interdisciplinaire de Recherches Médicales de Franceville, CIRMF, Gabon; Botswana Harvard AIDS Institute Partnership, Botswana; Institut National de Santé Publique, INSP, Burkina Faso; National Institute of Public Health, INSP, Burundi; Institut Pasteur Dakar, Senegal) |
MiniSeq - 2×150 bp - 7.5 Gb | |||
DNBSEQ-E25-2×150 bp - 7.5 Gb | |||
Ion Torrent: Proton/PGM - 400 bp -10 Gb | |||
PacBio RS II - 60 kb - 10 Gb | |||
MiSeq - 2×300 bp - 15 Gb | |||
MinION - 100 kb - 50 Gb | |||
DNBSEQ-G99-2×300 bp - 96 Gb | |||
Medium throughput
(100 – 1000 Gb) |
NextSeq - 2×150 bp - 120 Gb | Small whole-genome sequencing
Targeted panel sequencing (amplicon and gene panel) Gene expression profiling with mRNA-Seq miRNA and Small RNA analysis |
GridION (Laboratoire National de Santé Publique de Brazzaville, Republic of Congo; KEMRI-Wellcome Trust Research Programme, Kenya; INRB, DRC; Rwanda Biomedical Centre, RBC, Rwanda; Institut Pasteur Dakar, Senegal), NextSeq (Public Health Institute of Malawi, PHIM, Malawi, National Public Health Laboratory, Eswatini; Institut Pasteur Du Maroc, Morocco; Noguchi Memorial Institute for Medical Research, NMIMR, Ghana; ACEGID, Nigeria; INRB, DRC; National Institute of Public Health, INSP, Burundi; EPHI, Ethiopia; Institut Pasteur Dakar, Senegal), PacBio Sequel II (ACEGID, Nigeria) |
DNBSEQ-G50RS - 2×150 bp - 150 Gb | |||
GridION - 100 kb - 250 Gb | |||
PacBio Sequel II/IIe - 60 kb - 500 Gb | |||
High throughput
(1000 – 10 000 Gb) |
DNBSEQ-G400RS - 2 × 300 bp - 1.4 Tb | Whole-genome sequencing
Whole-exome sequencing Transcriptome sequencing Shotgun metagenomics |
|
HiSeq - 2 × 150 bp - 1.5 Tb | |||
DNBSEQ-T7-2 × 150 bp - 7 Tb | |||
Ultra-high throughput
(>10 Tb) |
PromethION - 270 kb - 13.3 Tb | Ultra-high-depth whole-genome sequencing of large genomes (Human, plant and animal)
Whole-exome sequencing Whole-transcriptome sequencing Methylation sequencing Shotgun metagenomics |
NovaSeq X (ACEGID, Nigeria), NovaSeq 6000 (CERI, South Africa; Institut Pasteur Dakar, Senegal) |
NovaSeq - 2×250 bp - 16 Tb | |||
DNBSEQ-T20×2RS - 2×150 bp - 72 Tb | |||
DNBSEQ-T10×4RS - 2×150 bp - 76.8 Tb |
COVID-19 flung pathogen genome sequencing to the vanguard of accurate disease outbreak detection, characterization, surveillance of emerging variants, and real-time data sharing. But in the first two years of the pandemic, 78% of high-income settings sequenced >0.5% of their diagnosed COVID-19 cases, compared to 42% of low- and middle-income countries. 7 Following this inequality, high-income countries were encouraged to support low- and middle-income countries to improve their local sequencing capacity 7 to ensure an effective global response to the pandemic. Furthermore, a study revealed that sequencing 0.5% of the cases with a turnaround time of less than 21 days could provide a benchmark for SARS-CoV-2 routine genomic surveillance to detect emerging variants. 7 In Africa, our findings suggest that many countries have built local capacity for pathogen genomics (see Figure 1). This was possible in part because, in the early phase of the pandemic, COVID-19-related travel restrictions forced countries to consider building local capacity for molecular testing and genomic surveillance of SARS-CoV-2 variants other than relying on shipping samples to other countries. By January 4, 2024, the continent had sequenced at least 1.3% of the reported cases (see Table 2) and ranked fourth.
Table 2. Shows proportions of SARS-CoV-2 cases sequenced by each continent.
Continent | COVID-19 cases | Sequenced SARS-CoV-2 genomes in GISAID | Percentage of COVID cases sequenced & shared |
---|---|---|---|
North America | 130,243,977 | 5,781,894 | 4.40% |
Europe | 252,605,493 | 8,027,917 | 3.20% |
Oceania | 14,722,723 | 283,139 | 1.90% |
Africa | 12,857,043 | 171,881 | 1.30% |
Asia | 221,152,538 | 1,701,264 | 0.80% |
South America | 69,454,296 | 398,332 | 0.60% |
Generally, the reduced costs of genome sequencing have led to the generation of large volumes of genomic sequence data globally as seen from the shared data in the Sequence Read Archive (SRA) for NGS data at the National Center for Biotechnology Information (NCBI) (see Figure 2). The SRA was established as part of the International Nucleotide Sequence Database Collaboration (INSDC) in 2009. 9
It should be noted that more genomic data is being generated than the capacity to analyze and meaningfully interpret it. 10 , 11 This data if well utilized holds clues to improve health. This makes FAIRness (findability, accessibility, interoperability, and reusability) data sharing 12 a critical component in this ecosystem so that scientists in well-resourced settings can collaboratively and equitably derive valuable insights from this data.
Africa's urgent need to develop a robust network of skilled bioinformatics researchers
Africa, with its rich biodiversity, stands to gain most from advancements in biomedical research and healthcare. However, harnessing this potential requires a critical mass of researchers skilled in genomics and bioinformatics. These fields combine biology, computer science, and information technology to comprehensively analyze and interpret vast biological data. Building a critical mass of this exertise in Africa is imperative for leveraging the continent's biodiversity, addressing health challenges, and promoting sustainable development. Existing initiatives like AI4D Africa ( https://africa.ai4d.ai/), H3AbioNet ( https://www.h3abionet.org/), EANBiT ( https://eanbit.icipe.org/), African BioGenome Project ( https://africanbiogenome.org/) and several university training programs provide a strong foundation, but there is a need for intensified broad training in Artificial intelligence (AI), High performance computing (HPC), NGS sequencing, and bioinformatics analysis. By adopting collaborative strategies, increasing funding, and leveraging technology, the continent can cultivate a skilled workforce in these areas, driving scientific innovation and improving public health outcomes.
Bioinformatics is pivotal in modern science, underpinning advancements in genomics, and personalized medicine. For Africa, developing bioinformatics capacity is crucial due to several reasons such as; (i) Africa faces a high burden of endemic infectious diseases like malaria, HIV/AIDS, and tuberculosis, alongside raising cases of non-communicable diseases, (ii) bioinformatics enables the analysis of genomic data to understand pathogen evolution, drug resistance mechanisms, and host-pathogen interactions, thereby facilitating the development of effective treatments and vaccines, and (iii) Africa's great genetic diversity offers a unique resource broadly for understanding human evolution and disease susceptibility. Therefore, bioinformatics is essential for analyzing Africa’s complex genomic data, uncovering genetic markers, and tailoring healthcare solutions to diverse populations.
We propose the following strategies for enhancing training;
-
1.
Establish collaborative training programs through partnerships between African institutions and international institutions to deliver joint training programs. These programs can offer access to cutting-edge technologies, expert mentorship, and global research networks.
-
2.
Utilize online platforms to deliver bioinformatics courses and workshops. This approach can reach a broader audience and provide flexible learning opportunities.
-
3.
Avail funding and scholarships opportunities for bioinformatics education and research. Scholarships and grants can support students and researchers, enabling them to pursue advanced studies and attend international conferences and workshops.
-
4.
Foster collaborations between academia and industry to provide practical training, innovations and research opportunities. Industry partners can offer internships, project collaborations, and access to proprietary technologies.
Data sharing of pathogen genomic data
COVID-19 propelled two important components in effective infectious disease response; (i) near-real-time pathogen sequencing and (ii) unprecedented collaborative scale in pathogen data sharing with over 16.4 million SARS-CoV-2 genomes already publicly shared via GISAID from all over the world by January 4, 2024.
Since most of the SARS-CoV-2 sequence data was shared via GISAID, we searched the SRA/NCBI database ( https://www.ncbi.nlm.nih.gov/sra/) up until January 4, 2024, for publicly accessible sequence data for other pathogens from different countries in Africa. The search terms used were, “Pathogen [All Fields] AND Country [All Fields]”. The retrieved data included different pathogens sequenced from each country filtered by microbial taxon, BioProject, genomic library layout, NGS platform, volume of data, and submitting/sequencing institution (see Extended data: Table 3 13 ).
In Africa before the COVID-19 pandemic, sequencing was largely a sophisticated endeavor undertaken by individuals and institutions based in high-income settings in collaboration with African institutions (see Extended data: Table 3 13 ). The local institutions largely participated in sample collection and processing for shipping to institutions in well-resourced countries. This implied that the objectives of these sequencing activities were also addressing the research aims of institutions in high-income settings. The pandemic brought about a paradigm shift that allowed local public health institutions to mobilize resources to embrace the genomic revolution in public health. This has been largely achieved through the Africa Pathogen Genomics Initiative (Africa PGI), 3 which has equipped, trained, and offered technical support for African Union (AU) Member States to build sustainable genomics and bioinformatics capacity. Furthermore, the AU Commission and Africa CDC have called on governments, multi-lateral organizations, philanthropies, the private sector, and civil society organizations to support the full implementation of Africa’s New Public Health Order to drive global health security. 14 In addition to this, it is imperative to promote equitable data sharing for pathogens of epidemic and pandemic potential in Africa while discouraging post-outbreak sanctions on countries that share data.
Major emerging challenges
The price of many of the commonly used sequencing equipment widely distributed in Africa varies significantly. For instance, the MinION Mk1B costs $1,999 with no annual fee, while the MinION Mk1C (which includes a sequencer, display, and GPU/CPU) is priced at $4,900 plus a $300 annual fee. The GridION is priced at $49,955 with an additional yearly cost of $12,500. Each MinION flow cell yields between 5-20 Gb per run and costs between $475 to $900, depending on the quantity purchased. 15 , 16 Illumina platforms have a broader price range, starting from $19,900 for the iSeq 100, $49,500 for the MiniSeq, and $128,000 for the MiSeq. Higher-end models include the NextSeq 1000 at $210,000 and the NextSeq 2000 at $335,000.
A recent survey indicates that setting up a bioinformatics computing workstation in an African laboratory, particularly one involved in routine pathogen sequencing and bioinformatics analyses, is estimated to cost a minimum of $7,000. This estimate includes the cost of a desktop computer with high-end specifications such as an Intel Core i9 processor with 16 cores at 5.0 GHz, 16 GB DDR4 RAM, a 12 GB GPU, dual 4 TB M.2 SSDs, and an additional 8 TB internal SSD for storage. Additional expenses, including electricity and internet, must also be considered. The average annual salary for a skilled bioinformatician in a public health laboratory in Africa is approximately $9,600.
The cost of bioinformatics analysis for a pathogen genome on a cloud platform can vary depending on several factors, including the type of analysis (e.g., pathogen identification, virulence, and antimicrobial resistance profiling), the sequencing strategy used (such as metagenomics or amplicon-based sequencing), and the size of the pathogen genome. Excluding costs related to internet connectivity and data storage, the cost of analyzing a pathogen genome can range from $0.5 to $10 per sample, depending on the bioinformatics cloud platform service provider. This estimate typically covers essential tasks like sequence quality assessment, genome assembly, species or lineage assignment, genome annotation, and variant calling. The runtime and cost can vary depending on the software used, as different tools utilize different algorithms. More complex analyses, such as comparative genomics and the determination of antimicrobial resistance and virulence factors, may increase costs due to the need for additional public pathogen sequences and reference databases. Additionally, the choice of bioinformatics tools is influenced by factors such as the sequencing approach used, including metagenomic next-generation sequencing, whole-genome sequencing, targeted sequencing (amplicon sequencing), multiplex PCR-based NGS, and whether short or long-read sequencing technologies are employed.
The ever-growing volume of genomic sequence data coming out of laboratories in Africa has amplified interest in genomics and bioinformatics applications in public health. 17 However, majority of the workforce has limited experience in these fields largely due to the limited number of training institutions that can offer the training. It should be noted that meaningful interpretation of genomic data generated from disease outbreaks requires adequate training and collaborative efforts (within and outside the continent). Sequencing facilities have varying levels of expertise in sequencing different organisms such as viruses, bacteria, fungi, and parasites as well as different sequencing technologies (see Table 1).
There is an urgent need for public health institutions in Africa to recruit and retain competent personnel dedicated to performing bioinformatics analyses that can support evidence-informed public health decision-making. In some cases, it has been complicated to establish such job positions within institutional structures. As such resorting to training the available personnel in this field is the most reasonable option however, this training requires long periods of learning, mentorship, and hands-on practice. 18
Limited local data analytics infrastructure is another critical emerging challenge. With many institutions having access to local NGS sequencing platforms that generate large volumes of raw sequence data that require memory-intensive analytic compute and storage, there is an ever-growing need to provide access to local high-performance computing (HPC) facilities. Setting up HPC facilities requires a lot of money and creates new challenges such as increasing electricity consumption and a need to recruit a competent HPC systems administrator. Furthermore, utilization of cloud-based resources for bioinformatics analyses also requires access to stable fast internet that may be unavailable in some countries.
The need to invest in biobanking and biorepository services in Africa. These facilities archive and distribute well-characterized biospecimens for research to support the development of disease diagnostics and therapeutics. 19 Genomics activities rely on access to samples in these facilities. Many public health laboratories faced the challenge of storing thousands of samples during 2014-2016 Ebola outbreak in West Africa and the COVID-19 outbreak and realized a need to expand their biobanking capacities.
Challenges of timely access to NGS reagents and supply chain. Travel restrictions during COVID-19 highlighted the need to improve the supply chain of NGS reagents in countries that lacked or had limited access to channel partners. Routine genomic surveillance requires reliable access to affordable NGS reagents and after-sales service. It is estimated that the cost per pathogen sequence ranges from US $20–$200, with poor countries paying up to 10-fold more than high-income countries. 20
Pathogen genomics in public health in Africa requires continuous engagement among different stakeholders including NGS platform manufacturers/distributors, program/project funders, sequencing facilities, research/academic institutions, the national ministries of health, and policymakers. Each of these has a unique contribution, therefore planning and supporting multi-pathogen activities is a multi-sectoral endeavor that requires continued engagement to ensure an impactful genomic ecosystem. For example, the prohibitive cost of using onboard NGS instrument data analysis software licenses requires engagement with platform manufacturers. Cloud-based computing is an enthralling emerging infrastructure that offers an alternative infrastructure for solving traditional bioinformatics challenges in NGS data analytics and visualization. Cloud computing service models are classified into three types: Platform as a Service (PaaS), Software as a Service (SaaS), and Infrastructure as a Service (IaaS). 21 The platform is relatively easier to use than the Linux command-line-based environment however it requires a pay-as-you-go subscription that charges based on usage.
Recent years have revealed an emerging dimension of the impact of climate change on the evolution of pathogens in Africa. 22 , 23 For example, climate change and extreme weather conditions have triggered increased cholera outbreaks in Africa. 24 Research on modeling climate change and pathogen genomic data has failed to be socially inclusive, geographically balanced, and broad in terms of the disease systems studied, limiting our capacities to better understand the actual contribution of climate change on disease outbreaks. 22 Therefore, integrating climate change data in pathogen genomics is important, and leveraging this expertise through collaboration with institutions that have established capacity.
Challenges of integrating ethics and law in international collaborations
International collaboration is essential in genomic activities and global health security, fostering innovation, access to sequencing technologies, sharing knowledge, and addressing complex issues that transcend borders such as during disease outbreaks. However, such collaborations often face significant challenges, particularly when it comes to integrating ethics and law across diverse national legal systems. Ethical guidelines in Africa need adaptation to the evolving data sharing policy landscape, which increasingly emphasizes openness, storage, sharing, and fair secondary use of data. Some countries have guidelines that do not adequately address the ethical challenges these new orientations present, failing to provide accurate guidance to ethics committees and researchers. 25
Furthermore, pathogen genomics involves obtaining samples from biobanks. Several countries have established biobanks and biorepositories to support genomic activities in Africa. 19 , 25 – 28 Access to these facilities requires balancing data privacy, trust, and a need to promote scientific progress. Biobanking services involving large prospective cohorts have addressed these issues through extensive communication among collaborators, researchers, community engagement and experts. Funders' main concerns include privacy, biosamples utilization, and information access. Emphasis is often on privacy and funder’s control over biosample use. To avoid delays, it's crucial to address ethical and legal issues proactively. 29
Integrating ethics and law in international collaborations presents significant challenges due to differences in ethical standards, legal frameworks, cultural values, and enforcement mechanisms. Harmonizing data standards, establishing clear agreements, joint review mechanisms, regulatory capacity building, and fostering cultural competence are essential to navigating these complexities. Legal systems across Africa are diverse, reflecting varied political, economic, cultural, and social structures. This diversity complicates regional and international collaborations, particularly in areas like public health genomics, intellectual property, and data protection. Conflicting laws between African countries require meticulous legal analysis and negotiation. Compliance with multiple legal frameworks can be resource-intensive and complex, with liability in legal disputes further complicated by jurisdictional differences. Regulatory approval for multi-country genomics research in Africa is often cumbersome due to differing requirements and timelines, leading to project delays and increased costs. However, mechanisms established by the Africa Centres for Disease Control and Prevention (Africa CDC) have begun to address these challenges by supporting member states in strengthening their health institutions and navigating the legal complexities of disease threats.
Beyond the post-COVID-19 genomics investment in Africa and way forward
COVID-19 allowed most of the countries in Africa to build local capacity for pathogen genome sequencing. It is important to keep the momentum by leveraging this investment to fight the continent’s endemic health challenges including Malaria, Cholera, HIV/AIDS, Tuberculosis, Vaccine-preventable diseases, Viral hemorrhagic fevers (VHFs) and the growing threat of antimicrobial resistance. Therefore, as recommended by the WHO, 30 countries are establishing multi-pathogen genomic surveillance strategies that will ensure they can rapidly detect and precisely characterize pathogens, but this also requires stable funding from national governments and other funding agencies such as the Pandemic Fund.
The global economic cost of the COVID-19 pandemic was estimated at USD $16 trillion. 31 Much as countries now appreciate the benefits of routine pathogen surveillance, many in Africa require technical and financial support. The devastating human, economic, and social cost of COVID-19 highlighted the urgent need for well-coordinated action to establish stronger health systems and mobilize additional resources for pandemic prevention, preparedness, response, and serve as a platform for advocacy. The WHO coordinated efforts through the newly established Pandemic Fund, which is a collaborative partnership among donor governments, co-investor countries, foundations, civil society organizations, and international agencies providing a dedicated stream of long-term funding (see Figure 3) for critical pandemic prevention, preparedness, and response (PPR) for future pandemics. The Pandemic Fund is set to finance critical investments to strengthen pandemic prevention, preparedness, and response capacities at national, regional, and global levels, with a focus on low- and middle-income countries. 32
The Pandemic Fund is set to allocate financing where investments are most urgently needed to bolster PPR for future pandemics, addressing key capacity gaps at national, regional, and global levels. In 2023, projects that were financed prioritized strengthening comprehensive outbreak surveillance and early warning, laboratory systems, and human resources/public health workforce capacity. Overall, the Fund’s aims include: (i) supporting surveillance systems that enable timely tracking and reporting of outbreaks, (ii) creating faster and more accurate data sharing and (iii) building better laboratories so that they better assist partner countries to rapidly detect and effectively respond to infectious disease outbreaks.
Beyond analysis, reporting, and data sharing of pathogen genomic data, the continent also needs to utilize the genomics resource to improve local research and development (R&D) capacity as well as support local therapeutics and affordable diagnostic manufacturing. The Partnerships for African Vaccine Manufacturing (PAVM) has a goal to enable the African vaccine manufacturing industry to develop, produce, and supply over 60 percent of the total vaccine doses required to fight endemic diseases on the continent by 2040. 33 This offers an exceptional opportunity for enhancing local vaccine training. Rwanda is a pioneering country, hosting the first mRNA vaccine manufacturing facility in Africa, and soon, Senegal, South Africa, and Kenya will be the other African countries to have mRNA vaccine manufacturing facilities on the continent. 34
Conclusions
Timely and accurate analysis of pathogen genomes is crucial for monitoring the molecular evolution and dissemination of pathogens, improving diagnostic molecular tests and vaccines, and guiding efficient public health interventions. Therefore, there is an urgent need to sustain gains in local pathogen genomics in Africa while strengthening data analytics infrastructure and an equitable data sharing ecosystem.
Acknowledgements
G.M. acknowledges the EDCTP2 career development grant which supports the Pathogen detection in HIV-infected children and adolescents with non-malarial febrile illnesses using the metagenomic next-generation sequencing approach in Uganda (PHICAMS) project which is part of the EDCTP2 programme from the European Union (Grant number TMA2020CDF-3159). The views and opinions of the author expressed herein do not necessarily state or reflect those of EDCTP.
We are equally grateful for the support of the NIH Common Fund, through the OD/Office of Strategic Coordination (OSC) and the Fogarty International Center (FIC), NIH award number U2RTW010672. Its contents are solely the responsibility of the authors and do not necessarily represent the official views of the supporting offices.
Funding Statement
European and Developing Countries Clinical Trials Partnership -TMA2020CDF-3159 Fogarty International Center- U2RTW010672 Public Health Alliance for Genomic Epidemiology - (PHA4GE)INV-038071
The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
[version 2; peer review: 2 approved
Data availability
Underlying data
No data are associated with this article.
Extended data
Zenodo: Extended data for ‘The rise of pathogen genomics in Africa’, https://doi.org/10.6084/m9.figshare.26826301.v1. 13
This project contains following extended data:
Table 3. The different pathogens that have been sequenced and publicly shared via SRA:NCBI from different countries in Africa (at least 12 Terabytes of sequence data).xlsx
Data are available under the terms of the Creative Commons Attribution 4.0 International license (CC-BY 4.0)
References
- 1.WHO guiding principles for pathogen genome data sharing. Reference Source
- 2. Aborode AT, et al. : Impact of poor disease surveillance system on COVID-19 response in africa: Time to rethink and rebuilt. Clin Epidemiol Glob Health. 2021;12:100841. 10.1016/j.cegh.2021.100841 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Makoni M: Africa’s $100-million Pathogen Genomics Initiative. Lancet Microbe. 2020;1:e318. 10.1016/S2666-5247(20)30206-8 [DOI] [PubMed] [Google Scholar]
- 4. Sartorius B, et al. : The burden of bacterial antimicrobial resistance in the WHO African region in 2019: a cross-country systematic analysis. Lancet Glob. Health. 2023. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Shu Y, McCauley J: GISAID: Global initiative on sharing all influenza data – from vision to reality. Eurosurveillance. 2017;22. 10.2807/1560-7917.ES.2017.22.13.30494 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Risk Ranking and Prioritization of Epidemic-Prone Diseases. Africa CDC. . Reference Source [Google Scholar]
- 7. Brito AF, et al. : Global disparities in SARS-CoV-2 genomic surveillance. Nat Commun. 2022;13:7003. 10.1038/s41467-022-33713-y [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Tegally H, et al. : The evolving SARS-CoV-2 epidemic in Africa: Insights from rapidly expanding genomic surveillance. Science. 2022;378:eabq5358. 10.1126/science.abq5358 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Katz K, et al. : The Sequence Read Archive: a decade more of explosive growth. Nucleic Acids Res. 2022;50:D387–D390. 10.1093/nar/gkab1053 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Mboowa G, Sserwadda I, Aruhomukama D: Genomics and bioinformatics capacity in Africa: no continent is left behind. Genome. 2021;64:503–513. 10.1139/gen-2020-0013 [DOI] [PubMed] [Google Scholar]
- 11. Mulder N, et al. : Genomic Research Data Generation, Analysis and Sharing – Challenges in the African Setting. Data Sci J. 2017;16:49. 10.5334/dsj-2017-049 [DOI] [Google Scholar]
- 12. Wilkinson MD, et al. : The FAIR Guiding Principles for scientific data management and stewardship. Sci Data. 2016;3:160018. 10.1038/sdata.2016.18 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Various pathogens have been sequenced by institutions within and outside Africa, with the data accessible through the Sequence Read Archive (SRA) of the NCBI from multiple African countries. figshare.2024. 10.6084/m9.figshare.26826301.v1 [DOI]
- 14. The New Public Health Order: Africa’s health security Agenda. Africa CDC. . Reference Source [Google Scholar]
- 15. Nanopore store : Devices. https://store.nanoporetech.com/devices.html
- 16. Nanopore store : Price List. https://store.nanoporetech.com/priceList.html
- 17. Onywera H, et al. : Boosting pathogen genomics and bioinformatics workforce in Africa. Lancet Infect Dis. 2023;24(2):e106–e112. 10.1016/S1473-3099(23)00394-8 [DOI] [PubMed] [Google Scholar]
- 18. Jjingo D, et al. : Bioinformatics mentorship in a resource limited setting. Brief Bioinform. 2022;23:bbab399. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Croxton T, et al. : Building blocks for better biorepositories in Africa. Genome Med. 2023;15:92. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Pronyk PM, et al. : Advancing pathogen genomics in resource-limited settings. Cell Genomics. 2023;3:100443. 10.1016/j.xgen.2023.100443 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Bani Baker Q, et al. : Comprehensive comparison of cloud-based NGS data analysis and alignment tools. Inform Med Unlocked. 2020;18:100296. 10.1016/j.imu.2020.100296 [DOI] [Google Scholar]
- 22. Van de Vuurst P, Escobar LE: Climate change and infectious disease: a review of evidence and research trends. Infect Dis. Poverty. 2023;12:1–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Mora C, et al. : Over half of known human pathogenic diseases can be aggravated by climate change. Nat Clim Chang. 2022;12:869–875. 10.1038/s41558-022-01426-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Nakkazi E: Cholera outbreak in Africa. Lancet Infect Dis. 2023;23:411. 10.1016/S1473-3099(23)00149-4 [DOI] [PubMed] [Google Scholar]
- 25. Vries J, et al. : Regulation of genomic and biobanking research in Africa: a content analysis of ethics guidelines, policies and procedures from 22 African countries. BMC Med Ethics. 2017;18:8. 10.1186/s12910-016-0165-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Establishment of a Biobanking Network as a Sustainable Mechanism to Accelerate Development and Evaluation of Diagnostic Tests in Africa . Africa CDC. https://africacdc.org/download/establishment-of-a-biobanking-network-as-a-sustainable-mechanism-to-accelerate-development-and-evaluation-of-diagnostic-tests-in-africa/
- 27. Abimiku A, et al. : H3Africa Biorepository Program: Supporting Genomics Research on African Populations by Sharing High-Quality Biospecimens. Biopreserv Biobank. 2017;15:99–102. 10.1089/bio.2017.0005 [DOI] [Google Scholar]
- 28. Staunton C, Vries J: The governance of genomic biobank research in Africa: reframing the regulatory tilt. J Law Biosci. 2020;7:lsz018. 10.1093/jlb/lsz018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Klingstrom T, Bongcam-Rudloff E, Reichel J: Legal & ethical compliance when sharing biospecimen. Brief Funct Genomics. 2018;17:1–7. 10.1093/bfgp/elx008 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Considerations for developing a national genomic surveillance strategy or action plan for pathogens with pandemic and epidemic potential. Reference Source
- 31. Cutler DM, Summers LH: The COVID-19 Pandemic and the $16 Trillion Virus. JAMA. 2020;324:1495–1496. 10.1001/jama.2020.19759 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. The Pandemic Fund Announces First Round of Funding to Help Countries Build Resilience to Future Pandemics. World Bank. . Reference Source [Google Scholar]
- 33. A Breakthrough for the African Vaccine Manufacturing. Africa CDC. . Reference Source [Google Scholar]
- 34. Saied AA: Building Africa’s first mRNA vaccine facility. Lancet. 2023;402:287–288. 10.1016/S0140-6736(23)01119-4 [DOI] [PubMed] [Google Scholar]