Skip to main content
AAAS - PMC COVID-19 Collection logoLink to AAAS - PMC COVID-19 Collection
. 2022 Sep 15:eabq5358. doi: 10.1126/science.abq5358

The evolving SARS-CoV-2 epidemic in Africa: Insights from rapidly expanding genomic surveillance

Houriiyah Tegally 1,2,, James E San 1,2, Matthew Cotten 3,4, Monika Moir 1, Bryan Tegomoh 5,6, Gerald Mboowa 7, Darren P Martin 8,9, Cheryl Baxter 1,10, Arnold W Lambisia 11, Amadou Diallo 12, Daniel G Amoako 13,14, Moussa M Diagne 12, Abay Sisay 15,16, Abdel-Rahman N Zekri 17, Abdou Salam Gueye 18, Abdoul K Sangare 19, Abdoul-Salam Ouedraogo 20, Abdourahmane Sow 21, Abdualmoniem O Musa 22,23,24, Abdul K Sesay 25, Abe G Abias 26, Adam I Elzagheid 27, Adamou Lagare 28, Adedotun-Sulaiman Kemi 29, Aden Elmi Abar 30, Adeniji A Johnson 31,32, Adeola Fowotade 33,34, Adeyemi O Oluwapelumi 35,36, Adrienne A Amuri 37,38, Agnes Juru 39, Ahmed Kandeil 40, Ahmed Mostafa 40, Ahmed Rebai 41, Ahmed Sayed 42, Akano Kazeem 43,44, Aladje Balde 45,46, Alan Christoffels 7,47, Alexander J Trotter 48, Allan Campbell 49, Alpha K Keita 50,51, Amadou Kone 52, Amal Bouzid 41,53, Amal Souissi 41, Ambrose Agweyu 11, Amel Naguib 54, Ana V Gutierrez 48, Anatole Nkeshimana 55, Andrew J Page 48, Anges Yadouleton 56, Anika Vinze 57, Anise N Happi 43, Anissa Chouikha 58,59, Arash Iranzadeh 8,9, Arisha Maharaj 1, Armel L Batchi-Bouyou 60,61, Arshad Ismail 13, Augustina A Sylverken 62,63, Augustine Goba 64,65, Ayoade Femi 43,44, Ayotunde E Sijuwola 43, Baba Marycelin 66,67, Babatunde L Salako 29,32, Bamidele S Oderinde 66, Bankole Bolajoko 43, Bassirou Diarra 52, Belinda L Herring 18, Benjamin Tsofa 11, Bernard Lekana-Douki 68,69, Bernard Mvula 70, Berthe-Marie Njanpop-Lafourcade 18, Blessing T Marondera 71, Bouh Abdi Khaireh 72,73, Bourema Kouriba 19, Bright Adu 74, Brigitte Pool 75, Bronwyn McInnis 17, Cara Brook 76,77, Carolyn Williamson 9,10,78, Cassien Nduwimana 55, Catherine Anscombe 79,80, Catherine B Pratt 81, Cathrine Scheepers 13,82, Chantal G Akoua-Koffi 83,84, Charles N Agoti 11,85, Chastel M Mapanguy 60,86, Cheikh Loucoubar 12, Chika K Onwuamah 87, Chikwe Ihekweazu 88, Christian N Malaka 89, Christophe Peyrefitte 12, Chukwa Grace 43,44, Chukwuma E Omoruyi 33,34, Clotaire D Rafaï 90, Collins M Morang’a 91, Cyril Erameh 92, Daniel B Lule 3, Daniel J Bridges 93, Daniel Mukadi-Bamuleka 37, Danny Park 57, David A Rasmussen 94,95, David Baker 48, David J Nokes 11,96, Deogratius Ssemwanga 3,97, Derek Tshiabuila 2, Dominic S Y Amuzu 91, Dominique Goedhals 98, Donald S Grant 64,65,99, Donwilliams O Omuoyo 11, Dorcas Maruapula 100, Dorcas W Wanjohi 7, Ebenezer Foster-Nyarko 48, Eddy K Lusamaki 37,38,51, Edgar Simulundu 101, Edidah M Ong’era 11, Edith N Ngabana 37,38, Edward O Abworo 102, Edward Otieno 11, Edwin Shumba 71, Edwine Barasa 11, El Bara Ahmed 103,104, Elhadi A Ahmed 23, Emmanuel Lokilo 37, Enatha Mukantwari 105, Eromon Philomena 43, Essia Belarbi 106, Etienne Simon-Loriere 107, Etilé A Anoh 83, Eusebio Manuel 108, Fabian Leendertz 106, Fahn M Taweh 109, Fares Wasfi 58, Fatma Abdelmoula 41,110, Faustinos T Takawira 39, Fawzi Derrar 111, Fehintola V Ajogbasile 43, Florette Treurnicht 112,113, Folarin Onikepe 43,44, Francine Ntoumi 60,114, Francisca M Muyembe 37,38, Frank E Z Ragomzingba 115, Fred A Dratibi 116,117, Fred-Akintunwa Iyanu 43, Gabriel K Mbunsu 38, Gaetan Thilliez 48, Gemma L Kay 48, George O Akpede 92, Gert U van Zyl 118,119, Gordon A Awandare 91, Grace S Kpeli 120,121, Grit Schubert 106, Gugu P Maphalala 122, Hafaliana C Ranaivoson 77, Hannah E Omunakwe 123, Harris Onywera 7, Haruka Abe 124, Hela Karray 125, Hellen Nansumba 126, Henda Triki 58, Herve Albéric Adje Kadjo 127, Hesham Elgahzaly 128, Hlanai Gumbo 39, Hota Mathieu 129, Hugo Kavunga-Membo 37, Ibtihel Smeti 41, Idowu B Olawoye 43, Ifedayo M O Adetifa 88,130, Ikponmwosa Odia 92, Ilhem Boutiba Ben Boubaker 131,132, Iluoreh Ahmed Mohammad 43, Isaac Ssewanyana 126, Isatta Wurie 133, Iyaloo S Konstantinus 134, Jacqueline Wemboo Afiwa Halatoko 135, James Ayei 26, Janaki Sonoo 136, Jean-Claude C Makangara 37,38, Jean-Jacques M Tamfum 37,38, Jean-Michel Heraud 12,77, Jeffrey G Shaffer 137, Jennifer Giandhari 2, Jennifer Musyoki 11, Jerome Nkurunziza 138, Jessica N Uwanibe 43, Jinal N Bhiman 13,113, Jiro Yasuda 124, Joana Morais 139,140, Jocelyn Kiconco 97, John D Sandi 64,65, John Huddleston 141, John K Odoom 74, John M Morobe 11, John O Gyapong 120, John T Kayiwa 3, Johnson C Okolie 43, Joicymara S Xavier 1,142,143, Jones Gyamfi 120, Joseph F Wamala 144, Joseph H K Bonney 74, Joseph Nyandwi 55,145, Josie Everatt 13, Joweria Nakaseegu 97, Joyce M Ngoi 91, Joyce Namulondo 97, Judith U Oguzie 43,44, Julia C Andeko 68, Julius J Lutwama 3, Juma J H Mogga 144, Justin O’Grady 48, Katherine J Siddle 57, Kathleen Victoir 146, Kayode T Adeyemi 43,44, Kefentse A Tumedi 147, Kevin S Carvalho 148, Khadija Said Mohammed 11, Koussay Dellagi 146, Kunda G Musonda 149, Kwabena O Duedu 120,121, Lamia Fki-Berrajah 125, Lavanya Singh 2, Lenora M Kepler 94,95, Leon Biscornet 75, Leonardo de Oliveira Martins 48, Lucious Chabuka 150, Luicer Olubayo 8, Lul Deng Ojok 26, Lul Lojok Deng 26, Lynette I Ochola-Oyier 11, Lynn Tyers 9, Madisa Mine 151, Magalutcheemee Ramuth 136, Maha Mastouri 152,153, Mahmoud ElHefnawi 154, Maimouna Mbanne 12, Maitshwarelo I Matsheka 147, Malebogo Kebabonye 155, Mamadou Diop 12, Mambu Momoh 64,65,156, Maria da Luz Lima Mendonça 148, Marietjie Venter 157, Marietou F Paye 57, Martin Faye 12, Martin M Nyaga 158, Mathabo Mareka 159, Matoke-Muhia Damaris 160, Maureen W Mburu 11, Maximillian G Mpina 161,162,163, Michael Owusu 164, Michael R Wiley 81,165, Mirabeau Y Tatfeng 166, Mitoha Ondo’o Ayekaba 162, Mohamed Abouelhoda 167,168, Mohamed Amine Beloufa 111, Mohamed G Seadawy 169,170, Mohamed K Khalifa 171, Mooko Marethabile Matobo 159, Mouhamed Kane 12, Mounerou Salou 172, Mphaphi B Mbulawa 155, Mulenga Mwenda 93, Mushal Allam 173, My V T Phan 3, Nabil Abid 152,174, Nadine Rujeni 175,176, Nadir Abuzaid 177, Nalia Ismael 178, Nancy Elguindy 54, Ndeye Marieme Top 12, Ndongo Dia 12, Nédio Mabunda 178, Nei-yuan Hsiao 9,78, Nelson Boricó Silochi 162, Ngiambudulu M Francisco 139, Ngonda Saasa 179, Nicholas Bbosa 3, Nickson Murunga 11, Nicksy Gumede 18, Nicole Wolter 13,113, Nikita Sitharam 1, Nnaemeka Ndodo 88, Nnennaya A Ajayi 180, Noël Tordo 181, Nokuzola Mbhele 9, Norosoa H Razanajatovo 77, Nosamiefan Iguosadolo 43, Nwando Mba 88, Ojide C Kingsley 182, Okogbenin Sylvanus 92, Oladiji Femi 183, Olubusuyi M Adewumi 31,32, Olumade Testimony 43,44, Olusola A Ogunsanya 43, Oluwatosin Fakayode 184, Onwe E Ogah 185, Ope-Ewe Oludayo 43, Ousmane Faye 12, Pamela Smith-Lawrence 155, Pascale Ondoa 71, Patrice Combe 186, Patricia Nabisubi 187,188, Patrick Semanda 126, Paul E Oluniyi 43, Paulo Arnaldo 178, Peter Kojo Quashie 91, Peter O Okokhere 92,189, Philip Bejon 11, Philippe Dussart 77, Phillip A Bester 190, Placide K Mbala 37,38, Pontiano Kaleebu 3,97, Priscilla Abechi 43,44, Rabeh El-Shesheny 40,191, Rageema Joseph 9, Ramy Karam Aziz 192,193, René G Essomba 194,195, Reuben Ayivor-Djanie 91,120,121, Richard Njouom 196, Richard O Phillips 63, Richmond Gorman 63, Robert A Kingsley 48, Rosa Maria D E S A Neto Rodrigues 197,198, Rosemary A Audu 29, Rosina A A Carr 120,121, Saba Gargouri 125, Saber Masmoudi 41, Sacha Bootsma 144, Safietou Sankhe 12, Sahra Isse Mohamed 199, Saibu Femi 43, Salma Mhalla 132,200, Salome Hosch 161,201, Samar Kamal Kassim 128, Samar Metha 57, Sameh Trabelsi 202, Sara Hassan Agwa 128, Sarah Wambui Mwangi 7, Seydou Doumbia 52, Sheila Makiala-Mandanda 37,38, Sherihane Aryeetey 63, Shymaa S Ahmed 54, Side Mohamed Ahmed 103, Siham Elhamoumi 57, Sikhulile Moyo 100,203, Silvia Lutucuta 139, Simani Gaseitsiwe 100,203, Simbirie Jalloh 64,65, Soa Fy Andriamandimby 77, Sobajo Oguntope 43, Solène Grayo 181, Sonia Lekana-Douki 68, Sophie Prosolek 48, Soumeya Ouangraoua 204,205, Stephanie van Wyk 1, Stephen F Schaffner 57, Stephen Kanyerezi 187,188, Steve Ahuka-Mundeke 37,38, Steven Rudder 48, Sureshnee Pillay 2, Susan Nabadda 126, Sylvie Behillil 206, Sylvie L Budiaki 159, Sylvie van der Werf 206, Tapfumanei Mashe 39,207, Thabo Mohale 13, Thanh Le-Viet 48, Thirumalaisamy P Velavan 114,208, Tobias Schindler 161,162,201, Tongai G Maponga 118, Trevor Bedford 141,209, Ugochukwu J Anyaneji 2, Ugwu Chinedu 43,44, Upasana Ramphal 2,10,210, Uwem E George 43, Vincent Enouf 206, Vishvanath Nene 102, Vivianne Gorova 211,212, Wael H Roshdy 54, Wasim Abdul Karim 1, William K Ampofo 213, Wolfgang Preiser 118,119, Wonderful T Choga 100,214, Yahaya Ali Ahmed 18, Yajna Ramphal 1, Yaw Bediako 91,215, Yeshnee Naidoo 2, Yvan Butera 175,216,217, Zaydah R de Laurent 11; Africa Pathogen Genomics Initiative (Africa PGI), Ahmed E O Ouma 7, Anne von Gottberg 13,113, George Githinji 11,218, Matshidiso Moeti 18, Oyewale Tomori 43, Pardis C Sabeti 57, Amadou A Sall 12, Samuel O Oyola 102, Yenew K Tebeje 7, Sofonias K Tessema 7, Tulio de Oliveira 1,2,10,219,*, Christian Happi 43,44, Richard Lessells 2, John Nkengasong 7, Eduan Wilkinson 1,2,†,*
PMCID: PMC9529057  PMID: 36108049

Abstract

Investment in SARS-CoV-2 sequencing in Africa over the past year has led to a major increase in the number of sequences generated, now exceeding 100,000 genomes, used to track the pandemic on the continent. Our results show an increase in the number of African countries able to sequence domestically, and highlight that local sequencing enables faster turnaround time and more regular routine surveillance. Despite limitations of low testing proportions, findings from this genomic surveillance study underscore the heterogeneous nature of the pandemic and shed light on the distinct dispersal dynamics of Variants of Concern, particularly Alpha, Beta, Delta, and Omicron, on the continent. Sustained investment for diagnostics and genomic surveillance in Africa is needed as the virus continues to evolve, while the continent faces many emerging and re-emerging infectious disease threats. These investments are crucial for pandemic preparedness and response and will serve the health of the continent well into the 21st century.


What originally started as a small cluster of pneumonia cases in Wuhan, China over two years ago ( 1 ), quickly turned into a global pandemic. Coronavirus Disease 2019 (COVID-19) is the clinical manifestation of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection; and by March 2022 there had been over 437 million reported cases and over 5.9 million reported deaths ( 2 ). Though Africa accounts for the lowest number of reported cases and deaths thus far, with ~11.3 million reported cases and 245 000 reported deaths as of February 2022, the continent has played an important role in shaping the scientific response to the pandemic with the implementation of genomic surveillance and the identification of two of the five variants of concerns (VOCs) ( 3 , 4 ).

Since it emerged in 2019, SARS-CoV-2 has continued to evolve and adapt ( 5 ). This has led to the emergence of several viral lineages that carry mutations that confer some viral adaptive advantages that increase transmission and infection ( 6 , 7 ), or counter the effect of neutralizing antibodies from vaccination ( 8 ) or previous infections ( 9 11 ). The World Health Organization (WHO) classifies certain viral lineages as variants of concert (VOCs) or variants of interest (VOIs) based on the potential impact they may have on the pandemic, with VOCs regarded as the highest risk. To date, five VOCs have been classified by the WHO, two of which were first detected on the African continent (Beta and Omicron) ( 3 , 4 , 12 ), while two more (Alpha and Delta) ( 12 , 13 ) have spread extensively on the continent in successive waves. The remaining VOC, Gamma ( 14 ), originated in Brazil and had a limited influence in Africa with only four recorded sequenced cases.

For genomic surveillance to be useful for public health responses, sampling for sequencing needs to be both spatially and temporally representative. In the case of SARS-CoV-2 in Africa, this means extending the geographic coverage of sequencing capacity to capture the dynamic genomic epidemiology in as many locations as possible. In a meta-analysis of the first 10 000 SARS-CoV-2 sequences generated in 2020 from Africa ( 15 ) several blind spots were identified with regards to genomic surveillance on the continent. Since then, much investment has been devoted to building capacity for genomic surveillance in Africa, coordinated mostly by the Africa Centers for Disease Control (Africa CDC) and the regional office of the WHO in Africa (or WHO AFRO), but also provided by several national and international partners resulting in an additional 90 000 sequences shared over the past year (April 2021 - March 2022). This makes the sequencing effort for SARS-CoV-2 a phenomenal milestone. In comparison, only 12 000 whole genome influenza sequences ( 16 ) and only ~3 700 whole genome HIV sequences ( 17 ) from Africa have been shared publicly even though HIV has plagued the continent for decades.

Here we describe how the first 100 000 SARS-CoV-2 sequences from Africa have helped describe the pandemic on the continent, how this genomic surveillance in Africa has expanded, and how we adapted our sequencing methods to deal with an evolving virus. We also highlight the impact that genomic sequencing in Africa has had on the global public health response, particularly through the identification and early analysis of new variants. Finally, we also describe here for the first time how the Delta and Omicron variants have spread across the continent, and how their transmission dynamics were distinct from the Alpha and Beta variants that preceded them.

Results

Epidemic waves driven by variant dynamics and geography

Scaling up sequencing in Africa has provided a wealth of information on how the pandemic unfolded on the continent. The epidemic has largely been spatially heterogeneous across Africa, but most countries have experienced multiple waves of infection ( 18 29 ), with significant local and regional diversity in the first and to a lesser extent the second waves, followed by successive sweeps of the continent with Delta and Omicron (Fig. 1A). In all regions of the continent, different lineages and VOIs evolved and co-circulated with VOCs and in some cases, contributed considerably to epidemic waves.

Fig. 1. Epidemiological progression of the COVID-19 pandemic on the African continent.


Fig. 1.

(A) Total reported new case counts per million inhabitants in Africa (Data Source: Our World in Data; OWID; log-transformed) along with the distribution of VOCs, the Eta VOI and other lineages through time (size of circles proportional to the number of genomes sampled per month for each category). (B to F) Breakdown of reported new cases per million (Data Source: Our World in Data; OWID; log-transformed) and monthly sampling of VOCs, regional variant or lineage of interest and other lineages for three selected countries for North, Southern, West, Central and East Africa respectively. For each region, a different variant or lineage of interest is shown, relevant to that region (C.36, C.1.2, Eta, B.1.620 and A.23.1, respectively).

In North Africa (Fig. 1B and fig. S1A), B.1 lineages and Alpha dominated in the first and second wave of the pandemic and were replaced by Delta and Omicron in the third and fourth waves, respectively. Interestingly, the C.36 and C.36.3 sub-lineage dominated the epidemic in Egypt (~40% of reported infections) before July 2021 when it was replaced by Delta ( 30 ). Similarly, in Tunisia the first and second waves were associated with the B.1.160 lineage and were replaced by Delta during the country’s third wave of infections. In southern Africa (Fig. 1C and fig. S1C), we see a similar pandemic profile with B.1 dominating the first wave, but instead of Alpha, Beta was responsible for the second wave, followed by Delta and Omicron. Another lineage that was flagged for close monitoring in the region was C.1.2, due to its mutational profile and predicted capacity for immune escape ( 31 ). However, the C.1.2 lineage did not cause many infections in the region as it was circulating at a time when Delta was dominant. In West Africa (Fig. 1D and fig. S1B), the B.1.525 lineage caused a large proportion of infections in the second and third waves where it shared the pandemic landscape with the Alpha variant. As with other regions on the continent, these variants were later replaced by the Delta and then Omicron VOCs in successive waves. In Central Africa (Fig. 1E and fig. S1D), the B.1.620 lineage caused most of the infections between January and June 2021 ( 32 ) before systematically being replaced by Delta and then Omicron. Lastly, in East Africa (Fig. 1F and fig. S1E) the A.23.1 lineage dominated the second wave of infections in Uganda ( 33 ) and much of East Africa. In all of these regions, minor lineages such as B.1.525, C.36 and A.23.1 were eventually replaced by VOCs that emerged in later waves.

Finally, we directly compared the official recorded cases in Africa with the ongoing SARS-CoV-2 genomic surveillance (GISAID date of access 2022-03-31) for a crude estimation of variants’ contribution to cases. We observe that Delta was responsible for an epidemic wave between May and October 2021 (Fig. 1A) and had the greatest impact on the continent with almost 34.2% of overall infections in Africa possibly attributed to it. Beta was responsible for an epidemic wave at the end of 2020 and beginning of 2021 (Fig. 1A), with 13.3% of infections overall attributed to it. Notably, Alpha, despite being predominant in other parts of the world at the beginning of 2021, had only minimal significance in Africa, accounting for just 4.3% of infections. At the time of writing, the Omicron VOC had contributed to 21.6% of overall sequenced infections. At this time the Omicron wave was still unfolding globally and in Africa with the expansion of several sub-lineages ( 34 ), such that its full impact is yet to be determined. However, due to increased population immunity ( 35 ), from SARS-CoV-2 infection and vaccination (fig. S2), the impact of Omicron on mortality has been less in comparison to the other VOCs, as can be observed by the relatively low death rate in South Africa during the Omicron wave ( 36 ). The findings from mapping epidemiological numbers onto genomic surveillance data are reliable as far as the proportional scaling of genomic sampling across Africa with the size and timing of epidemic waves (fig. S3; b = 0.011, SE = 0.001, p < 2 × 10−16).

This comes with the obvious caveats that testing and reporting practices have varied widely across the continent, along with genomic surveillance volumes throughout the pandemic. Countries in Africa with reported data have tested in proportions from as little as 0.1 daily tests per million population to more than 1 000 tests per million (fig. S4). Some countries have consistently tested at high proportions, for example South Africa, Botswana, Morocco and Tunisia. Incidentally, these countries have also generally reported more cases per million, providing an indication that recorded low incidence in other parts of the continent has been an underestimate due to low testing rates. However, even for these countries, epidemic numbers are certainly under represented and under detected, given that in several timeframes, test positivity rates were still on the higher end, approaching or exceeding 20% (fig. S4), and as concluded by seroprevalence surveys and estimates of true infection burdens in Africa ( 37 , 38 ). Findings of attributing case numbers of variants must therefore be interpreted in context of this limitation but can nevertheless provide a qualitative overview of the spatial and temporal dynamics of VOCs in relation to epidemic progression in Africa.

The African regional- (table S1) and country-specific (table S2) NextStrain builds also clearly support the changing nature of the pandemic over time. From these builds we observe a strong association of B.1-like viruses circulating on the continent during the first wave. These “ancestral” lineages were subsequently replaced by the Alpha and Beta variants which dominated the pandemic landscape during the second wave, and were later replaced by the Delta and Omicron variants during the third and fourth waves.

Optimizing surveillance coverage in Africa

By mapping and comparing the locations of specimen sampling laboratories to the sequencing laboratories, a number of aspects regarding the expansion of genomic surveillance on the continent became clear. First, even though several countries in Africa started sequencing SARS-CoV-2 in the first months of the pandemic, local sequencing capacity was initially limited. However, local sequencing capabilities slowly expanded over time, particularly after the emergence of VOCs (Fig. 2A). The fact that almost half of all SARS-CoV-2 sequencing in Africa was performed using the Oxford Nanopore technology (ONT), which is relatively low-cost compared to other sequencing technologies and better adapted to modest laboratory infrastructures, illustrates one component of how this rapid scale-up of local sequencing was achieved (fig. S5). Yet, to rely only on local sequencing would have thwarted the continent’s chance at a reliable genomic surveillance program. At the time of writing, there were 52/55 countries in Africa with SARS-CoV-2 genomes deposited in GISAID, however, there were still 16 countries with no reported local sequencing capacity (Fig. 2A) and undoubtedly many with limited capacity to meet demand during pandemic waves.

Fig. 2. Sequencing strategies and outputs in Africa.


Fig. 2.

(A) Geographical representation of all countries (shaded in gray) and institutions (red dots) in Africa with their own on-site sequencing facilities. The inset graph shows the number of countries in Africa able to carry out sequencing locally over time. (B) Key regional sequencing hubs and networks in Africa showing countries (shaded in bright colors) and institutions (red dots) that have sequenced for other countries (shaded in corresponding light colors and linking curves) on the continent. CERI: Centre for Epidemic Response and Innovation; KRISP: KwaZulu-Natal Research Innovation and Sequencing Platform; NICD: National Institute for Communicable Diseases; KEMRI-WT: Kenya Medical Research Institute - Wellcome Trust; ILRI: International Livestock Research Institute; MRC/UVRI: Medical Research Council/Uganda Virus Research Institute; INRB: Institut National de Recherche Biomédicale; ACEGID: African Centre of Excellence for Genomics of Infectious Diseases; NMIMR: Noguchi Memorial Institute for Medical Research; MRCG: Medical Research Council Unit - The Gambia; IPD: Institut Pasteur de Dakar. (C) Geographical representation of the total number of SARS-CoV-2 whole genomes produced over the course of the pandemic in each country, as well as the proportion of those sequences that were produced locally, regionally or abroad. (D) Correlation of the proportion of COVID-19 positive cases that have been sequenced and the corresponding number of epidemiological weeks since the start of the pandemic that are represented with genomes for each African country. The color of each circle represents the number of cases and its size the number of genomes. (E) Comparison of sequencing turn-around times (lag times from sample collection to sequence submission) for the three strategies of sequencing in Africa, showing a significant difference in the means (p-value<0.0001). The box and whisker plot denote the lower quartile, median and upper quartile (box), the minimum and maximum values (whisker), and outliers (black dots). (F) Pearson correlations of the total number of sequencing laboratories per country against key sequencing outputs.

To tackle this, three centers of excellence and various regional sequencing hubs were established to maximize resources available in a few countries to assist in genomic surveillance across the continent. This sequencing is done either as the sole source of viral genomes for those countries (e.g., Angola, South Sudan and Namibia) or concurrently with local efforts to increase capacity during resurgences (Fig. 2B). Sequencing is further supplemented by a number of countries utilizing facilities outside of Africa. Ultimately, a mix of strategies from local sequencing, collaborative resource sharing among African countries and sequencing with academic collaborators outside the continent helped close surveillance blind spots (Fig. 2C). Countries in sub-Saharan Africa, particularly in Southern and East Africa, most benefited from the regional sequencing networks, while countries in West and North Africa often partnered with collaborators outside of Africa.

The success of pathogen genomic surveillance programs relies on how representative it is of the epidemic under investigation. For SARS-CoV-2, this is often measured in terms of the percentage of reported cases sequenced and the regularity of sampling. African countries were positioned across a range of different combinations of overall proportion and frequency of genomic sampling (Fig. 2D). While the ultimate goal would be to optimize both of these parameters, a lower proportion of sampling can also be useful if frequency of sampling is maintained as high as possible. For instance, South Africa and Nigeria, who have both sequenced ~1% of cases overall, can be considered to have successful genomic surveillance programs on the basis that sampling is representative over time, and has enabled the timely detection of variants (Beta, Eta, Omicron).

Additionally, for genomic surveillance to be most useful for rapid public health response during a pandemic, sequencing would ideally be done in real-time or in a framework as close as possible to that. We show a general trend of decreasing sequencing turnaround time in Africa (fig. S6), particularly from a mean of 182 days between October to December 2020 to a mean of 50 days over the same period a year later, although this does come with several caveats. First, we measure sequencing turnaround time in the most accessible manner, which is by comparing the date of sampling of a specimen to the date its sequence was deposited in GISAID. Generally, the genomic data potentially informs the public health response more rapidly than reflected here, particularly when it comes to local outbreak investigations or variant detection. This analysis is also confounded by various factors such as country-to-country variation in these trends (fig. S7), delays in data sharing, and potential retrospective sequencing, particularly by countries joining sequencing efforts at later stages of the pandemic. The most critical caveat is the fact that sequencing from the most recently collected samples (e.g., over the last six months) may still be ongoing. The shortening duration between sampling and genomic data sharing is nevertheless a positive takeaway, given that this data also feeds into continental and global genomic monitoring networks. Overall, the continental average delay from specimen collection to sequencing submission is 87 days with 10 countries having an average turnaround time of less than 60 days and Botswana of less than 30 days (fig. S8).

Most importantly in the context of optimizing genomic surveillance, we found that the route taken to sequencing impacts the speed of data generation. Local sequencing has significantly faster sequencing turn-around times of the three frameworks we investigated (median of 51 days), followed by sequencing within regional sequencing networks in Africa (median of 93 days) and finally outsourced sequencing to countries outside Africa (median of 113 days) (Fig. 2E). This finding strongly supports the investments in local genomic surveillance, to generate timely and regular data for local and regional decision making. Finally, we show that it is beneficial in several ways for countries to undertake genomic surveillance through several sequencing laboratories, rather than centralizing efforts. For instance, we estimate strong correlations between the numbers of sequencing laboratories per country with the total number of genomes produced by that country (method, correlation value), the total number of epiweeks for which sequencing data was produced (method, correlation value), and importantly, sequencing turnaround time (method, correlation value) (Fig. 2F).

With the increase in sequencing capacity on the continent, a decrease in the time taken to detect new variants was observed. For example, the Beta variant was identified in December 2020 in South Africa ( 4 ), but sampling and molecular clock analyses suggest the variant originated in September 2020. This three-month lag in detection means that a new variant, like Beta, has ample time to spread over a large geographic region prior to its detection. However, by the end of 2021, the time to detect a new variant was substantially improved. Phylogenetic and molecular clock analyses suggest that the Omicron variant originated around 9 October 2021 (95% Highest posterior density or HPD: 30 September - 20 October 2021) and the variant was described on 23rd November 2021 ( 3 ). Thus, Omicron was detected within ~5 weeks from origin compared to the Beta variant (~16 weeks) and the Alpha variant, detected in the UK (~10 weeks). More importantly, the time from sequence deposition to the WHO declaring the new variant a VOC was substantially shortened to 72 hours for the Omicron variant.

To interpret insights from the described genomic surveillance in Africa, it is important to understand the context of epidemiological reporting and sampling strategies utilized for sequencing on the continent (table S3). Most countries provided daily reports of newly recorded cases, while a few provided weekly and monthly reports. For most countries, surveillance was mainly focused on the major cities, suggesting potential cryptic circulation in rural areas. We find that at the onset of the pandemic, surveillance was focused on identification of imported cases from incoming travelers or local residents returning from various countries. As community transmissions began to emerge, the focus shifted toward regular surveillance and outbreak investigations. Together, these three strategies account for the vast majority of samples generated on the continent and analyzed here. As the pandemic progressed and vaccines were made available, some countries on the continent began to explore other sampling strategies such as reinfections, environmental samples such as waste water samples, and vaccine breakthrough cases to gain new insights into the evolutionary dynamics of SARS-CoV-2. The utility of sequencing for viral evolution tracking and VOC detection in the way described above is obviously also dependent on sampling proportions, especially within sampling for regular surveillance.

The speed of SARS-CoV-2 evolution has complicated sequencing efforts. Common methods of RNA sequencing include reverse transcription followed by double stranded DNA amplification using sequence-specific primer sets ( 39 ). Ongoing SARS-CoV-2 evolution has necessitated the continual evaluation and updating of these primer sets to ensure their sustained utility during genomic surveillance efforts. Here, we examined the current set of genomes to determine aspects of the sequencing that might be improved in the future. Many of the primer sets used were designed using viral sequences from the start of the pandemic and may require updating to keep pace with evolution. Indeed, the ARTIC primer sets are currently in version 4.1 ( 40 ). The Entebbe primer set was designed mid-2020 well into the first year of the epidemic and used an algorithm and design that accommodates evolution ( 41 ).

The effects of viral evolution on sequencing patterns can be seen with low median unspecified nucleotide (N)-values (a consequence of primer dropout or low coverage at that site) observed for the first 12 months of the epidemic with an increase from October 2020 (Fig. 3A). Additional challenges appear (indicated by increasing median N values) as the virus further evolved into Delta and Omicron lineages from January 2021 onward (Fig. 3A). Examining the role of sequencing technology, it appears that the two major technologies used (Illumina and ONT) have similar gap profiles (as measured by mean N count per genome) while Ion Torrent, MGI and Sanger show reduced mean N count per genome (Fig. 3B). Likely factors for this pattern are the primers used in sequencing, with primer choice playing a key role in the quantity of gaps (Fig. 3C). The mean N count per genome varied with viral lineage (Fig. 4D). There was a modest difference in mean N count per genome across the lineages. Lineages that returned no classification with Pangolin (“None”) showed the highest mean N count, suggesting that high mean N count per genome was probably the basis for failed classification. The more recent lineages Delta (e.g., AY.39, AY.75) and Omicron (BA.1.1, BA.2) also showed higher mean N count per genome consistent with virus evolution impairing primer function. This pattern is further explored in fig. S9 with position of gaps showing an enrichment in the genome regions after position 19 000 with frequent gaps disrupting the spike coding region.

Fig. 3. Genome gap analysis.


Fig. 3.

(A) Shows the mean N count per genome by month of submission to GISAID. The dates for the detection of important SARS-CoV-2 lineages are indicated at the top of the figure. (B) Illustrates the mean N count per genome stratified by sequencing technology. (C) Shows the mean N count per genome stratified by the sequencing primers sets used. For panels A to C, error bars indicate 95% confidence intervals. (D) Mean N count per genome by lineage. The mean N data were stratified by SARS-CoV-2 lineages to investigate lineage-specific frequency of genome gaps, an indirect measure of primer mismatch. All lineages present at least 100 times in the genome data were presented.

Fig. 4. Inferred viral dissemination patterns of VOCs within Africa.


Fig. 4.

(A) Genomic prevalence of VOCs Alpha, Beta, Delta and Omicron in Africa over time. (B) Inferred viral exchange patterns to, from and within the Africa continent for the four VOCs (Omicron as BA.1 and BA.2) based on case-sensitive phylogeographic inference. Introductions and viral transitions within Africa are shown in solid lines and exports from Africa are shown in dotted lines and these are colored by continent. The shaded areas around the lines represent uncertainty of this analysis from ten replicates (+/− s.d.). (C) Dissemination patterns of the VOCs within Africa, from inferred ancestral state reconstructions performed on Africa enriched datasets, annotated and colored by region in Africa. The countries of origin of viral exchange routes are also shown with dots and the curves go from country of origin to destination country in an anti-clockwise direction.

Phylogenetic insights into the rise and spread of variants of concern in Africa

During the first wave of infections in 2020 in Africa, as was the case globally, the majority of corresponding genomes were classified as PANGO B.1 (n=2 456) or B.1.1 viruses (n=1 329). Toward the end of 2020, more distinct viral lineages started to appear. The most important of which that impacted the African continent are: B.1.525 (n=797), B.1.1.318 (n=398) ( 42 ), B.1.1.418 (n=395), A.23.1 (n=358) ( 15 , 29 , 31 , 33 ), C.1 (n=446) ( 29 ), C.1.2 (n=300) ( 31 ), C.36 (n=305) ( 30 , 43 ), B.1.1.54 (n=287) ( 15 , 29 , 31 , 33 ), B.1.416 (n=272), B.1.177 (n=203), B.1.620 (n=138), and B.1.160 (n=61), ( 32 ) (fig. S10, A and B). Our discrete state phylogeographic inference from phylogenetic reconstruction of non-VOC African sequences and an equal number of external references revealed that African countries were primarily seeded by multiple introductions of viral lineages from abroad (mainly Europe) at the beginning of the pandemic. The observed pattern of non-VOC viral lineage movement then consistently shifted toward more intercontinental exchanges (fig. S10C). Mapping out the spatial routes of dissemination shows that various countries in all subregions of the continent acted as sources of these viral lineages at one point or another (fig. S10D). While uneven testing rates and proportions of samples sequenced on the continent may have influenced these inferences (discussed below), the results presented here are in line with the fact that these most predominant non-VOC lineages in Africa, except B.1.177, emerged and circulated widely in different sub-regions (Fig. 1).

Similar to the pandemic globally, VOCs became increasingly important in Africa toward the end of 2020. The Alpha, Beta, Delta and Omicron variants demonstrate many similarities as well as differences in the way they spread on the continent. For all these VOCs, we observe large regional monophyletic transmission clusters in each of their phylogenetic reconstructions in Africa (fig. S11). This suggests an important extent of continental dissemination within Africa. Alpha and Beta were epidemiologically important in distinct regions of the continent with Alpha primarily circulating in West, North and most of Central Africa, Beta in southern and most of East Africa, and only substantially co-circulated in a few countries such as Angola, Kenya, Comoros, Burundi and Ghana (Fig. 1 and fig. S12). However, we may not have enough resolution in the geospatial data to know how much they were truly co-circulating throughout these countries, or whether there were regional outbreaks of Alpha and Beta within these countries. In Kenya, for example, Beta was detected more in coastal regions, and Alpha more inland ( 26 , 44 ). In contrast, Delta and Omicron variants sequentially dominated the majority of infections on the entire continent shortly after their emergence (Fig. 4A and fig. S12).

The Alpha variant was first identified in December 2020 in the UK and has since spread globally. In Africa, Alpha was detected in 43 countries with evidence of community transmission, based on phylogenetic clustering, in many countries including Ghana, Nigeria, Kenya, Gabon and Angola (fig. S11). Discrete state maximum likelihood reconstruction from a globally case-sensitive genomic subsampling inferred at least 80 introductions (95% CI: 78 - 82) into Africa with the bulk of imports attributed to the US (>47%) and the UK (>25%) (Fig. 4B). Only 1% of imports into any particular African country were attributed to another African nation. Phylogeographic reconstruction enriched in African sequences revealed that of those, >85% of the intercontinental Alpha exchanges in Africa originated from West African countries (Fig. 4C). This occurred in spite of initial importations of the Alpha variant from Europe into all regions of the continent (fig. S13B), but is in line with Alpha having dominated circulation mostly in West Africa (fig. S12). In countries where Alpha was introduced but did not grow and cause an expansion of cases, this can be explained by competition with the already established Beta variant, which simultaneously circulated. The characteristics of multiple introductions of Alpha intro Africa and between African countries is similar to the spread of Alpha documented in the UK, Scotland and Ireland ( 45 47 ).

The second VOC, Beta, was identified in December 2020 in South Africa ( 4 ). However, sampling and molecular clock analyses suggest that the variant originated around September 2020 (fig. S11). At the end of 2020 and beginning of 2021, Beta was driving a second wave of infection in South Africa and quickly spread to other countries within the region. The concurrent introductions and spread of Alpha and other variants (Eta, A.23.1) in other regions of the continent may have reduced the Beta variant’s initial growth, limiting its spread to largely southern Africa, and to a lesser extent the East Africa region. Beta spread to at least 114 countries globally, including 37 countries and territories in Africa. For this variant, viral circulation and geographical exchanges occurred predominantly within the continent. Indeed, phylogeographic reconstruction from a globally case-sensitive sampling revealed that of the 810 (95% CI: 803 - 818) inferred introductions of the Beta variant into African countries, only 110 (95% CI: 105 - 115; 13%) were attributed to sources outside the continent (fig. S13C), while more than half of introductions were attributed to South Africa (63%) (Fig. 4C). This is in line with expectations as the variant originated in South Africa. Beyond southern Africa, most of the introductions back into the continent were attributed to France and other EU countries into the French overseas territories, Mayotte and Reunion, and other Francophone African countries. Africa-focused phylogeographic analysis revealed a similar spatial pattern showing southern countries as substantial sources of the variant, followed in small numbers by countries in East Africa (Fig. 4C).

The fourth VOC observed was Delta ( 13 ), which rose to prominence in April 2021 in India, where it fuelled an explosive second wave. Since its emergence, Delta was detected in >170 countries, including 37 African countries and territories (fig. S11). Our global case-sensitive subsampled analysis infers at least 100 (95% CI: 93 - 106) introductions of the Delta variant into Africa, with the bulk attributed to India (~72%), mainland Europe (~8%), the UK (~5%), and the US (~2.5%). Viral introductions of Delta also occurred from one African country to others, in 7% of inferred introductions. From our Africa-focused phylogeographic inferences, we infer that viral dissemination of Delta within Africa was not restricted to or dominated by any particular region unlike Alpha and Beta, but rather spread across the entire continent (Fig. 4C). Following introductions from Asia in the middle of 2021, Delta rapidly replaced the other circulating variants (Fig. 4A). For example, in southern African countries, the Delta variant rapidly displaced Beta and by June-2021 was circulating at very high (>90%) frequencies ( 48 ).

The latest VOC, Omicron, was identified and characterized in November 2021, in southern Africa ( 3 ). At the time of writing, the variant has been detected and caused waves of infections in >160 countries including 39 African countries and two overseas territories (fig. S11). Due to the genetic distance between them and their sequential epidemic expansion globally (rather than simultaneous), phylogenies were reconstructed separately for Omicron BA.1 and BA.2. Our discrete ancestral state reconstruction from a global case-sensitive sampling for Omicron BA.1 infers at least 55 (95% CI: 47 - 62) viral exports of BA.1 out of various African countries, of which 31 (95% CI: 25 - 36) were toward Europe and 8 (95% CI: 6 - 10) toward North America (Fig. 4B). Following explosive expansion of Omicron around the world, we inferred even more reintroductions of the variant back into Africa, at least 69 (95% CI: 60 - 78) from Europe and 102 (95% CI: 92 - 112) from North America (Fig. 4B). From our Africa-focused phylogeographic reconstructions, we determine that, as with Delta, routes of dissemination of this variant involved all regions of the continent spatially (Fig. 4C). Yet, ~75% of all BA.1 viral movement volume in Africa happened between southern African countries, likely due to rapid epidemic expansion in the region soon after its detection ( 3 ). Omicron BA.2’s reach in Africa was limited at the time of writing, with only 3 260 sequences from 19 countries attributed to BA.2 on GISAID (Date of access: 2022-03-31) (15% of all Omicron sequences from Africa). Our discrete ancestral state reconstruction from a global case-sensitive sampling for Omicron BA.2 infers at least 68 (95% CI: 53 - 84) viral exports out of African countries, of which the majority were toward Europe (~88%) (Fig. 4B). We also infer at least 99 (95% CI: 87 - 109) separate introduction or reintroduction events of BA.2 back into African countries, of which ~65% are from Europe and ~30% from Asia, primarily from India (Fig. 4B). This is consistent with India having experienced one of the earliest large BA.2 waves globally. In the context of global incidence of BA.2, this case-sensitive phylogeographic analysis revealed that only 0.01% of viral movements of this lineage globally happened from one African country to another. Our Africa-focused analysis inferred a similar pattern of BA.2 spatial diffusion within African to BA.1 (Fig. 4C). However, given that this accounted for such a small percentage of global BA.2 movements, BA.2 diffusion from one African country to another is unlikely to have had a significant impact on epidemiological expansion, compared to introductions from Asia, Europe or North America.

Globally, dissemination of the SARS-CoV-2 virus throughout the pandemic was intricately linked with human mobility patterns ( 49 53 ). To determine the validity of the VOC movement patterns that we infer into and within the Africa continent in this study, we compared viral import and export events to and from South Africa with travel to the country. In December 2020, the UK accounted for the 5th highest number of passengers entering South Africa, while other countries with the top 9 sources of travellers were all neighboring countries in southern Africa (fig. S14A). Considering that incidence of the Alpha variant was insignificant in the region, this supports our inference of the UK contributing 60% of Alpha introductions to South Africa (fig. S15A). In March 2021, the US, Germany, the UK and India were among the top 12 sources of travellers to South Africa behind 8 African countries (fig. S14B). During this time of Delta dissemination globally, we infer that ~90% of introductions of Delta into South Africa originated in the UK, the US and India (fig. S15B). At the end of 2021, most introductions or re-introductions of Omicron to the country came from the UK, the US or Botswana, corresponding to locations of both high Omicron incidence at the time, and high numbers of passengers to South Africa (figs. S14C and S15C). These travel patterns also fit the findings that ~89%,~70% and ~75% of Beta, Delta and Omicron exports respectively from South Africa to other African countries were directed to locations of southern Africa (figs. S14, D and E, and S15, D and E).

Discussion, limitations and conclusions

By April 2020, a total of 20 African countries were able to sequence the virus within their own borders. This was largely made possible by other pre-existing sequencing efforts on the continent focused on other human pathogens (e.g., HIV, TB, Ebola and H1N1). However, these efforts were quickly limited by global supply chain issues and in many countries sequencing efforts dramatically slowed down or stopped toward the end of 2020. In order to facilitate more sequencing on the continent over the course of the past year (April 2021 - March 2022) the Africa CDC and partners invested heavily to support genomic surveillance on the continent. This included the transfer of 24 new sequencing platforms (including MinIon, GridIon, MiSeq and NextSeq), the distribution of reagents and flow cells to support the sequencing of 100 000 positive samples, the training of >230 students and technicians in wet laboratory and bioinformatic techniques and additional grants to support 10 regional sequencing hubs. This investment has started bearing fruit and should be intensified as the virus continues to evolve, requiring the adaptation of methodologies locally on the continent to keep pace with the emergence of variants. The continued development of sequencing protocols in Africa is of crucial importance ( 41 , 54 , 55 ) given the number of variants and lineages that emerged in, and were introduced to, the continent. In Northern Africa, the SARS-CoV-2 pandemic was caused by waves of infections that were similar to those seen in Europe (first wave = B.1 descendants, second wave = Alpha, third wave = Delta and forth wave = Omicron), in southern Africa the pattern was similar but with a Beta wave instead of an Alpha one. In East Africa, the pandemic was more complex, involving both Alpha and Beta as well as its own lineage A.23.1 before the arrival of Delta and Omicron. Central Africa experienced epidemic patterns sometimes mirroring East Africa and other times southern Africa. In West Africa, Eta made a significant contribution to both a second wave (together with alpha) and a third wave (together with Delta). The factors that resulted in these regional differences are not clear but could be due to differences in human mobility, founder effects, competition between lineages or the immunity induced by earlier waves in a region.

Public health benefits of such broadly inclusive genomic surveillance are manifold. The most prominent insight from this expanded genomic surveillance in Africa has been an early warning capacity for the world following the detection of new lineages and variants, most recently relevant in the detection of Omicron BA.1, BA.2, BA.3, BA.4 and BA.5 sub-variants ( 3 , 4 , 34 ). Furthermore, the reporting of local SARS-CoV-2 sequences made the epidemic more immediate to the Ministries of Health from the reporting African countries. It became clear early on that the viral evolution is global and the transmission of the virus is extremely rapid which guided mitigation strategies. The generation and the availability of local sequences also validated local diagnostics and allowed investigators to determine if nucleic acid based diagnostics in use could still detect local variants. The detection of SARS-CoV-2 in returning travellers and truck drivers indicated routes that the virus might be using to enter a country and guided early efforts to slow the virus entry and gain time to establish vaccination plans. Later the difficulty of stopping the virus at borders combined with the data that the variants were already in community circulation allowed public health officials to focus efforts and limited resources on vaccination rather than on border controls. The detection and reporting of the more recent lineages with enhanced transmission (i.e., Omicron) and the ability to bypass existing immunity is important information and an early alert to the public health officials globally that the epidemic was still proceeding. As the pandemic progresses in an evolving global context, we provide evidence that with each new variant, transmission dynamics are changing and the use of sequencing with phylogenetics could potentially alter decisions of public health measures. For example, the demonstrated shift away from regional dynamics of Alpha and Beta toward more global patterns with Delta and Omicron can provide insights to public health officials as they anticipate epidemic developments locally. With Omicron it became clear that although the variant expanded first in Africa, the continent ultimately had a minimal role in global dissemination, and continental expansion beyond southern Africa was most influenced by external introductions, in contrast to the Beta variant. All of these public health benefits to sequencing SARS-CoV-2 is primarily amplified, as we show in this study, if the sequencing can be conducted locally within a country, which strongly supports the continued investment into pathogen sequencing on the continent.

In spite of the recent successful expansion of genomics surveillance in Africa, additional work remains necessary. Even with the Africa CDC - Africa PGI’s and other investments, there are still 16 countries with no sequencing capacity within their own borders. These countries' only option is to send samples to continental sequencing hubs or to centers outside of the continent, which increases the turnaround times and limits the utility of genomic surveillance for public health decision making. Secondly, not all countries are willing to share data openly in a timely fashion for fear of being subject to travel bans or restrictions which could bring substantial economic harm. Such hesitancy has obvious potential ramifications for the future of genomic surveillance on the continent. Furthermore, with the expansion of sequencing on the continent there is a growing need for more bioinformatics support and knowledge to allow investigators to analyze and report their data in a reasonable timeframe that makes it useful for public health response. It is also clear the SARS-CoV-2 sequencing primers are not a static development and may require updating as the virus evolves. A number of research groups have been addressing the SARS-CoV-2 sequencing primer questions. Issues of gaps in the genomes due to missing amplicons have been discussed ( 56 , 57 ). The ARTIC primer set has gone through a number of revisions to accommodate virus evolution ( 39 , 40 ). Additional longer amplicon methods have been published ( 58 60 ) including methods to use a subset of ARTIC primers ( 61 ).

The patterns we describe here are of course limited to reported cases, and applies to both the phylogeographic as well as the epidemiology inferences. As such, the results need to be interpreted with these limitations in mind. Our primary phylogeographic inference relied on a sampling strategy considering all high quality African sequences and an equal number of external references. Though this strategy has the advantage of placing all African sequences in a phylogenetic context, it introduces a bias when applied to discrete ancestral state reconstruction as more internal nodes are inferred to be from Africa. To address this we performed an even sampling of global cases, based on reported case counts through time, to compare against our over sampled inference. The even sampling approach has the benefit that the discrete ancestral state reconstruction is not biased by uneven sampling. Comparing the two there are obvious differences, most notably that the number of inferred introductions into Africa is proportional to sampling proportions (fig. S16), as we no longer consider all African sequences but just a small subset against a global sample. However, inferences from the two approaches correspond well with one another. For example, considering Alpha we still observed the vast majority of introductions into Africa to originate from Western Europe. Patterns of dissemination within Africa are more robustly comparable between the two, for instance that countries in West Africa were the biggest source of Alpha within the continent. High concordance between the two inference methods were also observed for other VOCs for dispersal routes within Africa which gives us confidence in the inferred patterns we observe here. Although we represent an inference based on over sampling and case sensitive sampling, it is currently not possible to explore how under sampling affects the phylogeographic reconstruction due to uneven testing rates. Additionally, the robustness of the phylogeographic inference can also be affected by the underlying methodology used. Broad consensus would favor the use of Bayesian methods for phylogeographic reconstruction, which is often considered to be the “gold standard” in the field. The main drawbacks of Bayesian methods are that they can only be applied to a relatively small number of sequences at a time (<1,000) and are extremely computationally and time intensive. Given the explosion of sequence data over the past two years, the scientific community will have to adapt or put forth new analytical methods to fully capitalize on the global sequencing efforts for SARS-CoV-2.

Despite our best attempts to consider and minimize genomic sampling bias, the accuracy of the resulting phylogenetic inferences is limited by the available epidemiological and genomic data, leading to unaccounted biases in the estimates of viral movements. This includes limited testing and subsequent sequencing in many African countries. Although the percentage of reported cases sequenced in African countries (0.01 - 10%, mean = 1.27%) is not far from global figures (0.01-16%, mean = 1.31%), testing rates and infection-to-detection ratios in Africa were some of the lowest globally ( 38 , 62 ). Together with estimates of excess mortality being as much as 20-fold more than the reported numbers in African countries ( 63 ), these are strong indications of undetected and underreported epidemic sizes in Africa, leading to undersampling of genomic data ( 62 ) and thus underestimates of viral exchange inferences in our study. Some countries with no publicly available SARS-CoV-2 sequences are by definition completely missing in our inference. This in turn means that inferred routes of viral transmission within Africa could be missing important intermediate locations, although this is potentially true around the world. Nevertheless, we believe that the viral movement inferences that we discuss in this study provide a likely qualitative description of the patterns of SARS-CoV-2 migration into, out of, and within Africa.

Finally, we should also mention uneven sequencing and reporting standards across the different laboratories on the continent - and globally, for that matter. Different groups use different measures for what constitutes a high quality sequence (e.g., 70% vs 80% sequence coverage) or using different sequencing depth coverage. This lack of standardization globally complicates the direct comparison of sequences that may have been submitted to GISIAD using different criteria further biasing any inference. Given the sheer size of SARS-CoV-2 sequencing, with ~10 million whole genome sequences shared on the GISAID database (31st March 2022), there is an urgent need for global standards with regards to sequence quality and associated metadata.

In conclusion, Africa needs to continue expanding genomic sequencing technologies on the continent in conjunction with diagnostics capabilities. This holds true not just for SARS-CoV-2 but for other emerging or re-emerging pathogens on the continent. For example, WHO announced in February 2022 the re-emergence of wild polio in Africa, while sporadic influenza H1N1, measles and Ebola outbreaks continue to occur on the continent. The Africa CDC has estimated that over 200 pathogen outbreaks are reported across the continent every year. Beyond the current pandemic, continued investment in diagnostic and sequencing capacity for these pathogens could serve the public health of the continent well into the 21st century.

Methods and methods

​​Ethics statement

This project relied on sequence data and associated metadata publicly shared by the GISAID data repository and adhere to the terms and conditions laid out by GISAID ( 16 ). The African samples processed in this study were obtained anonymously from material exceeding the routine diagnosis of SARS-CoV-2 in African public and private health laboratories. Individual institutional review board (IRB) references or material transfer agreements (MTAs) for countries are listed below.

Angola - (MTA - CON8260), Botswana - Genomic surveillance in Botswana was approved by the Health Research and Development Committee (Protocol HPDME 13/18/1), Egypt - Surveillance in Egypt was approved by the Research Ethics Committee of the National Research Centre (Egypt) (protocol number 14 155, dated March 22, 2020), Kenya - samples were collected under the Ministry of Health protocols as part of the national COVID-19 public health response. The whole genome sequencing study protocol was reviewed and approved by the Scientific and Ethics Review Committee (SERU) at Kenya Medical Research Institute (KEMRI), Nairobi, Kenya (SERU protocol #4035), Nigeria – (NHREC/01/01/2007), Mali - study of the sequence of SARS-CoV-2 isolates in Mali - Letter of Ethical Committee (N0-2020 /201/CE/FMPOS/FAPH of 09/17/2020), Mozambique - (MTA - CON7800), Malawi - (MTA - CON8265), South Africa - The use of South African samples for sequencing and genomic surveillance were approved by University of KwaZulu-Natal Biomedical Research Ethics Committee (ref. BREC/00001510/2020); the University of the Witwatersrand Human Research Ethics Committee (HREC) (ref. M180832); Stellenbosch University HREC (ref. N20/04/008_COVID-19); the University of the Free State Research Ethics Committee (ref. UFS-HSD2020/1860/2710) and the University of Cape Town HREC (ref. 383/2020), Tunisia - for sequences derived from sampling in Tunisia, all patients provided their informed consent to use their samples for sequencing of the viral genomes. The ethical agreement was provided to the research project ADAGE (PRFCOVID19GP2) by the Committee of protection of persons (Tunisian Ministry of Health) under the reference (CPP SUD N 0265/2020), Uganda - The use of samples and sequences from Uganda were approved by the Uganda Virus Research Institute - Research and Ethics Committee UVRI-REC Federalwide Assurance [FWA] FWA No. 00001354, study reference - GC/127/20/04/771 and by the Uganda National Council for Science and Technology, reference number - HS936ES) and Zimbabwe (MTA - CON8271).

Epidemiological and genomic data dynamics

We analyzed trends in daily numbers of cases of SARS-CoV-2 in Africa up to 31st March 2022 from publicly released data provided by the Our World in Data repository for the continent of Africa (https://github.com/owid/covid-19-data/tree/master/public/data) as a whole and for individual countries ( 2 ). To provide a comparable view of epidemiological dynamics over time in various countries, the variable under primary consideration for Fig. 1 was ‘new cases per million (smoothed)’. To calculate the genomic sampling proportion and frequency for each country for Fig. 2, the total number of recorded cases at 31st March was considered, as well as the total length of time for which each country has recorded cases of SARS-CoV-2.

Genomic metadata was downloaded for all African entries on GISAID for the same time period (date of access: 31st March 2022). From this, information extracted from all entries for this study included: date of sampling, country of sampling, viral lineage and clade, originating laboratory, sequencing laboratory, and date of submission to the GISAID database. The geographical locations of the originating and sequencing laboratories were manually curated. Sequences originating and sequenced in the same country were defined as locally sequenced, irrespective of specific laboratory or finer location. Sequences originating in one African country and sequenced in another were defined as sequenced within regional sequencing networks. Sequences sequenced in a location not within Africa were labeled as sequenced outside Africa. Sequencing turnaround time was defined as the number of days elapsed from specimen collection to sequence submission to GISAID. Sequencing technology information for all African entries was also downloaded from GISAID on 31st March 2022.

Primer choice and sequencing outcomes

All SARS-CoV-2 genomes from African countries were retrieved from GISAID ( 16 ) for submission dates from 1 December 2019 to 31st March 2022 yielding 100 470 entries. Associated metadata for the entries were also retrieved, including collection date, submission date, country, viral strain and sequencing technology. Data on the primers used for the sequencing were requested from investigators and yielded primer data for 13 973 of the entries (~13%). The total N (bases with low sequence depth) per genome were counted, results from which were then used for genome quality analysis and visualization. Gap locations in the genomes were mapped and visualized compared to the original Wuhan strain ( 64 ).

Phylogenetic investigation

All African sequences on the GISAID sequence database ( 16 ) were downloaded on the 31st of March 2022 (n=100 470). Of this, Alpha accounted for 3 851 sequences, Beta accounted for 14 548 sequences, Delta accounted for 35 027 sequences, Omicron for 21 708, while 25 336 sequences were classified as none-VOCs. Prior to any phylogenetic inference we performed some quality assessment on the sequences to exclude incomplete or problematic sequences as well as sequences lacking complete metadata. Briefly, all African sequences were passed through the NextClade analysis pipeline ( 65 ) in order to identify and exclude: (i) sequences missing >10% of the SARS-CoV-2 genome, (ii) sequences that deviate by >70 nucleotides from the Wuhan reference strain, (iii) sequences with >10 ambiguous bases, (iv) clustered mutations, and (v) sequences flagged with private mutations by NextClade. Additionally, Omicron variants were screened for traces of viral recombination with RDP5.23 ( 66 ) using default settings and a p-value of ≤0.05 as evidence of recombination. A large number of sequences were removed (n=57 421) with incomplete sequences (<90% genome coverage) being the biggest contributor. This produced a final African dataset of 43 049 high quality African sequences. Due to the sheer size of the dataset we opted to perform independent phylogenetic inferences on the main VOCs (Alpha, Beta, Delta and Omicron BA.1 and BA.2) that have spread on the African continent, as well as a separate inference for all non-VOC SARS-CoV-2 sequences.

In order to evaluate the spread of the virus on the African continent we aligned the African datasets against a large number of globally representative sequences from around the world. Due to the oversampling of some variants or lineages we performed a random down sampling while retaining the oldest two known variants from each country. Reference sequences were respectively aligned with their African counterparts independently with NextAlign ( 65 ). Each of the alignments were then used to infer maximum likelihood (ML) tree topologies in FastTree v 2.0 ( 67 ) using the General Time Reversible (GTR) model of nucleotide substitution and a total of 100 bootstrap replicates ( 68 ). The resulting ML tree topologies were first inspected in TempEst ( 69 ) to identify any sequences that deviate more than 0.0001 from the residual mean. Following the removal of potential outliers in R with the ape package ( 70 ), the resulting ML-trees were then transformed into time calibrated phylogenies in TreeTime ( 71 ) by applying a rate of 8x10e-4 substitution per site per year ( 72 ) in order to transform the branches into units of calendar time. Time calibrated trees were then visualized along with associated metadata in R using ggtree ( 73 ) and other packages.

We performed a basic viral dispersal analysis for each of the VOCs (excluding Gamma), as well as for the non-VOC dataset. Briefly, a migration model was fitted to each of the time calibrated tree topologies in TreeTime, mapping the country location of sampled sequences to the external tips of the trees. The mugration model of TreeTime also infer the most likely location for internal nodes in the trees. Using a custom python script we could then count the number of state changes by iterating over each phylogeny from the root to the external tips. We count state changes when an internal node transitions from one country to a different country in the resulting child-node or tip(s). The timing of transition events is then recorded which serve as the estimated import or export event. To infer some confidence around these estimates, we performed ten replicates for each of the dataset by random selection from the 100 bootstrap trees. Due to the high uncertainty in the inferred locations for deep internal nodes in the trees we truncated state changes to the earliest date of sampling in each dataset. All data analytics were performed using custom python and R scripts and results visualized using the ggplot libraries ( 74 ). Such phylogeographic methods are always subject to uneven sampling through time (i.e., over the course of the pandemic) and through space (by sampling location). To address this we have performed a case sensitive analysis to investigate the effects of oversampling African locations on the inferred number of viral introductions. Furthermore, in a previous analysis ( 15 ) we performed a sensitivity analysis to address some of these issues and found no substantial variations in estimates.

Case sensitive phylogeographic inference

To address the potential over sampling of African sequences relative to global reference in the above mentioned analyses we performed another phylogeographic inference on subsamples based on global case counts to try and eliminate oversampling bias in our inference. To this end, we considered all high quality sequences for each of the VOCs (Alpha, Beta, Delta and Omicron BA.1 and BA.2) globally over the same sampling period (till 31st of March 2022). We used subsampler (https://github.com/andersonbrito/subsampler) to generate subsamples for each variant based on globally reported cases. In short, subsampler uses a case count matrix of daily cases, along with the fasta sequences and GISAID associated metadata to sample a user defined number of sequences. For each VOC and for BA.1 and BA.2 we performed 10 samplings using different number seeds in order to sample datasets of ~20 000. Once again, sampled sequences were screened for viral recombination as described above and sequences with signs of recombination were removed. Subsampler has the added advantage that it disregards poor quality sequences (e.g., <90% coverage) and sequences with missing metadata (e.g., exact date of sampling). Each dataset was then subjected to the same analytical pipeline as mentioned above to infer the viral transitions between Africa and the rest of the world.

Regional and country specific NextStrain builds

In order to investigate more granular changes in lineage dynamics within a specific country or region in Africa we utilized the NextStrain pipeline (https://github.com/nextstrain/ncov) to generate the regional and country-specific builds for African countries ( 75 ). First, all sequence data and metadata were retrieved from the GISAID sequence database and filtered for Africa based on the 'region' tab, for inclusion in regional- and country-specific African builds. For country-specific builds ~4 000 sequences from a given country were randomly selected and analyzed against ~1 000 randomly selected sequences from the Africa 'nextregions' records that do not match the focal country of interest. For region specific (e.g., West Africa), ~4 000 sequences from the focal region are selected at random and analyzed against ~1 000 randomly selected sequences from the Africa 'nextregions' records that do not match the focal region of interest. The methodological pipeline for NextStrain is well documented and performs all analyses within one workflow, including filtering of sequences, alignment, tree inference, molecular clock and ancestral state reconstruction. For more information please visit, https://docs.nextstrain.org/en/latest/index.html.

All region- and country-specific builds are regularly updated to keep track of the evolving pandemic on the continent. All builds are publicly available under the links provided in tables S1 and S2 as well as on the NextStrain webpage (https://nextstrain.org/sars-cov-2/#datasets).

Acknowledgments

First and foremost, we acknowledge authors in institutions in Africa and beyond who have made invaluable contributions toward specimen collection and sequencing to produce and share, via GISAID, SARS-CoV-2 genomic data. We also acknowledge the authors from the originating and submitting laboratories worldwide, who generated and shared SARS-CoV-2 sequence data, via GISAID, from other regions in the world, which was used to contextualize the African genomic data. A full list of GISAID sequence IDs used in the current study are available in table S4.

Funding: Sequencing efforts in the African Union Member States were supported by the Africa Center for Disease Control (Africa CDC) - Africa Pathogen Genomics Initiative (Africa PGI), and the World Health Organization Regional Office for Africa (WHO AFRO) through the transfer of laboratory infrastructure, the provision of reagents and training. The Africa PGI is supported by the African Union, Centers for Disease Control and Prevention (CDC), Bill and Melinda Gates Foundation (BMGF), Illumina Inc, Oxford Nanopore Technologies (ONT) and other partners. In addition, all Institut Pasteur organizations and CERMES in Niger are part of the PEPAIR COVID-19-Africa project which is funded by the French Ministry for European and Foreign Affairs. KRISP and CERI is supported in part by grants from WHO, the Abbott Pandemic Defense Coalition (APDC), the National Institute of Health USA (U01 AI151698) for the United World Antivirus Research Network (UWARN) and the INFORM Africa project through IHVN (U54 TW012041), H3BioNet Africa (Grant # 2020 HTH 062), the South African Department of Science and Innovation (SA DSI) and the South African Medical Research Council (SAMRC) under the BRICS JAF #2020/049. ILRI is also supported by the Ministry for Economic Cooperation and Federal Development of Germany (BMZ). Work conducted at ACEGID is made possible by support provided to ACEGID by a cohort of generous donors through TED’s Audacious Project, including the ELMA Foundation, MacKenzie Scott, the Skoll Foundation, and Open Philanthropy. Work at ACEGID was also partly supported by grants from the National Institute of Allergy and Infectious Diseases (https://www.niaid.nih.gov), NIH-H3Africa (https://h3africa.org) (U01HG007480 and U54HG007480), the World Bank (projects ACE-019 and ACE-IMPACT), the Rockefeller Foundation (Grant #2021 HTH), the Africa CDC through the African Society of Laboratory Medicine (ASLM; Grant #INV018978), the Wellcome Trust (Project 216619/Z/19/Z) and the Science for Africa Foundation. Sequencing efforts at the National Institute for Communicable Diseases (NICD) was also supported by a conditional grant from the South African National Department of Health as part of the emergency COVID-19 response; a cooperative agreement between the NICD of the National Health Laboratory Service (NHLS) and the United States Centers for Disease Control and Prevention (FAIN# U01IP001048; NU51IP000930); the South African Medical Research Council (SAMRC, project number 96838); the ASLM and the Bill and Melinda Gates Foundation grant number INV-018978; the UK Foreign, Commonwealth and Development Office and Wellcome (Grant no 221003/Z/20/Z); and the UK Department of Health and Social Care and managed by the Fleming Fund and performed under the auspices of the SEQAFRICA project. Funding for sequencing efforts in Angola was supported through Projecto Bongola (N.º 11/MESCTI/PDCT/2020) and OGE INIS (2020/2021). Botswana’s sequencing efforts led by the Botswana Harvard AIDS Institute Partnership was supported by: Foundation for Innovative New Diagnostics(FINDdx); BMGF, H3ABioNet [U41HG006941], Sub-Saharan African Network for TB/HIV Research Excellence (SANTHE) and Fogarty International Center (Grant # 5D43TW009610) . H3ABioNet is an initiative of the Human Health and Heredity in Africa Consortium (H3Africa) program of the African Academy of Science (AAS). HHS/NIH/National Institute of Allergy and Infectious Diseases (NIAID) (5K24AI131928-04; 5K24AI131924-04); SANTHE is a DELTAS Africa Initiative [grant # DEL-15-006]. The DELTAS Africa Initiative is an independent funding scheme of the African Academy of Sciences (AAS)’s Alliance for Accelerating Excellence in Science in Africa (AESA) and supported by the New Partnership for Africa’s Development Planning and Coordinating Agency (NEPADAgency) with funding from the Wellcome Trust [grant #107752/Z/15/Z] and the UK government. From Brazil, Joicymara Santos Xavier was funded by Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brazil (CAPES) - Finance Code 001. Sequencing efforts from Côte d’Ivoire were funded by the Robert Koch Institute and the German Federal Ministry of Education and Research (BMBF). Sequencing efforts in the Democratic Republic of the Congo were funded by the Bill & Melinda Gates Foundation under grant INV-018030 awarded to CBP and further supported by funding from the Africa CDC through ASLM for Accelerating SARS-CoV-2 Genomic Surveillance in Africa, the US Centre for Disease Control and Prevention (US CDC), USAMRIID, IRD/Montepellier, UCLA and SACIDS FIND. Efforts from Egypt were funded by the Egyptian Ministry of Health, the Egyptian Academy for Scientific Research and Technology (ASRT) JESOR project #3046 (Center for Genome and Microbiome Research), the Cairo University anti COVID-19 fund and the Science and Technology Development Fund (STDF), Project ID: 41907. The sequencing effort in Equatorial Guinea was supported by a public–private partnership, the Bioko Island Malaria Elimination Project, composed of the government of Equatorial Guinea Ministries of Mines and Hydrocarbons, and Health and Social Welfare, Marathon EG Production Limited, Noble Energy, Atlantic Methanol Production Company, and EG LNG. Analysis for the Gabon strains was supported by the Science and Technology Research Partnership for Sustainable Development (SATREPS), Japan International Cooperation Agency (JICA), and Japan Agency for Medical Research and Development (AMED) (grant number JP21jm0110013) and a grant from AMED (grant number JP21wm0225003). CIRMF (Gabon) is funded by the Gabonese Government and TOTAL Energy inc. CIRMF is a member of CANTAM supported by EDCTP. The work at WACCBIP (Ghana) was funded by a grant from the Rockefeller Foundation (2021 HTH 006), an Institut de Recherche pour le Développement (IRD) grant (ARIACOV), African Research Universities Alliance (ARUA) Vaccine Development Hubs grant with funds from Open Society Foundation, National Institute of Health Research (NIHR) (17.63.91) grants using UK aid from the UK Government for a global health research group for Genomic surveillance of malaria in West Africa (Wellcome Sanger Institute, UK) and the World Bank African Centers of Excellence Impact grant (WACCBIP-NCDs: Awandare). In addition to the funding sources from ILRI, KEMRI (Kenyan) contributions to sequencing efforts was supported in part by the National Institute for Health Research (NIHR) (project references 17/63/82 and 16/136/33) using UK aid from the UK Government to support global health research, and The UK Foreign, Commonwealth and Development Office (FCDO) and Wellcome (grant# 220985/Z/20/Z) and the Kenya Medical Research Institute Grant # KEMRI/COV/SPE/012. Contributions from Lesotho were supported by the Africa CDC, ALSM and SA NICD. Liberian efforts was funded by the Africa CDC through a subaward from the Bill and Melinda Gates Foundations, while efforts from Madagascar were funded by the French Ministry for Europe and Foreign Affairs through the REPAIR COVID-19-Africa project coordinated by the Pasteur International Network association. Sequencing from Malawi was supported by Wellcome Trust. Contributions from Mali was supported by Fogarty International Center and National Institute of Allergy and Infectious Diseases sections of the National Institutes of Health under Leidos-15X051, award numbers U2RTW010673 for the West African Center of Excellence for Global Health Bioinformatics Research Training and U19AI089696 and U19AI129387 for the West Africa International Center of Excellence for Malaria Research. Funding for surveillance, sampling and testing in Madagascar: World Health Organization (WHO), the US Centers for Disease Control and Prevention (US CDC: Grant#U5/IP000812-05), the United States Agency for International Development (USAID: Cooperation Agreement 72068719CA00001), the Office of the Assistant Secretary for Preparedness and Response in the U.S. Department of Health and Human Services (DHHS: grant number IDSEP190051-01-0200). Funding for sequencing: Bill & Melinda Gates Foundation (GCE/ID OPP1211841), Chan Zuckerberg Biohub, and the Innovative Genomics Institute at UC Berkeley. Mozambique acknowledges support from the Mozambican Ministry of Health and the President’s Emergency Plan for AIDS Relief (PEPFAR) through the U.S. Centers for Disease Control and Prevention (CDC) under the terms of [grant number GH002021, GH001944], and the Bill & Melinda Gates Foundation, #OPP1214435. Namibian efforts was supported by Africa CDC through a subaward from the Bill and Melinda Gates Foundations. Efforts from the country Niger were supported by the French Ministry for Europe and Foreign Affairs through the REPAIR COVID-19-Africa project coordinated by the Pasteur International Network association. In addition to the funding support for ACEGID already listed, Nigeria’s contributions were made possible by support from Flu Lab and a cohort of donors through the Audacious Project, a collaborative funding initiative housed at TED, including the ELMA Foundation, MacKenzie Scott, the Skoll Foundation, and Open Philanthropy. Efforts from the Republic of the Congo was supported by the European and Developing Countries Clinical Trials Partnership (EDCTP) IDs: PANDORA, CANTAM and German Academic Exchange Service (DAAD) IDs: PACE-UP; DAAD Project ID: 5759234. Rwanda’s contributions were made possible by funding from the African Network for improved Diagnostics, Epidemiology and Management of common Infectious Agents (ANDEMIA) was granted by the German Federal Ministry of Education and Research (BMBF grant 01KA1606, 01KA2021 and 01KA2110B) and the National Institute of Health Research (NIHR) Global Health Research program (16/136/33) using UK aid from the UK Government. In addition to the South African institutions listed above, the University of Cape Town’s work was supported by the Wellcome Trust [Grant # 203135/Z/16/Z], EDCTP RADIATES (RIA2020EF-3030), the South African Department of Science and Innovation (SA DSI) and the South African Medical Research Council (SAMRC), Stellenbosch University’s contributions by the South African Medical Research Council (SA-MRC), and the University of Pretoria’s contributions funded by the G7 Global Health Fund and a BMBF ANDEMIA grant. Funding from the Fleming Fund supported sequencing in Sudan. The Ministry of Higher Education and Scientific Research of Tunisia provided funding for sequencing from Tunisia. UVRI (Uganda) acknowledge support from the Wellcome Trust and FCDO - Wellcome Epidemic Preparedness – Coronavirus (AFRICO19, grant agreement number 220977/Z/20/Z), from the MRC (MC_UU_1201412) and from the UK Medical Research Council (MRC/UKRI) and FCDO (DIASEQCO, grant agreement number NC_PC_19060). Research at the FredHutch institute which supported bioinformatics analyses of sequences in the present study was supported by the Bill and Melinda Gates foundation (#INV-018979). Research support from Broad Institute colleagues was made possible by support from Flu lab and a cohort of generous donors through TED’s Audacious Project, including the ELMA Foundation, MacKenzie Scott, the Skoll Foundation, Open Philanthropy, the Howard Hughes Medical Institute and NIH (U01AI151812 and U54HG007480) (P.C.S.). Work from Quadram Institute Bioscience was funded by The Biotechnology and Biological Sciences Research Council Institute Strategic Programme Microbes in the Food Chain BB/R012504/1 and its constituent projects BBS/E/F/000PR10348, BBS/E/F/000PR10349, BBS/E/F/000PR10351, and BBS/E/F/000PR10352 and by the Quadram Institute Bioscience BBSRC funded Core Capability Grant (project number BB/CCG1860/1). Sequences generated in Zambia through PATH were funded by the BMGF and Africa CDC. The content and findings reported herein are the sole deduction, view and responsibility of the researcher/s and do not reflect the official position and sentiments of the funding agencies.

Author contributions: Conceptualization: HT, CB, SKT, TdO, RL, EW; Methodology: HT, JES, MC, BT, GM, DPM, AWL, DAR, LMK, GG, TdO, RL, EW; Genomic Data Generation: HT, JES, MC, MM, BT, GM, DPM, AWL, AD, DGA, MMD, AS, ANZ, ASG, AKS, AO, AS, AOM, AKS, AGA, AL, AK, AEA, AAJ, AF, AOO, AAA, AJ, AK, AM, AR, AS, AK, AB, AC, AJT, AC, AKK, AK, AB, AS, AA, AN, AVG, AN, AJP, AY, AV, ANH, AC, AI, AM, ALB, AI, AAS, AG, AF, AES, BM, BLS, BSO, BB, BD, BLH, BT, BL, BM, BN, BTM, BAK, BK, BA, BP, BM, CB, CW, CN, CA, CBP, CS, CGA, CNA, CMM, CL, CKO, CI, CNM, CP, CG, CEO, CDR, CMM, CE, DBL, DJB, DM, DP, DB, DJN, DS, DT, DSA, DG, DSG, DOO, DM, DWW, EF, EKL, ES, EMO, ENN, EOA, EO, ES, EB, EBA, EAA, EL, EM, EP, EB, ES, EAA, FL, FMT, FW, FA, FTT, FD, FVA, FT, FO, FN, FMM, FER, FAD, FI, GKM, GT, GLK, GOA, GUvZ, GAA, GS, GPM, HCR, HEO, HO, HA, HK, HN, HT, HAAK, HE, HG, HM, HK, IS, IBO, IMA, IO, IBB, IAM, IS, IW, ISK, JWAH, JA, JS, JCM, JMT, JH, JGS, JG, JM, JN, JNU, JNB, JY, JM, JK, JDS, JH, JKO, JMM, JOG, JTK, JCO, JSX, JG, JFW, JHB, JN, JE, JN, JMN, JN, JUO, JCA, JJL, JJHM, JO, KJS, KV, KTA, KAT, KSC, KSM, KD, KGM, KOD, LF, LS, LMK, LB, LdOM, LC, LO, LDO, LLD, LIO, LT, MM, MR, MM, ME, MM, MIM, MK, MD, MM, MdLLM, MV, MFP, MF, MMN, MM, MD, MWM, MGM, MO, MRW, MYT, MOA, MA, MAB, MGS, MKK, MMM, MK, MS, MBM, MM, MA, MVP, NA, NR, NA, NI, NE, NMT, ND, NM, NH, NBS, NMF, NS, NB, NM, NG, NW, NS, NN, NAA, NT, NM, NHR, NI, NM, OCK, OS, OF, OMA, OT, OAO, OF, OEO, O-EO, OF, PS, PO, PC, PN, PS, PEO, PA, PKQ, POO, PB, PD, PAB, PKM, PK, PA, RE, RJ, RKA, RGE, RA, RN, ROP, RG, RAK, RMND, RAA, RAC, SG, SM, SB, SS, SIM, SF, SM, SH, SKK, SM, ST, SHA, SWM, SD, SM, SA, SSA, SMA, SE, SM, SL, SG, SJ, SFA, SO, SG, SL, SP, SO, SvW, SFS, SK, SA, SR, SP, SN, SB, SLB, SvdW, TM, TM, TL, TPV, TS, TGM, TB, UJA, UC, UR, UEG, VE, VN, VG, WHR, WAK, WKA, WP, WTC, YAA, YR, YB, YN, YB, ZRdL, AEO, AvG, GG, MM, OT, PCS, AAS, SOO, YKT, SKT, TdO, CH, RL, JN, EW; Data Analysis: HT, JES, MC, MM, BT, GM, DPM, AWL, AIE, DAR, EM, GSK, SvW, GG, TdO, RL, EW; Funding acquisition: AEO, AvG, GG, MM, OT, AAS, SOO, YKT, SKT, TdO, CH; Project administration: GM, AD, DGA, MMD, AC, DWW, HO, SWM, AEO, AvG, GG, MM, OT, PCS, AAS, SOO, YKT, SKT, TdO, CH, RL, JN, EW; Supervision: AEO, AvG, GG, MM, OT, PCS, AAS, SOO, YKT, SKT, TdO, CH, RL, JN, EW; Writing – original draft: HT, JES, MC, GM, DPM, CB, SKT, TdO, RL, EW; Writing – review and editing: HT, JES, MC, MM, BT, GM, DPM, CB, AWL, AD, DGA, MMD, AS, ANZ, ASG, AKS, AO, AS, AOM, AKS, AIE, AL, AK, AEA, AAJ, AF, AOO, AAA, AJ, AK, AM, AR, AS, AK, AB, AC, AJT, AC, AKK, AK, AB, AS, AA, AVG, AJP, AY, AV, ANH, AC, AI, AM, ALB, AI, AAS, AG, AF, AES, BM, BLS, BSO, BB, BD, BLH, BT, BL, BM, BN, BTM, BAK, BK, BA, BP, BM, CB, CW, CA, CBP, CS, CGA, CNA, CMM, CL, CKO, CI, CNM, CP, CEO, CDR, CMM, CE, DBL, DJB, DM, DP, DB, DJN, DS, DT, DSA, DG, DSG, DOO, DM, DWW, EF, EKL, ES, EMO, ENN, EOA, EO, ES, EB, EBA, EL, EM, EP, EB, ES, EAA, EM, FL, FMT, FW, FA, FTT, FD, FVA, FT, FO, FN, FMM, FER, FAD, FI, GKM, GT, GLK, GOA, GUvZ, GAA, GSK, GS, GPM, HCR, HEO, HO, HA, HK, HN, HT, HAAK, HE, HG, HM, HK, IS, IBO, IMA, IO, IBB, IS, IW, ISK, JWAH, JA, JS, JCM, JMT, JH, JGS, JG, JM, JNU, JNB, JY, JM, JK, JDS, JH, JKO, JMM, JOG, JTK, JCO, JSX, JG, JHB, JN, JE, JN, JMN, JN, JUO, JCA, JJL, JO, KJS, KV, KTA, KAT, KSC, KSM, KD, KGM, KOD, LF, LS, LB, LdOM, LC, LO, LLD, LIO, MM, MR, MM, ME, MM, MIM, MK, MD, MM, MdLLM, MV, MFP, MF, MMN, MM, MD, MWM, MGM, MO, MRW, MYT, MOA, MA, MAB, MGS, MKK, MMM, MK, MS, MBM, MM, MVP, NA, NR, NI, NMT, ND, NM, NH, NBS, NMF, NS, NB, NM, NG, NW, NS, NN, NAA, NT, NM, NHR, NI, NM, OCK, OS, OF, OMA, OT, OAO, OF, OEO, OF, PS, PO, PC, PN, PS, PEO, PA, PKQ, POO, PB, PD, PAB, PKM, PK, PA, RE, RJ, RKA, RGE, RA, RN, ROP, RG, RAK, RAA, RAC, SG, SM, SS, SIM, SF, SM, SH, SKK, SM, ST, SHA, SWM, SD, SM, SA, SSA, SMA, SE, SM, SL, SG, SJ, SFA, SO, SG, SL, SP, SO, SvW, SFS, SK, SA, SR, SP, SN, SB, SLB, SvdW, TM, TM, TL, TPV, TS, TGM, TB, UJA, UC, UR, UEG, VE, VN, VG, WHR, WAK, WKA, WP, WTC, YAA, YR, YB, YN, YB, ZRdL, AEO, AvG, GG, MM, OT, PCS, AAS, SOO, YKT, SKT, TdO, CH, RL, JN, EW.

Competing interests: With the exception of Pardis Sabeti who is a co-founder of and consultant to Sherlock Biosciences and a Board Member of Danaher Corporation and who holds equity in the companies, we the authors have no conflicts of interest to declare.

Data and materials availability: All of the SARS-CoV-2 whole genome sequences that were analyzed in the present study are all publicly available on the GISAID sequence database. A full list of the African sequences as well as global references are presented and acknowledged in table S4 and in our github repository (https://github.com/CERI-KRISP/SARS-CoV-2-epidemic-in-Africa) ( 76 ). The repositories also contain all of the metadata, raw and time scaled ML tree topologies, annotated tree topologies as well as data analysis and visualization scripts used here which will allow for the independent reproduction of results. Furthermore, the repositories also contain all Institutional Review Board (IRB) and Material Transfer Agreements (MTA). Please refer to the Ethics Statement in the Methods section for more details.

License information: This work is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. To view a copy of this license, visit https://creativecommons.org/licenses/by/4.0/. This license does not apply to figures/photos/artwork or other content included in the article that is credited to a third party; obtain authorization from the rights holder before using such material.

Supplementary Materials

This PDF file includes:

Figs. S1 to S16

Tables S1 to S4

Reference (77)

Other Supplementary Material for this manuscript includes the following:

MDAR Reproducibility Checklist

Tables S3 and S4

References and Notes

  • 1. Li Q., Guan X., Wu P., Wang X., Zhou L., Tong Y., Ren R., Leung K. S. M., Lau E. H. Y., Wong J. Y., Xing X., Xiang N., Wu Y., Li C., Chen Q., Li D., Liu T., Zhao J., Liu M., Tu W., Chen C., Jin L., Yang R., Wang Q., Zhou S., Wang R., Liu H., Luo Y., Liu Y., Shao G., Li H., Tao Z., Yang Y., Deng Z., Liu B., Ma Z., Zhang Y., Shi G., Lam T. T. Y., Wu J. T., Gao G. F., Cowling B. J., Yang B., Leung G. M., Feng Z., Early transmission dynamics in Wuhan, China, of novel coronavirus-infected pneumonia. N. Engl. J. Med. 382, 1199–1207 (2020). 10.1056/NEJMoa2001316 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Hasell J., Mathieu E., Beltekian D., Macdonald B., Giattino C., Ortiz-Ospina E., Roser M., Ritchie H., A cross-country database of COVID-19 testing. Sci. Data 7, 345 (2020). 10.1038/s41597-020-00688-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Viana R., Moyo S., Amoako D. G., Tegally H., Scheepers C., Althaus C. L., Anyaneji U. J., Bester P. A., Boni M. F., Chand M., Choga W. T., Colquhoun R., Davids M., Deforche K., Doolabh D., du Plessis L., Engelbrecht S., Everatt J., Giandhari J., Giovanetti M., Hardie D., Hill V., Hsiao N. Y., Iranzadeh A., Ismail A., Joseph C., Joseph R., Koopile L., Kosakovsky Pond S. L., Kraemer M. U. G., Kuate-Lere L., Laguda-Akingba O., Lesetedi-Mafoko O., Lessells R. J., Lockman S., Lucaci A. G., Maharaj A., Mahlangu B., Maponga T., Mahlakwane K., Makatini Z., Marais G., Maruapula D., Masupu K., Matshaba M., Mayaphi S., Mbhele N., Mbulawa M. B., Mendes A., Mlisana K., Mnguni A., Mohale T., Moir M., Moruisi K., Mosepele M., Motsatsi G., Motswaledi M. S., Mphoyakgosi T., Msomi N., Mwangi P. N., Naidoo Y., Ntuli N., Nyaga M., Olubayo L., Pillay S., Radibe B., Ramphal Y., Ramphal U., San J. E., Scott L., Shapiro R., Singh L., Smith-Lawrence P., Stevens W., Strydom A., Subramoney K., Tebeila N., Tshiabuila D., Tsui J., van Wyk S., Weaver S., Wibmer C. K., Wilkinson E., Wolter N., Zarebski A. E., Zuze B., Goedhals D., Preiser W., Treurnicht F., Venter M., Williamson C., Pybus O. G., Bhiman J., Glass A., Martin D. P., Rambaut A., Gaseitsiwe S., von Gottberg A., de Oliveira T., Rapid epidemic expansion of the SARS-CoV-2 Omicron variant in southern Africa. Nature 603, 679–686 (2022). 10.1038/s41586-022-04411-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Tegally H., Wilkinson E., Giovanetti M., Iranzadeh A., Fonseca V., Giandhari J., Doolabh D., Pillay S., San E. J., Msomi N., Mlisana K., von Gottberg A., Walaza S., Allam M., Ismail A., Mohale T., Glass A. J., Engelbrecht S., Van Zyl G., Preiser W., Petruccione F., Sigal A., Hardie D., Marais G., Hsiao N. Y., Korsman S., Davies M. A., Tyers L., Mudau I., York D., Maslo C., Goedhals D., Abrahams S., Laguda-Akingba O., Alisoltani-Dehkordi A., Godzik A., Wibmer C. K., Sewell B. T., Lourenço J., Alcantara L. C. J., Kosakovsky Pond S. L., Weaver S., Martin D., Lessells R. J., Bhiman J. N., Williamson C., de Oliveira T., Detection of a SARS-CoV-2 variant of concern in South Africa. Nature 592, 438–443 (2021). 10.1038/s41586-021-03402-9 [DOI] [PubMed] [Google Scholar]
  • 5. Martin D. P., Weaver S., Tegally H., San J. E., Shank S. D., Wilkinson E., Lucaci A. G., Giandhari J., Naidoo S., Pillay Y., Singh L., Lessells R. J., Gupta R. K., Wertheim J. O., Nekturenko A., Murrell B., Harkins G. W., Lemey P., MacLean O. A., Robertson D. L., de Oliveira T., Kosakovsky Pond S. L., NGS-SA, COVID-19 Genomics UK (COG-UK) , The emergence and ongoing convergent evolution of the SARS-CoV-2 N501Y lineages. Cell 184, 5189–5200.e7 (2021). 10.1016/j.cell.2021.09.003 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Campbell F., Archer B., Laurenson-Schafer H., Jinnai Y., Konings F., Batra N., Pavlin B., Vandemaele K., Van Kerkhove M. D., Jombart T., Morgan O., le Polain de Waroux O., Increased transmissibility and global spread of SARS-CoV-2 variants of concern as at June 2021. Euro Surveill. 26, (2021). 10.2807/1560-7917.ES.2021.26.24.2100509 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Korber B., Fischer W. M., Gnanakaran S., Yoon H., Theiler J., Abfalterer W., Hengartner N., Giorgi E. E., Bhattacharya T., Foley B., Hastie K. M., Parker M. D., Partridge D. G., Evans C. M., Freeman T. M., de Silva T. I., McDanal C., Perez L. G., Tang H., Moon-Walker A., Whelan S. P., LaBranche C. C., Saphire E. O., Montefiori D. C., Angyal A., Brown R. L., Carrilero L., Green L. R., Groves D. C., Johnson K. J., Keeley A. J., Lindsey B. B., Parsons P. J., Raza M., Rowland-Jones S., Smith N., Tucker R. M., Wang D., Wyles M. D., Sheffield COVID-19 Genomics Group , Tracking changes in SARS-CoV-2 spike: Evidence that D614G increases infectivity of the COVID-19 virus. Cell 182, 812–827.e19 (2020). 10.1016/j.cell.2020.06.043 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Hacisuleyman E., Hale C., Saito Y., Blachere N. E., Bergh M., Conlon E. G., Schaefer-Babajew D. J., DaSilva J., Muecksch F., Gaebler C., Lifton R., Nussenzweig M. C., Hatziioannou T., Bieniasz P. D., Darnell R. B., Vaccine breakthrough infections with SARS-CoV-2 variants. N. Engl. J. Med. 384, 2212–2218 (2021). 10.1056/NEJMoa2105000 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Planas D., Veyer D., Baidaliuk A., Staropoli I., Guivel-Benhassine F., Rajah M. M., Planchais C., Porrot F., Robillard N., Puech J., Prot M., Gallais F., Gantner P., Velay A., Le Guen J., Kassis-Chikhani N., Edriss D., Belec L., Seve A., Courtellemont L., Péré H., Hocqueloux L., Fafi-Kremer S., Prazuck T., Mouquet H., Bruel T., Simon-Lorière E., Rey F. A., Schwartz O., Reduced sensitivity of SARS-CoV-2 variant Delta to antibody neutralization. Nature 596, 276–280 (2021). 10.1038/s41586-021-03777-9 [DOI] [PubMed] [Google Scholar]
  • 10. Yue S., Li Z., Lin Y., Yang Y., Yuan M., Pan Z., Hu L., Gao L., Zhou J., Tang J., Wang Y., Tian Q., Hao Y., Wang J., Huang Q., Xu L., Zhu B., Liu P., Deng K., Wang L., Ye L., Chen X., Sensitivity of SARS-CoV-2 variants to neutralization by convalescent sera and a VH3-30 monoclonal antibody. Front. Immunol. 12, 751584 (2021). 10.3389/fimmu.2021.751584 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Cele S., Gazy I., Jackson L., Hwa S.-H., Tegally H., Lustig G., Giandhari J., Pillay S., Wilkinson E., Naidoo Y., Karim F., Ganga Y., Khan K., Bernstein M., Balazs A. B., Gosnell B. I., Hanekom W., Moosa M. S., Lessells R. J., de Oliveira T., Sigal A., Network for Genomic Surveillance in South Africa, COMMIT-KZN Team , Escape of SARS-CoV-2 501Y.V2 from neutralization by convalescent plasma. Nature 593, 142–146 (2021). 10.1038/s41586-021-03471-w [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Meng B., Kemp S. A., Papa G., Datir R., Ferreira I. A. T. M., Marelli S., Harvey W. T., Lytras S., Mohamed A., Gallo G., Thakur N., Collier D. A., Mlcochova P., Duncan L. M., Carabelli A. M., Kenyon J. C., Lever A. M., De Marco A., Saliba C., Culap K., Cameroni E., Matheson N. J., Piccoli L., Corti D., James L. C., Robertson D. L., Bailey D., Gupta R. K., COVID-19 Genomics UK (COG-UK) Consortium , Recurrent emergence of SARS-CoV-2 spike deletion H69/V70 and its role in the Alpha variant B.1.1.7. Cell Rep. 35, 109292 (2021). 10.1016/j.celrep.2021.109292 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Mlcochova P., Kemp S. A., Dhar M. S., Papa G., Meng B., Ferreira I. A. T. M., Datir R., Collier D. A., Albecka A., Singh S., Pandey R., Brown J., Zhou J., Goonawardane N., Mishra S., Whittaker C., Mellan T., Marwal R., Datta M., Sengupta S., Ponnusamy K., Radhakrishnan V. S., Abdullahi A., Charles O., Chattopadhyay P., Devi P., Caputo D., Peacock T., Wattal C., Goel N., Satwik A., Vaishya R., Agarwal M., Mavousian A., Lee J. H., Bassi J., Silacci-Fegni C., Saliba C., Pinto D., Irie T., Yoshida I., Hamilton W. L., Sato K., Bhatt S., Flaxman S., James L. C., Corti D., Piccoli L., Barclay W. S., Rakshit P., Agrawal A., Gupta R. K., Indian SARS-CoV-2 Genomics Consortium (INSACOG), Genotype to Phenotype Japan (G2P-Japan) Consortium, CITIID-NIHR BioResource COVID-19 Collaboration , SARS-CoV-2 B.1.617.2 Delta variant replication and immune evasion. Nature 599, 114–119 (2021). 10.1038/s41586-021-03944-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Faria N. R., Mellan T. A., Whittaker C., Claro I. M., Candido D. D. S., Mishra S., Crispim M. A. E., Sales F. C. S., Hawryluk I., McCrone J. T., Hulswit R. J. G., Franco L. A. M., Ramundo M. S., de Jesus J. G., Andrade P. S., Coletti T. M., Ferreira G. M., Silva C. A. M., Manuli E. R., Pereira R. H. M., Peixoto P. S., Kraemer M. U. G., Gaburo N. Jr., Camilo C. D. C., Hoeltgebaum H., Souza W. M., Rocha E. C., de Souza L. M., de Pinho M. C., Araujo L. J. T., Malta F. S. V., de Lima A. B., Silva J. D. P., Zauli D. A. G., Ferreira A. C. S., Schnekenberg R. P., Laydon D. J., Walker P. G. T., Schlüter H. M., Dos Santos A. L. P., Vidal M. S., Del Caro V. S., Filho R. M. F., Dos Santos H. M., Aguiar R. S., Proença-Modena J. L., Nelson B., Hay J. A., Monod M., Miscouridou X., Coupland H., Sonabend R., Vollmer M., Gandy A., Prete C. A. Jr., Nascimento V. H., Suchard M. A., Bowden T. A., Pond S. L. K., Wu C.-H., Ratmann O., Ferguson N. M., Dye C., Loman N. J., Lemey P., Rambaut A., Fraiji N. A., Carvalho M. D. P. S. S., Pybus O. G., Flaxman S., Bhatt S., Sabino E. C., Genomics and epidemiology of the P.1 SARS-CoV-2 lineage in Manaus, Brazil. Science 372, 815–821 (2021). 10.1126/science.abh2644 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Wilkinson E., Giovanetti M., Tegally H., San J. E., Lessells R., Cuadros D., Martin D. P., Rasmussen D. A., Zekri A. N., Sangare A. K., Ouedraogo A.-S., Sesay A. K., Priscilla A., Kemi A.-S., Olubusuyi A. M., Oluwapelumi A. O. O., Hammami A., Amuri A. A., Sayed A., Ouma A. E. O., Elargoubi A., Ajayi N. A., Victoria A. F., Kazeem A., George A., Trotter A. J., Yahaya A. A., Keita A. K., Diallo A., Kone A., Souissi A., Chtourou A., Gutierrez A. V., Page A. J., Vinze A., Iranzadeh A., Lambisia A., Ismail A., Rosemary A., Sylverken A., Femi A., Ibrahimi A., Marycelin B., Oderinde B. S., Bolajoko B., Dhaala B., Herring B. L., Njanpop-Lafourcade B.-M., Kleinhans B., McInnis B., Tegomoh B., Brook C., Pratt C. B., Scheepers C., Akoua-Koffi C. G., Agoti C. N., Peyrefitte C., Daubenberger C., Morang’a C. M., Nokes D. J., Amoako D. G., Bugembe D. L., Park D., Baker D., Doolabh D., Ssemwanga D., Tshiabuila D., Bassirou D., Amuzu D. S. Y., Goedhals D., Omuoyo D. O., Maruapula D., Foster-Nyarko E., Lusamaki E. K., Simulundu E., Ong’era E. M., Ngabana E. N., Shumba E., El Fahime E., Lokilo E., Mukantwari E., Philomena E., Belarbi E., Simon-Loriere E., Anoh E. A., Leendertz F., Ajili F., Enoch F. O., Wasfi F., Abdelmoula F., Mosha F. S., Takawira F. T., Derrar F., Bouzid F., Onikepe F., Adeola F., Muyembe F. M., Tanser F., Dratibi F. A., Mbunsu G. K., Thilliez G., Kay G. L., Githinji G., van Zyl G., Awandare G. A., Schubert G., Maphalala G. P., Ranaivoson H. C., Lemriss H., Anise H., Abe H., Karray H. H., Nansumba H., Elgahzaly H. A., Gumbo H., Smeti I., Ayed I. B., Odia I., Ben Boubaker I. B., Gaaloul I., Gazy I., Mudau I., Ssewanyana I., Konstantinus I., Lekana-Douk J. B., Makangara J.-C. C., Tamfum J. M., Heraud J.-M., Shaffer J. G., Giandhari J., Li J., Yasuda J., Mends J. Q., Kiconco J., Morobe J. M., Gyapong J. O., Okolie J. C., Kayiwa J. T., Edwards J. A., Gyamfi J., Farah J., Nakaseegu J., Ngoi J. M., Namulondo J., Andeko J. C., Lutwama J. J., O’Grady J., Siddle K., Adeyemi K. T., Tumedi K. A., Said K. M., Hae-Young K., Duedu K. O., Belyamani L., Fki-Berrajah L., Singh L., Martins L. O., Tyers L., Ramuth M., Mastouri M., Aouni M., El Hefnawi M., Matsheka M. I., Kebabonye M., Diop M., Turki M., Paye M., Nyaga M. M., Mareka M., Damaris M.-M., Mburu M. W., Mpina M., Nwando M., Owusu M., Wiley M. R., Youtchou M. T., Ayekaba M. O., Abouelhoda M., Seadawy M. G., Khalifa M. K., Sekhele M., Ouadghiri M., Diagne M. M., Mwenda M., Allam M., Phan M. V. T., Abid N., Touil N., Rujeni N., Kharrat N., Ismael N., Dia N., Mabunda N., Hsiao N. Y., Silochi N. B., Nsenga N., Gumede N., Mulder N., Ndodo N., Razanajatovo N. H., Iguosadolo N., Judith O., Kingsley O. C., Sylvanus O., Peter O., Femi O., Idowu O., Testimony O., Chukwuma O. E., Ogah O. E., Onwuamah C. K., Cyril O., Faye O., Tomori O., Ondoa P., Combe P., Semanda P., Oluniyi P. E., Arnaldo P., Quashie P. K., Dussart P., Bester P. A., Mbala P. K., Ayivor-Djanie R., Njouom R., Phillips R. O., Gorman R., Kingsley R. A., Carr R. A. A., El Kabbaj S., Gargouri S., Masmoudi S., Sankhe S., Lawal S. B., Kassim S., Trabelsi S., Metha S., Kammoun S., Lemriss S., Agwa S. H. A., Calvignac-Spencer S., Schaffner S. F., Doumbia S., Mandanda S. M., Aryeetey S., Ahmed S. S., Elhamoumi S., Andriamandimby S., Tope S., Lekana-Douki S., Prosolek S., Ouangraoua S., Mundeke S. A., Rudder S., Panji S., Pillay S., Engelbrecht S., Nabadda S., Behillil S., Budiaki S. L., van der Werf S., Mashe T., Aanniz T., Mohale T., Le-Viet T., Schindler T., Anyaneji U. J., Chinedu U., Ramphal U., Jessica U., George U., Fonseca V., Enouf V., Gorova V., Roshdy W. H., Ampofo W. K., Preiser W., Choga W. T., Bediako Y., Naidoo Y., Butera Y., de Laurent Z. R., Sall A. A., Rebai A., von Gottberg A., Kouriba B., Williamson C., Bridges D. J., Chikwe I., Bhiman J. N., Mine M., Cotten M., Moyo S., Gaseitsiwe S., Saasa N., Sabeti P. C., Kaleebu P., Tebeje Y. K., Tessema S. K., Happi C., Nkengasong J., de Oliveira T., A year of genomic surveillance reveals how the SARS-CoV-2 pandemic unfolded in Africa. Science 374, 423–431 (2021). 10.1126/science.abj4336 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Shu Y., McCauley J., GISAID: Global initiative on sharing all influenza data - from vision to reality. Euro Surveill. 22, 30494 (2017). 10.2807/1560-7917.ES.2017.22.13.30494 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Kuiken C., Korber B., Shafer R. W., HIV sequence databases. AIDS Rev. 5, 52–61 (2003). [PMC free article] [PubMed] [Google Scholar]
  • 18. Bugembe D. L., Kayiwa J., Phan M. V. T., Tushabe P., Balinandi S., Dhaala B., Lexow J., Mwebesa H., Aceng J., Kyobe H., Ssemwanga D., Lutwama J., Kaleebu P., Cotten M., Main routes of entry and genomic diversity of SARS-CoV-2, Uganda. Emerg. Infect. Dis. 26, 2411–2415 (2020). 10.3201/eid2610.202575 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Mashe T., Takawira F. T., de Oliveira Martins L., Gudza-Mugabe M., Chirenda J., Munyanyi M., Chaibva B. V., Tarupiwa A., Gumbo H., Juru A., Nyagupe C., Ruhanya V., Phiri I., Manangazira P., Goredema A., Danda S., Chabata I., Jonga J., Munharira R., Masunda K., Mukeredzi I., Mangwanya D., Trotter A., Le Viet T., Rudder S., Kay G., Baker D., Thilliez G., Gutierrez A. V., O’Grady J., Hove M., Mutapuri-Zinyowera S., Page A. J., Kingsley R. A., Mhlanga G., COVID-19 Genomics UK Consortium, SARS-CoV-2 Research Group , Genomic epidemiology and the role of international and regional travel in the SARS-CoV-2 epidemic in Zimbabwe: A retrospective study of routinely collected surveillance data. Lancet Glob. Health 9, e1658–e1666 (2021). 10.1016/S2214-109X(21)00434-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Chouikha A., Fares W., Laamari A., Haddad-Boubaker S., Belaiba Z., Ghedira K., Kammoun Rebai W., Ayouni K., Khedhiri M., Ben Halima S., Krichen H., Touzi H., Ben Dhifallah I., Guerfali F. Z., Atri C., Azouz S., Khamessi O., Ardhaoui M., Safer M., Ben Alaya N., Guizani I., Kefi R., Gdoura M., Triki H., Molecular epidemiology of SARS-CoV-2 in Tunisia (North Africa) through several successive waves of COVID-19. Viruses 14, 624 (2022). 10.3390/v14030624 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Ntoumi F., Mfoutou Mapanguy C. C., Tomazatos A., Pallerla S. R., Linh L. T. K., Casadei N., Angelov A., Sonnabend M., Peter S., Kremsner P. G., Velavan T. P., Genomic surveillance of SARS-CoV-2 in the Republic of Congo. Int. J. Infect. Dis. 105, 735–738 (2021). 10.1016/j.ijid.2021.03.036 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Y. Butera, E. Mukantwari, M. Artesi, J. D’Arc Umuringa, Á. N. O’Toole, V. Hill, S. Rooke, S. L. Hong, S. Dellicour, O. Majyambere, S. Bontems, B. Boujemla, J. Quick, P. C. Resende, N. Loman, E. Umumararungu, A. Kabanda, M. M. Murindahabi, P. Tuyisenge, M. Gashegu, N. Rujeni, Genomic sequencing of SARS-CoV-2 in Rwanda: Evolution and regional dynamics. medRxiv 2021.04.02.21254839 [Preprint] (2021); . 10.1101/2021.04.02.21254839 [DOI]
  • 23. N Agoti C., Githinji G., S Mohammed K., W Lambisia A., R de Laurent Z., W Mburu M., M Ong’era E., M Morobe J., Otieno E., Abdou Azali H., Said Abdallah K., Diarra A., Ahmed Yahaya A., Borus P., Gumede Moeletsi N., Fred Athanasius D., Tsofa B., Bejon P., James Nokes D., Isabella Ochola-Oyier L., Detection of SARS-CoV-2 variant 501Y.V2 in Comoros Islands in January 2021. Wellcome Open Res. 6, 192 (2021). 10.12688/wellcomeopenres.16889.1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Morobe J. M., Pool B., Marie L., Didon D., Lambisia A. W., Makori T., Mohammed K. S., de Laurent Z. R., Ndwiga L., Mburu M. W., Moraa E., Murunga N., Musyoki J., Mwacharo J., Nyamako L., Riako D., Ephnatus P., Gambo F., Naimani J., Namulondo J., Tembo S. Z., Ogendi E., Balde T., Dratibi F. A., Yahaya A. A., Gumede N., Achilla R. A., Borus P. K., Wanjohi D. W., Tessema S. K., Mwangangi J., Bejon P., Nokes D. J., Ochola-Oyier L. I., Githinji G., Biscornet L., Agoti C. N., Genomic Epidemiology of SARS-CoV-2 in Seychelles, 2020-2021. Viruses 14, 1318 (2022). 10.3390/v14061318 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Morang’a C. M., Ngoi J. M., Gyamfi J., Amuzu D. S. Y., Nuertey B. D., Soglo P. M., Appiah V., Asante I. A., Owusu-Oduro P., Armoo S., Adu-Gyasi D., Amoako N., Oliver-Commey J., Owusu M., Sylverken A., Fenteng E. D., M’cormack V. V., Tei-Maya F., Quansah E. B., Ayivor-Djanie R., Amoako E. K., Ogbe I. T., Yemi B. K., Osei-Wusu I., Mettle D. N. A., Saiid S., Tapela K., Dzabeng F., Magnussen V., Quaye J., Opurum P. C., Carr R. A., Ababio P. T., Abass A. K., Akoriyea S. K., Amoako E., Kumi-Ansah F., Boakye O. D., Mibut D. K., Odoom T., Ofori-Boadu L., Allegye-Cudjoe E., Dassah S., Asoala V., Asante K. P., Phillips R. O., Osei-Atweneboana M. Y., Gyapong J. O., Kuma-Aboagye P., Ampofo W. K., Duedu K. O., Ndam N. T., Bediako Y., Quashie P. K., Amenga-Etego L. N., Awandare G. A., Genetic diversity of SARS-CoV-2 infections in Ghana from 2020-2021. Nat. Commun. 13, 2494 (2022). 10.1038/s41467-022-30219-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Agoti C. N., Ochola-Oyier L. I., Dellicour S., Mohammed K. S., Lambisia A. W., de Laurent Z. R., Morobe J. M., Mburu M. W., Omuoyo D. O., Ongera E. M., Ndwiga L., Maitha E., Kitole B., Suleiman T., Mwakinangu M., Nyambu J. K., Otieno J., Salim B., Musyoki J., Murunga N., Otieno E., Kiiru J. N., Kasera K., Amoth P., Mwangangi M., Aman R., Kinyanjui S., Warimwe G., Phan M., Agweyu A., Cotten M., Barasa E., Tsofa B., Nokes D. J., Bejon P., Githinji G., Transmission networks of SARS-CoV-2 in Coastal Kenya during the first two waves: A retrospective genomic study. eLife 11, e71703 (2022). 10.7554/eLife.71703 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27. Brand S. P. C., Ojal J., Aziza R., Were V., Okiro E. A., Kombe I. K., Mburu C., Ogero M., Agweyu A., Warimwe G. M., Nyagwange J., Karanja H., Gitonga J. N., Mugo D., Uyoga S., Adetifa I. M. O., Scott J. A. G., Otieno E., Murunga N., Otiende M., Ochola-Oyier L. I., Agoti C. N., Githinji G., Kasera K., Amoth P., Mwangangi M., Aman R., Ng’ang’a W., Tsofa B., Bejon P., Keeling M. J., Nokes D. J., Barasa E., COVID-19 transmission dynamics underlying epidemic waves in Kenya. Science 374, 989–994 (2021). 10.1126/science.abk0414 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Githinji G., de Laurent Z. R., Mohammed K. S., Omuoyo D. O., Macharia P. M., Morobe J. M., Otieno E., Kinyanjui S. M., Agweyu A., Maitha E., Kitole B., Suleiman T., Mwakinangu M., Nyambu J., Otieno J., Salim B., Kasera K., Kiiru J., Aman R., Barasa E., Warimwe G., Bejon P., Tsofa B., Ochola-Oyier L. I., Nokes D. J., Agoti C. N., Tracking the introduction and spread of SARS-CoV-2 in coastal Kenya. Nat. Commun. 12, 4809 (2021). 10.1038/s41467-021-25137-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Tegally H., Wilkinson E., Lessells R. J., Giandhari J., Pillay S., Msomi N., Mlisana K., Bhiman J. N., von Gottberg A., Walaza S., Fonseca V., Allam M., Ismail A., Glass A. J., Engelbrecht S., Van Zyl G., Preiser W., Williamson C., Petruccione F., Sigal A., Gazy I., Hardie D., Hsiao N. Y., Martin D., York D., Goedhals D., San E. J., Giovanetti M., Lourenço J., Alcantara L. C. J., de Oliveira T., Sixteen novel lineages of SARS-CoV-2 in South Africa. Nat. Med. 27, 440–446 (2021). 10.1038/s41591-021-01255-3 [DOI] [PubMed] [Google Scholar]
  • 30.W. H. Roshdy, M. K. Khalifa, J. E. San, H. Tegally, E. Wilkinson, S. Showky, D. P. Martin, M. Moir, A. Naguib, N. Elguindy, M. R. Gomaa, M. Fahim, H. A. Elsood, A. A. Mohsen, R. Galal, M. Hassany, R. J. Lessells, A. A. Al Karmalawy, R. E. L. Shesheny, A. M. Kandeil, T. de Oliveira, SARS-CoV-2 Genetic diversity and lineage dynamics of in Egypt. medRxiv 2022.01.05.22268646 [Preprint] (2022); . 10.1101/2022.01.05.22268646 [DOI]
  • 31. Scheepers C., Everatt J., Amoako D. G., Tegally H., Wibmer C. K., Mnguni A., Ismail A., Mahlangu B., Lambson B. E., Martin D. P., Wilkinson E., San J. E., Giandhari J., Manamela N., Ntuli N., Kgagudi P., Cele S., Richardson S. I., Pillay S., Mohale T., Ramphal U., Naidoo Y., Khumalo Z. T., Kwatra G., Gray G., Bekker L.-G., Madhi S. A., Baillie V., Van Voorhis W. C., Treurnicht F. K., Venter M., Mlisana K., Wolter N., Sigal A., Williamson C., Hsiao N. Y., Msomi N., Maponga T., Preiser W., Makatini Z., Lessells R., Moore P. L., de Oliveira T., von Gottberg A., Bhiman J. N., Emergence and phenotypic characterization of the global SARS-CoV-2 C.1.2 lineage. Nat. Commun. 13, 1976 (2022). 10.1038/s41467-022-29579-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Dudas G., Hong S. L., Potter B. I., Calvignac-Spencer S., Niatou-Singa F. S., Tombolomako T. B., Fuh-Neba T., Vickos U., Ulrich M., Leendertz F. H., Khan K., Huber C., Watts A., Olendraitė I., Snijder J., Wijnant K. N., Bonvin A. M. J. J., Martres P., Behillil S., Ayouba A., Maidadi M. F., Djomsi D. M., Godwe C., Butel C., Šimaitis A., Gabrielaitė M., Katėnaitė M., Norvilas R., Raugaitė L., Koyaweda G. W., Kandou J. K., Jonikas R., Nasvytienė I., Žemeckienė Ž., Gečys D., Tamušauskaitė K., Norkienė M., Vasiliūnaitė E., Žiogienė D., Timinskas A., Šukys M., Šarauskas M., Alzbutas G., Aziza A. A., Lusamaki E. K., Cigolo J. M., Mawete F. M., Lofiko E. L., Kingebeni P. M., Tamfum J. M., Belizaire M. R. D., Essomba R. G., Assoumou M. C. O., Mboringong A. B., Dieng A. B., Juozapaitė D., Hosch S., Obama J., Ayekaba M. O., Naumovas D., Pautienius A., Rafaï C. D., Vitkauskienė A., Ugenskienė R., Gedvilaitė A., Čereškevičius D., Lesauskaitė V., Žemaitis L., Griškevičius L., Baele G., Emergence and spread of SARS-CoV-2 lineage B.1.620 with variant of concern-like mutations and deletions. Nat. Commun. 12, 5769 (2021). 10.1038/s41467-021-26055-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Bugembe D. L., Phan M. V. T., Ssewanyana I., Semanda P., Nansumba H., Dhaala B., Nabadda S., O’Toole Á. N., Rambaut A., Kaleebu P., Cotten M., Emergence and spread of a SARS-CoV-2 lineage A variant (A.23.1) with altered spike protein in Uganda. Nat. Microbiol. 6, 1094–1101 (2021). 10.1038/s41564-021-00933-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Tegally H., Moir M., Everatt J., Giovanetti M., Scheepers C., Wilkinson E., Subramoney K., Makatini Z., Moyo S., Amoako D. G., Baxter C., Althaus C. L., Anyaneji U. J., Kekana D., Viana R., Giandhari J., Lessells R. J., Maponga T., Maruapula D., Choga W., Matshaba M., Mbulawa M. B., Msomi N., Naidoo Y., Pillay S., Sanko T. J., San J. E., Scott L., Singh L., Magini N. A., Smith-Lawrence P., Stevens W., Dor G., Tshiabuila D., Wolter N., Preiser W., Treurnicht F. K., Venter M., Chiloane G., McIntyre C., O’Toole A., Ruis C., Peacock T. P., Roemer C., Kosakovsky Pond S. L., Williamson C., Pybus O. G., Bhiman J. N., Glass A., Martin D. P., Jackson B., Rambaut A., Laguda-Akingba O., Gaseitsiwe S., von Gottberg A., de Oliveira T., NGS-SA Consortium , Emergence of SARS-CoV-2 Omicron lineages BA.4 and BA.5 in South Africa. Nat. Med. (2022). 10.1038/s41591-022-01911-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. Madhi S. A., Kwatra G., Myers J. E., Jassat W., Dhar N., Mukendi C. K., Nana A. J., Blumberg L., Welch R., Ngorima-Mabhena N., Mutevedzi P. C., Population immunity and Covid-19 severity with omicron variant in South Africa. N. Engl. J. Med. 386, 1314–1326 (2022). 10.1056/NEJMoa2119658 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36. Wolter N., Jassat W., Walaza S., Welch R., Moultrie H., Groome M., Amoako D. G., Everatt J., Bhiman J. N., Scheepers C., Tebeila N., Chiwandire N., du Plessis M., Govender N., Ismail A., Glass A., Mlisana K., Stevens W., Treurnicht F. K., Makatini Z., Hsiao N. Y., Parboosing R., Wadula J., Hussey H., Davies M. A., Boulle A., von Gottberg A., Cohen C., Early assessment of the clinical severity of the SARS-CoV-2 omicron variant in South Africa: A data linkage study. Lancet 399, 437–446 (2022). 10.1016/S0140-6736(22)00017-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Lewis H. C., Ware H., Whelan M., Subissi L., Li Z., Ma X., Nardone A., Valenciano M., Cheng B., Noel K., Cao C., Yanes-Lane M., Herring B. L., Talisuna A., Ngoy N., Balde T., Clifton D., Van Kerkhove M. D., Buckeridge D., Bobrovitz N., Okeibunor J., Arora R. K., Bergeri I., UNITY Studies Collaborator Group , SARS-CoV-2 infection in Africa: A systematic review and meta-analysis of standardised seroprevalence studies, from January 2020 to December 2021. BMJ Glob. Health 7, e008793 (2022). 10.1136/bmjgh-2022-008793 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38. Barber R. M., Sorensen R. J. D., Pigott D. M., Bisignano C., Carter A., Amlag J. O., Collins J. K., Abbafati C., Adolph C., Allorant A., Aravkin A. Y., Bang-Jensen B. L., Castro E., Chakrabarti S., Cogen R. M., Combs E., Comfort H., Cooperrider K., Dai X., Daoud F., Deen A., Earl L., Erickson M., Ewald S. B., Ferrari A. J., Flaxman A. D., Frostad J. J., Fullman N., Giles J. R., Guo G., He J., Helak M., Hulland E. N., Huntley B. M., Lazzar-Atwood A., LeGrand K. E., Lim S. S., Lindstrom A., Linebarger E., Lozano R., Magistro B., Malta D. C., Månsson J., Mantilla Herrera A. M., Mokdad A. H., Monasta L., Naghavi M., Nomura S., Odell C. M., Olana L. T., Ostroff S. M., Pasovic M., Pease S. A., Reiner R. C. Jr., Reinke G., Ribeiro A. L. P., Santomauro D. F., Sholokhov A., Spurlock E. E., Syailendrawati R., Topor-Madry R., Vo A. T., Vos T., Walcott R., Walker A., Wiens K. E., Wiysonge C. S., Worku N. A., Zheng P., Hay S. I., Gakidou E., Murray C. J. L., COVID-19 Cumulative Infection Collaborators , Estimating global, regional, and national daily and cumulative infections with SARS-CoV-2 through Nov 14, 2021: A statistical analysis. Lancet 399, 2351–2380 (2022). 10.1016/S0140-6736(22)00484-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.J. Quick, nCoV-2019 sequencing protocol v3 (LoCost) (2020).
  • 40.J. R. Tyson, P. James, D. Stoddart, N. Sparks, A. Wickenhagen, G. Hall, J. H. Choi, H. Lapointe, K. Kamelian, A. D. Smith, N. Prystajecky, I. Goodfellow, S. J. Wilson, R. Harrigan, T. P. Snutch, N. J. Loman, J. Quick, Improvements to the ARTIC multiplex PCR method for SARS-CoV-2 genome sequencing using nanopore. bioRxiv 2020.09.04.283077 [Preprint] (2020); . 10.1101/2020.09.04.283077 [DOI]
  • 41. Cotten M., Lule Bugembe D., Kaleebu P., V T Phan M., Alternate primers for whole-genome SARS-CoV-2 sequencing. Virus Evol. 7, veab006 (2021). 10.1093/ve/veab006 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.H. Tegally, M. Ramuth, D. Amoaka, C. Scheepers, E. Wilkinson, M. Giovanetti, R. J. Lessells, J. Giandhari, A. Ismail, D. Martin, E. J. San, M. Crawford, R. S. Daniels, R. Harvey, S. Bahadoor, J. Sonoo, M. Timol, L. Veerapa-Mangroo, A. von Gottberg, J. Bhiman, T. de Oliveira, S. Manraj, A novel and expanding SARS-CoV-2 variant, B.1.1.318, dominates infections in Mauritius. medRxiv 2021.06.16.21259017 [Preprint] (2021); . 10.1101/2021.06.16.21259017 [DOI]
  • 43. Zekri A. N., Bahnasy A. A., Hafez M. M., Hassan Z. K., Ahmed O. S., Soliman H. K., El-Sisi E. R., Dine M. H. S. E., Solimane M. S., Latife L. S. A., Seadawy M. G., Elsafty A. S., Abouelhoda M., Characterization of the SARS-CoV-2 genomes in Egypt in first and second waves of infection. Sci. Rep. 11, 21632 (2021). 10.1038/s41598-021-99014-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44. Nasimiyu C., Matoke-Muhia D., Rono G. K., Osoro E., Obado D. O., Mwangi J. M., Mwikwabe N., Thiongâ O K., Dawa J., Ngere I., Gachohi J., Kariuki S., Amukoye E., Mureithi M., Ngere P., Amoth P., Were I., Makayotto L., Nene V., Abworo E. O., Njenga M. K., Seifert S. N., Oyola S. O., Imported SARS-COV-2 variants of concern drove spread of infections across Kenya during the second year of the pandemic. COVID 2, 586–598 (2022). 10.3390/covid2050044 35262086 [DOI] [Google Scholar]
  • 45. Kraemer M. U. G., Hill V., Ruis C., Dellicour S., Bajaj S., McCrone J. T., Baele G., Parag K. V., Battle A. L., Gutierrez B., Jackson B., Colquhoun R., O’Toole Á., Klein B., Vespignani A., Volz E., Faria N. R., Aanensen D. M., Loman N. J., du Plessis L., Cauchemez S., Rambaut A., Scarpino S. V., Pybus O. G., COVID-19 Genomics UK (COG-UK) Consortium , Spatiotemporal invasion dynamics of SARS-CoV-2 lineage B.1.1.7 emergence. Science 373, 889–895 (2021). 10.1126/science.abj0113 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.S. J. Lycett, J. Hughes, M. P. McHugh, A. da Silva Filipe, R. Dewar, L. Lu, T. Doherty, A. Shepherd, R. Inward, G. Rossi, D. Balaz, R. R. Kao, S. Rooke, S. Cotton, M. D. Gallagher, C. B. Lopez, Á. O’Toole, E. Scher, V. Hill, J. T. McCrone, R. M. Colquhoun, B. Jackson, T. C. Williams, K. A. Williamson, N. Johnson, K. Smollett, D. Mair, S. Carmichael, L. Tong, J. Nichols, K. Brunker, J. G. Shepherd, K. Li, E. Aranday-Cortes, Y. A. Parr, A. Broos, K. Nomikou, S. E. McDonald, M. Niebel, P. Asamaphan, I. Starinskij, N. Jesudason, R. Shah, V. B. Sreenu, T. Stanton, S. Shaaban, A. MacLean, M. Woolhouse, R. Gunson, K. Templeton, E. C. Thomson, A. Rambaut, M. T. G. Holden, D. L. Robertson, COVID-19 Genomics UK (COG-UK) Consortium, Epidemic waves of COVID-19 in Scotland: a genomic perspective on the impact of the introduction and relaxation of lockdown on SARS-CoV-2. medRxiv 2021.01.08.20248677 [Preprint] (2021); . 10.1101/2021.01.08.20248677 [DOI]
  • 47.P. W. G. Mallon, F. Crispie, G. Gonzalez, W. Tinago, A. A. Garcia Leon, M. McCabe, E. de Barra, O. Yousif, J. S. Lambert, C. J. Walsh, J. G. Kenny, E. Feeney, M. Carr, P. Doran, P. D. Cotter, Whole-genome sequencing of SARS-CoV-2 in the Republic of Ireland during waves 1 and 2 of the pandemic. medRxiv 2021.02.09.21251402 [Preprint] (2021); . 10.1101/2021.02.09.21251402 [DOI]
  • 48.H. Tegally, E. Wilkinson, C. L. Althaus, M. Giovanetti, J. E. San, J. Giandhari, S. Pillay, Y. Naidoo, U. Ramphal, N. Msomi, K. Mlisana, D. G. Amoako, J. Everatt, T. Mohale, A. Nguni, B. Mahlangu, N. Ntuli, Z. T. Khumalo, Z. Makatini, N. Wolter, C. Scheepers, A. Ismail, D. Doolabh, R. Joseph, A. Strydom, A. Mendes, M. Davis, S. H. Mayaphi, Y. Ramphal, A. Maharaj, W. A. Karim, D. Tshiabuila, U. J. Anyaneji, L. Singh, S. Engelbrecht, V. Fonseca, K. Marais, S. Korsman, D. Hardie, N. Hsiao, T. Maponga, G. van Zyl, G. Marais, A. Iranzadeh, D. Martin, L. C. J. Alcantara, P. A. Bester, M. M. Nyaga, K. Subramoney, F. K. Treurnicht, M. Venter, D. Goedhals, W. Preiser, J. N. Bhiman, A. Gottberg, C. Williamson, R. J. Lessells, T. de Oliveira, Rapid replacement of the Beta variant by the Delta variant in South Africa. medRxiv 2021.09.23.21264018 [Preprint] (2021); https://doi.org/10.1101/2021.09.23.21264018. 10.1101/2021.09.23.21264018 [DOI]
  • 49. Chang S., Pierson E., Koh P. W., Gerardin J., Redbird B., Grusky D., Leskovec J., Mobility network models of COVID-19 explain inequities and inform reopening. Nature 589, 82–87 (2021). 10.1038/s41586-020-2923-3 [DOI] [PubMed] [Google Scholar]
  • 50. Chinazzi M., Davis J. T., Ajelli M., Gioannini C., Litvinova M., Merler S., Pastore Y Piontti A., Mu K., Rossi L., Sun K., Viboud C., Xiong X., Yu H., Halloran M. E., Longini I. M. Jr., Vespignani A., The effect of travel restrictions on the spread of the 2019 novel coronavirus (COVID-19) outbreak. Science 368, 395–400 (2020). 10.1126/science.aba9757 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51. Kraemer M. U. G., Yang C.-H., Gutierrez B., Wu C.-H., Klein B., Pigott D. M., du Plessis L., Faria N. R., Li R., Hanage W. P., Brownstein J. S., Layan M., Vespignani A., Tian H., Dye C., Pybus O. G., Scarpino S. V., Open COVID-19 Data Working Group , The effect of human mobility and control measures on the COVID-19 epidemic in China. Science 368, 493–497 (2020). 10.1126/science.abb4218 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52. Nouvellet P., Bhatia S., Cori A., Ainslie K. E. C., Baguelin M., Bhatt S., Boonyasiri A., Brazeau N. F., Cattarino L., Cooper L. V., Coupland H., Cucunuba Z. M., Cuomo-Dannenburg G., Dighe A., Djaafara B. A., Dorigatti I., Eales O. D., van Elsland S. L., Nascimento F. F., FitzJohn R. G., Gaythorpe K. A. M., Geidelberg L., Green W. D., Hamlet A., Hauck K., Hinsley W., Imai N., Jeffrey B., Knock E., Laydon D. J., Lees J. A., Mangal T., Mellan T. A., Nedjati-Gilani G., Parag K. V., Pons-Salort M., Ragonnet-Cronin M., Riley S., Unwin H. J. T., Verity R., Vollmer M. A. C., Volz E., Walker P. G. T., Walters C. E., Wang H., Watson O. J., Whittaker C., Whittles L. K., Xi X., Ferguson N. M., Donnelly C. A., Reduction in mobility and COVID-19 transmission. Nat. Commun. 12, 1090 (2021). 10.1038/s41467-021-21358-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53. Xiong C., Hu S., Yang M., Luo W., Zhang L., Mobile device data reveal the dynamics in a positive relationship between human mobility and COVID-19 infections. Proc. Natl. Acad. Sci. U.S.A. 117, 27087–27089 (2020). 10.1073/pnas.2010836117 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54. Pillay S., Giandhari J., Tegally H., Wilkinson E., Chimukangara B., Lessells R., Moosa Y., Mattison S., Gazy I., Fish M., Singh L., Khanyile K. S., San J. E., Fonseca V., Giovanetti M., Alcantara L. C. Jr., de Oliveira T., Whole genome sequencing of SARS-CoV-2: Adapting Illumina protocols for quick and accurate outbreak investigation during a pandemic. Genes 11, 949 (2020). 10.3390/genes11080949 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55. Singh L., San J. E., Tegally H., Brzoska P. M., Anyaneji U. J., Wilkinson E., Clark L., Giandhari J., Pillay S., Lessells R. J., Martin D. P., Furtado M., Kiran A. M., de Oliveira T., Targeted Sanger sequencing to recover key mutations in SARS-CoV-2 variant genome assemblies produced by next-generation sequencing. Microb. Genom. 8, (2022). 10.1099/mgen.0.000774 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56. Page A. J., Mather A. E., Le-Viet T., Meader E. J., Alikhan N.-F., Kay G. L., de Oliveira Martins L., Aydin A., Baker D. J., Trotter A. J., Rudder S., Tedim A. P., Kolyva A., Stanley R., Yasir M., Diaz M., Potter W., Stuart C., Meadows L., Bell A., Gutierrez A. V., Thomson N. M., Adriaenssens E. M., Swingler T., Gilroy R. A. J., Griffith L., Sethi D. K., Aggarwal D., Brown C. S., Davidson R. K., Kingsley R. A., Bedford L., Coupland L. J., Charles I. G., Elumogo N., Wain J., Prakash R., Webber M. A., Smith S. J. L., Chand M., Dervisevic S., O’Grady J., The Covid-Genomics Uk Cog-Uk Consortium , Large-scale sequencing of SARS-CoV-2 genomes from one region allows detailed epidemiology and enables local outbreak management. Microb. Genom. 7, (2021). 10.1099/mgen.0.000589 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.N. De Maio, C. Walker, R. Borges, L. Weilguny, G. Slodkowicz, N. Goldman, Issues with SARS-CoV-2 sequencing data, Virological.org (2020); https://virological.org/t/issues-with-sars-cov-2-sequencing-data/473.
  • 58. Freed N. E., Vlková M., Faisal M. B., Silander O. K., Rapid and inexpensive whole-genome sequencing of SARS-CoV-2 using 1200 bp tiled amplicons and Oxford Nanopore Rapid Barcoding. Biol. Methods Protoc. 5, bpaa014 (2020). 10.1093/biomethods/bpaa014 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59. Eden J.-S., Rockett R., Carter I., Rahman H., de Ligt J., Hadfield J., Storey M., Ren X., Tulloch R., Basile K., Wells J., Byun R., Gilroy N., O’Sullivan M. V., Sintchenko V., Chen S. C., Maddocks S., Sorrell T. C., Holmes E. C., Dwyer D. E., Kok J., 2019-nCoV Study Group , An emergent clade of SARS-CoV-2 linked to returned travellers from Iran. Virus Evol. 6, veaa027 (2020). 10.1093/ve/veaa027 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60. Gonzalez-Reiche A. S., Hernandez M. M., Sullivan M. J., Ciferri B., Alshammary H., Obla A., Fabre S., Kleiner G., Polanco J., Khan Z., Alburquerque B., van de Guchte A., Dutta J., Francoeur N., Melo B. S., Oussenko I., Deikus G., Soto J., Sridhar S. H., Wang Y.-C., Twyman K., Kasarskis A., Altman D. R., Smith M., Sebra R., Aberg J., Krammer F., García-Sastre A., Luksza M., Patel G., Paniz-Mondolfi A., Gitman M., Sordillo E. M., Simon V., van Bakel H., Introductions and early spread of SARS-CoV-2 in the New York City area. Science 369, 297–301 (2020). 10.1126/science.abc1917 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61. Itokawa K., Sekizuka T., Hashino M., Tanaka R., Kuroda M., Disentangling primer interactions improves SARS-CoV-2 genome sequencing by multiplex tiling PCR. PLOS ONE 15, e0239403 (2020). 10.1371/journal.pone.0239403 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.A. X. Han, A. Toporowski, J. A. Sacks, M. Perkins, S. Briand, M. van Kerkhove, E. Hannay, S. Carmona, B. Rodriguez, E. Parker, B. E. Nichols, C. A. Russell, Low testing rates limit the ability of genomic surveillance programs to monitor SARS-CoV-2 variants: a mathematical modelling study. medRxiv 2022.05.20.22275319 [Preprint] (2022); . 10.1101/2022.05.20.22275319 [DOI]
  • 63. Wang H., Paulson K. R., Pease S. A., Watson S., Comfort H., Zheng P., Aravkin A. Y., Bisignano C., Barber R. M., Alam T., Fuller J. E., May E. A., Jones D. P., Frisch M. E., Abbafati C., Adolph C., Allorant A., Amlag J. O., Bang-Jensen B., Bertolacci G. J., Bloom S. S., Carter A., Castro E., Chakrabarti S., Chattopadhyay J., Cogen R. M., Collins J. K., Cooperrider K., Dai X., Dangel W. J., Daoud F., Dapper C., Deen A., Duncan B. B., Erickson M., Ewald S. B., Fedosseeva T., Ferrari A. J., Frostad J. J., Fullman N., Gallagher J., Gamkrelidze A., Guo G., He J., Helak M., Henry N. J., Hulland E. N., Huntley B. M., Kereselidze M., Lazzar-Atwood A., LeGrand K. E., Lindstrom A., Linebarger E., Lotufo P. A., Lozano R., Magistro B., Malta D. C., Månsson J., Mantilla Herrera A. M., Marinho F., Mirkuzie A. H., Misganaw A. T., Monasta L., Naik P., Nomura S., O’Brien E. G., O’Halloran J. K., Olana L. T., Ostroff S. M., Penberthy L., Reiner R. C. Jr., Reinke G., Ribeiro A. L. P., Santomauro D. F., Schmidt M. I., Shaw D. H., Sheena B. S., Sholokhov A., Skhvitaridze N., Sorensen R. J. D., Spurlock E. E., Syailendrawati R., Topor-Madry R., Troeger C. E., Walcott R., Walker A., Wiysonge C. S., Worku N. A., Zigler B., Pigott D. M., Naghavi M., Mokdad A. H., Lim S. S., Hay S. I., Gakidou E., Murray C. J. L., COVID-19 Excess Mortality Collaborators , Estimating excess mortality due to the COVID-19 pandemic: A systematic analysis of COVID-19-related mortality, 2020-21. Lancet 399, 1513–1536 (2022). 10.1016/S0140-6736(21)02796-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64. Wu F., Zhao S., Yu B., Chen Y.-M., Wang W., Song Z.-G., Hu Y., Tao Z.-W., Tian J.-H., Pei Y.-Y., Yuan M.-L., Zhang Y.-L., Dai F.-H., Liu Y., Wang Q.-M., Zheng J.-J., Xu L., Holmes E. C., Zhang Y.-Z., A new coronavirus associated with human respiratory disease in China. Nature 579, 265–269 (2020). 10.1038/s41586-020-2008-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65. Aksamentov I., Roemer C., Hodcroft E., Neher R., Nextclade: Clade assignment, mutation calling and quality control for viral genomes. J. Open Source Softw. 6, 3773 (2021). 10.21105/joss.03773 [DOI] [Google Scholar]
  • 66. Martin D. P., Varsani A., Roumagnac P., Botha G., Maslamoney S., Schwab T., Kelz Z., Kumar V., Murrell B., RDP5: A computer program for analyzing recombination in, and removing signals of recombination from, nucleotide sequence datasets. Virus Evol. 7, veaa087 (2020). 10.1093/ve/veaa087 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67. Price M. N., Dehal P. S., Arkin A. P., FastTree 2—Approximately maximum-likelihood trees for large alignments. PLOS ONE 5, e9490 (2010). 10.1371/journal.pone.0009490 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68. Felsenstein J., Confidence limits on phylogenies: An approach using the bootstrap. Evolution 39, 783–791 (1985). 10.1111/j.1558-5646.1985.tb00420.x [DOI] [PubMed] [Google Scholar]
  • 69. Rambaut A., Lam T. T., Max Carvalho L., Pybus O. G., Exploring the temporal structure of heterochronous sequences using TempEst (formerly Path-O-Gen). Virus Evol. 2, vew007 (2016). 10.1093/ve/vew007 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70. Popescu A.-A., Huber K. T., Paradis E., ape 3.0: New tools for distance-based phylogenetics and evolutionary analysis in R. Bioinformatics 28, 1536–1537 (2012). 10.1093/bioinformatics/bts184 [DOI] [PubMed] [Google Scholar]
  • 71. Sagulenko P., Puller V., Neher R. A., TreeTime: Maximum-likelihood phylodynamic analysis. Virus Evol. 4, vex042 (2018). 10.1093/ve/vex042 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72. Wang S., Xu X., Wei C., Li S., Zhao J., Zheng Y., Liu X., Zeng X., Yuan W., Peng S., Molecular evolutionary characteristics of SARS-CoV-2 emerging in the United States. J. Med. Virol. 94, 310–317 (2022). 10.1002/jmv.27331 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73. Yu G., Using ggtree to visualize data on tree-like structures. Curr. Protoc. Bioinformatics 69, e96 (2020). 10.1002/cpbi.96 [DOI] [PubMed] [Google Scholar]
  • 74. Wickham H., ggplot2. Wiley Interdiscip. Rev. Comput. Stat. 3, 180–185 (2011). 10.1002/wics.147 [DOI] [Google Scholar]
  • 75. Hadfield J., Megill C., Bell S. M., Huddleston J., Potter B., Callender C., Sagulenko P., Bedford T., Neher R. A., Nextstrain: Real-time tracking of pathogen evolution. Bioinformatics 34, 4121–4123 (2018). 10.1093/bioinformatics/bty407 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.S. E. James, CERI-KRISP/SARS-CoV-2-epidemic-in-Africa: Expanding Africa SARS-CoV-2 sequencing capacity in a fast evolving pandemic analysis. Zenodo (2022); . 10.5281/zenodo.7006806 [DOI]
  • 77. Ward T., Johnsen A., Understanding an evolving pandemic: An analysis of the clinical time delay distributions of COVID-19 in the United Kingdom. PLOS ONE 16, e0257978 (2021). 10.1371/journal.pone.0257978 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Citations

  1. S. E. James, CERI-KRISP/SARS-CoV-2-epidemic-in-Africa: Expanding Africa SARS-CoV-2 sequencing capacity in a fast evolving pandemic analysis. Zenodo (2022); . 10.5281/zenodo.7006806 [DOI]

Supplementary Materials

Figs. S1 to S16

Tables S1 to S4

Reference (77)

MDAR Reproducibility Checklist

Tables S3 and S4


Articles from Science (New York, N.y.) are provided here courtesy of American Association for the Advancement of Science

RESOURCES