Skip to main content
Lancet Regional Health - Americas logoLink to Lancet Regional Health - Americas
. 2021 Aug 20;2:100039. doi: 10.1016/j.lana.2021.100039

Estimation of all-cause excess mortality by age-specific mortality patterns for countries with incomplete vital statistics: a population-based study of the case of Peru during the first wave of the COVID-19 pandemic

Lucas Sempé a,, Peter Lloyd-Sherlock b, Ramón Martínez c, Shah Ebrahim d, Martin McKee e, Enrique Acosta f
PMCID: PMC8507430  PMID: 34693394

Summary

Background

All-cause excess mortality is a comprehensive measure of the combined direct and indirect effects of COVID-19 on mortality. Estimates are usually derived from Civil Registration and Vital Statistics (CRVS) systems, but these do not include non-registered deaths, which may be affected by changes in vital registration coverage over time.

Methods

Our analytical framework and empirical strategy account for registered mortality and under-registration. This provides a better estimate of the actual mortality impact of the first wave of the COVID-19 pandemic in Peru. We use population and crude mortality rate projections from Peru's National Institute of Statistics and Information (INEI, in Spanish), individual-level registered COVID-19 deaths from the Ministry of Health (MoH), and individual-level registered deaths by region and age since 2017 from the National Electronic Deaths Register (SINADEF, in Spanish).

We develop a novel framework combining different estimates and using quasi-Poisson models to estimate total excess mortality across regions and age groups. Also, we use logistic mixed-effects models to estimate the coverage of the new SINADEF system.

Findings

We estimate that registered mortality underestimates national mortality by 37•1% (95% CI 23% - 48•5%) across 26 regions and nine age groups. We estimate total all-cause excess mortality during the period of analysis at 173,099 (95% CI 153,669 - 187,488) of which 108,943 (95% CI 96,507 - 118,261) were captured by the vital registration system. Deaths at age 60 and over accounted for 74•1% (95% CI 73•9% - 74•7%) of total excess deaths, and there were fewer deaths than expected in younger age groups. Lima region, on the Pacific coast and including the national capital, accounts for the highest share of excess deaths, 87,781 (95% CI 82,294 - 92,504), while in the opposite side regions of Apurimac and Huancavelica account for less than 300 excess deaths.

Interpretation

Estimating excess mortality in low- and middle-income countries (LMICs) such as Peru must take under-registration of mortality into account. Combining demographic trends with data from administrative registries reduces uncertainty and measurement errors. In countries like Peru, this is likely to produce significantly higher estimates of excess mortality than studies that do not take these effects into account.

Funding

None.

Keywords: COVID-19, Excess mortality, CRVS, Peru, Age-group mortality


Research in context.

Evidence before this study

We searched PubMed, Google Scholar, medRxiv, and SocArXiv for studies published up to May 27, 2020, using the keywords “excess mortality” and “under-registration” or “sub registration,” combined with “coronavirus” or “SARS-CoV-2” or “COVID-19.” We found studies estimating cumulative mortality in high-income countries in Europe and North America solely based on official death counts. We found studies computing overall COVID-19 mortality for a small number of LMICs. Prior research shows a significant percentage of under-registration of deaths in LMICs.

Added‐value of this study

To our knowledge, we provide the first estimate of excess mortality associated with COVID-19 that accounts for both registered and unregistered deaths in a LMIC. We develop an analytical strategy to address common challenges faced by LMICs, such as low completion rates of death certificates, missing data, and inconsistency and variability of data across regions and age groups. We show our method is robust for small samples, including for subnational regions and specific age groups.

Implications of all the available evidence

Our approach shows the importance of accounting for unregistered deaths based on demographic trends to generate robust estimates of excess mortality associated with COVID-19. It suggests that previous reports of COVID-19 related mortality in Peru were substantial underestimates.

Alt-text: Unlabelled box

1. Introduction

Monitoring mortality is an essential part of the public health response to the COVID-19 pandemic. In many countries, COVID-19 mortality monitoring has been hindered by failures to capture all deaths or to accurately attribute the cause of death [1]. Disentangling the contribution of COVID-19 to overall mortality is especially challenging, as many people who die from COVID-19 also have other conditions, such as cardiovascular disease and diabetes [2]. Some countries apply different cut-off times between a positive COVID-19 test and death when attributing mortality to this cause [3]. Also, the pandemic has led to large numbers of deaths not directly attributable to COVID-19 (either exclusively or in part), due to effects such as reduced access to treatment for other conditions [4,5]. Conversely, there is evidence that lockdowns have sometimes reduced expected rates of mortality from causes such as road traffic injuries and homicides [6].

In the absence of good data on these different effects, robust estimation of all-cause excess mortality offers the most complete and reliable approach for gauging the overall impact of the pandemic on a defined population over a fixed period [7]. Excess mortality refers to the number of additional deaths occurring over a time period when specific conditions apply (in this case, the presence of COVID-19), compared to the number of deaths we might reasonably expect over the same period based on historical data. It captures deaths directly attributed to COVID-19 and those resulting from other consequences of the pandemic, to provide an estimate of the overall mortality effect [8].

Excess mortality estimates based on registered deaths have been computed for high-income countries by surveillance agencies [9,10], academia [11], [12], [13], [14] and the media [15], [16], [17], [18]. However, these data are often incomplete or inaccurate, especially in many low and middle-income countries (LMICs). The Global Burden Disease (GBD) estimates only 64% of global deaths were registered in 2015 [19]. In most LMICs, responsibility for mortality data is divided between different national and subnational agencies [20]. This can cause extended delays in national reporting and discrepancies between different sources [12,21]. Disaggregation of summary data by different geographical areas or demographic groups is usually very limited [22]. These shortcomings explain the lack of published studies of excess mortality in LMICs [23,24].

We deploy an array of analytical tools to address these issues and we apply them to the case of Peru during the first wave of the pandemic. Peru is a geographically diverse country with three natural regions: a narrow dry coast in the west, the Andean region, and a large, sparsely populated rainforest region to the east. Peru is divided into 26 administrative regions (Figure 1). Peru's National Institute of Statistics and Information (INEI) estimated the national population was around 32.8 million in 2020, with a growth rate of around 1% year. The population aged 65 and over has increased rapidly over the last two decades, with yearly growth rates of over 2.5%. By contrast, the growth rate of younger age groups has been falling, with an absolute decline in the population aged under 20 from year 2010 onwards [35].

Figure 1.

Figure 1

Natural and administrative regions - Peru.

Between 18 March and December 31, 2020, Peru reported 36,036 deaths directly caused by COVID-19. This official figure only includes cases with positive COVID-19 test results, but rates of testing have been low compared to other Latin American countries [25], [26]. Also, many tests have used low sensitivity devices, potentially generating false negatives [27]. A separate study undertaken in association with the government applies a wider set of criteria to identify COVID-19 deaths [30]. Controversially, it categorises any death in households where at least one person tested positive for COVID-19 as a “potential COVID-19 death”, even if the deceased was not tested. This generates an estimate of 89,844 potential COVID-19 deaths from March 1st to November 30th, 2020: equivalent to one of the highest per capita death rates in the world.

The large differences between these two estimates, which cover similar periods of time, reflect the level of uncertainty about COVID-19 mortality in Peru. Estimates of all-cause excess deaths may offer a more reliable indicator of the pandemic's impact, but gaps in death registration impede robust comparisons over time. Also, Studies from other LMICs show registration coverage often varies significantly between sub-regions and age groups, hindering comparisons at these levels [32,33]. In 2017, Peru adopted a new online register, placing anonymised individual-level data on mortality and COVID-19 in the public domain [34]. Coverage of the new system has increased over time, but it remains lower than for the paper-based one [35]. Also, registration rates are not reported at the subnational level, hindering analysis of geographical and age patterns. Consequently, comparisons over time must take increased registration into account.

To resolve these challenges, we develop an analytical framework and empirical strategy, which deal with both registered and unregistered mortality, changes in registration coverage over time and variations in coverage between regions and age groups. This adds value to the methods applied by other studies of excess mortality, which focus exclusively on registered deaths [31], [32], [33]. We test the feasibility of this method by applying it to Peru during 2020 and consider its potential suitability for LMICs with similar gaps and inconsistencies in mortality registration.

2. Methods

2.1. Data

We use data from two Ministry of Health (MoH) mortality registration systems. Anonymised individual-level registered deaths from the National Electronic Deaths Register (SINADEF, in Spanish) from 2017 to 2020 are divided into 26 regions and nine age groups [36]. In 2017 Peru began to shift from a paper-based system to an electronic one, which led to a growth in coverage (from 98,552 reported deaths in 2017 to 114,449 in 2019). SINADEF sought to strengthen the mortality information system by improving the coverage and quality, and standardising registration processes. Data inclusion in the public database is delayed by processing and checking procedures, such as the inclusion of manual deaths certificates, quality control and the correction of errors in deaths certificates [34]. Coverage of SINADEF was estimated to be 74.0% of total deaths in 2018 [35]. However, coverage varies among regions, impeding comparisons of spatial and temporal trends. Additionally, we use anonymised individual-level registered deaths by region and age group from MoH COVID-19 tracking systems [63]. Both sets of data include date of death, age, sex, date of birth and administrative region, and were collected on May 30, 2021. The SINADEF data set consists of 573,054 observations, with an average age of 66•35 (s.d. 23•46). MoH COVID-19 data set consists of 36,036 deaths, with an average age of 65•96 (s.d. 14•66). A higher proportion of males deaths matches similar patterns in other countries due to immunological differences [13], [14]. We also use INEI population and mortality projections for recent years, by region and age group. These projections are based on decennial censuses (the most recent in 2017) and annual health and population surveys [35].

Figure 2 shows weekly registered crude death rates by region for 2017 to 2020. There are clear differences between regions: Moquegua (coast) and Arequipa (Andes) show the highest rates, while Amazonas (in Amazonia) and Ayacucho (Andes) show the lowest. All regions other than Apurimac, saw an initial peak at some point between week 20 and week 40. All regions then saw significant declines, although a second peak occurred in Lima and Pasco in weeks 50 to 52. We also observe that weekly mortality growth varied across time. We restrict our excess mortality analysis to those weeks where a significant change can be detected for each region, using 0•60% as a threshold criterion of weekly change [37] (vertical lines in Figure 2).

Figure 2.

Figure 2

Crude death rates during 2017–2020 by Region - Peru. Grey lines indicate deaths during years 2017 to 2019 and red lines during 2020. Vertical dashed lines indicate the starting point of the first pandemic wave in each region.

2.2. Excess mortality methods

Figure 3 presents our approach to estimate excess mortality. This can be decomposed into three terms: (i) excess SINADEF registered deaths; (ii) unregistered excess deaths; and (iii) unregistered COVID-19 deaths. Figure 4 summarises our empirical strategy based on data sources used (squares), statistical analysis performed (diamonds) and outputs (circles).

Figure 3.

Figure 3

Conceptual representation of empirical strategy Grey lines indicate deaths during years 2017 to 2019 and red lines during 2020. Vertical dashed lines indicate the starting point of the first pandemic wave in Peru at the national level.

Figure 4.

Figure 4

Flowchart: Data, analysis and outputs.

To estimate the first term, excess SINADEF registered deaths - ExcessReg^ -, we first fit quasi-Poisson regressions to address the over-dispersed nature of the data to weekly deaths, Deaths, as follows:

log(Deaths)=β0+β1COVID19+Fourierseries+β2t1+βkϕk(Week)+log(Pop)+t (1)

where we fit a natural cubic B-spline function ϕk(Week) on weeks and Fourier series (with yearly periods and 6 pairs of sines and cosines) to address long-term trends and seasonality [38]. Additionally, we use lagged residuals t1 [39] and the log of the population in 2020, log(Pop) as an offset. Finally, we compute a dichotomous variable COVID19 starting in the week corresponding to the pandemic onset for each region (Figure 2). We calculate the relative risk RR of those β1 under a statistical significance threshold (p<.05) to compute registered excess mortality as a population attributable fraction (PAF) [40], as follows:

ExcessReg^=(RR1)/RR*n (2)

where n is the weekly number of deaths and represents a fraction of the total mortality of the period where COVID19=1 in equation (1).

The second term, unregistered excess deaths - ExcessNotreg^ -, estimates mortality delayed or absent in official registers and is computed as a fraction of ExcessReg^. We start by predicting SINADEF completeness Regcomplete^ by modelling random-effects logistic regressions [41] for each region j using for 2017 and 2019, represented by Year. We exploit variability in INEI mortality rates, population aged 60 years and over and rurality to address potential differences in terms of registration completeness, using the equation:

logit(Regcompletej)=β0+β1*RegCDR+β2*RegCDR2+β3*complete<5+β4*P60++β5*log(5q0)+β6*LPG+β7*Year++γj (3)

where RegCDR and RegCDR2 are the Crude Death Rates, complete<5 indicates registration coverage of child mortality -as a relevant proxy measure of overall mortality registration in a population-, log(5q0) is the logarithm of under-five mortality rate and P60+ represents the fraction of the population at 60 years and over based on INEI projections. LPG is the share of households that use liquefied petroleum gas for cooking. This is an acceptable proxy for rurality in Peru, as 81•8% of rural households primarily use solid fuel compared to 9•8% of urban ones, which typically cook with liquefied petroleum gas [42]. Rurality is an important factor to explain delay and under-registration of deaths as the system requires access to internet and computers. Additionally, is the error term and γj is the region-level random effect. Regcompletej is computed using the inverse logit of the predicted values, which provides a regionally adjusted SINADEF completeness estimate. We use regional completeness estimates for the year 2019 in all cases but Lambayeque, where we average both to reflect the significant differences between them.

Finally, unregistered excess deaths are based on ExcessReg^ from equation (2) and Regcomplete^ from equation (3), as follows:

ExcessNotreg^=ExcessReg^*(1Regcompletej^1) (4)

The last term, unregistered COVID-19 deaths, μ^DeathsCOVID19notreg, is computed to correct for situations where the proportion of cumulative cases of reported COVID-19 deaths for that period exceeds SINADEF values. This occurs mostly in younger age groups. The calculation is conditional on registered excess deaths being lower than officially registered COVID-19 deaths for each region and specific age group as follows:

μ^DeathsCOVID19notreg={μ^DeathsCOVID19Regμ^ExcessRegifμ^DeathsCOVID19Reg>μ^ExcessReg,0ifμ^DeathsCOVID19Reg<μ^ExcessReg. (5)

Finally, we estimate total excess deaths for different scenarios.

A first scenario is when there is no solid evidence suggesting under-registration of deaths for some regions or age groups and, therefore, no scope to expand registration over time. This usually occurs in areas and age groups with very small populations. In those cases, ExcessNotreg^ is set to 0 to avoid adding negative values to the sum. A second scenario relates to younger age groups who in some regions have not been significantly affected by COVID-19 mortality. This case occurs when β1 in equation (1) is not statistically significant and therefore we set ExcessReg^=0. A third scenario is when some groups have fewer deaths than expected, due to effects such as reduced road traffic injuries. In these groups ExcessReg^0 and ExcessNotreg^=0 are included in the final estimate. A fourth scenario is when models underestimate the official number of deaths such as COVIDReg>ExcessReg^. In this case, we use the former as registered deaths. Equation (6) summarises the estimate of Excess^Tmin/mean/max as follows:

Excess^Tmin/mean/max=ExcessReg^min/mean/max+ExcessNotreg^+DeathsCOVID19NotReg (6)

Finally, we estimate total mortality during 2020 by adding Excess^Tmin/mean/max from equation (6) and SINADEF expected deaths for 2020, SINADEFNotexcess^, adjusted by Regcomplete^ from equation (3), as follows:

TotalMortality|2020^=Excess^Tmin/mean/max+(SINADEFNotexcess^*(1Regcompletej^)) (7)

To address the relative level of mortality in 2020, we compute all-cause age-standardised death rates per 1,000 people derived from the estimated total excess deaths. We apply direct standardization methods [43] using INEI population estimates by region and age group for 2020 as the standard population [44].

As a robustness check for excess mortality estimates, we estimate a mortality baseline for each age group fitting either a Generalised Linear Model with Poisson or Negative binomial distributions, depending on the data's over-dispersion. The model includes natural splines and sinusoidal components to account for secular changes and seasonality in mortality, as well as interpolated weekly exposures to control for changes in age structure over time. 95% prediction intervals were estimated using 2,000 bootstrapping iterations. Excess mortality is computed as the difference between observed mortality and the baseline, only including weeks in which observed mortality was above the upper prediction interval.

Role of the funding source: There was no funding for this study.

3. Results

Estimates of completeness of SINADEF registration derived from our logistic regression model fit the data according to marginal and conditional r2 and Root Mean Square of Errors parameters. Model fit and goodness-of-fit are presented in Appendix 1.

As a data validity test, we compare our estimations with the available data in the CRVS system for 2017 and 2019 (Figure 5). We find strong correlations between both estimations across years (r(25) = .81 and .87, p< .01, two-tailed). CRVS based estimation shows several values (to the right of vertical lines) where SINADEF registration is larger than CRVS, especially when completeness is in the higher end. For 2019, while the CRVS system estimates 159,706 deaths, we predict 189,991, which is similar to INEI estimations (188,043). This is consistent with current estimations of CRVS completeness gaps [45].

Figure 5.

Figure 5

Comparison of completeness methods. Each dot indicates Regions, the blue line depicts the estimation of the logit regression, and the vertical dashed line the full completeness of the records (100%).

Logistic regression based on SINADEF shows an outlier value for Lambayeque (bottom of right panel), which is reflects a consistent lack of registration during 2019. We address those anomalies choosing the highest registration completeness proportion for each region. Figure 6 shows important variations in regional completeness rates: Amazonas and Loreto (in Amazonia), and Pasco and Cajamarca (in the Andes) show estimated completion at below 50%, while Ica (coast) and Madre De Dios (Peru's least populated region in Amazonia) appear to have complete registration (Appendix 1).

Figure 6.

Figure 6

Estimation SINADEF completeness of death registration before pandemics - Peru.

Figure 7 shows observed and predicted values for registered excess death rates by region for the age group over 80. Arequipa, Madre De Dios, Moquegua are the regions with highest peaks. These values show both seasonality and growth. This, along with the high number of statistically significant β1 parameters across regressions, indicates the model fits the data. As such, the variable COVID-19 correctly identifies a significant change in mortality patterns across time. See Appendix 1 for statistics, goodness-of-fit analysis and predictions for absolute deaths for all models.

Figure 7.

Figure 7

Predicted registered excess death rates by 1000 people by Region for age group over 80 - Peru.

Table 1 summarises our estimates of excess mortality. Registered excess mortality is estimated to be 108,943 (95% CI 96,507 - 118,261), of which 37,725 are reported as COVID-19 deaths. Unregistered excess mortality is estimated to be 62,933 deaths (95% CI 55,940 - 68,005), making up 58•9% (95% CI 46% - 79•4%) of our estimate of total registered and unregistered deaths. This third term adds 1,222 deaths corresponding to cases when reported COVID-19 deaths exceed our estimate of adjusted registered excess mortality. Combining all these terms, our estimate of total excess deaths during 2020 is 173,099 (95% CI 153,669 - 187,488) and our estimate of total deaths for 2020 is 334,043 (95% CI 300,147 - 367,743). This is 73•8% (95% CI 56•2% - 91•3%) more than the number of deaths projected by INEI for 2020 (192,215) [46], and 14•3% (95% CI 13•5 - 14•5) higher than estimations based on CRVS data in 2019. See Appendix 1 for results and comparison with robustness analysis.

Table 1.

Summary of estimations, Peru, 2020.

Terms Estimates (95% CI)
COVID-19 deaths (MoH) 1,222
Registered excess deaths 108,943 (96,507 – 118,261)
SINADEF completeness registration 62•9% (51% - 77%)
Total excess mortality 173,099 (153,669 – 187,488)
Total estimated deaths in 2020 334,043 (300,147 – 367,743)

Table 2 shows estimates by region. Lima, which includes the capital, accounts for 87,781 (95% CI 82,294 – 92,504) total excess deaths, and Apurimac and Huancavelica show the lowest numbers. Coastal regions have the highest values of unregistered excess deaths, with Lima accounting for 36,856 additional deaths, followed by Piura (7,928) and Lambayeque (3,540). The highest under-reported COVID-19 mortality occurs in the Andean regions of Ayacucho (192) and Apurimac (150), along with Ucayali, in Amazonia (117).

Table 2.

Estimated total excess deaths by region

Region Total excess Registered excess
AMAZONAS 87,781 (82,294 – 92,504) 50,925 (47,738 – 53,666)
ANCASH 15,661 (14,575 – 16,465) 7,733 (7,196 – 8,131)
APURIMAC 8,091 (7,152 – 8,819) 6,502 (5,742 – 7,082)
AREQUIPA 7,144 (6,536 – 7,610) 6,215 (5,680 – 6,617)
AYACUCHO 5,962 (5,252 – 6,386) 2,784 (2,475 – 2,984)
CAJAMARCA 5,616 (4,873 – 6,207) 5,273 (4,564 – 5,829)
CALLAO 5,345 (4,012 – 6,291) 1,806 (1,369 – 2,129)
CUSCO 4,964 (4,782 – 5,074) 4,437 (4,268 – 4,531)
HUANCAVELICAICA 4,252 (3,691 – 4,678) 4,226 (3,659 – 4,645)
HUANUCO 4,155 (3,408 – 4,756) 3,490 (2,851 – 3,999)
4,071 (3,289 – 4,661) 2,526 (2,037 – 2,891)
JUNINSAN MARTIN 3,534 (2,775 – 4,081) 1,870 (1,447 – 2,162)
LA LIBERTAD 3,096 (2,255 – 3,759) 2,526 (1,823 – 3,067)
LAMBAYEQUE 2,841 (2,307 – 3,166) 1,645 (1,152 – 1,932)
LIMA 2,272 (1,566 – 2,823) 1,811 (1,237 – 2,253)
LORETO 1,623 (639 – 2,236) 960 (440 - 1330)
MADRE DE DIOS 1,160 (893 – 1,326) 984 (753 – 1,120)
MOQUEGUA 1,087 (884 – 1,205) 889 (718 - 982)
PASCO 958 (715 – 1,112) 755 (558 - 873)
PIURA 891 (464 – 1,145) 353 (181 - 482)
PUNO 873 (424 – 1,016) 264 (170 - 314)
658 (174 - 894) 261 (68 - 363)
TACNA 500 (319 - 607) 492 (284 - 591)
TUMBES 307 (209 - 376) 154 (77 - 207)
UCAYALI 260 (182 - 292) 62 (18 - 81)

Table 3 presents excess mortality estimates by age group. Excess deaths among people aged 60 years and over account for 74•1% of total excess mortality. There was negative registered excess mortality for the youngest age groups.

Table 3.

Estimated total excess deaths by age group

Age range Total excess Registered excess
< 10 209 (141 - 259) -53 (-262 - 64)
10-19 221 (134 - 272) 91 (20 - 131)
20-29 1,407 (774 – 1,826) 732 (354 - 980)
30-39 4,261 (2,899 – 5,138) 2,592 (1,698 – 3,151)
40-49 12,697 (11,168 – 13,739) 8,009 (6,995 – 8,693)
50-59 26,114 (23,687 – 27,763) 16,440 (14,939 – 17,493)
60-69 42,528 (39,316 – 44,858) 26,719 (24,719 – 28,200)
70-79 43,260 (38,903 – 46,469) 27,356 (24,702 – 29,402)
> 79 42,401 (36,648 – 47,164) 27,057 (23,341 – 30,147)

Figure 8 shows all-cause age-standardised death rates. Callao, Piura, and Lima have the highest rates, ranging from 10•3 to 13•9 (CI 95% ranging from 10•1 to 13•9). Lambayeque, Cusco, and Amazonas show the lowest rates ranging from 4•44 to 7•98 (CI 95% ranging from 4•33 to 8•26). Figure 8 also shows differences in deaths per 1,000 compared to INEI's projections for 2020. There is excess mortality in all regions but Amazonas and Cusco.

Figure 8.

Figure 8

Estimated age-standardised mortality rates and differences with INEI projections.

4. Discussion

Most published studies of excess COVID-19 mortality fall into two categories. Some provide estimates for countries where mortality data are relatively complete and reliable [47], [48], [49], [50]. As such, they do not apply specific methods to address data gaps. To our knowledge, previous studies of excess COVID-19 mortality for countries with less complete data have not taken unregistered deaths into account [51], [52], [53], [54], [55], [56].

Our method generates an estimate of all-cause excess mortality of 173,099 (95% CI 153,669 - 187,488). If true, this is the highest per capita rate of excess mortality yet reported for any country during the pandemic. Available research indicates that this high death rate resulted from a combination of causes, including a weak and under-resourced health system, national and local policy failures and a wider context of poverty, inequality and precarious employment [27], [28].

Our estimate of all-cause excess mortality is 93% (95% CI 71% - 109%) higher than official reports of total COVID-19 deaths for the same period [30]. This is a much larger differential than those reported by studies in high-income countries, where death registration coverage is more complete. For example, separate studies of the US report differentials of 28% and 33%, respectively [11,12]. However, under-registration of deaths does not fully account for this differential: our estimate of registered all-cause excess mortality was still 21% (95% CI 7% - 32%) higher than official figures.

There are a number of other national estimates of excess mortality for Peru during the pandemic, based on different methodologies. Quevedo et al compare registered deaths between April and June of 2020 to corresponding periods for 2017-19, but do not take non-registered deaths or improvements in data registration into account [56]. This produces an estimate of 36,322 excess deaths. Applying our approach to the same time period gives an estimate of 34,155 (CI 95% 30,257 - 37,178) registered excess deaths, increasing to 63,598 (CI 95% 52,703 - 67,213) when non-registered deaths are included.

A separate study applies similar methods to Quevedo et al to compare 1 January to 12 July 2020 to corresponding periods in 2017-19 [57]. This produces an estimate of 46,863 excess reported deaths, compared to 41,440 registered excess deaths (CI 95% 36,449 - 45,300), increasing to 72,692 (CI 95% 65,104 – 77,163) applying our approach to the same period. This second study includes 2,000 excess deaths that occurred before March 2020, and which cannot therefore be attributed to the COVID-19 pandemic. A third study of Lima metropolitan region compares the number of non-violent deaths between the 1st 24 weeks of 2019 and 2020 and finds 20,093 excess deaths [58]. This is larger than our own estimate of excess registered deaths in Lima over the same period: 12,513 (CI 95% 11,493 – 13,408), but very close to our total estimate when unregistered deaths are also included, 21,738 (CI 95% 16,892 - 21,738). Together, these studies demonstrate the value of the more comprehensive methods applied in our study.

No other published study for Peru provides data disaggregated by age groups. Studies for other countries show excess mortality tends to be higher among older people. A study of registered excess mortality in six Brazilian cities reports that people aged 60 and over accounted for 71•1% of the total [53]. An analysis of European countries reports that 91% of excess COVID-19 deaths occurred among people aged 65 or more [9]. Similarly, Despite making up a relatively low share of Peru's population (12•5% in 2020), we find people aged 60 or more accounted for 73•9% of estimated excess mortality. By contrast, negative excess mortality among Peruvians aged under 10 probably reflected the impact of measures taken to restrict the spread of SARS-CoV-2: an effect reported in other countries [59]. These measures appear to have reduced deaths from causes such as unintentional injuries, road accidents and street violence, as well as the transmission of other airborne viruses, including influenza and respiratory infections common among younger children.

No other published study for Peru takes geographical variations in mortality registration into account, in order to produce robust estimates data disaggregated by sub-region. Our finding that unregistered mortality tends to be higher in poorer regions is consistent with studies of other LMICs [60]. This will lead to larger under-estimations of mortality in poorer regions in studies based purely on registered deaths. We estimate that all-cause excess mortality ranged from 0•55 per 1,000 people in Apurimac to 8•27 in Lima . Our estimates of regional variations for excess mortality in Peru are in line with a separate study, which posits they may be in part attributable to the effect of altitude on COVID-19 case fatality [56]. Geographical variations in COVID-19 transmission may also have contributed to regional patterns of mortality. Specific parts of the country, including the capital city Lima and Iquitos (in Amazonia), have seen especially high rates of reported seroprevalence of anti-SARS-CoV-2 antibodies [28], [29]. However, this may in part be an artefact of geographical variations in COVID-19 testing rates.

Our analysis is limited by data availability and robustness. We do not take account potential SINADEF registration growth during 2020. We do not include it due to lack of robust historical data that allows for robust estimations to avoid possible confounding with demographic changes and other potential unknown exogenous shocks. Lambayeque is an example of poor data reliability, which poses difficulties to estimate both registered and unregistered excess mortality.

We also make several assumptions to simplify our analysis. We assume comparison between years is not invalidated by specific time-bound mortality events, such as additional disease outbreaks or other external shocks. We found no evidence of such events for the period in question. Our estimates of registration completeness assume no variation across age groups and assume a linear progression in registration, which may not be the case. Additionally, our estimates are based on provisional data, which remain incomplete and may subsequently be revised. While most death certificates are entered into the online database in real-time, those produced manually take additional time to process and be included. Also, these delays vary across regions. We assume that the effect of these delays is negligible, as we are limiting our analysis to deaths that occurred up to the end of 2020. Although some of the authors discussed the SINADEF and MoH data processes and characteristics with Peruvian government officials and regional experts, this does not constitute an independent validation of these data sources. Finally, we present a conservative scenario, allowing for both negative and positive excess mortality.

There is an evident need for robust estimates of the direct and indirect mortality effects of the COVID-19 pandemic. To date, much of the data for LMICs rely on officially registered deaths. Inaccurate attribution of cause of death can, to some degree, be resolved by generating excess mortality estimates based on comparisons of all-cause mortality over time. However, because of the substantial under-registration of deaths in many LMICs, it is also crucial to adjust excess estimates to include deaths that were not officially registered.

This paper develops and applies a method to obtain robust excess mortality estimates for both registered and unregistered deaths. These estimates suggest that official data for Peru substantially under-represent the overall mortality impact of the first pandemic wave, and that this gap is much greater than those reported for high-income countries.

It is plausible that the degree of underestimation of excess mortality in other countries with low-quality mortality data will be comparable to Peru. The framework, tools, and analytical strategy applied in this paper may therefore be useful for generating similar estimates for these countries [61]. In Colombia, for example, data quality is similar to Peru's and a recent reform of its CRVS system is likely to have boosted registration rates in recent years [62]. There is an urgent need to extend this research, in order to assess the true toll of the COVID-19 pandemic, both globally and especially in LMICs.

Acknowledgments

Contributors

LS conceived and initiated the study, did the statistical analysis and visualisations and drafted the manuscript.

PLS conceived and supervised the study, drafted the manuscript and led the editing process of the manuscript.

MM and SE supervised the study and engaged in the draft review & editing

RM contributed to the data curation and methodology process and reviewed & edited the draft.

EA contributed to data curation, methodology and statistical analysis.

LS and EA had access to all the data. All authors approved the manuscript and are responsible for the decision to submit it for publication.

Declaration of interest

RM is a staff member of the Pan American Health Organization. The author alone is responsible for the views expressed in this publication, and they do not necessarily represent the decisions or policies of the Pan American Health Organization.

All other authors declare no competing interests.

Data sharing

The dataset used in the analysis is available for sharing on the website: https://github.com/lsempe77/excess

Acknowledgements

We are grateful to the editors and reviewers, who helped to improve this study.

Footnotes

Supplementary material associated with this article can be found, in the online version, at doi:10.1016/j.lana.2021.100039.

Appendix. Supplementary materials

mmc1.pdf (1.4MB, pdf)
mmc2.docx (15.8KB, docx)

References

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

mmc1.pdf (1.4MB, pdf)
mmc2.docx (15.8KB, docx)

Articles from Lancet Regional Health - Americas are provided here courtesy of Elsevier

RESOURCES