Abstract
Background
There is empirical evidence that measured postpartum blood loss has a lognormal distribution. This feature can be used to analyze events of the type ‘blood loss greater than a certain cutoff point’ using a lognormal approach, which takes into account all the quantitative observations, as opposed to dichotomizing the variable blood loss volume into two categories. This lognormal approach uses all the information contained in the data and is expected to provide more efficient estimates of proportions and relative risk when comparing treatments to prevent postpartum haemorrhage. As a consequence, sample size can be reduced in clinical trials, while keeping the statistical precision requirements.
Methods
The authors illustrate how a lognormal approach can be used in this situation, using data from a clinical trial and the event ‘blood loss greater than 1000 mL’.
Results
Estimates of the proportions of this event for each treatment, and relative risks obtained with this method are presented and compared with the standard estimates obtained by dichotomizing measured blood loss volume. An example of how the blood loss distributions of two treatments can be compared is also presented. Different scenarios of the sample size needed to compare two treatments or interventions are presented to illustrate how with the lognormal approach the size of a clinical trial can be reduced.
Conclusions
A distributional approach for postpartum blood loss using the lognormal distribution fitted to the data results in more precise estimates of risks of events and relative risks, compared to the use of binomial proportions of events. It also results in reduced required sample size for clinical trials.
Trial registration
This paper reports a secondary analysis for a trial that was registered at clinicaltrials.gov (NCT00781066).
Keywords: Postpartum blood loss, Postpartum haemorrhage, Severe postpartum haemorrhage, Lognormal distribution
Background
The development of an adequate statistical analysis technique to analyze a continuous variable depends on the knowledge of its distribution. Many variables in biology and medicine follow the normal distribution and standard statistical techniques can be applied to compare means. However, when the distribution is not normal, the standard statistical techniques are no longer appropriate and a transformation is often used to normalize the distribution. This has been the case with postpartum blood loss, for which a logarithmic transformation has been used to compare medians, [1] based on the observation that the blood loss distribution is positively skewed, as the lognormal distribution. Also, there are oftentimes physical or biological justifications for a variable to have a specific distribution. The lognormal distribution is a result of many independent small multiplicative effects. It is a simple model that applies to many problems such as body and tumor mass (weight) [2], and blood pressure [3]. Examination of simple histograms of blood loss reveals right skewed distributions that resemble the lognormal distribution, and more in-depth and formal statistical analysis showed that indeed the lognormal distribution fits the blood loss distribution very well. [4]
In the case of postpartum blood loss, there is interest to compare events of the type ‘blood loss beyond a certain cutoff point’ because the loss of large amounts of blood postpartum can lead to severe maternal morbidity and mortality [5, 6]. Therefore there has been a concern to find efficient treatments or interventions to prevent postpartum haemorrhage (PPH), defined as blood loss of 500 ml or more within 24 h after birth, and severe PPH (sPPH) as blood loss of 1000 ml or more within 24 h after birth [7].
Measured blood loss is thus categorized in two categories, by means of an indicator variable of blood loss greater than a certain cutoff point. The estimation approach used so far has been to compute the sample proportion of women with blood loss equal to or above the cutoff point, or a binomial proportion. However, the categorization of a dependent variable results in a loss of power to detect true effects, which is substantial if the distribution is highly skewed and if the categorization is done in few categories, or both [8].
We have shown empirically, elsewhere [4], that the distribution of postpartum blood loss volume is lognormal, using data provided by the authors from three trials that compared two drugs [1, 9] or two management procedures for the third stage of labour [10], and one observational study [11]. We used this finding to propose an analysis approach based on the lognormal distribution, resulting in more efficient estimates of proportions and relative risk and in a reduction of the sample size needed in clinical trials that compare proportions between treatments [4]. In this paper we illustrate how this approach (denoted ‘the lognormal approach’) can be used to analyze data from one of these trials, the Althabe et al. trial [1].
Methods
Descriptive histograms by treatment are constructed to have a first view of the distributions.
The lognormal approach that we propose uses measured blood loss observations without categorizing this variable. It consists of the following steps:
The procedure starts by fitting a three-parameter lognormal distribution [12] to the data. The parameters of the lognormal distribution are estimated by maximum likelihood.
Goodness of fit is assessed using probabilistic plots, consisting in plotting the quantiles of the fitted lognormal distribution against the observed blood loss values. If the fit is good, then the points will fall on a straight line. The probabilistic plot is also used to detect presence of outliers and to assess the quality of the data.
Once a lognormal distribution is considered to be a good fit to the data, the fit is visualized by plotting the cumulative distribution function for the observed data (‘empirical cumulative distribution function’) together with the fitted lognormal cumulative distribution function, or, alternatively, its complement, denoted here as the survival function. The “survival” function gives the probability of having blood loss MORE than a particular value. It is R(v) = 1-F(v), where F(v) is the cumulative probability function.
The probability of an event of the type ‘blood loss greater than a cutoff point’ is just the survival function at the cutoff point. For example, the proportion of sPPH is just the survival distribution at the point 1000.
Comparison of proportions between treatments and computation of relative risks with confidence intervals are done using bootstrap techniques. We generated one thousand bootstrap samples for each treatment. The estimates of the proportions of sPPH and PPH are then computed for each bootstrap sample, for each treatment. [13] The two bootstrap samples tables are matched by row (sample) and the relative risks computed. From the distribution of the 1000 bootstrapped relative risks, the 95% confidence interval can be obtained from the 2.5 and 97.5% percentiles of the distribution of the 1000 samples.
To illustrate the gain in precision for this data, we estimated the proportions and relative risk in the standard way, with 95% confidence intervals, denoted the binomial approach. For the proportions, we calculated the width of the 95% confidence intervals for both approaches, the lognormal approach and the binomial approach, and calculated their ratio as a quantification of the gain in precision. For the relative risk we applied a similar procedure but on the relative scale.
We also illustrate tests of hypothesis using the two approaches. For the test of equality of the sPPH proportions between the two treatments, using the binomial approach, we report the Pearson chi-square statistic and p-value. To test the equality of the distributions for the two treatments using the lognormal fits, we proceed sequentially: first we test the model with equal scale parameters versus the full model (both location and scale parameters possibly different). If the null hypothesis with equal scale parameters is not rejected, we test the model with equal location parameters against the model with possible different location parameters and conclude about whether the distributions differ at, say, 5% level of significance.
All the computations for fitting distributions and obtaining estimates were done with JMP® 13 software. [14]
Computations for sample size calculations were done with SAS® software version 9.4 (PROC POWER procedure) [15].
Results
Descriptive histograms of blood loss volume
In Fig. 1 we show histograms of frequency distributions of blood loss for the Althabe et al. trial (1), by treatment. The distributions are right-skewed, but from the histograms we cannot specify the statistical distribution originating the data.
Fitting a lognormal distribution
A three-parameter lognormal distribution (threshold lognormal, abbreviated as THLN) was fitted to the data by maximum likelihood. The third parameter was added because it improved the fit compared to the two-parameter lognormal distribution. The estimated parameters are shown in Table 1, with their standard errors and the 95% confidence intervals.
Table 1.
Treatment | Parameter | Estimate | Std Error | 95% CI |
---|---|---|---|---|
Hands-off | location | 5.57 | 0.141 | 5.30 to 5.85 |
scale | 0.72 | 0.101 | 0.52 to 0.92 | |
threshold | 55.14 | 24.474 | 7.18 to 103.11 | |
CCT | location | 5.37 | 0.132 | 5.11 to 5.63 |
scale | 0.80 | 0.101 | 0.60 to 1.00 | |
threshold | 62.88 | 16.414 | 30.71 to 95.05 |
The goodness of fit of the THLN distribution to the data can be visualized in the lognormal probability plot of Fig. 2. The probabilities from the fitted lognormal distribution and the data points are on a straight line for values above 50 mL, thereby showing that the fit of the THLN distribution to the data is very good above 50 mL and providing evidence that the lognormal distribution is appropriate to the blood loss volume distribution. No outliers were detected. Only one treatment is shown in Fig. 2 for illustration purposes, because the two treatments had a very similar behaviour.
In Fig. 3 we show the survival function for the ‘hands-off’ treatment (n = 98), where the data points are represented by black dots and the pointwise 95% confidence intervals by blue lines. Figure 3 also shows the fit of a three-parameter lognormal distribution (THLN), in a superimposed red line, with 95% confidence band (red area). In Fig. 3 we can see, for example, that the probability of having 500 mL or more of blood loss is about 0.20. The lognormal distribution fits the data points very well. The 95% confidence intervals from the lognormal fit are narrower than the ones for the binomial estimates.
Estimates of proportions and relative risk
Estimation of proportions using number of events divided by the total number in each treatment will be denoted the binomial approach. Estimation of proportions using the survival function at the relevant cut-off point will be denoted by the lognormal approach. We can see in Fig. 3 that the number of black dots above 1000 mL is 5, resulting in an estimated proportion by the binomial approach, of 5/98 = 0.051, or 5.1%, for the hands-off treatment. The estimate of the proportion of sPPH, based on the fitted survival function, is 0.038 (0.017 to 0.078). It is interesting to note that, of the five points in excess of 1000, three are close to the cut-off point. They could easily have slipped, by chance, to close values to the left of the cut-off, thereby sharply lowering the binomial estimate. Such a change would barely affect the lognormal estimate, since the fitted curve closely follows the entire cumulative empirical distribution and takes into account all the data.
Table 2 shows the estimated proportions of sPPH for the Althabe et al. trial by the binomial approach, as reported in the published trial results [1] and by the lognormal approach, per treatment group.
Table 2.
Trial | Treatment | n/N | Proportion 95% CI | Width of the 95% CI | Width ratio lognormal vs binomial (%) |
---|---|---|---|---|---|
Binomial | Hands-off | 5/98 | 0.051 (0.022 to 0.114) | 0.092 | – |
CCT | 3/101 | 0.030 (0.010 to 0.084) | 0.074 | – | |
Lognormal | Hands-off | – | 0.038 (0.017 to 0.078) | 0.062 | 67 |
CCT | – | 0.033 (0.014 to 0.069) | 0.055 | 75 |
As can be appreciated in Fig. 3, Table 2 also shows that the 95% confidence intervals for the lognormal approach are narrower than the ones for the binomial approach. For the hands-off treatment, for example, the width of the binomial estimate confidence interval is 0.092, whereas that of the lognormal estimate confidence interval is 0.062, about two thirds of the former. The 95% confidence intervals width ratio is 67% and 75% respectively for the hands-off treatment and for the controlled cord traction (CCT) treatment.
For the lognormal distribution, the relative risk, shown in Table 3, with the 95% confidence intervals, is estimated by the bootstrap technique, using 1000 bootstrap samples. Note that the confidence intervals for the RR are wide because this is a small trial that was designed to compare median blood loss as the main outcome.
Table 3.
Approach | RR (95% CI) CCT vs Hands-off | Ratio upper/lower limit of the 95% CI | Log scale width ratio lognormal vs binomial (%) |
---|---|---|---|
Binomial | 0.58 (0.14 to 2.37) | 16.9 | – |
Lognormal | 0.86 (0.22 to 2.62) | 11.9 | 70.4 |
From the comparison of the 95% confidence limits for both approaches, we obtained a log-scale width ratio of 70.4% for the lognormal approach in relation to the binomial approach, so that the gain in precision when using the lognormal instead of the binomial approach, is about 30%.
Tests of hypothesis
The test of equality of sPPH proportions by the binomial method can be based on a 2 × 2 table, giving the Pearson chi-square = 0.591 with p-value = 0.4440.
The lognormal tests of hypothesis that the two distributions are the same is shown in Table 4. The comparison of the scale parameters shows that they are not significantly different, as the p-value equals 0.7381; therefore there is no evidence that the scale parameters of the two treatments are different. The location parameters are not significantly different either, as the p-value equals 0.1506. Hence we conclude that there is no evidence against the equality of the distributions.
Table 4.
Models compared | Likelihood ratio Chi-square | DF | p-value |
---|---|---|---|
No effect vs. location | 2.066 | 1 | 0.15 |
Location vs. location and scale | 0.112 | 1 | 0.74 |
The significance level for the final test, comparing the location parameters, is about 0.15 for the lognormal approach. This can be compared to the significance level of the binomial test, 0.44. The lognormal approach seems more sensitive.
Figure 4 shows the two distributions side by side on a lognormal probability plot. The joint confidence intervals overlap entirely, suggesting that the two distributions are very similar.
Sample size: an example
We present as an example different scenarios of sample size calculation using the two approaches, the binomial and the lognormal one.
For the binomial approach, we assume that the proportion of sPPH is 0.015, 0.02 or 0.025 in the current treatment. We also assume that a new preventive therapy is considered worthwhile if the relative risk of sPPH of the new therapy with respect to the current one is no larger than 0.70 or 0.80. We calculated the sample size for the binomial response based on the likelihood ratio chi-square one-sided test for the relative risk statistic.
For the lognormal approach, we used the well known result that if a variable has a lognormal distribution, its logarithm has a normal distribution. Therefore, our sample size computations were based on the transformation of the volumes to their logarithms. For a scenario consisting of a given proportion of sPPH, say p, and a relative risk RR = p2/p1, we computed the corresponding risk of the competing treatment, p2. For the standard deviation on the log scale, we used the scale parameter obtained from the analysis of three clinical trials, that was in all cases close to s = 0.7. [4] The values of the mean m (on the log scale) can be readily computed for each value of p and for each scenario, from an equation derived from the proportion p of sPPH:
where z1-p is the (1-p) quantile (or (1-p)× 100% percentile) of the standard normal distribution, taking values p1 and p2 for a particular scenario. With the two computed means, derived from p1 and p2 using the equation above, and the (fixed) standard deviation, together with the power requirement, the computation of the sample size is straightforward, done as a comparison of means of the log-transformed variable blood loss. SAS PROC POWER with TWOSAMPLEMEANS was used for the computations.
The total sample sizes for the two approaches, for a power of 80%, in a one-sided 5% significance test, are shown in Table 5. The difference in required sample size is enormous, as expected, because the lognormal approach is using more of the information in the sample.
Table 5.
Scenario | Assumed sPPH rate for control (%) | RR | Total sample size with the binomial approach | Total sample size with the lognormal approach |
---|---|---|---|---|
1 | 1.5 | 0.70 | 15,294 | 1178 |
2 | 1.5 | 0.80 | 36,522 | 3068 |
3 | 2.0 | 0.70 | 11,422 | 1080 |
4 | 2.0 | 0.80 | 27,266 | 2814 |
5 | 2.5 | 0.70 | 9098 | 1002 |
6 | 2.5 | 0.80 | 21,714 | 2618 |
Discussion
We have illustrated how to apply an analysis technique for blood loss volume data based on the lognormal distribution, without categorizing this response variable, for a small trial [1], verifying first that the blood loss volume indeed follows a lognormal distribution. Using data from two large trials [9, 10], the same pattern was found, that we reported elsewhere [4]. For these two large trials, the fit was also very good. We have also fitted a lognormal distribution to blood loss data using the reported percentiles from an observational study [11]. In all cases, we have found empirical evidence that the blood loss volume has a lognormal distribution, and can be described by a variant of this family of distributions, the threshold three-parameter lognormal distribution [12]. A lognormal distribution with its specific parameters characterizes several physical and biological phenomena [16], and can be described by means of a physical model as a multiplicative sequence of losses.
Furthermore, the estimated location and scale parameters from all four of the studies analyzed were very similar [4]. The studies were conducted in different places and times, suggesting that the lognormal distribution fits postpartum blood loss data universally. The stability of the parameters found in the analysis of blood loss data across studies may well be a characteristic of postpartum blood loss that can be further explored.
The available analysis technique to compare proportions of an event of the type ‘blood loss above a certain cut-off point’ between treatments or interventions to prevent this event, has been to estimate the two binomial proportions of sPPH. The categorization of blood loss volume in two categories entails loss of information contained in objectively measured weight or volume data, with a resulting loss of power in tests of hypothesis and a decrease in the precision of the estimates of proportions and relative risks [8]. This fact, together with the low prevalence of rare events when the cut-off point is a high value of blood loss volume, for example 1000 mL, results in very large size of trials needed to compare this event between treatments or interventions.
We proposed a lognormal approach of analysis of postpartum haemorrhage trials aiming to compare events of the type ‘blood loss greater than a certain cutoff point’ between treatments. [4] We illustrate here this approach, consisting of fitting a lognormal type of distribution to blood loss data, which involves estimating the parameters that define the distribution. Once the distribution function of blood loss volume, or its complement, the survival function, is defined by its parameters, the proportion of sPPH, for example, is just the survival distribution at the point 1000. To compare treatments using relative risk, we estimated its confidence interval with bootstrap techniques.
An application of using the lognormal model for the distribution of the blood loss volume is a substantial reduction of sample size of clinical trials, while keeping the statistical power and precision requirements. Using the lognormal approach that we propose, based on fitting a lognormal distribution to the postpartum blood loss data, the objectives of a trial can be attained with smaller sample sizes and reduced cost, through an improvement in the efficiency of the estimation methods. With the lognormal approach, there is a trade-off between the simplicity offered by the binomial approach and the possibility of reducing the size of a trial.
Similar methods described in this paper can be used with other variables having the lognormal distribution, like blood pressure [3] and estimation of hypertension. The reason why this approach has not been used in the past, is that it is computer intensive. Nowadays, with improvement in computer power, this is not a problem, although the complexity of fitting a lognormal distribution and calculating relative risk’s confidence intervals by bootstrapping requires statistical expertise.
As an additional bonus to the use of the lognormal approach based on fitting a lognormal distribution to blood loss data, it is possible to test other hypotheses of interest, such as the equality of medians or any other percentile, or even compare the entire distributions between two treatments or interventions.
Conclusions
We illustrated how a lognormal approach based on fitting a lognormal distribution to the data can be applied to measured blood loss volume data of a trial. We found that the precision of the estimates of proportions of the event ‘blood loss greater than 1000 mL’ and its comparison between treatments improved compared to the standard methods based on dichotomizing the blood loss variable. We also illustrate how the lognormal approach can be used to compare the distribution parameters for two treatments. When analyzing data using this lognormal approach, sample size of trials can be reduced.
Acknowledgements
The authors thank the reviewers for valuable comments.
Funding
Publication charges for this supplement were funded by the University of British Columbia PRE-EMPT (Pre-eclampsia/Eclampsia, Monitoring, Prevention and Treatment) initiative supported by the Bill & Melinda Gates Foundation.
Availability of data and materials
The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.
About this supplement
This article has been published as part of Reproductive Health Volume 15 Supplement 1, 2018: Improving pregnancy outcomes - Proceedings of the 2nd International Conference on Maternal and Newborn Health: Translating Research Evidence to Practice. The full contents of the supplement will be available online at https://reproductive-health-journal.biomedcentral.com/articles/supplements/volume-15-supplement-1.
Authors contributions
GP: conceptualization, writing (original draft, editing and review). JFC: conceptualization, statistical analysis, writing (original draft, editing and review). FA: writing (editing and review), provided the data. All authors read and approved the final manuscript.
Ethics approval and consent to participate
Not applicable.
Competing interests
The authors declare not to have any competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Contributor Information
Gilda Piaggio, Email: gilda.piaggio@gmail.com.
José Ferreira de Carvalho, Email: josef.carvalho@gmail.com.
Fernando Althabe, Email: falthabe@gmail.com.
References
- 1.Althabe F, Aleman A, Tomasso G, Gibbons L, Vitureira G, Belizan JM, et al. A pilot randomized controlled trial of controlled cord traction to reduce postpartum blood loss. International journal of gynaecology and obstetrics: the official organ of the International Federation of Gynaecology and Obstetrics. 2009;107(1):4–7. doi: 10.1016/j.ijgo.2009.05.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Spratt JS., Jr The lognormal frequency distribution and human cancer. J Surg Res. 1969;9(3):151–157. doi: 10.1016/0022-4804(69)90046-8. [DOI] [PubMed] [Google Scholar]
- 3.Makuch RW, Freeman DH, Jr, Johnson MF. Justification for the lognormal distribution as a model for blood pressure. J Chronic Dis. 1979;32(3):245–250. doi: 10.1016/0021-9681(79)90070-5. [DOI] [PubMed] [Google Scholar]
- 4.Carvalho JF, Piaggio G, Wojdyla D, Widmer M, Gülmezoglu AM. Distribution of postpartum blood loss volume: modeling, estimation and application to clinical trials. 2018; Submitted. [DOI] [PMC free article] [PubMed]
- 5.Say L, Chou D, Gemmill A, Tuncalp O, Moller AB, Daniels J, et al. Global causes of maternal death: a WHO systematic analysis. Lancet Glob Health. 2014;2(6):e323–e333. doi: 10.1016/S2214-109X(14)70227-X. [DOI] [PubMed] [Google Scholar]
- 6.Khan KS, Wojdyla D, Say L, Gulmezoglu AM, Van Look PF. WHO analysis of causes of maternal death: a systematic review. Lancet. 2006;367(9516):1066–1074. doi: 10.1016/S0140-6736(06)68397-9. [DOI] [PubMed] [Google Scholar]
- 7.WHO. Recommendations for the prevention and treatment of postpartum haemorrhage. World Health Organization. 2012. [PubMed]
- 8.Taylor AB, West SG, Aiken LS. Loss of power in Logistic,Ordinal logistic, and Probit regression when an outcome variable is coarsely categorized. Educ Psychol Meas. 2006;66:228–239. doi: 10.1177/0013164405278580. [DOI] [Google Scholar]
- 9.Gulmezoglu AM, Lumbiganon P, Landoulsi S, Widmer M, Abdel-Aleem H, Festin M, et al. Active management of the third stage of labour with and without controlled cord traction: a randomised, controlled, non-inferiority trial. Lancet. 2012;379(9827):1721–1727. doi: 10.1016/S0140-6736(12)60206-2. [DOI] [PubMed] [Google Scholar]
- 10.Gulmezoglu AM, Villar J, Ngoc NT, Piaggio G, Carroli G, Adetoro L, et al. WHO multicentre randomised trial of misoprostol in the management of the third stage of labour. Lancet. 2001;358(9283):689–695. doi: 10.1016/S0140-6736(01)05835-4. [DOI] [PubMed] [Google Scholar]
- 11.Bamberg C, Niepraschk-von Dollen K, Mickley L, Henkelmann A, Hinkson L, Kaufner L, et al. Evaluation of measured postpartum blood loss after vaginal delivery using a collector bag in relation to postpartum hemorrhage management strategies: a prospective observational study. J Perinat Med. 2016;44(4):433–439. doi: 10.1515/jpm-2015-0200. [DOI] [PubMed] [Google Scholar]
- 12.Meeker WQ, Escobar LA. Statistical methods for reliability data: John Wiley & Sons. Inc.; 1998.
- 13.Efron B, Tibshirani R. Bootstrap methods for standard errors, confidence intervals, and other measures of statistical accuracy. Statist Sci. 1986;1(1):54–75. doi: 10.1214/ss/1177013815. [DOI] [Google Scholar]
- 14.JMP®, Version 13. SAS Institute Inc., Cary, NC, 2017.
- 15.SAS/STAT, software. Version 9.4 for Linux, SAS Institute Inc. ed. Inc. SI, editor. Cary, NC, USA.2012.
- 16.Limpert E, Stahel WA, Abbt M. Log-normal distributions across the sciences: keys and clues. Bioscience. 2001;51(5):341–352. doi: 10.1641/0006-3568(2001)051[0341:LNDATS]2.0.CO;2. [DOI] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.