Analysis of Parasite and Other Skewed Counts

Neal Alexander

doi:10.1111/j.1365-3156.2012.02987.x

. Author manuscript; available in PMC: 2012 Oct 11.

Published in final edited form as: Trop Med Int Health. 2012 Jun;17(6):684–693. doi: 10.1111/j.1365-3156.2012.02987.x

Analysis of Parasite and Other Skewed Counts

Neal Alexander ¹

PMCID: PMC3468795 EMSID: UKMS50039 PMID: 22943299

Abstract

Objective

To review methods for the statistical analysis of parasite and other skewed count data.

Methods

Statistical methods for skewed count data are described and compared, with reference to those used over a ten year period of Tropical Medicine and International Health. Two parasitological datasets are used for illustration.

Results

Ninety papers were identified, 89 with descriptive and 60 with inferential analysis. A lack of clarity is noted in identifying measures of location, in particular the Williams and geometric mean. The different measures are compared, emphasizing the legitimacy of the arithmetic mean for skewed data. In the published papers, the t test and related methods were often used on untransformed data, which is likely to be invalid. Several approaches to inferential analysis are described, emphasizing 1) non-parametric methods, while noting that they are not simply comparisons of medians, and 2) generalized linear modelling, in particular with the negative binomial distribution. Additional methods, such as the bootstrap, with potential for greater use are described.

Conclusions

Clarity is recommended when describing transformations and measures of location. It is suggested that non-parametric methods and generalized linear models are likely to be sufficient for most analyses.

Keywords: Statistical Data Analysis; Parasitology; Statistics, Nonparametric; Regression Analysis

82, 83, 84... 87

weevils today, Mr. Victor.

That’s above average, but the trend is down.

In a wartime detention camp, a boy counts the vermin in his food rations (from the film Empire of the Sun, based on the autobiographical novel by JG Ballard).

Introduction

Counting items such as parasites results in non-negative integer data: 0, 1, 2, etc. The aim of this paper is to review methods for statistical analysis of such data in medical research. The focus will be on parasite counts although most of the same methods are applicable to other count data such as numbers of insects, plaques or disease episodes.

Descriptive analysis of such data usually includes summary numbers which aim to convey average values of the data. In statistics these are called ‘measures of location’. For count data even these simple measures prove problematic. This is because the data are often skewed: their ‘long tail’ (Mumpower and McClelland 2002) means that a small proportion of people can, for example, harbour a large proportion of the parasites. In basic statistics it is often taught that such skewness precludes the use of the arithmetic mean because it is ‘overly’ influenced by the high values (Kirkwood and Sterne 2003). A possible alternative is to analyse the logarithms of the data instead of the original values. However, for count data, this is only feasible in the absence of zeros, because the logarithm of zero is not a finite number. This has led to controversy over what measures of location are appropriate, and to basic terms such as ‘geometric mean’ being used inconsistently.

Much of this controversy results from a reliance on statistical methods which use the normal (Gaussian) distribution, whose symmetry usually precludes a good fit to count data. However, this dependence is now largely outdated, due to the availability of approaches based on skewed distributions such as the negative binomial (Anderson and May 1991).

The current paper aims to review the statistical methods currently being used for such data, and describe their advantages and disadvantages. It will concentrate on the most common kinds of analyses, leaving aside more specialized ones such as spatial patterns, repeatability and reproducibility. Descriptive methods will be addressed first, then inferential methods, the latter being those which compare counts between groups or correlate them with other variables.

Methods and Results

Descriptive methods

Measures of location: median and means

Location is “the notion of central or ‘typical value’ in a sample distribution” (Everitt 1995). The most commonly measures of location are the median, and various types of mean. The median is the middle value of the sorted data. If the number of values is even, then there are two middle values, and the convention is to define the median as the value half way between them. The median is zero if more than half of the values are zero.

By default ‘the mean’ usually refers to the arithmetic mean — the sum of the values divided by n, the number of values — although there are other types of mean which may be useful for count data. In particular, the geometric mean is obtained by multiplying together all the data values, then taking the nth root (Borowski and Borwein 1989, Everitt 1995). The geometric mean is always less than the arithmetic mean, unless all values in the dataset are identical, in which case the two are equal(Borowski and Borwein 1989). The geometric mean is zero whenever any of the individual values is zero. This makes it rather meaningless for datasets with zeros, e.g. uninfected people.

The geometric mean is related to logarithmic transformation of the data. If there are no zero values, then the geometric mean equals the exponential of the mean of the log-values. Any exponent can be used, but it must equal the logarithm base, e.g. 10 to the power of the mean of the log₁₀-values. This suggests the following modification of the geometric mean to accommodate zero values: add 1 to all the data values, then take the geometric mean, then subtract 1 again. This is known as the Williams mean (Williams 1937). There are also other types of mean. For example, the square mean root is the square of the mean of the square root-transformed data. This measure has applications in meta-analysis(Bushman and Wang 1995) and in assessing the agreement of counts between observers or methods(Alexander et al. 2007). Other powers can also be used: in general these are called power means or algebraic means(Bonferroni 1950).

Confusion between geometric and Williams means

In the medical research literature, the Williams mean is sometimes called the geometric mean, as in the onchocerciasis community microfilarial load (CMFL) (Remme et al. 1986). Nevertheless, the two are not equal. One disadvantage of the Williams mean is its dependence on the choice of units. For example, if skin snips weigh 2mg, the CMFL per snip is not double the CMFL per mg, although both units are used in the literature(Marshall et al. 1986) (Remme et al. 1986). This scale dependence can be avoided by adding and subtracting a value with units e.g. 1/mg rather than a dimensionless 1(Alexander et al. 2005).

The geometric mean is not a clever way to estimate the arithmetic mean

Fulford explains that the Williams mean (he calls it the geometric mean) should not be used to estimate the arithmetic mean (Fulford 1994). Similarly, Dobson et al (Dobson et al. 2009) define helminth vaccine efficacy as a fold reduction in arithmetic mean egg count, then show that this is not well estimated by a ratio of Williams means. The points made by these papers are valid although may seem rather tautological. The different types of mean differ mathematically so it should not be a surprise that one of them cannot properly estimate another. Their values are no more commensurable than, for example, weight-for-height and weight-for-age in nutrition.

Choice of measure of location: arithmetic versus geometric mean

Compared to the arithmetic mean, the geometric mean is said to be ‘not overly influenced by the very large values in a skewed distribution’(Kirkwood and Sterne 2003). How much is ‘overly’? We can try to answer this question objectively by going back to whichever health outcomes motivated the study.

A clear-cut example would be noise pollution. Human perception of sound is approximately logarithmic in that orders of magnitude change in sound pressure (measured in Pascals) are perceived as approximately uniform increments on a linear scale. This is known as Weber’s law(Wojtczak and Viemeister 2008) and is why sound volumes are expressed in decibels (log-ratio of pressure over baseline). Hence, averaging is more aligned with human perception if done on the logarithmic scale: geometric mean of Pascals or arithmetic mean of decibels(Olayinka and Abdullahi 2009). This reasoning is independent of the statistical distribution of the data: even if the decibel values were skewed, their arithmetic mean would still be more relevant than the geometric mean.

The choice of measure of location may not always be so straightforward. If we consider malaria parasite density, then we could imagine either the arithmetic or the geometric mean having more interest, depending on the purpose of the study. If the intensity of transmission is of interest, it may be reasonable to assume that this is proportional to parasite density, in which case the arithmetic mean would be relevant. This would correspond to the use of the annual transmission potential (ATP) in filarial diseases, which is based on the arithmetic mean number of filariae per mosquito(Bockarie et al. 1996). If, by contrast, the objective of the study relates to number of clinical cases, the geometric mean may be more appropriate because the relation of parasite density to the probability of being a case (as opposed to asymptomatic) is non-linear, the slope reducing at higher densities(Smith et al. 1994). In this case, we would need to exclude non-parasitaemic people from the calculation of the geometric mean, although this is not problematic for this purpose because, by definition, they cannot have malaria.

In other situations another measure of location, such as the median, may be preferable. The general point is to ‘deconstruct’ the problem(Hand 1994) so that the choice of measure is subordinated to the study’s objectives, with purely statistical concerns — for example achieving a normal distribution — being secondary.

Measures of dispersion

The most commonly used measures of dispersion are probably the standard deviation, range, and interquartile range. The standard deviation is the square root of the variance, which is the average squared deviation from the mean. More specifically, the standard deviation equals $\sqrt[]{Σ_{i} {(x_{i} - \overset{‒}{x})}^{2} ∕ (n - 1)}$ where $Σ_{i}$ means the sum over the data values x_i , and $\overset{‒}{x}$ is the sample mean. The range is the interval between the minimum and maximum values in the data. This tends to increase with larger sample sizes. Percentiles are values below which a certain percentage of the data lie, after they have been sorted into ascending order. The median, for example, is the 50^th percentile. The interquartile range is the interval between the 25^th and 75^th percentiles: this interval contains the middle half of the data.

Since the distributions of count data are usually asymmetric, using the standard deviation for error bars, or in ± notation about the mean, are not usually helpful. This is because they will often imply negative values whereas, of course, for count data the lowest possible value is zero. For example, looking at the mean and standard deviation of the hookworm egg counts in Table 1, quoting ‘126 ± 322’ would be rather meaningless. The interquartile range would be one preferable option. Similarly, error bars based on standard deviations are likely to go below zero, as in, for example, the figure showing schistosomiasis faecal eggs per gram by age in one of the papers in the review described below(Seto et al. 2007). Again, plotting the interquartile range, or superimposing box and whisker plots(Kirkwood and Sterne 2003), would be better options.

Table 1.

Descriptive statistics of example datasets.

	Hookworm eggs	Plasmodium falciparum asexual blood stages
Sample size (male:female)	1237 (603:634)	477 (247:230)
Mean age in years (range)	26 (0-95)	4.9 (1-10)
Mass or volume of each sample	two Kato-Katz faecal slides, totalling approximately 1/12g	1 high power field of a thick blood film, approximately 0.001-0.0025 μL (Warrell and Gilles 2002)
Percent positive	76	100^a
Mean	126	139
Median	22	74
Geometric mean	0^b	59
Williams mean	18	61
Range	0 – 4,803	1 - 4,941
Inter-quartile range	1 – 103	23 – 181
Standard deviation	322	266
Reference	Cundhill et al (2011)	(Dunyo et al. 2006)

Open in a new tab

A positive malaria slide was an entry criterion for the study.

The geometric mean is necessarily zero because the dataset contains zero values (i.e. negative people).

For count data, the greatest utility of the standard deviation or variance may be to assess how homogeneous a distribution the data have. The Poisson distribution results from homogeneous processes, which are the simplest models for count data. The mean and variance of the Poisson distribution are equal whereas, if the variance of data is considerably greater than the mean, there is said to be overdispersion (Elliott 1977, Mwangi et al. 2008). The variance of the negative binomial distribution, for example, equals μ + μ²/k, where μ is the mean and k is a positive-valued parameter. For small values of k the variance is much greater than the mean, reflecting a high degree of overdispersion. Such distributions are sometimes also called ‘contagious’ — even if no infectious disease is involved — because they reflect clustering in the data(Elliott 1977). These two situations — homogeneity and overdispersion — are illustrated in Figure 1. It is also possible for the variance to be less than the mean. This occurs when distances between events tend to be similar, with both smaller and larger distances being rare. This is rare in practice but would be the case, for example, for spatial events on an even grid.

Simulated spatial data showing homogeneous and clustered processes in the left and right panels, respectively. The dashed lines divide each area into a 5×5 grid. For the homogenous process, the variance and mean of the 25 counts are similar: 3.8 and 4.0 respectively, consistent with a Poisson distribution. For the clustered process, the variance is much larger than the mean: 19.3 versus 3.8, indicating overdispersion. Although these data are spatial, the same principle applies to counts per person, for example. The data were generated using the ‘rpoispp’ and ‘rMatClust’ functions of the ‘spatstat’ package in the R software.

Example datasets

Hookworm eggs

These are baseline data from a longitudinal study of hookworm in Minas Gerais state, Brazil(Cundill et al. 2011). Table 1 shows summary statistics for the total egg count from two Kato-Katz thick smears, each pair being prepared from a single stool sample. Those with missing data on age, sex, or either egg count have been excluded.

Plasmodium falciparum asexual blood stages

Children participating in a trial of anti-malarial drugs had a thick blood film read microscopically(Dunyo et al. 2006). The parasite counts from a single high power field are shown in Table 1.

Literature review

Statistical methods used for parasite count data were reviewed over ten years of Tropical Medicine and International Health. The following search was done in the PubMed online database:

(parasit* OR malaria* OR helmint* OR filar*) AND (count* OR intensit* OR densit*) AND trop med int health

For the years 2001-2010 (volumes 6-15), the search returned 156 papers. They were retained in the review if they were found to include either i) descriptive analysis of parasite count data, or ii) inferential analysis in which count data constituted an outcome variable. Papers using correlation coefficients — which do not distinguish between outcome and explanatory variables — were also included under the second category. Of the 156 papers, 90 were found to include such analyses: 89 descriptive and 60 inferential.

Methods for descriptive and inferential analysis are shown in Table 2 and Table 3. The former includes measures of location, of which the geometric mean was the most commonly used (51% of papers). In some papers, the Williams mean was calculated but presented as the geometric mean. It’s likely that there were other instances of this which could not be unambiguously identified from the text. In other words, the Williams mean category (13%) was difficult, in practice, to separate from the geometric mean. The arithmetic mean was also commonly used (35%) and the median less so (7%).

Table 2.

Descriptive measures of location for parasite count data used in 89 papers in Tropical Medicine and International Health, 2001-2010.

Descriptive measure of location	Number of papers (%)^a
Arithmetic mean	31 (35)
Geometric mean^b, or arithmetic mean of logarithms
zeros absent	25 (28)
zeros present	7 (8)
unclear whether zeros present or not	13 (15)
Williams mean or similar^b,^c	12 (13)
Median	6 (7)
Prevalence by category of infection intensity	20 (22)
Other^d	1 (1)

Open in a new tab

The percentages add to more than 100 because some papers used more than one measure.

The term ‘geometric mean’ is sometimes used in these papers to refer to the Williams mean. Unambiguous examples of this, e.g. evidenced by an equation in the paper, are included under ‘Williams mean or similar’. However, it is likely that more of the ‘geometric means’ are, in fact, Williams means, especially where the data included zeros.

Williams mean = (geometric mean of (x+δ)- δ, when δ=1. Here, ‘similar’ measures include those which: used values of δ other than 1; added 1 without then subtracting it; or took the mean of log(x+1) without then exponentiating.

The arithmetic mean restricted to parasite-positive individuals.

Table 3.

Inferential statistical methods used for parasite count data in 60 papers in Tropical Medicine and International Health, 2001-2010.

Inferential Method	Number of papers (%)^a
Distribution-based:
Normal (Gaussian)^b
on untransformed data	6 (10)
after transformation of the data: logarithmic	9 (15)
other	8 (13)
unclear what, if any, transformation was used	9 (15)
Negative binomial	6 (10)
Poisson with allowance for overdispersion	1 (2)
Non-parametric	14 (23)
Other^c	8 (13)
Unclear	7 (12)

Open in a new tab

The percentages add to more than 100 because some papers used more than one method.

Including t test, analysis of variance (ANOVA), and ordinary least squares regression.

Comprising: ratios of parasite densities as response variable (2); χ² test for trend on infection categories (1); normal distribution-based analysis of village-level indices (1); inference on overlap of confidence intervals (1); Wagstaff index of aggregation of parasites among people (1); logistic regression of high versus low parasite intensity (1); review citing previous analysis (1).

Inferential analysis

t test and related methods

The t test compares means between two samples. Although it assumes that the samples are drawn from normal distributions with equal means, the results are surprisingly robust to departure from these assumptions (Heeren and d’Agostino 1987). Nevertheless, it is liable to break down when ‘skew is severe or when population variances and sample sizes both differ’(Stonehouse and Forrester 1998, Boneau 1960). At least one of these two circumstances is likely to pertain with count data. Skewness has already been mentioned, while a difference in means implies a difference in variances if the data are actually drawn from, for example, a Poisson or negative binomial distribution. Hence the t test cannot be recommended for untransformed parasite data. The same caveats apply to regression and analysis of variance, because these are all mathematically similar. Nevertheless, such analyses were done on untransformed data in at least 10% of the papers in the literature review (Table 3).

The performance of these methods may be improved by first transforming the data. In particular, if there are no zeros in the data, then a logarithmic transformation may suffice. Moreover, the t test and related techniques will then yield ratios of geometric means and so can easily be interpreted. Such methods were used in at least 15% of the papers in Table 3. As noted above, their robustness means that the normal distribution does not need to be a perfect fit for the results to be reliable. Figure 2 shows a histogram of malaria parasite densities from Dunyo et al. (2006) on a log scale. The shape is similar to that of the superimposed fitted normal curve, suggesting that the t test and related methods are likely to be applicable to the log-transformed data.

Histogram, on a log scale, of numbers of *Plasmodium falciparum* asexual blood stages on one high power field of a thick blood film in a clinical trial (Dunyo et al. 2006). The dashed line is the fitted normal distribution.

The logarithmic transformation is not applicable, however, in the presence of zeros. Various other options are available although, in most cases, they lack the same ease of interpretation. One option is to add 1, or another value, before taking logarithms. Table 2 shows that at least 13% of papers in the literature review used a transformation other than the simple logarithmic, while in 15% it was not clear what, if any, transformation was used. For negative binomial distributions, the value k/2 can be added instead of 1 before taking logarithms, or an inverse hyperbolic sine transformation can be used (Beall 1942, Anscombe 1948, Laubscher 1961, Elliott 1977). Apart from the basic logarithmic transformation, these have at least two disadvantages. With few exceptions, the results cannot be interpreted as easily, and the transformation ‘cannot remove the clump’ of zeros, if present, and so the data will not be normalized(Hallstrom 2010). In general, if zeros are present, then other techniques, in particular generalized linear models (see below), are likely to be preferable to normal-theory methods on transformed data(O’Hara and Kotze 2010).

Non-parametric methods

These methods generally use the ranks of the data rather than the original values. This means that, for example, a single data value much greater than the others will not greatly affect the results as it would for a t test. One of the simplest non-parametric methods is the Mann-Whitney test, which compares data from two independent samples. In the literature review, non-parametric methods were used in 23% of those papers which used any inferential method (Table 3). Nevertheless, they do have some disadvantages:

If the sample size is low, and the distribution of the data is close to normal, then they are likely to have considerably lower power than parametric methods such as the t test(Bland 1995). In these — rather limited — circumstances, a parametric method is likely to be preferable.

Contrary to common conception they cannot, in general, be interpreted as comparisons of medians(Hart 2001). For example, the Mann-Whitney test (also known as the Wilcoxon test) assesses whether two distribution functions are unequal for at least one value (Conover 1980). The test can only be interpreted as a comparison of medians if the samples are drawn from populations whose distributions have the same shape, i.e., when represented graphically, they can be laid exactly on top of each other simply by shifting one along the horizontal axis by an amount Δ: a ‘shift alternative’(Bauer 1972). This is unlikely to apply to count data but, if it were, it’s likely that a t test would have applicable in the first place (Bland 1995). When software provides a confidence interval, it’s not necessarily for the difference in medians.

They are not generally capable of complex analyses, in particular those with multiple explanatory (predictor) variables.

Overall, non-parametric methods are suitable for simple analyses of skewed count data, although are not as foolproof as they may appear.

Generalized linear models

We have seen techniques which assume that the distribution of the data is normal (Gaussian) and others which do not depend on it having any particular form. Another option is to find a distribution which does fit the data adequately: this is done in generalized linear modelling. This can be thought of a kind of regression in which:

The distribution of the data is not necessarily normal. For count data the Poisson or negative binomial, for example, may be suitable.

We model a function of the mean, not necessarily the mean itself. This function is called the link function and, for present purposes, is likely to be the logarithm. It is conventional here for statistical packages to use the natural (base e) logarithms

Parasite counts are usually overdispersed relative to Poisson, requiring a distribution such as the negative binomial which can accommodate this. If a Poisson distribution is used regardless, the results can be misleading (Barker and Cadwell 2008). However, it is possible to use the ‘sandwich’ estimator (e.g. with the ‘vce(robust)’ option in STATA) or ‘quasi-likelihood’ models, both of which effectively multiply the standard errors by a factor estimated from the data (White 1980, Sileshi 2006, Noe et al. 2010), but which do not fully characterize a specific distribution for the data.

Once the distribution has been chosen, a generalized linear model is fitted by maximizing the likelihood — i.e. the probability of the data, considered as a function of the model parameters (Everitt 1995) — over possible values of the regression coefficients. A couple of points are worth stressing.

First, it is the fitted mean, not the data, which is transformed in this approach. Hence, a logarithmic link function models the logarithm of the arithmetic mean, not the geometric mean as with an ordinary regression on the log-transformed data.

Secondly, the data must have values which are feasible for the chosen distribution. In the case of the Poisson or negative binomial distributions, for example, which apply to whole-number (integer) data, then, we must model the actual counts. If we have, for example, a set of parasite counts based on varying numbers of replicates, e.g. Kato-Katz slides, we should not input the average values to the GLM, since these are not necessarily whole numbers. Rather, the log number of replicates should be included as an offset in the model. This is an explanatory variable with its regression coefficient constrained to equal 1. In the example of multiple slides per person, if there is a single explanatory variable x with coefficient β, then we model the log mean count per person as log_e(μ)=α+ βx + log_e(n), so log_e(μ/n) = α+ βx. We can then obtain exp(β) as the ratio of the average count per slide, even though we specified the total count as the response variable. Similarly, when analysing the incidence of events when the follow-up time varies between people, using log-time as an offset enables the analysis to yield rate ratios. Figure 3 shows mean hookworm egg count fitted as a function of age. As with other regression models, different forms of dependence can be included (e.g. polynomial), as can multiple explanatory variables.

Negative binomial regression of faecal egg count on age, the sloped solid line showing the fitted values. The × symbols are the means by age group, with vertical solid lines showing the 95% confidence intervals. Although age groups are shown, the fitted model uses the exact age for each person. The curved dashed lines are the pointwise 95% confidence intervals based on the whole dataset.

O’Hara & Kotze compared GLMs with transformation-based methods and found that the latter perform poorly — in terms of bias and sampling error — unless dispersion is small and the mean counts large(O’Hara and Kotze 2010). Goodness of fit measures for generalized linear models, and negative binomial regression in particular, include checking that the residual deviance is similar to the residual degrees of freedom (Hilbe 2007), and plotting the residuals against fitted values to check that they have random scatter with constant variance (McCullagh and Nelder 1983). Chi-squared tests can also be done to compare the numbers in various categories of the outcome variable with those expected under the given distribution. However, it should be borne in mind that, with a large sample size, a departure from the assumed distribution may be large enough to be detected by the goodness-of-fit test, but too small to affect materially the results of the main analysis. The present author tends to rely on visual assessment of the overall distribution as shown in Figure 4.

Histogram of numbers of hookworm eggs counted on two Kato-Katz slides per person in the baseline survey of a prospective study in Brazil. The axis labels show the original values although the scale has been eighth-root transformed. The dashed line is the fitted negative binomial distribution and the dotted line the fitted Poisson.

Other distributions for skewed data include versions of the Poisson and negative binomial in which the proportion of zeros is defined by additional parameters. These are called zero-inflated distributions. The zero-inflated negative binomial may be a better fit than the negative binomial in some cases(Walker et al. 2009). If zeros were not recorded in the dataset then a zero-truncated distribution may be suitable, while if counting stopped at a maximum value — for example no more than 500 Ascaris eggs were counted per slide by Cundill et al (2011) — then censored distributions are available(Hilbe 2007). There are yet other distributions which have been little (Grenfell et al. 1990, Shoukri et al. 2004) (Hoshino 2005), if at all, (Dobbie 2001, Massé and Theodorescu 2005, Shmueli et al. 2005) used for parasite data.

Other methods with potential for greater use

There are several statistical methods which have not been described above but which have scope for greater use with count data. The bootstrap was notable by its absence in the articles reviewed. This approach is based on re-sampling the data and observing the sampling variation in any outcome of choice, such as a difference or ratio in medians or means(Efron and Tibshirani 1993). For simple comparisons this would seem to have some advantages over non-parametric methods. However, it is not recommended for small sample sizes because the discreteness of the sampling distribution may make the inference unreliable.

The median or other percentile of the counts can be modelled as a function of explanatory variables using quantile regression(Yu et al. 2003). Transformation of the counts may be advisable but, since the usual functions maintain rank order (they are monotonic), this will not affect the interpretation of the results.

In many cases the negative (zero) instances may warrant different treatment, whether for purely statistical reasons or due to clinical or scientific distinctions between them and the positive group. There is a well-developed literature on models which treat these two groups distinctly within a single overall analysis. These may be referred to as two-part, mixture, or bivarate models(Moulton et al. 2002, Lachenbruch 2002, Hallstrom 2010). When the proportion of zeros is modelled as a function of covariates, then zero-inflated distributions are in this category. The results for the zero and non-zero parts of the model can be presented separately or, depending on the kind of model, it may be possible to synthesize them to report the overall mean(Burton et al. 2003).

Finally, when no single parameter, such as the median or mean, is found to be an adequate summary of the data(Montresor 2007), infection intensity classes (eg negative, mild, moderate, heavy) can be analysed as ordered categories (Agresti 1999).

Discussion

The statistical challenges of count data result not so much from extreme skewness per se as from its combination with zero values. If there are no zero values then analysis of log-transformed data by common methods such as the t test may be applicable, yielding results in terms of geometric means. However, when zeros are present, the log transformation is not readily applicable, and the geometric mean is zero, complicating the choice of measure of location for count data.

The Williams mean is an attempt to retrieve the geometric mean by applying it to the counts plus one, then subtracting one again. This approach is commonly used in analysis of parasite counts, although more for expediency than for clinical or biological relevance. Moreover, since it is sometimes misleadingly called the geometric mean, it is often difficult to decide from the text of a paper what was actually done.

There is some resistance to using the arithmetic mean as a measure of location, given that basic statistics teaching usually brands it unsuitable for skewed data. Nevertheless, in some circumstances the mean is the most relevant outcome measure. In health economics, for example, the mean cost per patient is proportional to the total cost, and for that reason it is more relevant than other measures such as the median(Barber and Thompson 2000). This should not, and need not, be trumped by concerns over difficulty of analysis. Similarly, in parasitology, the arithmetic mean per person may be of interest because, for example, the total community burden in terms of parasite numbers is proportional to the arithmetic mean. Methods are available which model the arithmetic mean while accommodating skewness in the data. The median can also be used, although will be zero if the prevalence is less than 50%. Non-parametric methods are commonly said to assess differences between medians, although this is not generally the case. It is possible to do sample size calculations and statistical analysis in terms of arithmetic means via suitable distributions such as the negative binomial (Alexander et al. 2011).

Choosing the most relevant measure of location will affect how any further analysis is done. Given the powerful statistical software now readily available, it should be possible to choose a method not solely on statistical considerations — finding one whose assumptions are justified — but based on how closely it responds to the original research question (Hand 1994).

The current paper emphasizes generalized linear models (GLMs), in particular with the negative binomial distribution. Fuller explications of GLMs are available elsewhere (Gardner et al. 1995, Wilson and Grenfell 1997, Coxe et al. 2009, McElduff et al. 2010). The negative binomial often provides a good fit to parasite data(Anderson and May 1991) although other distributions may be better for particular datasets(Walker et al. 2009). One review of analysis methods for entomological data recommended negative binomial and quasi-likelihood approaches (Sileshi 2006), while another of aquatic organisms favoured the latter(Noe et al. 2010). One concern with the negative binomial is that, empirically, the dispersion parameter (k) is often found to increase with the mean(Alexander et al. 2011), yet basic models assume it to be constant. On the other hand, a simulation study has found likelihood-based analysis to be robust to this(Aban et al. 2008). There are also alternative parameterizations of the negative binomial with different variance-mean dependence (Hilbe 2007). GLM techniques for taking account of clustering can also be used in cohort studies with multiple disease episodes or repeated measures per person, although the issue of time dependence is likely to arise and this is beyond the scope of the current paper.

When describing any numerical information there is a trade-off to be made between the conciseness of any summary measure and the information it conveys. For parasite data, the case has been made for summarizing intensity in the form of categories rather than any single value(Montresor 2007). This is indeed more informative, but less concise. For some purposes, such as clinical trials, a single measure per arm — such as a mean — is likely to be preferred. If categories are used in clinical trials, they will probably be further summarized into a single measure, such as the proportion with heavy infection.

Acknowledgements

This work was supported financially by United Kingdom Medical Research Council grant number G7508177 to the Tropical Epidemiology Group, and by the Human Hookworm Vaccine Initiative (HHVI) of the Sabin Vaccine Institute, which receives funding from the Bill and Melinda Gates Foundation. The studies which generated the hookworm and malaria datasets presented here were supported respectively by 1) the HHVI and 2) the Wellcome Trust (Project 061910), the Gates Malaria Partnership, and the United Kingdom Medical Research Council. I am grateful to Christian Bottomley, Paul Milligan and an anonymous referee for useful comments on the manuscript.

References

Aban IB, Cutter GR, Mavinga N. Inferences and Power Analysis Concerning Two Negative Binomial Distributions with An Application to MRI Lesion Counts Data. Comput Stat Data Anal. 2008;53:820–833. doi: 10.1016/j.csda.2008.07.034. [DOI] [PMC free article] [PubMed] [Google Scholar]
Agresti A. Modelling ordered categorical data: recent advances and future challenges. Statistics in Medicine. 1999;18:2191–2208. doi: 10.1002/(sici)1097-0258(19990915/30)18:17/18<2191::aid-sim249>3.0.co;2-m. [DOI] [PubMed] [Google Scholar]
Alexander N, Bethony J, Corrêa-Oliveira R, Rodrigues LC, Hotez P, Brooker S. Repeatability of paired counts. Statistics in Medicine. 2007;26:3566–3577. doi: 10.1002/sim.2724. [DOI] [PubMed] [Google Scholar]
Alexander N, Cundill B, Sabatelli L, et al. Selection and quantification of infection endpoints for trials of vaccines against intestinal helminths. Vaccine. 2011;29:3686–94. doi: 10.1016/j.vaccine.2011.03.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
Alexander ND, Solomon AW, Holland MJ, et al. An index of community ocular Chlamydia trachomatis load for control of trachoma. Trans R Soc Trop Med Hyg. 2005;99:175–7. doi: 10.1016/j.trstmh.2004.05.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
Anderson RM, May RM. Infectious Diseases of Humans: Dynamics and Control. 1st edn Oxford University Press; Oxford: 1991. [Google Scholar]
Anscombe FJ. The transformation of Poisson, binomial and negative-binomial data. Biometrika. 1948;35:246–254. [Google Scholar]
Barber JA, Thompson SG. Analysis of cost data in randomized trials: an application of the non-parametric bootstrap. Statistics in Medicine. 2000;19:3219–3236. doi: 10.1002/1097-0258(20001215)19:23<3219::aid-sim623>3.0.co;2-p. [DOI] [PubMed] [Google Scholar]
Barker L, Cadwell BL. An analysis of eight 95 per cent confidence intervals for a ratio of Poisson parameters when events are rare. Statistics in Medicine. 2008;27:4030–4037. doi: 10.1002/sim.3234. [DOI] [PubMed] [Google Scholar]
Bauer DF. Constructing confidence sets using rank statistics. Journal of the American Statistical Association. 1972;67:687–690. [Google Scholar]
Beall G. The transformation of data from entomological field experiments so that the analysis of variance becomes applicable. Biometrika. 1942;32:243–262. [Google Scholar]
Bland M. An Introduction to Medical Statistics. 2nd edn Oxford University Press; Oxford: 1995. [Google Scholar]
Bockarie M, Kazura J, Alexander N, et al. Transmission dynamics of Wuchereria bancrofti in East Sepik Province, Papua New Guinea. American Journal of Tropical Medicine and Hygiene. 1996;54:577–581. doi: 10.4269/ajtmh.1996.54.577. [DOI] [PubMed] [Google Scholar]
Boneau CA. The effects of violations of assumptions underlying the t test. Psychological Bulletin. 1960;57:49–64. doi: 10.1037/h0041412. [DOI] [PubMed] [Google Scholar]
Bonferroni CE. Sulle medie multiple di potenze [On multiple algebraic means] Bollettino dell’Unione Matematica Italiana, serie 3. 1950;5:267–270. [Google Scholar]
Borowski EJ, Borwein JM. Dictionary of Mathematics. 1st edn HarperCollins Publishers; London: 1989. [Google Scholar]
Burton MJ, Holland MJ, Faal N, et al. Which members of a community need antibiotics to control trachoma? Conjunctival Chlamydia trachomatis infection load in Gambian villages. Investigative Ophthalmology and Visual Science. 2003;44:4215–4222. doi: 10.1167/iovs.03-0107. [DOI] [PubMed] [Google Scholar]
Bushman BJ, Wang MC. A procedure for combining sample correlation coefficients and vote counts to obtain an estimate and a confidence interval for the population correlation coefficient. Psychological Bulletin. 1995;117:530–546. [Google Scholar]
Conover WJ. Practical Nonparametric Statistics. 2nd edn John Wiley & Sons; New York: 1980. [Google Scholar]
Coxe S, West SG, Aiken LS. The analysis of count data: a gentle introduction to Poisson regression and its alternatives. J Pers Assess. 2009;91:121–36. doi: 10.1080/00223890802634175. [DOI] [PubMed] [Google Scholar]
Cundill B, Alexander N, Bethony J, Diemert D, Pullan RL, Brooker S. Rates and intensity of re-infection with human helminths after treatment and the influence of individual, household, and environmental factors in a Brazilian community. Parasitology. 2011;138:1406–16. doi: 10.1017/S0031182011001132. [DOI] [PMC free article] [PubMed] [Google Scholar]
Dobbie MJ. Modelling correlated zero-inflated count data. Australian National University; Canberrra: 2001. [Google Scholar]
Dobson RJ, Sangster NC, Besier RB, Woodgate RG. Geometric means provide a biased efficacy result when conducting a faecal egg count reduction test (FECRT) Vet Parasitol. 2009;161:162–7. doi: 10.1016/j.vetpar.2008.12.007. [DOI] [PubMed] [Google Scholar]
Dunyo S, Ord R, Hallett R, et al. Randomised trial of chloroquine/sulphadoxine-pyrimethamine in Gambian children with malaria: impact against multidrug-resistant P. falciparum. PLoS Clin Trials. 2006;1:e14. doi: 10.1371/journal.pctr.0010014. [DOI] [PMC free article] [PubMed] [Google Scholar]
Efron B, Tibshirani R. An Introduction to the Bootstrap. 1st edn Chapman and Hall; New York: 1993. [Google Scholar]
Elliott JM. Some Methods for the Statistical Analysis of Samples of Benthic Invertebrates. 2nd edn Freshwater Biological Association; Ambleside: 1977. [Google Scholar]
Everitt B. Cambridge Dictionary of Statistics in the Medical Sciences. Cambridge University Press; Cambridge: 1995. [Google Scholar]
Fulford AJC. Dispersion and bias: can we trust geometric means? Parasitology Today. 1994;10:446–448. doi: 10.1016/0169-4758(94)90181-3. [DOI] [PubMed] [Google Scholar]
Gardner W, Mulvey EP, Shaw EC. Regression analyses of counts and rates: Poisson, overdispersed Poisson, and negative binomial models. Psychological Bulletin. 1995;118:392–404. doi: 10.1037/0033-2909.118.3.392. [DOI] [PubMed] [Google Scholar]
Grenfell BT, Das PK, Rajagopalan PK, Bundy DAP. Frequency distribution of lymphatic filariasis microfilariae in human populations: population processes and statistical estimation. Parasitology. 1990;101:417–427. doi: 10.1017/s0031182000060613. [DOI] [PubMed] [Google Scholar]
Hallstrom AP. A modified Wilcoxon test for non-negative distributions with a clump of zeros. Stat Med. 2010;29:391–400. doi: 10.1002/sim.3785. [DOI] [PubMed] [Google Scholar]
Hand DJ. Deconstructing statistical questions (with discussion) Journal of the Royal Statistical Society Series A (Statistics in Society) 1994;157:317–356. [Google Scholar]
Hart A. Mann-Whitney test is not just a test of medians: differences in spread can be important. BMJ. 2001;323:391–3. doi: 10.1136/bmj.323.7309.391. [DOI] [PMC free article] [PubMed] [Google Scholar]
Heeren T, d’Agostino R. Robustness of the two-independent samples t-test when applies to ordinal scale data. Statistics in Medicine. 1987;6:79–90. doi: 10.1002/sim.4780060110. [DOI] [PubMed] [Google Scholar]
Hilbe JM. Negative Binomial Regression. 1st edn Cambridge University Press; Cambridge: 2007. [Google Scholar]
Hoshino N. Engen’s extended negative binomial model revisited. Annals of the Institute of Statistical Mathematics. 2005;57:369–387. [Google Scholar]
Kirkwood BR, Sterne JAC. Essentials of Medical Statistics. 2nd edn Blackwell Scientific Publications; Oxford: 2003. [Google Scholar]
Lachenbruch PA. Analysis of data with excess zeros. Statistical Methods in Medical Research. 2002;11:297–302. doi: 10.1191/0962280202sm289ra. [DOI] [PubMed] [Google Scholar]
Laubscher NF. On stabilizing the binomial and negative binomial variance. Journal of the American Statistical Association. 1961;56:143–150. [Google Scholar]
Marshall TF, Anderson J, Fuglsang H. The incidence of eye lesions and visual impairment in onchocerciasis in relationship to the intensity of infection. Transactions of the Royal Society of Tropical Medicine and Hygiene. 1986;80:426–434. doi: 10.1016/0035-9203(86)90333-0. [DOI] [PubMed] [Google Scholar]
Massé JC, Theodorescu R. Neyman type A distribution revisited. Statistica Neerlandica. 2005;59:206–213. [Google Scholar]
McCullagh P, Nelder JA. Generalized Linear Models. 1st edn Chapman and Hall; London: 1983. [Google Scholar]
McElduff F, Cortina-Borja M, Chan S-K, Wade A. When t-tests or Wilcoxon-Mann-Whitney tests won’t do. Advances in Physiology Education. 2010;34:128–133. doi: 10.1152/advan.00017.2010. [DOI] [PubMed] [Google Scholar]
Montresor A. Arithmetic or geometric means of eggs per gram are not appropriate indicators to estimate the impact of control measures in helminth infections. Trans R Soc Trop Med Hyg. 2007;101:773–6. doi: 10.1016/j.trstmh.2007.04.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
Moulton LH, Curriero FC, Barroso PF. Mixture models for quantitative HIV RNA data. Statistical Methods in Medical Research. 2002;11:317–325. doi: 10.1191/0962280202sm292ra. [DOI] [PubMed] [Google Scholar]
Mumpower JL, McClelland G. Measurement error, skewness, and risk analysis: coping with the long tail of the distribution. Risk Anal. 2002;22:277–90. doi: 10.1111/0272-4332.00027. [DOI] [PubMed] [Google Scholar]
Mwangi TW, Fegan G, Williams TN, Kinyanjui SM, Snow RW, Marsh K. Evidence for over-dispersion in the distribution of clinical malaria episodes in children. PLoS ONE. 2008;3:e2196. doi: 10.1371/journal.pone.0002196. [DOI] [PMC free article] [PubMed] [Google Scholar]
Noe DA, Bailer AJ, Noble RB. Comparing methods for analyzing overdispersed count data in aquatic toxicology. Environ Toxicol Chem. 2010;29:212–9. doi: 10.1002/etc.2. [DOI] [PubMed] [Google Scholar]
O’Hara RB, Kotze DJ. Do not log-transform count data. Methods in Ecology & Evolution. 2010;1:118–122. [Google Scholar]
Olayinka OS, Abdullahi SA. An overview of industrial employees’ exposure to noise in sundry processing and manufacturing industries in Ilorin metropolis, Nigeria. Ind Health. 2009;47:123–33. doi: 10.2486/indhealth.47.123. [DOI] [PubMed] [Google Scholar]
Remme J, Ba O, Dadzie KY, Karam M. A force-of-infection model for onchocerciasis and its applications in the epidemiological evaluation of the Onchocerciasis Control Programme in the Volta River basin area. Bulletin of the World Health Organization. 1986;64:667–681. [PMC free article] [PubMed] [Google Scholar]
Seto EY, Lee YJ, Liang S, Zhong B. Individual and village-level study of water contact patterns and Schistosoma japonicum infection in mountainous rural China. Trop Med Int Health. 2007;12:1199–209. doi: 10.1111/j.1365-3156.2007.01903.x. [DOI] [PubMed] [Google Scholar]
Shmueli G, Minka T, Kadane JB, Borle S, Boatwright PB. A useful distribution for fitting discrete data: revival of the Conway-Maxwell-Poisson distribution. Journal of the Royal Statistical Society: Series C (Applied Statistics) 2005;54:127–142. [Google Scholar]
Shoukri MM, Asyali MH, VanDorp R, Kelton D. The Poisson inverse Gaussian regression model in the analysis of clustered counts data. Journal of Data Science. 2004;2:17–32. [Google Scholar]
Sileshi G. Selecting the right statistical model for analysis of insect count data by using information theoretic measures. Bull Entomol Res. 2006;96:479–88. [PubMed] [Google Scholar]
Smith T, Armstrong Schellenberg J, Hayes R. Attributable fraction estimates and case definitions for malaria in endemic areas. Statistics in Medicine. 1994;13:2345–2358. doi: 10.1002/sim.4780132206. [DOI] [PubMed] [Google Scholar]
Stonehouse JM, Forrester GJ. Robustness of the t and U tests under combined assumption violations. Journal of Applied Statistics. 1998;25:63–74. [Google Scholar]
Walker M, Hall A, Anderson RM, Basanez MG. Density-dependent effects on the weight of female Ascaris lumbricoides infections of humans and its impact on patterns of egg production. Parasit Vectors. 2009;2:11. doi: 10.1186/1756-3305-2-11. [DOI] [PMC free article] [PubMed] [Google Scholar]
Warrell DM, Gilles HM, editors. Essential Malariology. Arnold; London: 2002. [Google Scholar]
White H. A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica. 1980;48:817–838. [Google Scholar]
Williams CB. The use of logarithms in the interpretation of certain entomological problems. Annals of Applied Biology. 1937;24:404–414. [Google Scholar]
Wilson K, Grenfell BT. Generalized linear modelling for parasitologists. Parasitology Today. 1997;13:33–38. doi: 10.1016/s0169-4758(96)40009-6. [DOI] [PubMed] [Google Scholar]
Wojtczak M, Viemeister NF. Perception of suprathreshold amplitude modulation and intensity increments: Weber’s law revisited. J Acoust Soc Am. 2008;123:2220–36. doi: 10.1121/1.2839889. [DOI] [PMC free article] [PubMed] [Google Scholar]
Yu K, Lu Z, Stander J. Quantile regression: applications and current research areas. Statistician. 2003;52:331–350. [Google Scholar]

Papers Included in the Literature Review

Alker AP, Kazadi WM, Kutelemeni AK, Bloland PB, Tshefu AK, Meshnick SR. dhfr and dhps genotype and sulfadoxine-pyrimethamine treatment failure in children with falciparum malaria in the Democratic Republic of Congo. Trop Med Int Health. 2008;13:1384–91. doi: 10.1111/j.1365-3156.2008.02150.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
Allam AF, Kader O, Zaki A, Shehab AY, Farag HF. Assessing the marginal error in diagnosis and cure of Schistosoma mansoni in areas of low endemicity using Percoll and PCR techniques. Trop Med Int Health. 2009;14:316–21. doi: 10.1111/j.1365-3156.2009.02225.x. [DOI] [PubMed] [Google Scholar]
Appawu MA, Dadzie SK, Baffoe-Wilmot A, Wilson MD. Lymphatic filariasis in Ghana: entomological investigation of transmission dynamics and intensity in communities served by irrigation systems in the Upper East Region of Ghana. Trop Med Int Health. 2001;6:511–6. doi: 10.1046/j.1365-3156.2001.00737.x. [DOI] [PubMed] [Google Scholar]
Audibert M, Mathonnat J, Henry MC. Malaria and property accumulation in rice production systems in the savannah zone of Cote d’Ivoire. Trop Med Int Health. 2003;8:471–83. doi: 10.1046/j.1365-3156.2003.01051.x. [DOI] [PubMed] [Google Scholar]
Avila JC, Villaroel R, Marquino W, Zegarra J, Mollinedo R, Ruebush TK. Efficacy of mefloquine and mefloquine-artesunate for the treatment of uncomplicated Plasmodium falciparum malaria in the Amazon region of Bolivia. Trop Med Int Health. 2004;9:217–21. doi: 10.1046/j.1365-3156.2003.01184.x. [DOI] [PubMed] [Google Scholar]
Berhe N, Geitung JT, Medhin G, Gundersen SG. Large scale evaluation of WHO’s ultrasonographic staging system of schistosomal periportal fibrosis in Ethiopia. Trop Med Int Health. 2006;11:1286–94. doi: 10.1111/j.1365-3156.2006.01665.x. [DOI] [PubMed] [Google Scholar]
Bethony J, Williams JT, Kloos H, et al. Exposure to Schistosoma mansoni infection in a rural area in Brazil. II: household risk factors. Trop Med Int Health. 2001;6:136–45. doi: 10.1046/j.1365-3156.2001.00685.x. [DOI] [PubMed] [Google Scholar]
Black CL, Steinauer ML, Mwinzi PN, Evan Secor W, Karanja DM, Colley DG. Impact of intense, longitudinal retreatment with praziquantel on cure rates of schistosomiasis mansoni in a cohort of occupationally exposed adults in western Kenya. Trop Med Int Health. 2009;14:450–7. doi: 10.1111/j.1365-3156.2009.02234.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
Boisier P, Ramarokoto CE, Ravoniarimbinina P, Rabarijaona L, Ravaoalimalala VE. Geographic differences in hepatosplenic complications of schistosomiasis mansoni and explanatory factors of morbidity. Trop Med IntHealth. 2001;6:699–706. doi: 10.1046/j.1365-3156.2001.00781.x. [DOI] [PubMed] [Google Scholar]
Bonnet S, Paul RE, Gouagna C, et al. Level and dynamics of malaria transmission and morbidity in an equatorial area of South Cameroon. Trop Med Int Health. 2002;7:249–56. doi: 10.1046/j.1365-3156.2002.00861.x. [DOI] [PubMed] [Google Scholar]
Bosompem KM, Bentum IA, Otchere J, et al. Infant schistosomiasis in Ghana: a survey in an irrigation community. Trop Med Int Health. 2004;9:917–22. doi: 10.1111/j.1365-3156.2004.01282.x. [DOI] [PubMed] [Google Scholar]
Bousema JT, Gouagna LC, Meutstege AM, et al. Treatment failure of pyrimethamine-sulphadoxine and induction of Plasmodium falciparum gametocytaemia in children in western Kenya. Trop Med Int Health. 2003;8:427–30. doi: 10.1046/j.1365-3156.2003.01047.x. [DOI] [PubMed] [Google Scholar]
Cox SE, Staalsoe T, Arthur P, et al. Maternal vitamin A supplementation and immunity to malaria in pregnancy in Ghanaian primigravids. Trop Med Int Health. 2005;10:1286–97. doi: 10.1111/j.1365-3156.2005.01515.x. [DOI] [PubMed] [Google Scholar]
Cramer JP, Nussler AK, Ehrhardt S, et al. Age-dependent effect of plasma nitric oxide on parasite density in Ghanaian children with severe malaria. Trop Med Int Health. 2005;10:672–80. doi: 10.1111/j.1365-3156.2005.01438.x. [DOI] [PubMed] [Google Scholar]
Critchley J, Addiss D, Ejere H, Gamble C, Garner P, Gelband H. Albendazole for the control and elimination of lymphatic filariasis: systematic review. Trop Med Int Health. 2005;10:818–25. doi: 10.1111/j.1365-3156.2005.01458.x. [DOI] [PubMed] [Google Scholar]
Cunin P, Tchuem Tchuente LA, Poste B, Djibrilla K, Martin PM. Interactions between Schistosoma haematobium and Schistosoma mansoni in humans in north Cameroon. Trop Med Int Health. 2003;8:1110–7. doi: 10.1046/j.1360-2276.2003.01139.x. [DOI] [PubMed] [Google Scholar]
Drakeley CJ, Jawara M, Targett GA, et al. Addition of artesunate to chloroquine for treatment of Plasmodium falciparum malaria in Gambian children causes a significant but short-lived reduction in infectiousness for mosquitoes. Trop Med Int Health. 2004;9:53–61. doi: 10.1046/j.1365-3156.2003.01169.x. [DOI] [PubMed] [Google Scholar]
Durrani N, Leslie T, Rahim S, Graham K, Ahmad F, Rowland M. Efficacy of combination therapy with artesunate plus amodiaquine compared to monotherapy with chloroquine, amodiaquine or sulfadoxine-pyrimethamine for treatment of uncomplicated Plasmodium falciparum in Afghanistan. Trop Med Int Health. 2005;10:521–9. doi: 10.1111/j.1365-3156.2005.01429.x. [DOI] [PubMed] [Google Scholar]
Ensink JH, Mahmood T, Dalsgaard A. Wastewater-irrigated vegetables: market handling versus irrigation water quality. Trop Med Int Health. 2007;12(Suppl 2):2–7. doi: 10.1111/j.1365-3156.2007.01935.x. [DOI] [PubMed] [Google Scholar]
Esteban JG, Gonzalez C, Bargues MD, et al. High fascioliasis infection in children linked to a man-made irrigation zone in Peru. Trop Med Int Health. 2002;7:339–48. doi: 10.1046/j.1365-3156.2002.00870.x. [DOI] [PubMed] [Google Scholar]
Falade C, Mokuolu O, Okafor H, et al. Epidemiology of congenital malaria in Nigeria: a multi-centre study. Trop Med Int Health. 2007;12:1279–87. doi: 10.1111/j.1365-3156.2007.01931.x. [DOI] [PubMed] [Google Scholar]
Fanello CI, Karema C, van Doren W, Rwagacondo CE, D’Alessandro U. Tolerability of amodiaquine and sulphadoxine-pyrimethamine, alone or in combination for the treatment of uncomplicated Plasmodium falciparum malaria in Rwandan adults. Trop Med Int Health. 2006;11:589–96. doi: 10.1111/j.1365-3156.2006.01610.x. [DOI] [PubMed] [Google Scholar]
Feldmeier H, Helling-Giese G, Poggensee G. Unreliability of PAP smears to diagnose female genital schistosomiasis. Trop Med Int Health. 2001;6:31–3. doi: 10.1046/j.1365-3156.2001.00647.x. [DOI] [PubMed] [Google Scholar]
Fleming FM, Brooker S, Geiger SM, et al. Synergistic associations between hookworm and other helminth species in a rural community in Brazil. Trop Med Int Health. 2006;11:56–64. doi: 10.1111/j.1365-3156.2005.01541.x. [DOI] [PubMed] [Google Scholar]
Gilgen DD, Mascie-Taylor CG, Rosetta LL. Intestinal helminth infections, anaemia and labour productivity of female tea pluckers in Bangladesh. Trop Med Int Health. 2001;6:449–57. doi: 10.1046/j.1365-3156.2001.00729.x. [DOI] [PubMed] [Google Scholar]
Gryseels B, Mbaye A, De Vlas SJ, et al. Are poor responses to praziquantel for the treatment of Schistosoma mansoni infections in Senegal due to resistance? An overview of the evidence. Trop Med Int Health. 2001;6:864–73. doi: 10.1046/j.1365-3156.2001.00811.x. [DOI] [PubMed] [Google Scholar]
Hanelt B, Steinauer ML, Mwangi IN, et al. A new approach to characterize populations of Schistosoma mansoni from humans: development and assessment of microsatellite analysis of pooled miracidia. Trop Med Int Health. 2009;14:322–31. doi: 10.1111/j.1365-3156.2009.02226.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
Henry MC, Rogier C, Nzeyimana I, et al. Inland valley rice production systems and malaria infection and disease in the savannah of Cote d’Ivoire. Trop Med Int Health. 2003;8:449–58. doi: 10.1046/j.1365-3156.2003.01053.x. [DOI] [PubMed] [Google Scholar]
Idro R, Aloyo J, Mayende L, Bitarakwate E, John CC, Kivumbi GW. Severe malaria in children in areas with low, moderate and high transmission intensity in Uganda. Trop Med Int Health. 2006;11:115–24. doi: 10.1111/j.1365-3156.2005.01518.x. [DOI] [PubMed] [Google Scholar]
Jardim-Botelho A, Raff S, Rodrigues Rde A, et al. Hookworm, Ascaris lumbricoides infection and polyparasitism associated with poor cognitive performance in Brazilian schoolchildren. Trop Med Int Health. 2008;13:994–1004. doi: 10.1111/j.1365-3156.2008.02103.x. [DOI] [PubMed] [Google Scholar]
Jukes MC, Nokes CA, Alcock KJ, et al. Heavy schistosomiasis associated with poor short-term memory and slower reaction times in Tanzanian schoolchildren. Trop Med Int Health. 2002;7:104–17. doi: 10.1046/j.1365-3156.2002.00843.x. [DOI] [PubMed] [Google Scholar]
Kabatereine NB, Brooker S, Tukahebwa EM, Kazibwe F, Onapa AW. Epidemiology and geography of Schistosoma mansoni in Uganda: implications for planning control. Trop Med Int Health. 2004;9:372–80. doi: 10.1046/j.1365-3156.2003.01176.x. [DOI] [PubMed] [Google Scholar]
Keraita B, Konradsen F, Drechsel P, Abaidoo RC. Effect of low-cost irrigation methods on microbial contamination of lettuce irrigated with untreated wastewater. Trop Med Int Health. 2007;12(Suppl 2):15–22. doi: 10.1111/j.1365-3156.2007.01937.x. [DOI] [PubMed] [Google Scholar]
Keraita B, Konradsen F, Drechsel P, Abaidoo RC. Reducing microbial contamination on wastewater-irrigated lettuce by cessation of irrigation before harvesting. Trop Med Int Health. 2007;12(Suppl 2):8–14. doi: 10.1111/j.1365-3156.2007.01936.x. [DOI] [PubMed] [Google Scholar]
Kongs A, Marks G, Verle P, Van der Stuyft P. The unreliability of the Kato-Katz technique limits its usefulness for evaluating S. mansoni infections. Trop Med Int Health. 2001;6:163–9. doi: 10.1046/j.1365-3156.2001.00687.x. [DOI] [PubMed] [Google Scholar]
Larocque R, Casapia M, Gotuzzo E, et al. A double-blind randomized controlled trial of antenatal mebendazole to reduce low birthweight in a hookworm-endemic area of Peru. Trop Med Int Health. 2006;11:1485–95. doi: 10.1111/j.1365-3156.2006.01706.x. [DOI] [PubMed] [Google Scholar]
Luoba AI, Wenzel Geissler P, Estambale B, et al. Earth-eating and reinfection with intestinal helminths among pregnant and lactating women in western Kenya. Trop Med Int Health. 2005;10:220–7. doi: 10.1111/j.1365-3156.2004.01380.x. [DOI] [PubMed] [Google Scholar]
Maguire JD, Tuti S, Sismadi P, et al. Endemic coastal malaria in the Thousand Islands District, near Jakarta, Indonesia. Trop Med Int Health. 2005;10:489–96. doi: 10.1111/j.1365-3156.2005.01402.x. [DOI] [PubMed] [Google Scholar]
Mani TR, Rajendran R, Munirathinam A, et al. Efficacy of co-administration of albendazole and diethylcarbamazine against geohelminthiases: a study from South India. Trop Med Int Health. 2002;7:541–8. doi: 10.1046/j.1365-3156.2002.00894.x. [DOI] [PubMed] [Google Scholar]
Mani TR, Rajendran R, Sunish IP, et al. Effectiveness of two annual, single-dose mass drug administrations of diethylcarbamazine alone or in combination with albendazole on soil-transmitted helminthiasis in filariasis elimination programme. Trop Med Int Health. 2004;9:1030–5. doi: 10.1111/j.1365-3156.2004.01298.x. [DOI] [PubMed] [Google Scholar]
Marchant T, Schellenberg JA, Edgar T, et al. Socially marketed insecticide-treated nets improve malaria and anaemia in pregnancy in southern Tanzania. Trop Med Int Health. 2002;7:149–58. doi: 10.1046/j.1365-3156.2002.00840.x. [DOI] [PubMed] [Google Scholar]
Mas J, Ascaso C, Escaramis G, et al. Reduction in the prevalence and intensity of infection in Onchocerca volvulus microfilariae according to ethnicity and community after 8 years of ivermectin treatment on the island of Bioko, Equatorial Guinea. Trop Med Int Health. 2006;11:1082–91. doi: 10.1111/j.1365-3156.2006.01650.x. [DOI] [PubMed] [Google Scholar]
Mayor A, Saute F, Aponte JJ, et al. Plasmodium falciparum multiple infections in Mozambique, its relation to other malariological indices and to prospective risk of malaria morbidity. Trop Med Int Health. 2003;8:3–11. doi: 10.1046/j.1365-3156.2003.00968.x. [DOI] [PubMed] [Google Scholar]
Mens P, Spieker N, Omar S, Heijnen M, Schallig H, Kager PA. Is molecular biology the best alternative for diagnosis of malaria to microscopy? A comparison between microscopy, antigen detection and molecular tests in rural Kenya and urban Tanzania. Trop Med Int Health. 2007;12:238–44. doi: 10.1111/j.1365-3156.2006.01779.x. [DOI] [PubMed] [Google Scholar]
Meyrowitsch DW, Simonsen PE. Efficacy of DEC against Ascaris and hookworm infections in schoolchildren. Trop Med Int Health. 2001;6:739–42. doi: 10.1046/j.1365-3156.2001.00766.x. [DOI] [PubMed] [Google Scholar]
Mockenhaupt FP, Eggelte TA, Till H, Bienzle U. Plasmodium falciparum pfcrt and pfmdr1 polymorphisms are associated with the pfdhfr N108 pyrimethamine-resistance mutation in isolates from Ghana. Trop Med Int Health. 2001;6:749–55. doi: 10.1046/j.1365-3156.2001.00792.x. [DOI] [PubMed] [Google Scholar]
Mockenhaupt FP, Ehrhardt S, Dzisi SY, et al. A randomized, placebo-controlled, double-blind trial on sulfadoxine-pyrimethamine alone or combined with artesunate or amodiaquine in uncomplicated malaria. Trop Med Int Health. 2005;10:512–20. doi: 10.1111/j.1365-3156.2005.01427.x. [DOI] [PubMed] [Google Scholar]
Montresor A, Ramsan M, Chwaya HM, et al. School enrollment in Zanzibar linked to children’s age and helminth infections. Trop Med Int Health. 2001;6:227–31. doi: 10.1046/j.1365-3156.2001.00686.x. [DOI] [PubMed] [Google Scholar]
Montresor A, Zin TT, Padmasiri E, Allen H, Savioli L. Soil-transmitted helminthiasis in Myanmar and approximate costs for countrywide control. Trop Med Int Health. 2004;9:1012–5. doi: 10.1111/j.1365-3156.2004.01297.x. [DOI] [PubMed] [Google Scholar]
Mugittu K, Adjuik M, Snounou G, et al. Molecular genotyping to distinguish between recrudescents and new infections in treatment trials of Plasmodium falciparum malaria conducted in Sub-Saharan Africa: adjustment of parasitological outcomes and assessment of genotyping effectiveness. Trop Med Int Health. 2006;11:1350–9. doi: 10.1111/j.1365-3156.2006.01688.x. [DOI] [PubMed] [Google Scholar]
Muhumuza S, Kitimbo G, Oryema-Lalobo M, Nuwaha F. Association between socio economic status and schistosomiasis infection in Jinja District, Uganda. Trop Med Int Health. 2009;14:612–9. doi: 10.1111/j.1365-3156.2009.02273.x. [DOI] [PubMed] [Google Scholar]
Naus CW, Booth M, Jones FM, et al. The relationship between age, sex, egg-count and specific antibody responses against Schistosoma mansoni antigens in a Ugandan fishing community. Trop Med Int Health. 2003;8:561–8. doi: 10.1046/j.1365-3156.2003.01056.x. [DOI] [PubMed] [Google Scholar]
Nebie I, Tiono AB, Diallo DA, et al. Do antibody responses to malaria vaccine candidates influenced by the level of malaria transmission protect from malaria? Trop Med Int Health. 2008;13:229–37. doi: 10.1111/j.1365-3156.2007.01994.x. [DOI] [PubMed] [Google Scholar]
N’Goran EK, Utzinger J, N’Guessan AN, et al. Reinfection with Schistosoma haematobium following school-based chemotherapy with praziquantel in four highly endemic villages in Cote d’Ivoire. Trop Med Int Health. 2001;6:817–25. doi: 10.1046/j.1365-3156.2001.00785.x. [DOI] [PubMed] [Google Scholar]
Nguansangiam S, Day NP, Hien TT, et al. A quantitative ultrastructural study of renal pathology in fatal Plasmodium falciparum malaria. Trop Med Int Health. 2007;12:1037–50. doi: 10.1111/j.1365-3156.2007.01881.x. [DOI] [PubMed] [Google Scholar]
Nguyen TH, Nguyen VD, Murrell D, Dalsgaard A. Occurrence and species distribution of fishborne zoonotic trematodes in wastewater-fed aquaculture in northern Vietnam. Trop Med Int Health. 2007;12(Suppl 2):66–72. doi: 10.1111/j.1365-3156.2007.01943.x. [DOI] [PubMed] [Google Scholar]
Njama-Meya D, Kamya MR, Dorsey G. Asymptomatic parasitaemia as a risk factor for symptomatic malaria in a cohort of Ugandan children. Trop Med Int Health. 2004;9:862–8. doi: 10.1111/j.1365-3156.2004.01277.x. [DOI] [PubMed] [Google Scholar]
Oladejo SO, Ofoezie IE. Unabated schistosomiasis transmission in Erinle River Dam, Osun State, Nigeria: evidence of neglect of environmental effects of development projects. Trop Med Int Health. 2006;11:843–50. doi: 10.1111/j.1365-3156.2006.01628.x. [DOI] [PubMed] [Google Scholar]
Ouedraogo A, Tiono AB, Diarra A, Nebie IO, Konate AT, Sirima SB. The effects of a pre-season treatment with effective antimalarials on subsequent malaria morbidity in under five-year-old children living in high and seasonal malaria transmission area of Burkina Faso. Trop Med Int Health. 2010;15:1315–21. doi: 10.1111/j.1365-3156.2010.02618.x. [DOI] [PubMed] [Google Scholar]
Owusu-Agyei S, Binka F, Koram K, et al. Does radical cure of asymptomatic Plasmodium falciparum place adults in endemic areas at increased risk of recurrent symptomatic malaria? Trop Med Int Health. 2002;7:599–603. doi: 10.1046/j.1365-3156.2002.00902.x. [DOI] [PubMed] [Google Scholar]
Owusu-Agyei S, Smith T, Beck HP, Amenga-Etego L, Felger I. Molecular epidemiology of Plasmodium falciparum infections among asymptomatic inhabitants of a holoendemic malarious area in northern Ghana. Trop Med Int Health. 2002;7:421–8. doi: 10.1046/j.1365-3156.2002.00881.x. [DOI] [PubMed] [Google Scholar]
Peyerl-Hoffmann G, Jelinek T, Kilian A, Kabagambe G, Metzger WG, von Sonnenburg F. Genetic diversity of Plasmodium falciparum and its relationship to parasite density in an area with different malaria endemicities in West Uganda. Trop Med Int Health. 2001;6:607–13. doi: 10.1046/j.1365-3156.2001.00761.x. [DOI] [PubMed] [Google Scholar]
Pfeiffer K, Some F, Muller O, et al. Clinical diagnosis of malaria and the risk of chloroquine self-medication in rural health centres in Burkina Faso. Trop Med Int Health. 2008;13:418–26. doi: 10.1111/j.1365-3156.2008.02017.x. [DOI] [PubMed] [Google Scholar]
Pischke S, Buttner DW, Liebau E, Fischer P. An internal control for the detection of Onchocerca volvulus DNA by PCR-ELISA and rapid detection of specific PCR products by DNA Detection Test Strips. Trop Med Int Health. 2002;7:526–31. doi: 10.1046/j.1365-3156.2002.00890.x. [DOI] [PubMed] [Google Scholar]
Polman K, Stelma FF, De Vlas SJ, et al. Dynamics of egg counts and circulating antigen levels in a recent Schistosoma mansoni focus in northern Senegal. Trop Med Int Health. 2001;6:538–44. doi: 10.1046/j.1365-3156.2001.00742.x. [DOI] [PubMed] [Google Scholar]
Rajendran R, Sunish IP, Mani TR, et al. Community-based study to assess the efficacy of DEC plus ALB against DEC alone on bancroftian filarial infection in endemic areas in Tamil Nadu, south India. Trop Med Int Health. 2006;11:851–61. doi: 10.1111/j.1365-3156.2006.01625.x. [DOI] [PubMed] [Google Scholar]
Ramaiah KD, Das PK, Vanamail P, Pani SP. The impact of six rounds of single-dose mass administration of diethylcarbamazine or ivermectin on the transmission of Wuchereria bancrofti by Culex quinquefasciatus and its implications for lymphatic filariasis elimination programmes. Trop Med Int Health. 2003;8:1082–92. doi: 10.1046/j.1360-2276.2003.01138.x. [DOI] [PubMed] [Google Scholar]
Ramaiah KD, Vanamail P, Pani SP, Yuvaraj J, Das PK. The effect of six rounds of single dose mass treatment with diethylcarbamazine or ivermectin on Wuchereria bancrofti infection and its implications for lymphatic filariasis elimination. Trop Med Int Health. 2002;7:767–74. doi: 10.1046/j.1365-3156.2002.00935.x. [DOI] [PubMed] [Google Scholar]
Ranjit MR, Das A, Chhotray GP, Das BP, Das BN, Acharya AS. The PfCRT (K76T) point mutation favours clone multiplicity and disease severity in Plasmodium falciparum infection. Trop Med Int Health. 2004;9:857–61. doi: 10.1111/j.1365-3156.2004.01286.x. [DOI] [PubMed] [Google Scholar]
Raso G, Utzinger J, Silue KD, et al. Disparities in parasitic infections, perceived ill health and access to health care among poorer and less poor schoolchildren of rural Cote d’Ivoire. Trop Med Int Health. 2005;10:42–57. doi: 10.1111/j.1365-3156.2004.01352.x. [DOI] [PubMed] [Google Scholar]
Saathoff E, Olsen A, Kvalsvig JD, Appleton CC, Sharp B, Kleinschmidt I. Ecological covariates of Ascaris lumbricoides infection in schoolchildren from rural KwaZulu-Natal, South Africa. Trop Med Int Health. 2005;10:412–22. doi: 10.1111/j.1365-3156.2005.01406.x. [DOI] [PubMed] [Google Scholar]
Saute F, Menendez C, Mayor A, et al. Malaria in pregnancy in rural Mozambique: the role of parity, submicroscopic and multiple Plasmodium falciparum infections. Trop Med Int Health. 2002;7:19–28. doi: 10.1046/j.1365-3156.2002.00831.x. [DOI] [PubMed] [Google Scholar]
Scott JT, Diakhate M, Vereecken K, et al. Human water contacts patterns in Schistosoma mansoni epidemic foci in northern Senegal change according to age, sex and place of residence, but are not related to intensity of infection. Trop Med Int Health. 2003;8:100–8. doi: 10.1046/j.1365-3156.2003.00993.x. [DOI] [PubMed] [Google Scholar]
Seto EY, Lee YJ, Liang S, Zhong B. Individual and village-level study of water contact patterns and Schistosoma japonicum infection in mountainous rural China. Trop Med Int Health. 2007;12:1199–209. doi: 10.1111/j.1365-3156.2007.01903.x. [DOI] [PubMed] [Google Scholar]
Shekalaghe SA, Bousema JT, Kunei KK, et al. Submicroscopic Plasmodium falciparum gametocyte carriage is common in an area of low and seasonal transmission in Tanzania. Trop Med Int Health. 2007;12:547–53. doi: 10.1111/j.1365-3156.2007.01821.x. [DOI] [PubMed] [Google Scholar]
Sowunmi A, Fateye BA. Plasmodium falciparum gametocytaemia in Nigerian children: before, during and after treatment with antimalarial drugs. Trop Med Int Health. 2003;8:783–92. doi: 10.1046/j.1365-3156.2003.01093.x. [DOI] [PubMed] [Google Scholar]
Sunish IP, Rajendran R, Mani TR, et al. Transmission intensity index to monitor filariasis infection pressure in vectors for the evaluation of filariasis elimination programmes. Trop Med Int Health. 2003;8:812–9. doi: 10.1046/j.1365-3156.2003.01109.x. [DOI] [PubMed] [Google Scholar]
Sunish IP, Rajendran R, Mani TR, et al. Resurgence in filarial transmission after withdrawal of mass drug administration and the relationship between antigenaemia and microfilaraemia--a longitudinal study. Trop Med Int Health. 2002;7:59–69. doi: 10.1046/j.1365-3156.2002.00828.x. [DOI] [PubMed] [Google Scholar]
Supali T, Ismid IS, Ruckert P, Fischer P. Treatment of Brugia timori and Wuchereria bancrofti infections in Indonesia using DEC or a combination of DEC and albendazole: adverse reactions and short-term effects on microfilariae. Trop Med Int Health. 2002;7:894–901. doi: 10.1046/j.1365-3156.2002.00921.x. [DOI] [PubMed] [Google Scholar]
Tchuem Tchuente LA, Behnke JM, Gilbert FS, Southgate VR, Vercruysse J. Polyparasitism with Schistosoma haematobium and soil-transmitted helminth infections among school children in Loum, Cameroon. Trop Med Int Health. 2003;8:975–86. doi: 10.1046/j.1360-2276.2003.01120.x. [DOI] [PubMed] [Google Scholar]
Traub RJ, Robertson ID, Irwin P, Mencke N, Andrew Thompson RC. The prevalence, intensities and risk factors associated with geohelminth infection in tea-growing communities of Assam, India. Trop Med Int Health. 2004;9:688–701. doi: 10.1111/j.1365-3156.2004.01252.x. [DOI] [PubMed] [Google Scholar]
Van den Bossche P, Ky-Zerbo A, Brandt J, Marcotty T, Geerts S, De Deken R. Transmissibility of Trypanosoma brucei during its development in cattle. Trop Med Int Health. 2005;10:833–9. doi: 10.1111/j.1365-3156.2005.01467.x. [DOI] [PubMed] [Google Scholar]
Vennervald BJ, Kenty L, Butterworth AE, et al. Detailed clinical and ultrasound examination of children and adolescents in a Schistosoma mansoni endemic area in Kenya: hepatosplenic disease in the absence of portal fibrosis. Trop Med Int Health. 2004;9:461–70. doi: 10.1111/j.1365-3156.2004.01215.x. [DOI] [PubMed] [Google Scholar]
Vereecken K, Naus CW, Polman K, et al. Associations between specific antibody responses and resistance to reinfection in a Senegalese population recently exposed to Schistosoma mansoni. Trop Med Int Health. 2007;12:431–44. doi: 10.1111/j.1365-3156.2006.01805.x. [DOI] [PubMed] [Google Scholar]
Villamor E, Fataki MR, Mbise RL, Fawzi WW. Malaria parasitaemia in relation to HIV status and vitamin A supplementation among pre-school children. Trop Med Int Health. 2003;8:1051–61. doi: 10.1046/j.1360-2276.2003.01134.x. [DOI] [PubMed] [Google Scholar]
Vuong TA, Nguyen TT, Klank LT, Phung DC, Dalsgaard A. Faecal and protozoan parasite contamination of water spinach (Ipomoea aquatica) cultivated in urban wastewater in Phnom Penh, Cambodia. Trop Med Int Health. 2007;12(Suppl 2):73–81. doi: 10.1111/j.1365-3156.2007.01944.x. [DOI] [PubMed] [Google Scholar]
Wanji S, Tendongfor N, Esum ME, Enyong P. Chrysops silacea biting densities and transmission potential in an endemic area of human loiasis in south-west Cameroon. Trop Med Int Health. 2002;7:371–7. doi: 10.1046/j.1365-3156.2002.00845.x. [DOI] [PubMed] [Google Scholar]
Wilson S, Vennervald BJ, Kadzo H, et al. Hepatosplenomegaly in Kenyan schoolchildren: exacerbation by concurrent chronic exposure to malaria and Schistosoma mansoni infection. Trop Med Int Health. 2007;12:1442–9. doi: 10.1111/j.1365-3156.2007.01950.x. [DOI] [PubMed] [Google Scholar]
Yanze MF, Duru C, Jacob M, Bastide JM, Lankeuh M. Rapid therapeutic response onset of a new pharmaceutical form of chloroquine phosphate 300 mg: effervescent tablets. Trop Med Int Health. 2001;6:196–201. doi: 10.1046/j.1365-3156.2001.00681.x. [DOI] [PubMed] [Google Scholar]
Ziem JB, Magnussen P, Olsen A, Horton J, Asigri VL, Polderman AM. Impact of repeated mass treatment on human Oesophagostomum and hookworm infections in northern Ghana. Trop Med Int Health. 2006;11:1764–72. doi: 10.1111/j.1365-3156.2006.01729.x. [DOI] [PubMed] [Google Scholar]

[R1] Aban IB, Cutter GR, Mavinga N. Inferences and Power Analysis Concerning Two Negative Binomial Distributions with An Application to MRI Lesion Counts Data. Comput Stat Data Anal. 2008;53:820–833. doi: 10.1016/j.csda.2008.07.034. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R2] Agresti A. Modelling ordered categorical data: recent advances and future challenges. Statistics in Medicine. 1999;18:2191–2208. doi: 10.1002/(sici)1097-0258(19990915/30)18:17/18<2191::aid-sim249>3.0.co;2-m. [DOI] [PubMed] [Google Scholar]

[R3] Alexander N, Bethony J, Corrêa-Oliveira R, Rodrigues LC, Hotez P, Brooker S. Repeatability of paired counts. Statistics in Medicine. 2007;26:3566–3577. doi: 10.1002/sim.2724. [DOI] [PubMed] [Google Scholar]

[R4] Alexander N, Cundill B, Sabatelli L, et al. Selection and quantification of infection endpoints for trials of vaccines against intestinal helminths. Vaccine. 2011;29:3686–94. doi: 10.1016/j.vaccine.2011.03.026. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] Alexander ND, Solomon AW, Holland MJ, et al. An index of community ocular Chlamydia trachomatis load for control of trachoma. Trans R Soc Trop Med Hyg. 2005;99:175–7. doi: 10.1016/j.trstmh.2004.05.003. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R6] Anderson RM, May RM. Infectious Diseases of Humans: Dynamics and Control. 1st edn Oxford University Press; Oxford: 1991. [Google Scholar]

[R7] Anscombe FJ. The transformation of Poisson, binomial and negative-binomial data. Biometrika. 1948;35:246–254. [Google Scholar]

[R8] Barber JA, Thompson SG. Analysis of cost data in randomized trials: an application of the non-parametric bootstrap. Statistics in Medicine. 2000;19:3219–3236. doi: 10.1002/1097-0258(20001215)19:23<3219::aid-sim623>3.0.co;2-p. [DOI] [PubMed] [Google Scholar]

[R9] Barker L, Cadwell BL. An analysis of eight 95 per cent confidence intervals for a ratio of Poisson parameters when events are rare. Statistics in Medicine. 2008;27:4030–4037. doi: 10.1002/sim.3234. [DOI] [PubMed] [Google Scholar]

[R10] Bauer DF. Constructing confidence sets using rank statistics. Journal of the American Statistical Association. 1972;67:687–690. [Google Scholar]

[R11] Beall G. The transformation of data from entomological field experiments so that the analysis of variance becomes applicable. Biometrika. 1942;32:243–262. [Google Scholar]

[R12] Bland M. An Introduction to Medical Statistics. 2nd edn Oxford University Press; Oxford: 1995. [Google Scholar]

[R13] Bockarie M, Kazura J, Alexander N, et al. Transmission dynamics of Wuchereria bancrofti in East Sepik Province, Papua New Guinea. American Journal of Tropical Medicine and Hygiene. 1996;54:577–581. doi: 10.4269/ajtmh.1996.54.577. [DOI] [PubMed] [Google Scholar]

[R14] Boneau CA. The effects of violations of assumptions underlying the t test. Psychological Bulletin. 1960;57:49–64. doi: 10.1037/h0041412. [DOI] [PubMed] [Google Scholar]

[R15] Bonferroni CE. Sulle medie multiple di potenze [On multiple algebraic means] Bollettino dell’Unione Matematica Italiana, serie 3. 1950;5:267–270. [Google Scholar]

[R16] Borowski EJ, Borwein JM. Dictionary of Mathematics. 1st edn HarperCollins Publishers; London: 1989. [Google Scholar]

[R17] Burton MJ, Holland MJ, Faal N, et al. Which members of a community need antibiotics to control trachoma? Conjunctival Chlamydia trachomatis infection load in Gambian villages. Investigative Ophthalmology and Visual Science. 2003;44:4215–4222. doi: 10.1167/iovs.03-0107. [DOI] [PubMed] [Google Scholar]

[R18] Bushman BJ, Wang MC. A procedure for combining sample correlation coefficients and vote counts to obtain an estimate and a confidence interval for the population correlation coefficient. Psychological Bulletin. 1995;117:530–546. [Google Scholar]

[R19] Conover WJ. Practical Nonparametric Statistics. 2nd edn John Wiley & Sons; New York: 1980. [Google Scholar]

[R20] Coxe S, West SG, Aiken LS. The analysis of count data: a gentle introduction to Poisson regression and its alternatives. J Pers Assess. 2009;91:121–36. doi: 10.1080/00223890802634175. [DOI] [PubMed] [Google Scholar]

[R21] Cundill B, Alexander N, Bethony J, Diemert D, Pullan RL, Brooker S. Rates and intensity of re-infection with human helminths after treatment and the influence of individual, household, and environmental factors in a Brazilian community. Parasitology. 2011;138:1406–16. doi: 10.1017/S0031182011001132. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R22] Dobbie MJ. Modelling correlated zero-inflated count data. Australian National University; Canberrra: 2001. [Google Scholar]

[R23] Dobson RJ, Sangster NC, Besier RB, Woodgate RG. Geometric means provide a biased efficacy result when conducting a faecal egg count reduction test (FECRT) Vet Parasitol. 2009;161:162–7. doi: 10.1016/j.vetpar.2008.12.007. [DOI] [PubMed] [Google Scholar]

[R24] Dunyo S, Ord R, Hallett R, et al. Randomised trial of chloroquine/sulphadoxine-pyrimethamine in Gambian children with malaria: impact against multidrug-resistant P. falciparum. PLoS Clin Trials. 2006;1:e14. doi: 10.1371/journal.pctr.0010014. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R25] Efron B, Tibshirani R. An Introduction to the Bootstrap. 1st edn Chapman and Hall; New York: 1993. [Google Scholar]

[R26] Elliott JM. Some Methods for the Statistical Analysis of Samples of Benthic Invertebrates. 2nd edn Freshwater Biological Association; Ambleside: 1977. [Google Scholar]

[R27] Everitt B. Cambridge Dictionary of Statistics in the Medical Sciences. Cambridge University Press; Cambridge: 1995. [Google Scholar]

[R28] Fulford AJC. Dispersion and bias: can we trust geometric means? Parasitology Today. 1994;10:446–448. doi: 10.1016/0169-4758(94)90181-3. [DOI] [PubMed] [Google Scholar]

[R29] Gardner W, Mulvey EP, Shaw EC. Regression analyses of counts and rates: Poisson, overdispersed Poisson, and negative binomial models. Psychological Bulletin. 1995;118:392–404. doi: 10.1037/0033-2909.118.3.392. [DOI] [PubMed] [Google Scholar]

[R30] Grenfell BT, Das PK, Rajagopalan PK, Bundy DAP. Frequency distribution of lymphatic filariasis microfilariae in human populations: population processes and statistical estimation. Parasitology. 1990;101:417–427. doi: 10.1017/s0031182000060613. [DOI] [PubMed] [Google Scholar]

[R31] Hallstrom AP. A modified Wilcoxon test for non-negative distributions with a clump of zeros. Stat Med. 2010;29:391–400. doi: 10.1002/sim.3785. [DOI] [PubMed] [Google Scholar]

[R32] Hand DJ. Deconstructing statistical questions (with discussion) Journal of the Royal Statistical Society Series A (Statistics in Society) 1994;157:317–356. [Google Scholar]

[R33] Hart A. Mann-Whitney test is not just a test of medians: differences in spread can be important. BMJ. 2001;323:391–3. doi: 10.1136/bmj.323.7309.391. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R34] Heeren T, d’Agostino R. Robustness of the two-independent samples t-test when applies to ordinal scale data. Statistics in Medicine. 1987;6:79–90. doi: 10.1002/sim.4780060110. [DOI] [PubMed] [Google Scholar]

[R35] Hilbe JM. Negative Binomial Regression. 1st edn Cambridge University Press; Cambridge: 2007. [Google Scholar]

[R36] Hoshino N. Engen’s extended negative binomial model revisited. Annals of the Institute of Statistical Mathematics. 2005;57:369–387. [Google Scholar]

[R37] Kirkwood BR, Sterne JAC. Essentials of Medical Statistics. 2nd edn Blackwell Scientific Publications; Oxford: 2003. [Google Scholar]

[R38] Lachenbruch PA. Analysis of data with excess zeros. Statistical Methods in Medical Research. 2002;11:297–302. doi: 10.1191/0962280202sm289ra. [DOI] [PubMed] [Google Scholar]

[R39] Laubscher NF. On stabilizing the binomial and negative binomial variance. Journal of the American Statistical Association. 1961;56:143–150. [Google Scholar]

[R40] Marshall TF, Anderson J, Fuglsang H. The incidence of eye lesions and visual impairment in onchocerciasis in relationship to the intensity of infection. Transactions of the Royal Society of Tropical Medicine and Hygiene. 1986;80:426–434. doi: 10.1016/0035-9203(86)90333-0. [DOI] [PubMed] [Google Scholar]

[R41] Massé JC, Theodorescu R. Neyman type A distribution revisited. Statistica Neerlandica. 2005;59:206–213. [Google Scholar]

[R42] McCullagh P, Nelder JA. Generalized Linear Models. 1st edn Chapman and Hall; London: 1983. [Google Scholar]

[R43] McElduff F, Cortina-Borja M, Chan S-K, Wade A. When t-tests or Wilcoxon-Mann-Whitney tests won’t do. Advances in Physiology Education. 2010;34:128–133. doi: 10.1152/advan.00017.2010. [DOI] [PubMed] [Google Scholar]

[R44] Montresor A. Arithmetic or geometric means of eggs per gram are not appropriate indicators to estimate the impact of control measures in helminth infections. Trans R Soc Trop Med Hyg. 2007;101:773–6. doi: 10.1016/j.trstmh.2007.04.008. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R45] Moulton LH, Curriero FC, Barroso PF. Mixture models for quantitative HIV RNA data. Statistical Methods in Medical Research. 2002;11:317–325. doi: 10.1191/0962280202sm292ra. [DOI] [PubMed] [Google Scholar]

[R46] Mumpower JL, McClelland G. Measurement error, skewness, and risk analysis: coping with the long tail of the distribution. Risk Anal. 2002;22:277–90. doi: 10.1111/0272-4332.00027. [DOI] [PubMed] [Google Scholar]

[R47] Mwangi TW, Fegan G, Williams TN, Kinyanjui SM, Snow RW, Marsh K. Evidence for over-dispersion in the distribution of clinical malaria episodes in children. PLoS ONE. 2008;3:e2196. doi: 10.1371/journal.pone.0002196. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R48] Noe DA, Bailer AJ, Noble RB. Comparing methods for analyzing overdispersed count data in aquatic toxicology. Environ Toxicol Chem. 2010;29:212–9. doi: 10.1002/etc.2. [DOI] [PubMed] [Google Scholar]

[R49] O’Hara RB, Kotze DJ. Do not log-transform count data. Methods in Ecology & Evolution. 2010;1:118–122. [Google Scholar]

[R50] Olayinka OS, Abdullahi SA. An overview of industrial employees’ exposure to noise in sundry processing and manufacturing industries in Ilorin metropolis, Nigeria. Ind Health. 2009;47:123–33. doi: 10.2486/indhealth.47.123. [DOI] [PubMed] [Google Scholar]

[R51] Remme J, Ba O, Dadzie KY, Karam M. A force-of-infection model for onchocerciasis and its applications in the epidemiological evaluation of the Onchocerciasis Control Programme in the Volta River basin area. Bulletin of the World Health Organization. 1986;64:667–681. [PMC free article] [PubMed] [Google Scholar]

[R52] Seto EY, Lee YJ, Liang S, Zhong B. Individual and village-level study of water contact patterns and Schistosoma japonicum infection in mountainous rural China. Trop Med Int Health. 2007;12:1199–209. doi: 10.1111/j.1365-3156.2007.01903.x. [DOI] [PubMed] [Google Scholar]

[R53] Shmueli G, Minka T, Kadane JB, Borle S, Boatwright PB. A useful distribution for fitting discrete data: revival of the Conway-Maxwell-Poisson distribution. Journal of the Royal Statistical Society: Series C (Applied Statistics) 2005;54:127–142. [Google Scholar]

[R54] Shoukri MM, Asyali MH, VanDorp R, Kelton D. The Poisson inverse Gaussian regression model in the analysis of clustered counts data. Journal of Data Science. 2004;2:17–32. [Google Scholar]

[R55] Sileshi G. Selecting the right statistical model for analysis of insect count data by using information theoretic measures. Bull Entomol Res. 2006;96:479–88. [PubMed] [Google Scholar]

[R56] Smith T, Armstrong Schellenberg J, Hayes R. Attributable fraction estimates and case definitions for malaria in endemic areas. Statistics in Medicine. 1994;13:2345–2358. doi: 10.1002/sim.4780132206. [DOI] [PubMed] [Google Scholar]

[R57] Stonehouse JM, Forrester GJ. Robustness of the t and U tests under combined assumption violations. Journal of Applied Statistics. 1998;25:63–74. [Google Scholar]

[R58] Walker M, Hall A, Anderson RM, Basanez MG. Density-dependent effects on the weight of female Ascaris lumbricoides infections of humans and its impact on patterns of egg production. Parasit Vectors. 2009;2:11. doi: 10.1186/1756-3305-2-11. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R59] Warrell DM, Gilles HM, editors. Essential Malariology. Arnold; London: 2002. [Google Scholar]

[R60] White H. A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica. 1980;48:817–838. [Google Scholar]

[R61] Williams CB. The use of logarithms in the interpretation of certain entomological problems. Annals of Applied Biology. 1937;24:404–414. [Google Scholar]

[R62] Wilson K, Grenfell BT. Generalized linear modelling for parasitologists. Parasitology Today. 1997;13:33–38. doi: 10.1016/s0169-4758(96)40009-6. [DOI] [PubMed] [Google Scholar]

[R63] Wojtczak M, Viemeister NF. Perception of suprathreshold amplitude modulation and intensity increments: Weber’s law revisited. J Acoust Soc Am. 2008;123:2220–36. doi: 10.1121/1.2839889. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R64] Yu K, Lu Z, Stander J. Quantile regression: applications and current research areas. Statistician. 2003;52:331–350. [Google Scholar]

PERMALINK

Analysis of Parasite and Other Skewed Counts

Neal Alexander

Abstract

Objective

Methods

Results

Conclusions

Introduction

Methods and Results

Descriptive methods

Measures of location: median and means

Confusion between geometric and Williams means

The geometric mean is not a clever way to estimate the arithmetic mean

Choice of measure of location: arithmetic versus geometric mean

Measures of dispersion

Table 1.

Figure 1.

Example datasets

Hookworm eggs

Plasmodium falciparum asexual blood stages

Literature review

Table 2.

Table 3.

Inferential analysis

t test and related methods

Figure 2.

Non-parametric methods

Generalized linear models

Figure 3.

Figure 4.

Other methods with potential for greater use

Discussion

Acknowledgements

References

Papers Included in the Literature Review

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases