In ophthalmic research we have a problem. The collection of data for a project usually involves examining and recording observations on eyes. People generally have two eyes and we often examine both. The problem then comes in the analysis.
Do we include all our data and talk about eyes?
Do we look at individuals?
How should we deal with our data?
Does it really matter?
Yes, it does matter!
There are a variety of ways people analyse their data. The method employed should depend on the question being asked, the data-collected and the nature of the condition being studied.
The Question Being Asked
Is the question relating to events or observations purely at an ocular level? For example, trauma or the effect of corneal opacity on the ability to diagnose cataract.
Does the question also include events or observations which relate to the individual? For example, diet, systemic disease (diabetes, hypertension, malaria) or social factors. These examples are obvious. A less obvious example of something that relates to the individual is the response of an optic disc to a given level of intraocular pressure. This may be affected by the connective tissue make-up and vascular system, both of which relate to individuals.
If the question is purely at an ocular level then there is no problem. Analyse your data using eyes.
If the question includes events or observations which relate to the individual then the method of analysis depends on the nature of the condition being studied.
The Data Collected
If information on only one eye per person has been collected then there is no problem; analyse your data using the eyes (which also represent individuals). If information on both eyes has been collected on everyone in the study, then you need to consider the nature of the condition being studied before analysis.
There is a big potential problem when it comes to data where information on one eye has been collected on some people and information on both eyes in other people. It is generally safer to analyse only the data of one eye per person in this situation.
The Nature of the Condition Being Studied
Some cases are obvious. If your study concerns visual disability then clearly the results from both eyes are needed to show how disabled the individual is and you analyse your data at the level of the individual. The same is true for squint.
The condition you are studying may hardly ever affect both eyes in an individual. An example of this is choroidal melanoma which occurs in only one eye in 99% of cases. Other examples are corneal herpes simplex infection in the immunocompetent, or severe ocular trauma (98% of cases). In these cases it is appropriate to analyse at the level of the individual.
At the other extreme are conditions such as blepharitis which almost always affects both eyes (proportion bilateral 95%). This means that whatever you find in the right eye is almost bound to be exactly the same as in the left eye (perfect correlation). The result of this is firstly that there is no point collecting data on both eyes. Why not save effort and just use one eye per individual? Certainly that is the way you should analyse your data!
The majority of ocular conditions lie between these two extremes.
If you know the intraocular pressure in the right eye of patient A then you can make an educated guess at the intraocular pressure in the left eye of patient A. You may not be correct because the IOP is not perfectly correlated between eyes but you have a reasonable chance of being correct. There is more chance of being correct than if you take the IOP in patient A'sright eye and try to predict the IOP in the left eye of patient B!
Routine statistical analyses rely on all data points being independent of each other. This means that you cannot predict a second data point from the knowledge of the first data point. From the above this does not hold for IOP. Patient A'sright eye and patient B'sleft eye are independent. Patient A'sright eye and patient A'sleft eye are not independent.
Clearly a simple answer is to use the data of only one eye per person. This is sound and safe statistically but in many instances leads to a waste of data which may be important. The analysis of data is often aimed at estimates of effect or descriptions of distributions. These are expressed as figures with confidence intervals. The ideal would be to include the whole population and then the estimate will not be an estimate, it will be the real figure. Studies are done, however, on samples of populations. The bigger the sample the more accurate (precise) the estimate of effect and the tighter (smaller) the confidence intervals.
Forty eyes represent a bigger sample size than 20 people! To use only 20 eyes in the analysis is a waste. To use 40 eyes may give a falsely high degree of precision. Special techniques exist to make use of all the data that has been collected in these instances. These techniques, in this example, make the sample size between 20 and 40. The more correlated the results are between right and left eyes, the nearer the sample size gets to 20. The less correlated the results are between right and left eyes, the nearer the sample size gets to 40.
We recommend discussion with a statistician to help in both research planning and analysis of data.