The quality of web-based information on the treatment of depression

 K.M. Griffiths & H. Christensen.

THE AUSTRALIAN NATIONAL UNIVERSITY
Centre for Mental Health Research
CANBERRA ACT 0200
AUSTRALIA

TELEPHONE: +612 6249 2741
FACSIMILE: +612 6249 0733

EMAIL: CMHR@anu.edu.au
WEB: www.anu.edu.aul/cmhr


LIST OF ALTERATIONS TO PAPER

A. EDITORIAL COMMITTEE. (Carnall, Booth & Williams)

1. Description of search

The following amendments were made:

(a) 1st sentence, final para, p.1, of the introduction was amended to read:

The current study aimed to survey web sites which a ‘typical’ user might access when searching for information on depression.

(b) The 1st para of the Selection of sites subsection of the Methods section now contains a more detailed description of how the search was conducted and our confidence in the methodology.

Two search engines, DirectHit (www.directhit.com) and Metacrawler (www.go2net.com/search.html), were used to identify potential sites for the survey.
DirectHit returns 10 ‘popular’ sites based on analyses of previous user activity for a query (primarily frequency of ‘clickthroughs’ from a result list).
MetaCrawler integrates the results for a query from a number of well known websearch engines including Alta Vista, Excite, Infoseek, Lycos, WebCrawler, Yahoo, LookSmart, Thurderstone and Mining Co.
Sites not relevant to depression, sites no longer active and one site concerned solely with Seasonal Affective Disorder were excluded. All other sites identified by Direct Hit (n=9) and the highest ranked sites from the WebCrawler search (n=11) were selected for analysis. A standalone book imported from a third party website was rated separately.
The usefulness of Direct Hit and Metacrawler in generating popular sites has not been the subject of formal independent evaluation. However, in the absence of any other suitable search engine tools, the list of sites yielded by the above search methodology provided the best available approximation to a list of depression sites most commonly encountered by the ‘typical’ user.

DirectHit claims to identify the top 10 popular sites for a topic by analysing the results from the activity of millions of users. However, as we now acknowledge, there is no independent, authoritative evidence concerning the usefulness of Direct Hit, Metacrawler (or any other engine) in producing lists of the most popular sites for a search. This is symptomatic of a more general lack of independent high quality evaluation of search engine characteristics (eg, there is very little evidence concerning the. effectiveness with which search engines identify relevant documents; see Gordon & Parthak, 1999; Hawking, Craswell, Bailey & Griffiths, 2000).

2. The status of AHCPR

A more detailed description of the AHCPR guidelines has been inserted in the Methods section at the end of the Guideline score subsection (new p. 6), as follows:

The AHCPR depression guidelines are one of a set of widely disseminated US Federal guidelines developed according to the principles outlined in the US Institute of Medicine’s guidelines for developing evidence-based clinical practice guidelines.[13a] The guidelines were developed by a multidisciplinary panel from systematic reviews of the scientific evidence, and underwent an extensive scientific and field review process. Meta-analyses of randomised controlled trials used modified intent-to-treat analyses. All panel members, a methodologist, 28 scientific reviewers and 73 organizations were involved in the development process.

We believe it is now clearer that the guidelines are evidence-based. To our knowledge, no studies have systematically investigated whether the guidelines are ‘widely accepted’ but we have now indicated that the guidelines have been disseminated widely throughout the United States and have undergone an extensive consultation process prior to release.

3. Generalisability

The issues raised in this point have addressed as follows:

(a) Sentence 2, Discussion and Conclusions. This sentence has been modified so that it now refers only to depression information. ie, There is a need to improve the accuracy and coverage of web-information ... has been changed to ‘There is a need to improve the accuracy and coverage of web-based depression information ...’

(b) Methods, Quality of content, Guideline score, after sentence 1. An additional sentence has been inserted to make it clear that each item on the rating scale corresponded to a guideline statement from the practice guidelines. (A statement later in the para indicates that the guideline score was computed for each site by cumulating the number of items on the scale for which site information was concordant with guidelines).

(c) The latter half of the second paragraph in the Discussion and Conclusions. This section has been altered and a new 3rd paragraph inserted to address the issue of generalisability and ease of conversion of guidelines into a set of evaluation criteria. The second and third paragraphs now read:

The current findings raise questions about the usefulness of specific Silberg et al accountability criteria as indicators of quality and suggest that further investigation of indicators of quality is warranted. Particular site characteristics (such as ownership by an organisation or existence of a professional editorial board) are likely to prove more useful indicators of content quality than disclosure of information per se. The results of the current study also suggest that the number of different types of interventions mentioned may be a predictor of site quality as may the citation of scientific evidence in support of treatment recommendations. The critical question is whether attributes that are indicators of quality depression sites are valid indicators of the quality of other types of health-related sites. The current methodology could be applied to address this question and to identify those attributes which are common predictors of quality across different subject areas. The methodology lends itself to replication in different subject areas since any systematically produced set of guideline statements can serve directly as a set of rating scale items when evaluating concordance between web-site and guideline information.

4. The changes made in response to point 3, also address the issue raised in point 4. In addition, the following sentence was added at the end of the Abstract:

The study presents a potentially valuable methodology for exploring these issues in a diverse range of health fields.

5. Organisation

The headings Site characteristics, Quality of content, and Accountability now appear in the same order in the Abstract, Methods and Results section. (In particular, the organisation of the Abstract has been changed, the ‘Sources of help’ heading has been deleted in the results section, and consistent headings are used throughout.).

B. STATISTICAL REVIEW (Campbell)

1. Confidence intervals.

The method for calculating confidence intervals is now described and the level of confidence interval reported.

(a) Non-parametric confidence intervals for differences were computed for those main findings where Mann-Whitney procedures had been applied to the data using the procedure outlined by Gardner & Atmann (p. 74) for two sample, unpaired case and reported in the text of the Results.

(b) The confidence intervals associated with single samples were removed and where appropriate, ranges added (consistent with suggestion Item 5).

2. Agreement between judges

p. 5, Global score, sentence 3.

(a) The term ‘significant’ has been replaced with the phrase ‘moderately high’.

(b) The mean difference in ratings, and the associated standard deviation of the difference were included in the sentence.

3. Emphasis on p-values

The emphasis on p-values has been decreased. Differences in scores and Confidence intervals have now been reported in the results section for data analysed using the Mann-Whitney procedures. Percentages for each group are quoted for results based on nominal data. Test statistics are also reported.

4. Ownership type/structure

The following definition of Organisation has now been included in the footnote of Table 3.

aCommercial, consumer or other organised group

This should clarify the difference between ownership type and ownership structure when read in conjunction with the footnotes relating to ownership type.

4. Table 3. Single judge

Reference to the p-value for the single judge has been omitted from Table 3 and the text.

5. Presentation of data in Table 3.

As suggested the following changes have been made:

(a) The standard deviations have now been omitted and the overall mean and range for all sites have been included in the final row.

(b) Difference in scores and 95% confidence intervals have been calculated and quoted in text for important results (as noted in 1 and 3 above).

C. ARTICLE REVIEW (Haynes).

1. Methods of assessing popularity.

This matter is addressed as noted above in A. EDITORIAL COMMITTEE, Item 1.

2. Site details to enable replication

A more detailed description of the method used for site selection has now been included (see A. Editorial Committee, Item 1). This should permit the study to be replicated by other researchers. Individual web sites typically change (and even disappear) over time so that precisely replicating a survey of sites across time in the absence of archival information is not possible. However, we will supply URLs of individual sites to interested researchers if they contact us.

3. Statement re dietary supplements/herbs

The statement has now been clarified as follows:

The clause ‘the percentage of sites reporting side effects of herbal or dietary supplements was consistent with the overall level of reporting of these types of treatment ....’ has been replaced with ‘and most of the sites which mentioned herbal or dietary supplements included some discussion of the side effects of supplements or herbs.’

References:

Gordon M & Pathak P. Finding information on the world wide web: The retrieval effectiveness of search engines. Information Processing and Management, 1999;35:141-180, 1999.

Hawking, D, Craswell, N, Bailey, P & Griffiths K. Measuring the quality of public search engines. In Proceedings of Search Engines Meeting, l0th April, Boston, 2000. http://www.infonortics.com/