Abstract
Hannah Brown investigates the use and abuse of journal rankings
George Lundberg spent the early 1980s lamenting the loss of his journal's once great reputation. JAMA (the Journal of the AmericanMedical Association), which he had taken over in 1982, had been in decline since its peak of popularity in the 1960s. And a new set of rankings that pitted medical journals against each other on the basis of article citations now seemed to confirm that JAMA was a long way behind the best. To make his editorship successful, Dr Lundberg needed a recovery strategy.
So, while other medical journals continued to dismiss as an irrelevance their citation rankings—labelled “impact factor” by the data crunching company that devised and compiled the system—Dr Lundberg seized the opportunity to make them work in JAMA's favour. Recognising that impact factors were derived from citations, Dr Lundberg reasoned that chasing high profile authors and institutions could help boost JAMA's rank and, therefore, its reputation. He instructed his editorial team to seek out studies that had the potential to become staple references in other papers and try to woo the authors into submitting to JAMA. “We were looking for prestige,” Dr Lundberg recalls.
At the time the strategy was implemented, JAMA had a lot of ground to make up in the impact factor stakes. “When we started, JAMA and the BMJ were roughly similar at around four, the Lancet was higher, and NEJM [New England Journal of Medicine] and Annals [of Internal Medicine] were higher still,” Dr Lundberg explains. “But then JAMA started rising and it never stopped,” he says. Over several years, Dr Lundberg successfully raised the journal's impact factor to around 11, while those of the Annals of Internal Medicine and the BMJ rose only slightly in the same time.
Since Dr Lundberg took the decision to embrace impact factors in the 1980s, these indices have grown into something of an obsession among editors of medical journals. Editorial strategies designed to get the best impact factor results by chopping, mixing, and categorising content in different ways have become the norm. But Dr Lundberg—who says his dedication to impact factors extended only as far as getting a respectable, rather than an outstanding, number—believes the now central importance of this ranking to many editors has distorted the fundamental character of their journals, forcing them to focus more and more on citations and less and less on readers.
According to Dr Lundberg, research shows little correlation between papers that are cited a lot and those that are considered landmark articles by panels of experts decades later. So medical journals that aim to pull in only those papers likely to be highly cited—at the expense of potentially less citeable but important work—may be doing science, and their readers, a disservice in the long run.
Whether the popularity of impact factors itself has distorted editorial decisions during the past decade's frenzy has become a well rehearsed debate. But such concerns as the fact that a bad paper may be cited because of its infamous errors and that a journal's rank has no bearing on the quality of individual papers it publishes, has not stopped this neat metric capturing a growing army of devotees outside journal publishing. The impact factor now has a worrying influence not just on publication of papers but on the science behind them too.
Attracted by an apparently simple measure of quality, academic employers, funding bodies, and even governments have begun using the impact factor of journals in which researchers most frequently publish to guide decisions on appointments, grant allocations, and science policy. This trend has been particularly noticeable in the UK, where impact factors have been used heavily in the research assessment exercise, a regular evaluation of research activity that determines the allocation of part of the higher education budget. One consequence has been to make universities prioritise laboratory based life sciences that produce research published in the highest impact factor journals, causing substantial damage to the clinical research base. Impact factors, it seems, have a lot to answer for.
Counting citations
So how did a simple calculation become so influential? The impact factor was first proposed in the early 1960s by information scientist Eugene Garfield, now chairman emeritus of the multinational information company Thomson Scientific. It was conceived as a way to make better use of the reams of data that resulted from his Science Citation Index, set up in the 1950s to track the “subsequent history” of scientific ideas through their citations in future publications.
With the hundreds of thousands of references from scientific journals Dr Garfield and his team at the Institute of Scientific Information (ISI) collected and categorised for their index, they were able to analyse the publication histories of individual authors, identify papers that caught the imagination of other scientists, and, importantly for publishing, rank journals according to their talent for picking popular papers.
Although initial efforts at journal rankings simply totted up the numbers of mentions each publication received in the reference lists of future papers, Dr Garfield quickly realised that this method favoured journals that published a lot but did not necessarily pick the best studies. He suggested that dividing the number of times a journal is cited by the number of articles that it publishes would eliminate the bias towards big journals and produce a meaningful measure of the importance of a journal—the impact of an average paper published.
In 1975, ISI started publishing an annual summary of citations in journals including the impact factor calculation, primarily as an aid for librarians making budget decisions who needed to choose the most cost effective journals to buy. The process involved loading the references from each published paper on to the science citation index database and then, to get the impact factor for each journal, adding up the numbers of citations published in all journals in the current year to articles published in the journal of interest over the two previous years and dividing that total by the number of “scholarly” items published in the previous two years. The result was a number that quantified the average number of citations accrued by a paper published in a particular journal during a given year—the impact factor.
Three decades later, an almost identical system underlies the Journal Citation Reports still produced by ISI, which is now subsumed by Thomson Scientific. Rather than ranking just the 152 top journals Dr Garfield began with, ISI now produces yearly impact factor lists, grouped by specialty, for the 6088 journals in their science citation index, which is growing by an astonishing 200 journals every year.
Inclusion in the index is something of a badge of honour for new journals, which must pass ISI's stringent assessment procedure before being incorporated. Suitable candidates have to meet basic publishing standards and have a fairly good chance of influencing the scientific record. “We take a look at what they have been able to do since the beginning of the year and whether the journal can attract authors that make an impact. If it passes that test we go on to quantitative analysis,” says James Testa, senior director of editorial development for Journal Citation Reports.
But whereas the theory hasn't changed in 40 years, the mechanics of the calculation have. ISI has to take into account changes in the nature of scientific publishing from print only to an increasing proportion of electronic publications. “We index everything from print to direct feed to FTP files,” says Marie McVeigh, senior manager of Journal Citation Reports. And a lot of work goes into keeping up with the journals' changing editorial content. “It's six months of pretty non-stop work,” she says. “We have begun the first preparatory steps for year 2006 now and we'll be publishing [this year's impact factors] in mid to late June.”
For ISI, one of the most difficult aspects of the indexing process is deciding which articles from each journal should count as part of the scholarly record and should therefore be added into the denominator for calculating the impact factor. Many scientific journals—and medical journals are particularly bad offenders in this respect—publish an eclectic mix of article types that marry journalism with research, narrative reviews with clinical cases. Editorial policy changes that create new sections, alter numbers of references, or reorganise article types are made with what seems like—at least from ISI's perspective—dizzying frequency. All of them can affect the eventual impact factor.
David Tempest, associate director of research academic relations for the scientific publisher Elsevier, which publishes the Lancet, says the denominator is a difficult thing for ISI to get right. “BMJ, JAMA, and the Lancet might not have the same article types, and ISI has to work out what should be included,” he explains.
But whereas in the 1970s journals were disinterested enough in their rankings to let ISI do its calculations unimpeded— “they ignored them”, says Dr Garfield—editors and publishers are now active participants, helping ISI make sure their numbers are correct at every step of the way. Tempest says he and his colleagues count the number of scholarly articles in Elsevier's journals to highlight any possible misclassifications by ISI. “What we try to do is work with ISI to get the citable items, the dominator, to be as accurate as possible. Things like news items and conference listings don't get a lot of citations, so they are seen as non-citable by ISI. We work together to get the best outcome for journals”, he explains.
But for many journal editors, particularly those outside the big publishing houses, checking on the accuracy of ISI's indexing of their own journal's content is no easy task. The first difficulty is ascertaining from ISI which articles have been counted as “citable,” and therefore contribute to the denominator in the impact factor calculation. Getting these data can be, according to Mabel Chew, formerly deputy editor of the Medical Journal of Australia and now a BMJ associate editor, a tortuous process. Even in cases where there have been obvious errors—such as the erroneous classification of news articles published by CMAJ during the 1990s as citable items, which caused the journal's impact factor to drop significantly—ISI takes months to respond to editors' queries. Dr Chew believes the process could be made fairer if ISI committed to transparency about its indexing process, enabling journal editors to see for themselves why changes in their impact factors are occurring. “ISI could make public its policies on the steps it takes to determine whether something is considered a citable item or not and say these are the steps we take when we come across a funny article type”, she says. “They could be more transparent about how they do things.”
Working the system
This system of negotiations—or, as ISI's Ms McVeigh prefers it “discussions or clarifications”—has made journals far more cognisant of how editorial decisions can affect impact factors. As well as monitoring cases in which ISI gets it wrong, editors are using this knowledge to their advantage. By keeping the numbers of scholarly articles as small as possible, journals can maximise their ranking. “Every time you get a number you get people working out how to make it work to their advantage”, admits Dr Lundberg. Several artefacts can influence a publication's ranking in journal lists. Review articles or letters are generally cited more than research papers, so boosting review content can make journals perform better in the ranking. Inclusion of news articles, editorials, and media reviews that are among articles considered “non-source” by ISI can win a journal citations without increasing the denominator. And journals can, of course, deliberately try to inflate self citations by asking authors to reference papers in their journal.
“There are ridiculous things that people do to boost their impact factors,” says Dr Garfield. “There were one or two German journals that listed all the articles that had appeared in the journal in the past year, and that increased the citation count by enough to boost the impact up a notch,” he says. But Dr Garfield thinks that although these strategies can force small increases in impact factor, since the index is essentially a measure of quality, “the best thing the publisher can do is to publish good articles.” The striking stability of the impact factor rankings over time supports Dr Garfield's view. “The same set of journals tends to appear top year in year out,” he says. “Nature and Science are not ‘Johnny come latelys'; they have always been at the top and they will remain there.”
Journals' minor manipulation of content in their jostle for better ranking positions is not the issue that causes most concern, however. Despite the fact that the index has now existed for 30 years, there remains a worrying lack of awareness about the other scientific uses to which impact factors can appropriately be applied—and situations where it is completely inappropriate. This ignorance about what the impact factor can and cannot do has persisted while journals' increasing tendency to tout their numbers on promotional material has helped disseminate the concept to wider audiences. Dr Lundberg suggests the impact factor's meteoric rise is simply a question of nomenclature: “Because the impact factor has that word ‘impact' it has got in people's head that this is something that is really important,” he says.
When used properly—that is, to describe the use of scientific information by other scientists within a particular field, it is a useful and powerful measure. But, as Dr Garfield emphasises, the impact factor's only real value is in assessing the relative importance of papers published in one journal compared with those published in another of similar content. It is not an absolute measure and should not be used for comparing journals from different fields. Michael Mabe, chief executive of the International Association of Scientific, Technical, and Medical Publishers, explains: “There is a common misunderstanding that the actual impact factor has meaning, but it doesn't. In fundamental life sciences, for example, a typical impact factor is 3 or 4 while in maths it is 0.4. But you wouldn't assume that mathematicians are eight times more stupid than life scientists, would you?”
Distorting influence?
For these reasons, the trend towards use of impact factors to guide decisions on research funding is worrying. “People are looking at it, studying it, using it in ways that it really shouldn't be used,” Mr Mabe says. In the UK, many universities' obsession with selectively encouraging research that achieves publication in high impact factor journals—a result of a heavy reliance on impact factors within the research assessment exercise—has, according to Michael Rees, who chairs the BMA's medical academic staff committee, introduced a bias against important fields in which few journals boast an exceptional figure.
Universities trying to second guess the research assessment exercise, focus on exactly the kind of cross-specialty comparison of impact factors that Dr Garfield and Mr Mabe caution against. Academic medicine has been particularly badly affected. There has been a haemorrhage of clinical academic staff from universities during the past 10 years—mirroring the existence of the research assessment exercise—and wide ranging cuts in specialist teaching available in medical schools, with some subjects now completely absent. Professor Rees says 1000 members of staff have been lost from medical schools, most of them clinical researchers. He attributes this damaging decline to the fact that papers reporting laboratory based research get published in journals with generally higher impact factors than their clinical counterparts, so universities selectively return those sorts of papers for departmental evaluations in the research assessment exercise and funding for clinical investigation decreases as a result.
Professor Rees believes that because impact factors reflect only the immediate response of research communities to a journal's content they are not wholly suitable for judging clinical research, whose true impact can take a decade or more to emerge. The next research assessment exercise, planned for 2008, will be the first to deliberately reduce the contribution of impact factors, and Professor Rees hopes it will reverse the downward spiral in academic clinical research. However, a just finished consultation on the shape of research assessment after 2008 indicates that in the future bibliometrics (although not necessarily the impact factor) might play an even greater part in decisions as universities demand less bureaucratic ways of assessing research quality.
According to Dr Garfield, use of the impact factor as a general surrogate to aid decision making is not necessarily bad. “It is perfectly OK to use impact data in a general way. I always like to point out 20 years ago when the Soros Foundation had to make quick judgments on who to give grants to in the Russian Federation. They would give priority to scientists that had published in a journal with an impact factor above a certain number,” he says. “It was a good measure. . . It is the mindless use of citation data and impact factors that gets people upset.”
But why this particular measure? ISI's Web of Science database can be used as a starting point to calculate plenty of alternative bibliometrics that are better aids to decision making in various circumstances. The Hirsch index, for instance, which ISI also calculates, is a good way of assessing the impact of individual researchers' work by analysing the distribution of citations of all their work. And the Journal Performance Indicator, which is like the impact factor but excludes citations to non-scholarly articles, gives a better indication of long term performance of journals. This would theoretically better suit ranking of clinical journals, whose research publications may take years to filter through into practice, than the impact factor, which favours the short timeline from publication to impact in basic life sciences journals. Both measures, however, are languishing in relative obscurity among the many bibliometric calculations that have failed to catch academics' and editors' imaginations. “There is only so much ISI can do to make people aware of all these databases,” says Dr Garfield. “The impact factor is available and known whereas the others not everybody gets.” It comes down to the fundamental problem that people want a simple, easy to calculate number to do their comparisons. Complicated maths is just not so appealing.
In both publishing and science, the impact factor's ubiquity has definitely distorted priorities during the past 10 years, concur Mr Mabe and Professor Rees. And a side effect of this change has been that many medical journals have dispensed with their traditional measures of success, such as subscriber numbers and readership. “If you want something read by the clinical community you would want to go to the most widely read journal, the impact factor doesn't mean anything,” says Dr Garfield.
But is this change a bad one? What journals, editors, and funders should really be prioritising, reckons Dr Lundberg, is what matters most to them. “It all depends on the goals of the journal and what the publisher wants,” he explains. “You set plans for what you are trying to achieve and you measure against those plans. If the publisher's goal is to attract authors to communicate with others in their field, then the impact factor is a good measure to use. But if the goal is to earn money by selling subscriptions, then it is irrelevant.” One thing he is sure about is that the impact factor will not wane anytime soon. “Everyone loves a number”, he says.
Competing interests: None declared.