Nature Immunology has an interesting editorial today, entitled: “Ball and Chain”; it asks the very pertinent question:
The classic impact factor is outmoded. Is there an alternative for assessing both a researcher’s productivity and a journal’s quality?
I have had the exact same impression for a while now, though I was afraid to say it out aloud except amongst friends – for fear of committing scientific blasphemy: I do not think Impact Factors — originally intended as a metric to determine a peer-reviewed journal’s popularity, based on citations — are, or should be, what they are hyped up to eventually represent: a proxy metric to assess a researcher’s productivity and potential, that often influences employment, promotion and tenure, even funding.
The Editors at Nature Immunology consider this latter usage as inappropriate, going as far as stating that:
The use of this outmoded metric to assess a scientist’s productivity and a journal’s rank has become a ball and chain for both researchers and editors alike.
To jog everyone’s memory: What is Impact Factor? It is an artificial metric drummed up by the Institute for Scientific Information (ISI, now part of Thomson Reuters, the creators of the Referencing software like Endnote and Reference Manager) that purports to assess a journal’s impact in the scientific community. The Impact Factor of a journal for any given year (published in Fall) represents the average number of citations of articles in that journal in that year for papers published in the previous two years, divided by the total number of citable papers published in that journal during those two years. Wikipedia explains it well, thusly:
For example, if a journal has an impact factor of 3 in 2008, then its papers published in 2006 and 2007 received 3 citations each on average. The 2008 impact factor of a journal would be calculated as follows:
A = the number of times articles published in 2006 and 2007 were cited by indexed journals during 2008
B = the total number of “citable items” published by that journal in 2006 and 2007 (“Citable items” are usually articles, reviews, proceedings, or notes; not editorials or Letters-to-the-Editor.)
2008 impact factor = A/B
The NI Editors point out several problems to this approach.
- Impact factors, having citations as the numerator (A), are discipline-dependent, with relatively ‘hot’ sub-disciplines receiving lot many citations than others. The comparison of all journals within a wide group, or even within a particular sub-discipline, based on this single parameter is, therefore, flawed.
- Thomson Scientific arbitrarily determines the denominator (B) of a journal’s impact factor, but the process by which articles are deemed citable is not transparent. The NI Editors noted that their essays, which are written in a journalistic, rather than scholarly, style and which lack an abstract or complete citation, were now considered a part of the total citable items for this journal.
- Impact Factor calculation, being based on the mean number of citations per year, is skewed by papers that receive huge numbers of citations. The example cited is that of Acta Crystallographica Section A, whose impact factor rose more than 20-fold in 2009, to 49.926, due to one paper that was cited more than 6,600 times. The calculation does not correct for the kurtosis, and in fact, in its report (available as a PDF), International Mathematical Union has criticized the use of arithmetic mean for evaluation of citations, because the distribution does not follow a Normal distribution.
The NI Editors bring up another important caveat of the Impact Factor system:
A much greater bone of contention is that citations to retracted articles are not excluded from calculation of the impact factor. Because of the nature of research, such papers are often highly cited as many researchers publish articles that refute the findings in these papers.
By this metric, therefore, The Lancet’s Impact Factor is going to remain high as long as people continue to write about Andrew Wakefield’s bogus 1998 paper linking MMR vaccine and autism, that has been retracted.
In addition, the real impact of a particular work or study may not figure in the Impact Factor calculation in the short term, since the calculation for a given year is restricted to previous two years. Early ideas, published in relatively lower Impact Factor journals, may not become popular or be considered trendy until after a few years have passed. In a similar vein, when an idea becomes “hot” (such as Th17, mentioned by the NI Editors, in recent times), oftentimes the review articles on the idea outnumber the actual research articles, which also impacts the Impact Factor calculation — and journals often use this technique to bolster their Impact Factors.
Given the large number of caveats associated with the Impact Factor calculation, it is surprising how this metric is still used to determine the importance of individual publications, or evaluate an individual researcher, based on whether the researcher has or hasn’t published in those journals. It stands to reason that only a small proportion of the articles published in a journal contributes to its Impact Factor. Therefore, continued usage of this metric may do a disservice to a body of work and the researcher (by diminishing the importance of the work based on the journal’s Impact Factor), and to the scientific community as well (by under-emphasizing key research areas because of averaging during calculation). By the same token, the Impact Factor system may artificially elevate the importance of underwhelming research by making it to a higher Impact Factor journal. The NI Editorial makes the bold and welcome statement that the journal…
[…] would like to deemphasize the importance of this metric, especially when the validity of the impact factor, its possible manipulation and its misuse have been highlighted by many different quarters of the scientific community.
It is noteworthy that the European Association of Science Editors, in a 2008 statement on “Inappropriate Use of Impact Factors”, has recommended that journal impact factors be “used only – and cautiously – for measuring and comparing the influence of entire journals, but not for the assessment of single papers, and certainly not for the assessment of researchers or research programmes either directly or as a surrogate.” Sadly, although an overhaul of the system for evaluating a researcher’s professional output and a journal’s importance in the scientific community is desirable and clearly needed, no unified standard metric exists for achieving this; the Impact Factor system, therefore, continues to be in wide use. But if the scientific community, including the top-tier journals, comes together to develop a more evolved alternative, eschewing the Impact Factor metric, perhaps it would benefit the community as a whole in the long term.
UPDATE: As I mentioned above, a lot of water has flown under the bridge since. I indicated my yearning for the Open Source, Open Access publishing model. Open Access was lauded and derided in various outlets. A journalist from the premier journal Science raised a stink about a so-called sting operation on Open Access journals across the world, a study that many have called deeply faulty. The pièce de résistance came in form of Nobel Laureate Randy Schekman, who publicly denounced the Big Three of the prestigious journals, Nature, Cell, and Science, vowing to avoid thenceforth what he termed ‘luxury journals’. Scientist and co-founder of PLOS, Michael Eisen, recounted his own experience, providing a lot of insights for the way forward.
The debate, however, rages on, especially since in a ‘Publish or Perish’ world, it is hard to wean oneself away from the glamor and glitz of the luxury journals, which can make or break a young scientist’s career. But, things are looking up in that respect. As an example – also to round up this update – I leave you with this excellent parody “In PLOS” of Lady Gaga’s “Applause”, that I chanced upon on YouTube. Enjoy!