This post was originally published on September 20, 2010, and received some interesting comments. Three and a half odd years later, many of the larger issues are still valid, although there has been a drift towards improvement within the scientific community, thanks in no small a part due to the Open Access publishing movement. I decided to update this post today with an excellent video I found on YouTube (see at the end) ~ February 18, 2014.
Nature Immunology has an interesting editorial today, entitled: “Ball and Chain”; it asks the very pertinent question:
The classic impact factor is outmoded. Is there an alternative for assessing both a researcher’s productivity and a journal’s quality?
I have had the exact same impression for a while now, though I was afraid to say it out aloud except amongst friends – for fear of committing scientific blasphemy: I do not think Impact Factors — originally intended as a metric to determine a peer-reviewed journal’s popularity, based on citations — are, or should be, what they are hyped up to eventually represent: a proxy metric to assess a researcher’s productivity and potential, that often influences employment, promotion and tenure, even funding.
The Editors at Nature Immunology consider this latter usage as inappropriate, going as far as stating that:
The use of this outmoded metric to assess a scientist’s productivity and a journal’s rank has become a ball and chain for both researchers and editors alike.
To jog everyone’s memory: What is Impact Factor? It is an artificial metric drummed up by the Institute for Scientific Information (ISI, now part of Thomson Reuters, the creators of the Referencing software like Endnote and Reference Manager) that purports to assess a journal’s impact in the scientific community. The Impact Factor of a journal for any given year (published in Fall) represents the average number of citations of articles in that journal in that year for papers published in the previous two years, divided by the total number of citable papers published in that journal during those two years. Wikipedia explains it well, thusly:
For example, if a journal has an impact factor of 3 in 2008, then its papers published in 2006 and 2007 received 3 citations each on average. The 2008 impact factor of a journal would be calculated as follows:
A = the number of times articles published in 2006 and 2007 were cited by indexed journals during 2008
B = the total number of “citable items” published by that journal in 2006 and 2007 (“Citable items” are usually articles, reviews, proceedings, or notes; not editorials or Letters-to-the-Editor.)
2008 impact factor = A/B
The NI Editors point out several problems to this approach.
- Impact factors, having citations as the numerator (A), are discipline-dependent, with relatively ‘hot’ sub-disciplines receiving lot many citations than others. The comparison of all journals within a wide group, or even within a particular sub-discipline, based on this single parameter is, therefore, flawed.
- Thomson Scientific arbitrarily determines the denominator (B) of a journal’s impact factor, but the process by which articles are deemed citable is not transparent. The NI Editors noted that their essays, which are written in a journalistic, rather than scholarly, style and which lack an abstract or complete citation, were now considered a part of the total citable items for this journal.
- Impact Factor calculation, being based on the mean number of citations per year, is skewed by papers that receive huge numbers of citations. The example cited is that of Acta Crystallographica Section A, whose impact factor rose more than 20-fold in 2009, to 49.926, due to one paper that was cited more than 6,600 times. The calculation does not correct for the kurtosis, and in fact, in its report (available as a PDF), International Mathematical Union has criticized the use of arithmetic mean for evaluation of citations, because the distribution does not follow a Normal distribution.
The NI Editors bring up another important caveat of the Impact Factor system:
A much greater bone of contention is that citations to retracted articles are not excluded from calculation of the impact factor. Because of the nature of research, such papers are often highly cited as many researchers publish articles that refute the findings in these papers.
By this metric, therefore, The Lancet’s Impact Factor is going to remain high as long as people continue to write about Andrew Wakefield’s bogus 1998 paper linking MMR vaccine and autism, that has been retracted.
In addition, the real impact of a particular work or study may not figure in the Impact Factor calculation in the short term, since the calculation for a given year is restricted to previous two years. Early ideas, published in relatively lower Impact Factor journals, may not become popular or be considered trendy until after a few years have passed. In a similar vein, when an idea becomes “hot” (such as Th17, mentioned by the NI Editors, in recent times), oftentimes the review articles on the idea outnumber the actual research articles, which also impacts the Impact Factor calculation — and journals often use this technique to bolster their Impact Factors.
Given the large number of caveats associated with the Impact Factor calculation, it is surprising how this metric is still used to determine the importance of individual publications, or evaluate an individual researcher, based on whether the researcher has or hasn’t published in those journals. It stands to reason that only a small proportion of the articles published in a journal contributes to its Impact Factor. Therefore, continued usage of this metric may do a disservice to a body of work and the researcher (by diminishing the importance of the work based on the journal’s Impact Factor), and to the scientific community as well (by under-emphasizing key research areas because of averaging during calculation). By the same token, the Impact Factor system may artificially elevate the importance of underwhelming research by making it to a higher Impact Factor journal. The NI Editorial makes the bold and welcome statement that the journal…
[…] would like to deemphasize the importance of this metric, especially when the validity of the impact factor, its possible manipulation and its misuse have been highlighted by many different quarters of the scientific community.
It is noteworthy that the European Association of Science Editors, in a 2008 statement on “Inappropriate Use of Impact Factors”, has recommended that journal impact factors be “used only – and cautiously – for measuring and comparing the influence of entire journals, but not for the assessment of single papers, and certainly not for the assessment of researchers or research programmes either directly or as a surrogate.” Sadly, although an overhaul of the system for evaluating a researcher’s professional output and a journal’s importance in the scientific community is desirable and clearly needed, no unified standard metric exists for achieving this; the Impact Factor system, therefore, continues to be in wide use. But if the scientific community, including the top-tier journals, comes together to develop a more evolved alternative, eschewing the Impact Factor metric, perhaps it would benefit the community as a whole in the long term.
UPDATE: As I mentioned above, a lot of water has flown under the bridge since. I indicated my yearning for the Open Source, Open Access publishing model. Open Access was lauded and derided in various outlets. A journalist from the premier journal Science raised a stink about a so-called sting operation on Open Access journals across the world, a study that many have called deeply faulty. The pièce de résistance came in form of Nobel Laureate Randy Schekman, who publicly denounced the Big Three of the prestigious journals, Nature, Cell, and Science, vowing to avoid thenceforth what he termed ‘luxury journals’. Scientist and co-founder of PLOS, Michael Eisen, recounted his own experience, providing a lot of insights for the way forward.
The debate, however, rages on, especially since in a ‘Publish or Perish’ world, it is hard to wean oneself away from the glamor and glitz of the luxury journals, which can make or break a young scientist’s career. But, things are looking up in that respect. As an example – also to round up this update – I leave you with this excellent parody “In PLOS” of Lady Gaga’s “Applause”, that I chanced upon on YouTube. Enjoy!
I’m glad people are starting to query the use of the impact factor to evaluate individuals. Here in the UK it is distorting science, with some universities prohibiting their researchers from publishing in journals with low IF.
I’ve blogged about this on: http://tiny.cc/uzbqb
Nice blog, and like almost every comment/article/blog i’ve ever read on impact factors, you correctly point out the flaws in using IFs in any sort of evaluation of individuals or institutions. What surprises me is that despite years of protestation and IF resistance, there are still those who believe them to be important.
Or are there? We hear many anecdotal accounts of interview panels, grant reviewers, RAE panels etc., who tot up IFs in order to make elementary quantitative comparisons, but it’s difficult to find anyone who will admit to this. I think what this debate needs, is clear unequivocal evidence that IFs are being used in decision-making. If you can provide links to such evidence this would be much appreciated.
Nicely done. I do see people chat and debate around the IF issue but it has rooted deep inside our heads and become a heard belief. That’s pity.
A very interesting post Kausik. I don’t know the answers but I think Dorothy is very much on the right track.
I have never understood myself how such a measure can be used to evaluate a scientist’s talent and/or potential. But, unfortunately, it is understandable that some people prefer to use such arbitrary metrics to carry out their evaluation tasks. I mean, otherwise one would actually have to find out what the applicants have done or even — gasp — talk science with them for a while. No, one number will suffice.
I work in Finland where surprisingly many senior scientists base their view on some paper or an individual researcher on the journal impact factor. This line of thinking is then propagated to the next generation of researchers. Yet, there is something very basic in this: the sense of reward people get when they successfully struggle and fight their paper get accepted to a high impact factor journal. It’s as if one gets the intellectual reward from publishing papers and not actually finding out new things and breaking new ground in science.
Whew! Finally found some time to respond. A big thanks to all who have written in. I guess what the question boils down to is this: whether, in lieu of a single, grossly imperfect, obviously flawed metric, it is possible to evolve one or more of a more meaningful, objective and rational system for evaluating the popularity of a journal and its scientific contribution. I don’t know. There are alternative systems of ranking (mentioned in the Wikipedia article I linked to above, as well as briefly in the NI Editorial), such as the Pagerank by Google, Eigenfactor or the SCImago Journal rank. None of these systems are perfect, or widely used (which reflects, among other things, the time warp called Impact Factor, that most people are stuck in). One elegant way has been suggested by PLoS, in which it floated the idea that research articles are best judged by their own merit – and not by the journal they are published in. PLoS is calling this system ‘Article Level Metrics’. Is it perfect? No. Will it gain popularity? Who knows. But perhaps this is a better way of assessing the impact of the work of individual researchers, rather than holding them hostage to an antiquated system invented by a single company whose business interests lie in referencing software and citation data mining. In time, article level metrics may even effectively address the issues with the metric, H-index, which was proposed as an alternative assessment of the performance of individual researchers, but in which quantity may overshadow quality of the research articles in the calculation.
Perhaps this topic should be taken up by the scientific community on an urgent basis. Intellectual rewards aside, a researcher’s career may be made or broken based on the proper (or improper) evaluation of his/her published work. Impact Factor of a journal just seems too frivolous a parameter to judge a researcher’s scientific potential by. Wotsay, Ladies and Gentlemen, will this problem be amenable to collective thinking? 🙂