A couple of weeks ago I blogged about a paper on the (possible) effect of double blinding on the bias against female authors. The paper had stirred up a few other comments, which got me thinking a bit more about why I'm a bit sceptical about double-blind reviews. Then I started thinking too hard, and ended up playing around with a simple model.
It's generally agreed that there is a bias towards better-known authors, so that a well-known author is more likely to have a manuscript recommended for acceptance than someone unknown. The argument for double-blinding is that it removes this bias, because the referee doesn't know who the author is. The problem with this is that it is often possible to guess who the author is (hell, it's sometimes possible to guess who a reviewer is) - a study 10 years ago (Cho et al. 1998) found that reviewers could work out the identity of the authors in about 40% of cases.
Presumably the authors who are recognised are the better known ones. We therefore have a situation where fame (whatever it is exactly) affects both whether a paper will be recommended for acceptance, and also whether the authors will be recognised. What effect does this have on the pattern of acceptance? Rather than just indulging in arm-waving, we can build a model, and indulge in arm-waving with numbers!
The model is simple, but hopefully captures the main points. Each author of a manuscript (for simplicity I will assume that each paper only has one author) has a fame. If the author's identity is known to the reviewer, then the probability of acceptance increases with their fame (the solid black line below).
If the reviewing process is double binded, then the probability that the author is recognised increases with the fame (the red dotted line). Note that it starts from a lower point, but increases more rapidly. if the author's name is not recognised, then the probability of acceptance is equal to the minimum probability. This is the the solid red line.
The technical details are below, for those who care. I have also scaled the probabilities of acceptance, so you can see them.
What does this show? Well, if you're a nobody, then the double-blind process means that you do as well as anyone else who isn't recognised, i.e. all but the famous. The famous do well under both systems, as they're recognised anyway. The people who lose out are those in the middle: the ones who are just starting to make a name for themselves, but are yet to be well known. With single blinding, their fame is enough that it helps them. Under double-blinding, though, they are not famous enough that they are recognised, so they are treated the same as a novice.
What this suggests, then, is that double blinding doesn't remove the biases: it just shifts them. So, the very famous actually do better under double blinding, as do the very obscure. Playing around a bit with the model suggests that the general result is robust, but it depends on the probability of recognition starting lower and having a steeper slope.
This is a model, using numbers that were plucked out of the air. But how does it compare to reality? My guess is that the effects are not as severe as shown here, but what is needed is data which can be used to estimate the parameters of the model. In the mean time, I'm not going to submit to any double-blind journals until I have my FRS.
The Maths
Fame, f, is uniformally distributed between -1 and 1. The probability of acceptance for a fame f, pa(f), is modelled like this:
if the identity of the author is known, otherwise it is the minimum value. If the manuscript is double-blind reviewed, then the probability that the reviewer correctly recognises the name of the author, pr(f), is
If a manuscript is reviewed double-blind, the probability that it is accepted is proportional to
pr(f)pa(f) + (1-pr(f))pa(-1)
The final probabilities are normalised, so that they sum to 1, by dividing by the sum of the probabilities.
References
Cho, M.K. et al. (1998) J. Am. Med. Assoc. 280, 243–245.
Sunday, 10 February 2008
Fame, Journals, and Blinding
Posted by Bob O'Hara at 14:38
Labels: academia, refereeing
Subscribe to:
Post Comments (Atom)
11 comments:
"It's generally agreed that there is a bias towards better-known authors, so that a well-known author is more likely to have a manuscript recommended for acceptance than someone unknown. The argument for double-blinding is that it removes this bias, because the referee doesn't know who the author is."
I don't disagree with this, but a couple of riders:
(1) As I keep trying to get across on peer to peer ;-), it is the journal editor who decides on what papers to even send for review. So that's a big factor before you even get as far as double blind or not. If you are correct about better known authors, then how can we konw that the editor is not biased in their favour before we even get to the peer-review stage? (Nature editors would strongly argue that we aren't biased that way. Whether or not we are, what about the vast majority of STM journals, whose editors are practicing academics, embedded in and part of the system, who choose which to send out?)
I also add, in response to another point you make that I think, based on mss I've read over the years, that even "not so well known" authors refer to their own previous work in the first few refs in the ref list of a paper -- unless it is their very first paper of course. The paper you refer to (40 per cent guessing) is quite old now, and I believe that citation practices, certainly in the biomedical area, have become much more "stylised" these days.
Personally, I don't buy the gender bias in journals.... for two main reasons. First, editors genuinely don't know the gender of most authors and most papers are authored by a mix; and second, there is well-documented gender bias in the scientific profession anyway (hiring, grants, tenure, promotions, short-term contracts being the norm in the main childbearing ages, etc). In doing a journal study, one would have to control for these factors, which seem to me to be far more marked than anywhere near anythign a journal has been even claimed to do -- and your stats seem to indicate that the claim here may not be as clear as had been thought by the authors.
PS because of your/Blogger's commenting system, I've had to log in using my personal blog ID, but the comment above is in fact Maxine from Peer to Peer (www.blogs.nature.com/peer-to-peer)
Also, sorry, forgot to put a (2) on second paragraph of comment above.
Bob,
I don't understand why the probability of acceptance increases MORE with double blind than single as the identity of a (famous) author is guessed. Shouldn't the model have the double blind probability approach single blind as the confidence of identity increases? Your model seems to indicate that once the identity is guessed in double blind, there is a greater benefit to fame. This doesn't make sense to me.
Maxine - interesting point about editors. I've been vaguely thinking that they're important, but not really analysed their importance seriously. Mind you, I'm not sure you'll be popular with your colleagues for the implication that they're biased. :-)
I'm sure there's more to come on the gender bias story - keep your eye on TREE!
Nina - Ah, I could have explained that better! The probabilities are normalised so that the total probability (i.e. the area under each curve) is the same, i.e. the same number of papers are published in each regime. The totally obscure and the very famous have the same (un-normalised) probabilities. But people in the middle have a lower un-normalised probability, because they can be rejected if they are not recognised. If I didn't normalise, then the double-blind journal would end up with fewer accepted papers, so the normalisation just equalises this, which means raising the probability of acceptance of the obscure and famous.
Bob
Bob, we've been distracted by needing to comment on a lamentable report over on Holford Watch but we've been meaning to blog about peer-review matters for some time.
We recently came across a peer-reviewed paper with some spectacularly bad methodological issues. The paper was so bad that we were planning to write a letter to the journal rather than just blogging about it. However, the journal doesn't have a facility for publishing letters that will be indexed rather than just online comments. Because it is a pay-to-publish journal, we would have to pay £900 to criticise the article in question - something that (of course) we don't have the budget to do.
Looking into the matter further, it seems that several journals are now publicising that their peer-review process is unbelievably lax which allows some people to claim that they have peer-reviewed publications even when those publications have no merit. It is not as if these journals are being used solely by the publication-hungry, there are some surprising names in there.
On the other side - it seems as if there are some sensible journals that can not be indexed because their 'peer-reviewers' are fundamentally opposed to the journal. Must post more on this topic.
In case of Blogger commenting system again, I'll state up-front that this is Maxine (Peer to Peer).
On the "bias towards better known..." etc. I don't think this applies only to journals, any more than I think gender does. I think journals reflect the culture of the scientific community, whatever that is, because authors, peer-reviewers, and readers are all the same people, or from the same pool of people - scientists - scientists manage their own professional systems such as tenure, grants, awards and so on.
That having been said, good journals such as Nature (the journal I know best) spend a lot of money on hiring excellent professional editors to manage an independent and fair peer-review process. Whatever peer-review system Nature uses, we still have to reject 90 per cent of manuscripts submitted. Therefore, whatever system we use, a lot of people are going to think it is unfair (to them, mainly). But this does not mean that it is biased. Just tough.
I've personally accepted papers by "junior, unknown" authors for Nature: I know some brilliant referees who are named as "most renowned" scientist by citation studies, who are incredibly supportive and encouraging to manuscripts by "new" authors. I also observe over the years that when one uses a "new" peer-reviewer, in the sense of a newly qualified postdoc, these reviewers tend to be far more harsh than senior academics. But these are all generalisations. One thing I am sure of, there is no "one" answer, as each individual is different. You can't class people as "being biased because they don't like papers by young authors presenting new ideas" -- some people are like that, and some aren't. The strength of a journal like Nature is that our editors get to know who is a good referee and who isn't.
Incidentally, peer-reviewers themselves are incredibly busy and more and more is being asked of them (tons of additional supplementary data per mansucript, etc). A colleague was just at a conference and a scientist there told her that she or he (;-) ) had received six requests to review manuscripts for various journals, just that morning alone.
^my credentials. Thus take this however you wish.
I am wondering whether bias is considered an acceptable sacrifice for effectivity and efficiency?
Consider that, generally, juniors/young scholars are less experienced in their fields. Consider that this can be seen as the following pertaining to their papers: less depth, lower significance/originality and more prone to technical errors. A higher risk of a lower return on investment, if you will.
Then there are the referees, which is a scarce and overworked resource, by a number of accounts. Do (senior) referees have an easier time peer reviewing good papers or bad papers? One would say those with less depth, less originality/significance and more prone to technical errors requires more effort of those peer reviewers. Higher maintenance, if you will.
In an unbiased environment, journals could be dealing with a situation that more likely brings higher maintenance, but lower return on investments. Either one of these characteristics is less than desirable, but both of them combined sounds like a rather unfortunate issue. Double blind peer review does not solve this, bias somewhat does.
Consider the primary objective of a journal, which is communicating quality research work, and an objective of a publisher, profit, is it reasonable for them to be less aggressive to this kind of bias?
Dear "neither published nor refereed":
What you note would be accurate if all papers published (junior or senior) were free of bias (ie only the best science would be published, at least in the top journals, regardless of who’s on the paper, their political connections etc).
The problem –as I and others see it- is NOT as much that junior authors are under more scrutiny (which they are, but I happen to think that science benefits from scrutiny - though of course "too much" of it can be counterproductive -- see letter by Zucker in Science magazine http://www.sciencemag.org/cgi/content/full/319/5859/32c)
The problem is that, at least in my specialty field, senior authors all too often get a “free pass”, because people “trust” them more. There is no reason for this, as the experimental work should be able to stand (or fall) on its own merits. But here’s where the faith element enters: “so-and-so did this and he/she cannot be wrong” is a comment I heard many times about a paper that made little sense, but was published with fanfare (though eventually it was understood to be “over-interpreted”). I have taken to calling this “faith based science” – which can only make for bad science, for obvious reasons.
Of course, the “sexier” the topic (or the more connected the author), the easier it is to publish questionable (or not well-supported and over-interpreted) research in top journals. Conversely, once this happens, it becomes exceedingly hard for others to publish the correction. Science self-corrects, it is true, but self-correction can take a long time.
PS This is the status quo in my field – which is crowded and quite top-heavy. It is not the case in a number of other fields that I have brushed up against. So whereas it can and does happen, I’m sure it is not the norm.
Tsk, I sure know how to make a point: the grammatical errors in my first post are embarrassing. I still have a long way to go.
‘The problem –as I and others see it- is NOT as much that junior authors are under more scrutiny (which they are, but I happen to think that science benefits from scrutiny - though of course "too much" of it can be counterproductive’
I was under the impression that junior/young scholars did not even get their fair share of scrutiny, but it seems like that is not the case at all. Thanks for the insight.
‘senior authors all too often get a “free pass”, because people “trust” them more. There is no reason for this, as the experimental work should be able to stand (or fall) on its own merits.’
I want to say that I completely agree that ‘so-and-so did this and he/she cannot be wrong’ is a bogus stance to “green light” work. However, trust IS often linked to “familiarity”. People with records are seen as more experienced and they generally have “more to lose” by submitting/publishing nonsense.
I agree that experimental work SHOULD stand on its own, but therein lies the limitations of peer review (or rather, peer reviewers): the lack of resources to reproduce or even verify the (raw) data. Given that peer review is based on trust, peer reviewers work with the notion that the “peers” they are fact checking have in fact done the right thing and not play with their data. So combine this with the above mentioned, it is possible that there is the notion of a slightly higher “risk” that comes with young/junior scholars in terms of trustworthiness of science based on correct data.
I think this is more or less the effect the study by Blank on the Am. Econ. Rev. found back in '91: "Authors at top-ranked niversities and at colleges and low-ranked universities are largely unaffected by the different reviewing practices, but the authors at near-top-ranked universities and at nonacademic institutions have lower acceptance rates under double-blind reviewing."
The Effects of Double-Blind versus Single-Blind Reviewing: Experimental Evidence from The American Economic Review by Rebecca M. Blank,
American Economic Review 81 (1991), 1041-1067
Does it really have an effect on the pattern of acceptance? Maybe. This information does shed a lot of light. It's difficult to come out with a conclusion as the data available is lacking.
Post a Comment