Saturday, 16 February 2008

ID Does Predict Something

Well, that was a shock. About a month ago, Bill Dembski put up a post on his blog saying that he had some ID predictions, but what about other people? This generated a lot of discussion (over 200 comments), but little in the way of predictions. And we were all left wondering if Bill actually had any.

It turns out he did, and he revealed them this week. Bizarrely, he did this after being prodded by a couple of sock-puppets. And the predictions were...

(1) ID predicts that although there will be occasional degeneration of biological structures (both macroscopic and microscopic), most structures will exhibit function and thus serve a purpose. Thus most organs should not be vestigial, and most DNA should not be “junk DNA.” ...

Ah, this old canard. Afarensis has had a go at this one, and it also has a potted rebuttal in the Index To Creationist Claims. But I think there is one point worth adding. We're told that ID only asks whether there is a designer, but it says nothing about the identity or properties of the designer (which is good. They'll be embarrassed when they find out Ididit). But this prediction about junk DNA assumes a rather pragmatic designer who wouldn't put something into their design unless it was functional. What if the designer was slightly odd, and just wanted to throw something in for the hell of it. Perhaps the designer just wants to say 'Alu to us all. This is only really an ID prediction if one makes articular assumptions about the designer. For example, that it doesn't care what colour the wheel should be.

(2) Many systems inside the cell represent nanotechnology at a scale and sophistication that dwarfs human engineering. Moreover, our ability to understand the structure and function of these systems depends directly on our facility with engineering principles (both in developing the instrumentation to study these systems and in analyzing what they do). Engineers have developed these principles by designing systems of their own, albeit much cruder than what we find inside the cell. Many of these cellular systems are literally machines: electro-mechanical machines, information-processing machines, signal-transduction machines, communication and transportation machines, etc. They are not just analogous to humanly built machines but, as mathematicians would say, isomorphic to them, that is, they capture all the essential features of machines. ID predicts that the cell would have such engineering features; by contrast, Darwinian theory has consistently underestimated the sophistication of the machinery inside the cell.

Somewhere in there there might be a prediction.

Ah, found it. It's that "machines" in the cell and human-built machines have the same essential properties. Like being designed by humans, or having their primary structure determined by DNA. Any questions?

The analogy between the different types of machine is just that - an analogy. It isn't perfect - human designed machines aren't coded by DNA, and aren't parts of structures that start dividing and reproducing. A problem, then, is how do we know when the analogy has gone too far? If we take Dr. Dembski at face values, it never does.

(3) Conservation of information results (also referred to as No Free Lunch theorems, which are well established in the engineering and mathematical literature — see indicate that evolution requires an information source that imparts at least as much information to evolutionary processes as these processes in turn are capable of expressing. In consequence, such an information source (i) cannot be reduced to materialistic causes (e.g., natural selection), (ii) suggests that we live in an informationally open universe, and (iii) may reasonably be regarded as intelligent. The conservation of information counts as a positive theoretical reason to accept intelligent design and quantifies the informational hurdles that neo-Darwinian processes must overcome. Moreover, ID theorists have applied these results to actual biological systems to show that they are unevolvable by Darwinian means. ID has always predicted that there will be classes of biological systems for which Darwinian processes fail irremediably, and conservation of information is putting paid to this prediction.

OK, Dembski works on evolutionary information, so I guess we should have expected this. But it's not clear what the NFL theorems have to do with conservation of information - they say that blind search does not better than any other search strategies when averaged over all fitness surfaces, but some of those fitness surfaces will look very strange. And in reality, evolution works on a small set of such surfaces, which similar properties (I'm in a bit of a rush, so I won't dig out any links just now). Dembski did produce one manuscript claiming conservation of information, but his proof was to assume "for consistency's sake" that p1 <>2, and then prove log(p1) <>2). I never did get an explanation for what he meant by "for consistency's sake".

So, there you go, about we would expect 0 out of 3. Of course, I'm biased, so Iäm suer the UDites will score differently. Read more!

Sunday, 10 February 2008

Fame, Journals, and Blinding

A couple of weeks ago I blogged about a paper on the (possible) effect of double blinding on the bias against female authors. The paper had stirred up a few other comments, which got me thinking a bit more about why I'm a bit sceptical about double-blind reviews. Then I started thinking too hard, and ended up playing around with a simple model.

It's generally agreed that there is a bias towards better-known authors, so that a well-known author is more likely to have a manuscript recommended for acceptance than someone unknown. The argument for double-blinding is that it removes this bias, because the referee doesn't know who the author is. The problem with this is that it is often possible to guess who the author is (hell, it's sometimes possible to guess who a reviewer is) - a study 10 years ago (Cho et al. 1998) found that reviewers could work out the identity of the authors in about 40% of cases.

Presumably the authors who are recognised are the better known ones. We therefore have a situation where fame (whatever it is exactly) affects both whether a paper will be recommended for acceptance, and also whether the authors will be recognised. What effect does this have on the pattern of acceptance? Rather than just indulging in arm-waving, we can build a model, and indulge in arm-waving with numbers!

The model is simple, but hopefully captures the main points. Each author of a manuscript (for simplicity I will assume that each paper only has one author) has a fame. If the author's identity is known to the reviewer, then the probability of acceptance increases with their fame (the solid black line below).

If the reviewing process is double binded, then the probability that the author is recognised increases with the fame (the red dotted line). Note that it starts from a lower point, but increases more rapidly. if the author's name is not recognised, then the probability of acceptance is equal to the minimum probability. This is the the solid red line.

The technical details are below, for those who care. I have also scaled the probabilities of acceptance, so you can see them.

What does this show? Well, if you're a nobody, then the double-blind process means that you do as well as anyone else who isn't recognised, i.e. all but the famous. The famous do well under both systems, as they're recognised anyway. The people who lose out are those in the middle: the ones who are just starting to make a name for themselves, but are yet to be well known. With single blinding, their fame is enough that it helps them. Under double-blinding, though, they are not famous enough that they are recognised, so they are treated the same as a novice.

What this suggests, then, is that double blinding doesn't remove the biases: it just shifts them. So, the very famous actually do better under double blinding, as do the very obscure. Playing around a bit with the model suggests that the general result is robust, but it depends on the probability of recognition starting lower and having a steeper slope.

This is a model, using numbers that were plucked out of the air. But how does it compare to reality? My guess is that the effects are not as severe as shown here, but what is needed is data which can be used to estimate the parameters of the model. In the mean time, I'm not going to submit to any double-blind journals until I have my FRS.

The Maths
Fame, f, is uniformally distributed between -1 and 1. The probability of acceptance for a fame f, pa(f), is modelled like this:

if the identity of the author is known, otherwise it is the minimum value. If the manuscript is double-blind reviewed, then the probability that the reviewer correctly recognises the name of the author, pr(f), is

If a manuscript is reviewed double-blind, the probability that it is accepted is proportional to

pr(f)pa(f) + (1-pr(f))pa(-1)

The final probabilities are normalised, so that they sum to 1, by dividing by the sum of the probabilities.

Cho, M.K. et al. (1998) J. Am. Med. Assoc. 280, 243–245.

Read more!