Friday, November 27, 2009

ClimateGate and Reflections on Trusting Trust

If Alan Turing is the Isaac Newton of Computer Science, then Ken Thompson is probably the Albert Einstein. Turing established a conceptual basis for a science in its infancy; Thompson was a dividing line between the old and the new. Before Thompson (and his collaborator, Dennis Ritchie), all was Mainframes, punch cards, and batch processing run my an elite priesthood of the Data Center; after was Unix and the explosion of personal computing - a democratization of computing, if you like.

Thompson and Ritchie developed Unix, so it was fitting that they received the Association of Computing Machinery organization's Turing Award - kind of a Nobel Prize for computing, if you will. Thompson's acceptance speech - Reflections on Trusting Trust - was another groundbreaker, and is considered one of the seminal papers of my field, Computer Security. In it, he describes how to make an entirely undetectable Trojan Horse (a "back door" offering unauthorized access), by fiddling with the compiler (the program that turns computer source code into executable programs). The first part of the process is to chance the source code for the compiler itself to add the code that implements the back door. Of course, anyone who looked at the source code would know that the game was rigged, so there's a sleight of hand that has to occur:
First we compile the modified source with the normal C compiler to produce a bugged binary. We install this binary as the official C. We can now remove the bugs from the source of the compiler and the new binary will reinsert the bugs whenever it is compiled. Of course, the login command will remain bugged with no trace in source anywhere.
An undetectable hack that lets you control any computer, without anyone being the wiser, because you control the underlying code. The exploit here targets trust.

The most important revelations from the Hadley/CRU data leak are, in a very meaningful sense, similar. The code that has been revealed is dodgy at best, and there are very serious questions about the validity of the data. Nevertheless, there are many, many scientists who believe that the Earth is warming, and that Mankind is at least partially to blame. There is a consensus of sorts in the scientific community, although it is admittedly not universally held.

A question that is not (yet) being asked about climate change is how would someone create a scientific consensus in the absence of solid data and computer models? Trusting trust.

Dr. Jones and the CRU team are in control of one of the main data sets that all climate scientists use in their analyses (referred to as HadCRUt; the other major one is NASA's GISStemp - note that NASA's Gavin Schmidt features prominently in the CRU emails as a member of the "Hockey Team").

What is clear about the HadCRUt (as well as GISStemp) is that they are opaque - the data sets are terribly hard to understand, poorly documented, and adjusted in a manner that is not well explained (if, indeed, it is explained at all). In the case of CRU, the original (unmodified) data is no longer available, but seems to have been destroyed.

Yet ever single climate scientist uses these data sets for their analysis of global temperature.

So, if the guardians of these data sets were to want to ensure a scientific consensus that the globe is warming, that this is a recent phenomenon, and that mankind is behind it, all they need to do is modify the data sets. All researchers pick up the modified data sets, have no (easy) way to validate the soundness of the data, and unsurprisingly produce similar results. Hey, the data show conclusively that the planet is warming. Oh noes! Thermageddon!

Trusting trust.

The response of the Global Warmers does not produce a feeling of trust. In the face of repeated examples of what seems to be subverting the peer review process and conspiring to avoid Freedom of Information Act requests, what we are presented with is essentially "How dare you impugn the reputation of these scientists?"

Ignoring the obvious answer that Dr. Jones and company have done that to themselves in their own emails, this response is entirely beside the point. We shouldn't trust anybody. We shouldn't even trust the data until everyone understands where it came from, how it got here, and whether (and how) it has been modified.

At that point, we will have a baseline of trust, and the discussion can begin. Until then, we may have an undetectable Trojan Horse in the ostensible justification for a multi-Trillion dollar re-engineering of western civilization. The press has not yet caught on to this, although the scientific community is beginning to discuss it. Thompson again (from an admittedly different content, but relevant to this):
The press must learn that misguided use of a computer is no more amazing than drunk driving of an automobile.
Until Jones and Schmidt step aside and allow independent validation of the data, the entire field of Climate Science will remain - for very good reason - under a cloud of distrust.

1 comment:

bob said...

I do not believe the redundant use of trust is necessary. The problem was misplaced trust.