Total Drek

Or, the thoughts of several frustrated intellectuals on Sociology, Gaming, Science, Politics, Science Fiction, Religion, and whatever the hell else strikes their fancy. There is absolutely no reason why you should read this blog. None. Seriously. Go hit your back button. It's up in the upper left-hand corner of your browser... it says "Back." Don't say we didn't warn you.

Monday, June 09, 2008

The Measure of a Man

Recently just to pass the time I've been doing a bit of light reading. Specifically, I've been working my way through Mark Perakh's intriguing book Unintelligent Design. As you might guess, it's a critique of the ideas of Wild Bill Dembski generally, and his books Intelligent Design, The Design Inference, and No Free Lunch in particular. One of the arguments centers around the area of information theory and has really got me thinking.

To begin, let's consider information theory. This is, as you might guess, a sort of interdiciplinary area studying signals and communication. It is generally considered to have been founded by Claude Shannon back in 1948 and employs a definition of information that may seem a little counter-intuitive to many of us. Specifically, it regards a text (for example) as having an information content equal to the shortest possible computer program required to reproduce it. So, the easier the text is for a computer to generate, the less information that text carries. To make this a little more explicit, let's consider a pair of examples. The first example is a string of numbers like this "5696585638565615275755151850760772515." I produced this string more or less at random by slapping keys on my keyboard. This of course means that it is not truly random- given my hand placement the middle keys were struck rather a lot- but it provides a simple example of an otherwise unordered string of numbers. On the other hand, let's consider a different string of numbers: "1 11 21 1211 3112 132112 312213 232221 134211." Now, believe it or not, this second string of numbers is not random. Instead, it is derived from a simple rule that, after the first entry, each successive entry describes the previous. So, for example, "11" would be read as "one-1" and "21" would be read as "two-ones," and so on. Now, because the second string of numbers is a deterministic result of a simple rule, a computer program needs only that rule in order to accurately reproduce this string and, indeed, extend it to any arbitrary length we should require. In a sense the only information in the second string is carried by the first number which determines the remainder of the sequence- other than it, all other numbers are simply logical consequences of the first number and the rule. In contrast, the first string is effectively random and no number can be used to deduce the next in the sequence. Thus, to reproduce this sequence in a computer program you would effectively have to simply record the sequence in full and read it back out of memory. Thus, according to Shannon, the first sequence contains more "information" because it requires more computer space and effort to produce while the second sequence contains less.

This may seem highly counter-intuitive because most of us are used to employing the word "information" to refer to "meaningful content." Indeed, the random hash that we hear on an untuned radio station would likely be ignored as containing no information by most people despite the fact that- like the random series of numbers- it would be more difficult to reproduce than the content of a radio signal. In other words, static is regarded as containing more information than meaningful communication. The explanation for this, however, is that the meaningfulness of a signal is not necessarily inherent in that signal. Consider, for a moment, a police drama in which an agreement is made that when one character coughs twice the rest of the police squad will storm a building. Is there any way that those two coughs could be analyzed and dissected so as to reveal an unequivocal command to storm the building? Obviously not- in this case the meaningfulness is derived from properties of the sender (i.e. the cougher) and the receiver (i.e. the rest of the police) but is not otherwise inherent. Likewise, the meaningfulness of a text written in a language we cannot read is exceedingly low. Certainly we surmise that it must mean something, but we are incapable of distinguishing a meaningful sentence in, say, cyrillic from a random hash of letters unless we can actually read that alphabet and the relevant language.* As a result of all this, we cannot quantify the meaningfulness of a message with any ease but can quantify information in Shannon's sense. It is this distinction between meaning and information that, as a side note, proves so useful to Perakh in slapping the shit out of Dembski.

Now, pausing information theory for a moment, let's talk about Alan Turing. The father of computer science, Turing developed a method for determining when a computer should be regarded as intelligent or sentient, a test that has become known as the Turing test. I've discussed his ideas before but the basic logic of the test is simple: place a human in a room and allow them to correspond with several entities. Some of those entities are other humans but others are computers. To the extent that the first human cannot reliably distinguish the humans from the machines, you must consider the machine to be sentient. This may seem a little simplistic but, really, it just mirrors the process we use when talking to other humans. We cannot directly observe each other thinking but, because we speak and act in a manner that implies that we are intelligent and sentient,** we generally assume that other humans are intelligent and sentient. The Turing test simply makes it possible for a machine- an artificial construct- to be given the same benefit of the doubt that we normally give to other hominids.

So how does all this come together? Well, here's the thing: let's say that we managed to put together an artificial intelligence that could replicate, but not exceed, the mental capabilities of the average human. It passes the most stringent Turing test, or derivative examination, we can construct with flying colors. That A.I. would, presumably, be defined at least in part by a series of software commands.*** Interestingly enough, we can precisely measure and quantify the length and complexity of a computer program. More simply, we could calculate the amount of "information" contained in that program in Shannon's terms. And if this computer program is capable of mimicking a human, if it must be regarded as intelligent and sentient, then we have a way of measuring the "information" content of a single human individual.

And that, when you get right down to it, is pretty f-ing cool.


* Actually, information theory would provide a way to distinguish nonsense from a message but, sadly, it would not get us any closer to desiphering said message.

** The folks on Conservapedia being a notable exception.

*** For those not used to thinking about issues like this, I'm engaging in a lot of handwaving here. A functional A.I. would probably not rely on software as usually conceived any more than human thought depends on a hand calculator.

Labels: , , , ,

3 Comments:

Anonymous Dan Hirschman said...

I think you lost a term or two in the "Look and say" sequence. In particular the 5th term is missing, making the sequence very confusing :)

One confusion I have with this definition of information is connected to a conversation I had with a computer science major friend when he was learning about algorithms for zipping things. The problem is, there's no best answer - different algorithms will work better on different kinds of data. If that particular sequence in your post ("569...760") were very common, we could just name it sequence 1 and everyone would know what it was (though I suppose if it were an infinite sequence, like the Look and Say, and it had no obvious pattern, that would not work). Anyway, long and short of it, I don't get information theory.

Monday, June 09, 2008 11:53:00 AM  
Blogger Marf said...

I feel that's a faulty way of measuring the "information content" of a human. That would be the information content of an AI's programming. Humans process information differently.

Think of it this way (let's say information amount is equal to the number of characters):
#1 AI's version: 1+1=2;
#2 AI's version: A=1; A+A=2;
Human's version: One plus one equals two.

Humans may have much less efficient, or much more efficient methods of processing information than an AI. Just because they are capable of the same tasks does not mean they perform it in the same way (or with the same amount of information).

Monday, June 09, 2008 10:13:00 PM  
Blogger Drek said...

Dan: Well, I wouldn't be surprised to learn I dropped a term as this blog isn't exactly carefully reviewed.

As to your concern about information theory, I think this can be dealt with, actually. First, it's important to keep in mind that I.T. stipulates that the program is the shortest possible. Thus, it's more or less assuming you have the most efficient algorithm.

Secondly, keep in mind that in order to define "760" as "sequence 1" you must, somewhere in the program, include that definition. As the numbers "760" must be represented in binary, that imposes a cost on the program. Granted, you could build that translation into hardware, but that's just putting part of the program into a different medium. So, long story short, that approach doesn't completely avoid the issue at hand.

Third, in the event that "760" were unusually common then I.T. would regard it as containing less information that a random set of three numbers. In other words, any time you see "7" you can guess that "60" follows and, thus, everything after the 7 is redundant. So you're creating a situation of non-randomness when the example postulated a random string where no one sequence will be any more common than another.

Finally, I think you have to keep in mind that I.T. is speaking of computers in a very theoretical, rather than concrete, way. Much as we can define and analyze turing-complete systems abstractly without actual hardware, they're talking about computer programs in a broad sense, not in terms of an actual language.

Marf: Actually, I agree that this approach wouldn't necessarily tell us a lot about humans, though in my case it's because the definition of "information" is pretty abstract. That said, I agree with you that an A.I. probably wouldn't figure things out the same way we did. At the same time, however, if it appeared to be capable of the same cognitive abilities as we are, then determining its information content would help us place a bound on our own information content. Perhaps not the most accurate measurement possible but, nevertheless, a useful one.

Tuesday, June 10, 2008 9:48:00 AM  

Post a Comment

<< Home

Site Meter