Wednesday, January 07, 2009

The Jeff Shallit's "Information" (but really raw K-complexity) Quiz.

Jeff Shallit..

"Q1: Can information be created by gene duplication or polyploidy? More specifically, if x is a string of symbols, is it possible for xx to contain more information than x?"

No. While raw/naked K-complexity may indeed be increased, raw K-complexity is not information. The second x is just a repeat of the information in the first and it can therefore be compressed back into the first x. The first x, however, as an optimized information string, cannot be further compressed without information loss. --The fact that there are two x's (instead of one), is a system-level information(fact), not information contained in x.

"Q2: Can information be created by point mutations? More specifically, if xay is a string of symbols, is it possible that xby contains significantly more information? Here a, b are distinct symbols, and x, y are strings."
No. While raw K-complexity may indeed be increased, random (unspecified) mutations can only destroy information. Point mutations are -- by definition -- uncorrelated and non-symbolic (at the level in question). The replacement of "a" (a correlated, coordinated specified symbol) with "b" (an uncorrelated, uncoordinated, unspecified{random} character) at very least, negates the information contained in "a" and may well damage the rest of the string -- a string that may well depend critically upon "a."

"Q3: Can information be created by deletion? More specifically, if xyz is a string of symbols, is it possible that xz contains signficantly more information?"
No. While raw K-compexity may again be increased. The loss of functioning symbol "y" constitutes a loss of information.

"Q4: Can information be created by random rearrangement? More specifically, if x is a string of symbols, is it possible that some permutation of x contains significantly more information?"
No. "Random rearrangement/purmutation" ALONE always replaces coordinated (non random) information with uncoordinated (random) events and information is almost always lost. While raw K-compexity may indeed be increased, this raw K-complexity is not information and is wholly parasitic upon the underlying system level of complexity.

"Q5. Can information be created by recombination? More specifically, let x and y be strings of the same length, and let s(x, y) be any single string obtained by "shuffling" x and y together. Here I do not mean what is sometimes called "perfect shuffle", but rather a possibly imperfect shuffle where x and y both appear left-to-right in s(x, y) , but not necessarily contiguously. For example, a perfect shuffle of 0000 and 1111 gives 01010101, and one possible non-perfect shuffle of 0000 and 1111 is 01101100. Can an imperfect shuffle of two strings have more information than the sum of the information in each string?"
No. This question is somewhat ambiguous however. Is that A: "an imperfect shuffle of two strings of information", or B: "an imperfect shuffle of two strings of meaningless binary digits?" In both cases however the answer is no. Unless the information (A:) is suitably protected, a sustained RANDOM shuffling will only decrease information. While raw K-complexity may once again be increased by this action, raw unspecified K-complexity is not specified/correlated and it is subsequently not information.

According to Shallit however..

"The answer to each question is 'yes.' In fact, for questions Q2-Q5, I can even prove that the given transformation can arbitrarily increase the amount of information in the string, in the sense that there exist strings for which the given transformation increases the complexity by an arbitrarily large multiplicative factor. I won't give the proofs here, because that's part of the challenge: ask your creationist to provide a proof for each of Q1-Q5. "

Yes to raw K-complexity increase but no to semantic information increase. Proofs of raw K-complexity increase are not proofs of informational K-complexity increase.

Jeff Shallit continues...

"Now I asserted that creationists usually cannot answer these
questions correctly, and here is some proof.

"Q1. In his book No Free Lunch, William Dembski claimed (p. 129) that "there is no more information in two copies of Shakespeare's Hamlet than in a single copy. This is of course patently obvious, and any formal account of information had better agree." Too bad for him that Kolmogorov complexity is a formal account of information
theory, and it does not agree. "

"Q2. Lee Spetner and the odious Ken Ham are fond of claiming that mutations cannot increase information. And this creationist web page flatly claims that "No mutation has yet been found that increased the genetic information." All of them are wrong in the Kolmogorov model of information."

"Q4. R. L. Wysong, in his book The Creation-Evolution Controversy, claimed (p. 109) that "random rearrangements in DNA would result in loss of DNA information". Wrong in the Kolmogorov model."

The Kolmogorov model is incomplete. The Shannon model is also incomplete. The Shannon model needs to be supplemented with a correlational improbability and the Kolmogorov model needs to be supplemented with irreducible correlational couplings. These additions are necessary to avoid the falsehood of... "Because both random strings and information strings are K-complex (true)...therefore, randomness= information (false)." The second part of this generic statement ("randomness = information") is false. When I produced this post I did not produce a random string of letters. Instead I produced information - a textual (B) correlation of what I have in my mind(A).

In order to complete Shannon and Komogorov's theories, the extended/projected nature of information must be mathematically formalized. I have proposed doing this by means of a correlational improbability figure in Shannon's theory and irreducible coupling for information in Kolmogorov's theory.

If there is no correlation between the email that you send your friend and the information he/she receives then the "extention"(from A to B) has failed and the information has been lost. "Information" thus depends critically upon correlational integrity (input to output mapping) . Randomness is not correlated/coordinated (by definition) and does not depend on correlational integrity. Randomness depends upon lack of correlation.

Information(A) is always information "about" something (B). Randomness is not information "about" anything. Graphically now, "in-FORM-ation" is K-complex on the "y" axis while being irreducibly K-simple (ordered/formed) on the "x" axis. A random string is just naked/raw K-complexity. The introduction of random events into the DNA information string is a loosing proposition.

As I see it..

In Q1 Jeff is trying to defend RANDOM gene duplications.
In Q2 Jeff is trying to defend RANDOM point mutations.
In Q3 Jeff is trying to defend RANDOM gene deletions.
In Q4 Jeff is trying to defend RANDOM gene shuffling/rearrangement.
In Q5 Jeff is trying to defend RANDOM gene shuffling/recombination.


William Brookfield said...

In writing this text I am projecting “informational” patterns that initially exists in my mind or brain (A) outward in correlated and coordinated form into my computer (B) and over the Internet (C) for others to read (D) at distant places. I am suggesting that a valid theoretical model of “information” must include information’s correlated and projected nature -- and that any valid mathematical model of information must provide a numeric quantification of this ongoing “coordinated projectionality.” As far as I can tell neither Komogorov nor Shannon’s theories of information include any such quantification.

Only minds can give birth to information. This is because only the “minds eye” (with its insight, hindsight and foresight) possesses an in inner mental space, large enough to initially nurture and hold information for subsequent projection outward.

Thanks to the massive effort of computer inventors and designers we now have stable correlated computer environments safe/stable enough for the processing, storage and transmission of digitized information. It is the shape that information takes when digitized that is of interest to information theorists (such as myself).

Anonymous said...

I am trying to understand your critique, but these terms confuse me:

-- "raw K-complexity"

-- "system level information"

-- "optimized information string"

-- "non-symbolic"

-- "correlated, coordinated specified symbol"

-- "damage" to a string

-- "functioning symbol"

-- "coordinated information"

-- "informational K-complexity"

Please refer me to some papers or books in the information theory literature where these definitions can be found. Thanks!

William Brookfield said...

Hi Anonymous,

The key definitions are between #1 "raw K-complexity" and #2. "Informational K-complexity." Raw K-complexity is just a random/ haphazard string of symbols -- just gibberish. (I)K-complexity is not random because all of the symbols are specified and functional components (decision nodes) of a larger structure.

When one is talking about random gene duplications and random mutations etc, the K-complexity being produced is the wrong type of K-complexity. I.E. raw unspecified K-complexity -- not specified aperiodic K-complexity (information).

There is a crucial difference between UNspecified aperiodic complexity and specified aperiodic complexity.

Raw one-dimensional K-complexity is produced by the collision of randomness (a uniform probability distribution) and any digitized system that is attempting to accurately express it. "Digitization" is a problem for randomness because every digit (ATGC) is itself a manifestation of order (the opposite of randomness). DNA (ATGC) is a digitized system. The result is a lawless, haphazard string -- I.E., raw/naked K complexity.

Informational K-complexity on the other hand is specified in that it contains specific instructions that map(correlate) to (biological) function and form. This is what is needed for functional DNA.

Randomness plus digitization provide only unspecified raw K-complexity (noise).

Informational K-complexity is specific and requires error correction (I.E., protection from random errors in order to maintain functionality). Random/spurious/non-symbolic events are the enemy of order and the enemy of information.

See perhaps – "Error Correction Runs Deep" plus “Error Correction Runs Yet Deeper” -- both by Mike Gene.

Non-informational K-complexity is unspecified. Being random and non-functional, it requires no protection from random errors (for it is itself just a string of random errors).

Anonymous said...

So you are saying that none of the terms you are using can be found in the information theory literature?