Paper #4

Paper: Yockey

Respondent: Bossard

Discussion

Mills: I happen to agree with Dr. Bossard in that I would like to see the evidence there ever was a two-letter code. First, if you have a two-letter code with a spacer, for me this is no simpler than a three-letter code. I think it is pretty much pure speculation that there ever was a two-letter code. Once you start getting modifications, I think almost all would be lethal.

I was disappointed that you didn't present here what the probability or information content of cytochrome c is. I grant you that it has 104 amino acids, 22 of these are invariant, i.e., they cannot be changed, which is about 20% of them. In twelve positions, there are two different amino acids that may be used, and this goes on down to two positions in which there might be as many as eleven variations.

Yockey: Eighteen.

Mills: Well, that disagrees with the data I've seen but the point is, Did you calculate on the basis of that? What is the probability of forming something like that by chance?

Yockey: It's in one of the 1977 papers.

Mills: Do you have it here?

Yockey: No.

Rust: 2 x 10⁶⁵

Mills: Okay, that is still extremely improbable. It is important to note that there are different possibilities in certain positions of proteins to have more than one amino acid. The only information that we have on that is in cases where people have studied sequence of amino acids in proteins. In one case, I think there were 70 different cytochrome c's that this data comes from. So it is probably not going to change very much from that value.

In regard to your statement that there could be 14 amino acids, that is at least theoretically possible and certainly it could be tested. As far as I know there are more than 14 amino acids in all cytochrome c's that are found biologically. I presume that someone could synthesize this and then see if it has appropriate activity. One cannot necessarily assume that it would have appropriate activity for all species.

Yockey: Two responses. It has been said by Crick that any alteration of the code would be lethal. But I am assuming here that you start out with a doublet code and then you add newcomers. Those newcomers take their own place in line. They don't disturb everybody else.

Mills: I see what you mean but you are assuming to begin with that you have basically three nucleotides in a sequence, the first two of which provide the code, and the third is a spacer.

Yockey: Yes.

Mills: I can't see that's simpler than a triplet code to begin with. It may be mathematically ...

Yockey: No. It is de facto a doublet code because the third codon doesn't make any difference.

Mills: I'll grant you that. I'm just saying it's no simpler biologically.

Yockey: It's just a spacer.

Mills: But it has to be put in there to be a spacer. If it isn't there then you've destroyed the code.

Yockey: No, because it is just a spacer. Your receiver simply looks at the first two, puts that amino acid in, and doesn't bother to look at the third one.

Mills: I grant you that. What I am saying is that that third nucleotide has to be there if it's a spacer. If it's omitted by mutation, deletion...

Yockey: It's not going to be omitted.

Mills: Well, one of the ways of changing things is by deletion, in which case then you change all the succeeding codes.

Yockey: No. You don't change the code if you make a deletion. You simply change the message.

Mills: All along the line.

Yockey: Exactly. But not the code. It's good that you mentioned that because there are a lot of people who don't know the difference between the code and the message. They use the terms interchangeably. This is wrong. The code is a set of rules. The message is what you need to describe a cytochrome c or a hemoglobin molecule. Two different things completely.

Denton: I want to ask Dr. Yockey a question I've wondered about ever since I read his papers. I'd like to object to these probability calculations. I don't deny that in fact the way you carried out the calculations is valid. But it seems to me that if it were so in nature that there was a long sequence of molecules leading from an alpha helix to a double alpha helix, then to possibly a bit of beta sheet, slowly the function of cytochrome c--and if there were very many of such networks approaching many other molecules with the properties of cytochrome c--then the fact that cytochrome c which exists in organisms on earth, and which differs from cytochrome b completely in 3-D form, the fact that it has this particular form --which is of course unique and highly improbable in itself--is irrelevant because there are 10⁶⁵ other possible molecules which can carry out that function. You have shown that a particular molecular conformation is highly improbable. But you haven't shown how many other possible molecules can carry out that function within the space of all possibilities.

Yockey: That's not true. I calculated the total number of cytochrome c molecules, which is something like 10⁶¹. How I did it is in the paper. This is the family of cytochrome c molecules.

Denton: You are interested in cytochrome c because it performs a biological function.

Yockey: That's right.

Denton: I'm asking, what if there exist billions of other molecules accessible to a random search which possess the same function? What is the ultimate point of the calculation?

Yockey: Again, for a lousy fifty bucks about a year from now you can find out. This is in chapter 6. I really can't discuss it any more than what's in the papers which are available to you, because it's a rather long chapter and it has a good deal of analytical geometry in it. It has to do with how you find out what are the family of acceptable amino acids at a particular site.

Denton: When I was with Murray Eden a few months ago at the NIH [National Institutes of Health] we discussed your paper. And I said to him, Isn't this paper flawed because he hasn't defined the space of possibilities. It is mathematically correct the way you did it, obviously. But in the prebiotic soup there might have been billions of molecules that could have carried out any particular function.

Yockey: You know how many hamburgers have been sold by some of these places? About 10¹¹ or something like that. So a billion (10⁹) is a very small number. Also, in the published paper I used Grantham's means of deciding how close they were. I'm replacing that by another paper which came out in 1987 and which answers your question. It's a good point, and if I hadn't done the work, your comment would be germane. But read chapter 6.

Denton: But what I am bringing up is the question of probability after the effect.

Yockey: If you buy it from me I won't charge you a sales tax.

Van Till: The title of our Conference is Sources of Information Content in DNA. My question is simply this. Is information, defined as it is in this context, the kind of thing that needs a source? If so, in what sense? If so, what kind of source would be sufficient?

Yockey: Well, yes, it needs a source. But if you ask me who done it, I'd have to say that I don't know. But the message is stored in the DNA. Now, we can talk about the message being transferred from here to there, which we do on a telephone. Or from now to then, or from sometime in the past to now.

Van Till: I am particularly concerned with that last one.

Yockey: If you take any sequence, remember my story about the Sultan and the Mullahs. Any sequence, whether it is written by infidels or believers, will go through this telephone system. This is what we are concerned about. I am not concerned with who made the message. I am concerned with how it goes through the system. What the entropy is, things of that sort.

Van Till: I am still concerned to get at the title of the conference and wonder if we are using terms consistently among one another or not. What does it mean to talk about a source of information? Is it the sort of thing that would be generated in the course of biochemical reactions? Are we using this word source in a way which implies something legitimate or are we are asking a non-question?

Yockey: In information theory, the source is described as a Markov process--any Markov process that you choose. In other words, you jump from one letter to another to another to another, and it could be a play by Shakespeare or it could be nonsense. All we can do is to measure its entropy. But we get a lot of things out of that. One thing that I have proved is that the Central Dogma is a theorem, not a basic principle. That's going to shake up a lot of people. I've already published that. I think it's important to understand that the Pythagorean is not a basic property of geometry, it's a theorem. There are a lot of other code systems that have a Central Dogma. And any error correction code has a Central Dogma. You get that purely from the mathematics.

Thaxton: In the letter that I sent to all conference participants, I indicated some of the kinds of sources of information that I have seen in the origin-of-life literature. Much of the literature talks about solar photons as ultimately the source of the information. I am hoping some of you will address this question. Solar photons seem to me a rather lame possibility to produce the kind of information we have been talking about. You spoke, Dr. Yockey, of an electrical engineer as something that can produce information. Electrical engineers, most of them I've known at least, have had intelligence as opposed to a rock or a solar photon. We know by experience that electrical engineers can produce this kind of information. But we don't know by experience that solar photons can.

Van Till: My concern is whether we need to look outside the physical system for a source of information or whether it can be generated by processes which take place in the normal course within the system. That brings me back to a concern expressed earlier in the day: Is the created order marked by a functional completeness and does it have built into its design the ability to generate systems like you and me? Or does it have the kind of functional incompleteness which requires information to be fed in continuously from outside in order to generate the systems that we are talking about?

Thaxton: When you say "generated by the system," what do you mean?

Van Till: By processes taking place within the system--let's say chemical processes.

Thaxton: Is that what I've been calling natural processes?

Van Till: Yes.

Thaxton: Okay, so then by your definition all intelligences are ipso facto beyond.

Van Till: I lost the connection. It was too large a leap.

Thaxton: We've been talking about intelligent causes too. Yet when you talk about causes within the system, you restrict them to natural ones. My question, Is human intelligence within the system?

Van Till: Okay, that's why we talked earlier about two levels to questions about cause. The term "natural cause" deals with a level quite different from the term "intelligent cause." I wonder if we are getting into some semantic difficulties because we are using the term "cause" at multiple levels and we are not making distinctions where distinctions may be necessary.

Thaxton: I'd like to get back to the question of solar photons and I 'd like to ask Dr. Yockey: In your understanding, what kinds of source would be legitimate for this code?

Yockey: When you talk about radiation you are talking about Boltzmann-Gibbs entropy. That is not isomorphic with Shannon-Wiener entropy. The two are entirely different things. One is physics; one is biology.

Thaxton: There's a quotation from Grammatical Man by Jeremy Campbell, p. 16 [Simon and Schuster, New York, 1982]: "Evidently nature can no longer be seen as matter and energy alone. Nor can all her secrets be unlocked with the keys chemistry and physics, brilliantly successful as these two branches of science have been in our century. A third component is needed for any explanation of the world that claims to be complete... Nature must be interpreted as matter, energy, and information." With the advent of information theory we have to realize that information is not reducible to matter and energy but is a third fundamental entity. But what exactly is information? Can you explain some of the different ways we use the word "information"?

Yockey: We get tripped up if we try to make our theories by means of words. If you stick to the mathematics then you don't have any problem. What you try to capture in the word "information" is only part of what we know. It does not include knowledge. As Shannon pointed out in his first monograph, knowledge and meaning are sort of relative. We use the telephone to transfer meaning. We don't have any way of measuring meaning. But we can measure the entropy of the signal it goes through.

Bradley: I think people who do origin-of-life research today recognize that there is a problem with the natural processes we are aware of in generating what seems to be the required functional complexity. Attempts to resolve that problem take several forms. One is to try to conceive of much simpler systems that can somehow provide the function. I think what Wilder-Smith [The Creation of Life, The Natural Sciences Know Nothing of Evolution] has been doing is saying, Look, you're never going to make it with the chance paradigm because what it postulates is too complicated to produce in some accidental way over time. Sidney Fox would say that even at relatively more simple molecular complexities, we can get enough differentiation in terms of function that over time natural selection might help us to weed through these many, many possibilities and come up with some functional selection that would ultimately lead to the kind of life functions that are necessary. But I think everybody today who works in the field recognizes that this is the core problem. I think it is fair for people from a process point of view to say that at the moment we don't have answers. We don't necessarily conclude there aren't answers within the realm of process.

As for a source of information, whether photons or energy flow through the system or whatever, there is a widespread recognition we don't know how to answer that question yet. I've been to two Gordon Conferences, and the last major international conference on the origin of life, and I think everybody who works in the field would acknowledge that. There is nobody blithely saying we have the answer to how to get the necessary complexity. Two approaches to resolving that question are searching for new principles or for much simpler systems that work.

Thaxton: Relating to Dr. Yockey's cytochrome c work in the Journal of Theoretical Biology, one of the criticisms I have heard of that work is that it presupposes a cytochrome c of a length we now know. This argument says certainly we should be able to have something much shorter and thereby improve the probabilities. This was Russell Doolittle's criticism of your work a couple years back. I would like to get your response to that objection.

Yockey: I don't invent facts. I just deal with the ones that exist. That's how long the thing is. I'm not going to shorten it or lengthen it.

Thaxton: So the query of a functional cytochrome c less than 100 amino acids long is speculation. We have to wait until we get one shorter to talk about it?

Yockey: Show me one. I haven't seen it yet.

Wilcox: Throw it back the other direction. Cytochrome c is pretty useless except as a component of the respiratory assembly of the mitochondria. It has no intrinsic biological capabilities, does it? It can't produce life. All it does is transfer an electron between two other cytochromes, which means that however big or small it is, what you've really got to have is a functional assembly, not just a cytochrome c.

Yockey: As I explained in the cytochrome c paper, it is a simple globular protein. It doesn't have 500 amino acids which would make it very difficult to deal with. Its function is well known. It doesn't have any gaps. I'm sure that if I had a graduate student who knew a lot about hemoglobin, I would suggest that the graduate student do a similar study of hemoglobin. But hemoglobin is a complicated molecule. You could spend your whole life learning about hemoglobin. Cytochrome c was a very handy way to start this discussion. That's why I used it.

Wilcox: All I meant by my question was, Can you have a functioning cytochrome c without its context?

Yockey: I just took the sequences I found in the list. I didn't feel I should shorten it, because that would be arbitrary. And I didn't think I should lengthen it. That would be arbitrary too.

Meyer: I want to get some clarification of the differences between some of these catagories in information theory. It seems they are especially important in regard to the kind of argument that Dr. Thaxton has been making. It is not really a question, as I understand it, of the quantification of information but rather the identification of qualitatively distinct kinds of information. And especially as that relates to the distinction between randomness and specified complexity. Wicken makes the distinction between structural and thermodynamic information--that structural information is an information sequence that actually conveys a message. This corresponds to the terminology of specified complexity. And I want to know if there is any mathematical way to distinguish between a random sequence and specified complexity if you don't know the code in advance, or the function of the organism. How do you make this qualitative distinction between a specified complexity and a random sequence when the mathematics seems to be identical in treating them?

Yockey: Briefly the answer is that it is not possible to prove whether a given sequence is random or not, because you are talking about elements which are members of themselves. Our friend, Mr. Epimenides, for example. So again, you will have to wait till the book comes out, and I will discuss that. I would like to thank everybody very much for it will be very helpful as I go through this material to make sure the points you raise are clearly explained and clearly stated. I do appreciate every comment that has been made here. Very helpful. Very constructive.