Tuesday, April 14, 2020

size of genome

http://sandwalk.blogspot.com/2011/03/how-big-is-human-genome.html?showComment=1300992550197#c1663047036846499582





manuel "moe" g said...

[Part 1 of 2]

Forgive my ignorance, but I am trying to make sense of different descriptions of the human genome, and different descriptions of the information needed to fully specify a large mammal, like a man.

You talk about 3.5 Gb for the genome, and, giving Ray Kurzweil the benefit of the doubt, 50 million bytes after loss-less compression.

If someone made extravagant claims about a computer program that runs on some unknown hardware and unknown OS, I would be unamused if they handed me a thumb-drive containing the compressed binary executable, and nothing more. This single file would demonstrate nothing.

I would demand the original source code, the specification for the code (including the business decisions the code is meant to automate, at the very least), some documentation demonstrating that I can move back and forth between points in the specification and the source code lines encoding that part of the specification, and the code for the automated tests (so an automated test can demonstrate what changes to the code will still keep it within specification, at the very least).

And maybe the same for some of the libraries and hardware - maybe needing the full specification if the libraries, OS, and hardware if they all are very novel, quite unlike any I have worked with before.

So there would be a dramatic explosion of information needed, moving from the binary executable to a bare minimum specification of a computer program as defined above.

manuel "moe" g said...

[Part 2 of 2]

In the debate between PZ and Kurzweil, PZ makes this point:

http://scienceblogs.com/pharyngula/2010/08/ray_kurzweil_does_not_understa.php

"""

Let me give you a few specific examples of just how wrong Kurzweil's calculations are. Here are a few proteins that I plucked at random from the NIH database; all play a role in the human brain.

First up is RHEB (Ras Homolog Enriched in Brain). It's a small protein, only 184 amino acids, which Kurzweil pretends can be reduced to about 12 bytes of code in his simulation. Here's the short description.

MTOR (FRAP1; 601231) integrates protein translation with cellular nutrient status and growth signals through its participation in 2 biochemically and functionally distinct protein complexes, MTORC1 and MTORC2. MTORC1 is sensitive to rapamycin and signals downstream to activate protein translation, whereas MTORC2 is resistant to rapamycin and signals upstream to activate AKT (see 164730). The GTPase RHEB is a proximal activator of MTORC1 and translation initiation. It has the opposite effect on MTORC2, producing inhibition of the upstream AKT pathway (Mavrakis et al., 2008).

Got that? You can't understand RHEB until you understand how it interacts with three other proteins, and how it fits into a complex regulatory pathway.

"""

I am inclined to grant PZ the point, and say his understanding of the immensity of the task outstrips Kurzweil's understanding.

Would the explosion of information needed to move from the complete genome to the complete specification of a large mammal be on the same order of the explosion of information needed to move from the binary executable to a bare minimum specification of a computer program as defined above? Did I capture the gist of it, or am I hopelessly mistaken?

No comments: