Skip to main content

Connectome as a Book


Your Connectome is a map of your brain.  Every neuron, every synapse.

I am only a few pages into Connectome, but was intrigued by a sentence: "Human DNA....has three billion letters....would be a million pages long if printed as a book."  The companion question, "How many pages for the Connectome?" might be answered later in the book, but I thought I would take a shot at it here.

Here is the punchline: Your Connectome book is 6.7 million times longer than your DNA book.

That human DNA is about a million pages is not too surprising, although it probably is not optimized. According to quora there are between 1500 and 1800 letters per page.  I am going to use round numbers, namely 2000.  Then, the 3x10^9 DNA letters would actually be 1.5 million pages.  But this is very wasteful.  Even using just ASCII we can encode four DNA letters per character, so the book should really only be about 400K pages.  And, this book is much more interesting; instead of endless GATC's, you get a full 256 character set to work with. Further compression is definitely possible, and in fact, up to 99% compression has been shown. This is largely due to repetitive structures in DNA that can be encoded efficiently.  So, we can write the DNA book with only 15,000 pages.  Robert Jordan's series The Wheel of Time is 11,000 pages and probably a little bit more readable.)

Now, the Connectome, according to Wikipedia, has 10^10 neurons and 10^14 synapses.  So, on average, there are 10,000 synapses per neuron.  If we imagine our Connectome book to be (somewhat) efficiently coded, we could do the following.  With 10^10 neurons we need 34 bits to encode the "neuron address".  I am going to assume that neurons have a "local space" (i.e, are connected to other neurons that are physically close by), and can be addressed relative to themselves with only 32 bits (if a connection is out of local space, we can use an escape sequence and a full 34 bit address).  This 32 bits gives us a nice round 4 characters per neuron.  Our encoding is then the following: <marker><neuron number><neuron for synapse 1>....<neuron for synapse 10,000>.

But this is still highly wasteful.  Let's sort the connections by neuron address and then use a differential encoding. With 10^10 neurons, and 10^4 connections per neuron, the average "distance" between synapses will be just 10^6, or a measly million. This takes only 20 bits to encode, or 2.5 characters.  Let's round down to 2 characters per synapse, assuming more compression is possible.  (I am assuming that there is no repetitive structure in the Connectome, so while some more compression is possible, it is probably not going to be much).  Now we have 2 * 10^14 / 2000 = 10^11 pages in our Connectome book.  That is a hundred billion pages, or approximately 6.7 million times as many pages as the DNA book.

Amazon can deliver one, but might struggle with the other.


Of course, I have used the assumption that, just like folding is ignored in the DNA Book, the XYZ coordinates of synapses is not required for the Connectome, just the connection graph.









Comments

Popular posts from this blog

Gliese 581g

So...there is probably intelligent life out there.  As the old Monty Python saying goes, "I hope so, cause there certainly isn't much here on earth."  Case in point.  The video for Gliese581g is on MSNBC, and works fine in IE, but crashes in Chrome [ here ].

Acsoi - Land Grab Economics

"Adjusted Consolidated Segment Operating Income" ( Acsoi ), is a measure of what a companies profits would be if they were not spending like crazy to acquire a space:  in GroupOn's case, this would be retailers. To me, using Acsoi as a measure is really an admission that a company has no staying power beyond brand awareness.  So, they need to grab and own as much mindshare as they can, as quickly as they can, to increase the barrier to entry for competitors.  Without intellectual property to help protect them, and with the cost of switching (for a user) being effectively zero, building a global brand, and relying on brand stickiness, is the best way forward. Companies like Amazon that have been effective at this have also built in other "sticky" factors over time: recommendation engines, one-click purchasing, etc.  This increases the cost for the user to switch, and allows the company to stop pouring money into marketing and acquisition costs.  You also buil...

Schrodinger's Cat is still Alive...and Dead

With Borders scaling back so many of their stores, I have ended up buying more books than I normally would. One I picked up is "What is Life?" by Ed Regis.  It is a good short read, although it is now 2 years old, which is a long time given the rate that "artificial life" is moving at.  Since the book came out, Craig Ventor claims to have created the first artificial life . One thing I did not know was that Schrodinger wrote a book with the same title in 1944 which predicted the existence of a DNA-like molecule;  Crick actually credits Schrodinger with inspiring some of his early work. Ed Regis points out that Schrodinger's book never actually defines what life is; that is left hanging.  Interestingly, I felt the same about Regis's book.  While he argues that life is defined by having an "embedded metabolism" that argument still seems weak.  Carl Sagan pointed out, many years ago, that cars have a metabolism, which is hard to argue against....