Your Connectome is a map of your brain. Every neuron, every synapse.
I am only a few pages into Connectome, but was intrigued by a sentence: "Human DNA....has three billion letters....would be a million pages long if printed as a book." The companion question, "How many pages for the Connectome?" might be answered later in the book, but I thought I would take a shot at it here.
Here is the punchline: Your Connectome book is 6.7 million times longer than your DNA book.
That human DNA is about a million pages is not too surprising, although it probably is not optimized. According to quora there are between 1500 and 1800 letters per page. I am going to use round numbers, namely 2000. Then, the 3x10^9 DNA letters would actually be 1.5 million pages. But this is very wasteful. Even using just ASCII we can encode four DNA letters per character, so the book should really only be about 400K pages. And, this book is much more interesting; instead of endless GATC's, you get a full 256 character set to work with. Further compression is definitely possible, and in fact, up to 99% compression has been shown. This is largely due to repetitive structures in DNA that can be encoded efficiently. So, we can write the DNA book with only 15,000 pages. Robert Jordan's series The Wheel of Time is 11,000 pages and probably a little bit more readable.)
Now, the Connectome, according to Wikipedia, has 10^10 neurons and 10^14 synapses. So, on average, there are 10,000 synapses per neuron. If we imagine our Connectome book to be (somewhat) efficiently coded, we could do the following. With 10^10 neurons we need 34 bits to encode the "neuron address". I am going to assume that neurons have a "local space" (i.e, are connected to other neurons that are physically close by), and can be addressed relative to themselves with only 32 bits (if a connection is out of local space, we can use an escape sequence and a full 34 bit address). This 32 bits gives us a nice round 4 characters per neuron. Our encoding is then the following: <marker><neuron number><neuron for synapse 1>....<neuron for synapse 10,000>.
But this is still highly wasteful. Let's sort the connections by neuron address and then use a differential encoding. With 10^10 neurons, and 10^4 connections per neuron, the average "distance" between synapses will be just 10^6, or a measly million. This takes only 20 bits to encode, or 2.5 characters. Let's round down to 2 characters per synapse, assuming more compression is possible. (I am assuming that there is no repetitive structure in the Connectome, so while some more compression is possible, it is probably not going to be much). Now we have 2 * 10^14 / 2000 = 10^11 pages in our Connectome book. That is a hundred billion pages, or approximately 6.7 million times as many pages as the DNA book.
Amazon can deliver one, but might struggle with the other.
Of course, I have used the assumption that, just like folding is ignored in the DNA Book, the XYZ coordinates of synapses is not required for the Connectome, just the connection graph.
Comments