"Come quickly, Watson," said Sherlock Holmes, "I've been asked to review a mysterious sequence, whose importance I'm only now beginning to comprehend."
The unidentified stranger handed Holmes a piece of paper inscribed with symbols and said it was a map of unparalleled value.
Holmes gazed thoughtfully at the map, then slowly lifted his eyes and coldly surveyed his subject's beaming countenance. "You have an affinity for the ocean," said Holmes, "that you indulged to excess as a reckless youth. An experience as a medic in the military changed your life and gave you a reason to do more than surf. A community college graduate, you achieved positions of high power and left them just as easily. You have a thirst for recognition and a talent for achieving goals that others thought were many years away."
Dr. Venter looked stunned. "You saw all that in the sequence of my genome?" he gasped.
"No," remarked Holmes, "I read your book."(*)
This is how I imagine a meeting between Watson, Holmes, and Venter might play out. Holmes would then ask, "what can we learn from Venter's genome that we don't already know?"
Watson, might be somewhat perplexed. There was certainly one thing that really puzzled me.
Aren't all human autosomes diploid?
The title of the PLoS article (2) made it seem like there was something unusual about having solved the sequence of a diploid human. This really puzzled me because we have 23 paired sets of chromosomes in addition to our mitochondrial DNA. Except for our mitochondrial DNA, all the chromosomes in women, come in pairs. In men, there are two chromosomes without partners, X and Y, but we don't hold it against them.
Anyway, so I was confused about the emphasis on Venter's genome being diploid.
It seemed by implication that Watson's genome was not or at least that there was something different about it.
It took some sleuthing this morning, but I did figure it out.
Watson's genome was sequenced with the 454 instrument. The beauty of 454 sequencing is that you don't have to clone your DNA in order to sequence it. You sequence single molecules, lots of single molecules, and use assembly algorithms, and I'm guessing a reference genome sequence, to put the sequence together.
How was Venter's genome project different from Watson's?
Understanding how Venter's genome was sequenced and put back together again was a bit challenging because the PLoS paper didn't include anything about the sequencing in the materials and methods section. (I think they were a bit remiss here, but oh well).
In order to figure out how Venter's group actually sequenced the DNA, I had to use the accession numbers from the paper and hunt around the NCBI until I could find some traces (electropherograms) from the project. I also found some annotations that told me that the sequencing was accomplished using the whole genome shotgun approach. (I wrote a whole series on genome sequencing earlier.)
Anyway, I wanted a trace so I could figure out if the Venter group was using any of the next-generation sequencing technologies. Eventually, I found a trace, opened it in FinchTV, and clicked the i icon to learn about the sequencing instrument. FinchTV presents all kinds of information from chromatogram files. It told me that this trace was obtained from an ABI 3730 sequencer and that it was base-called with KB v. 1.2 (in case you're wondering about this after my obsession with base callers).
I went back to the paper and at last, I remembered. The Venter sequence was derived from large, CLONED, pieces of DNA. If they could map the sequences back to the original clones, they could reconstruct the chromosomes. This is what they did.
You can think about the two different procedures like this: in the Watson version, everything was mixed in a blender, the sequence of each little piece determined, and the entire thing reconstructed from itty bitty parts. In the Venter version, the sequence was broken into large parts. Then the large parts were broken into smaller parts, sequenced and put back together. The difference is that with Venter's genome, it was possible to figure out how the smaller parts fit into the larger parts, and to reconstruct contiguous pieces.
Of course the HapMap data helped too, but this the general idea.
Why do we care? Is this going to be on the test?
Ah, yes, the pre-med question.
Unlike Watson's data, Venter's data allows us to look much more closely at the difference between the two sets of chromosomes. The paper reports that the maternal and paternal sets are quite different and 44% of the genes are heterozygous.
Good news, Craig, your parents weren't closely related!
Venter's data provide an amazing glance at the amount of variation between the two sets of chromosomes in one individual human. The differences determine health, personality, looks, all the things that make us distinct individuals.
And that, is why we care.
It's elementary, my dear Watson.
1.The Genome War: How Craig Venter Tried to Capture the Code of Life and Save the World by James Shreeve.
2.Levy S, Sutton G, Ng PC, Feuk L, Halpern AL, et al. (2007) The Diploid Genome Sequence of an Individual Human. PLoS Biol 5(10): e254 doi:10.1371/journal.pbio.0050254
3. Erika Check, 2007. James Watson's genome sequenced. Published online: 1 June 2007; | doi:10.1038/news070528-10