Basics: How do you sequence a genome?

Sandra Porter

About a week ago, I offered to answer questions about subjects that I've either worked with, studied or taught.

I haven't had many questions yet, but I can certainly answer the ones I've had so far. Today, I'll answer the first question:

How do you sequence a genome?

Before we get into the technical details, there are some other genomic questions that you might like answered.

How much does it cost to sequence a genome?

I remember in 2002, when we were at the O'Reilly bioinformatics conference and we heard Lee Hood challenge the DNA sequencing community to lower the costs of genomic sequencing to $1000 for a human genome. It was all pretty exciting!

We're not there yet. But, we're getting closer. I've heard secondhand, from one of our customers, that it costs about $10,000 to sequence an average-sized bacterial genome, once you've purchased your sequencers, bought your software, and built your lab. Just for a bit of perspective, an average bacterial genome is about 750 times smaller than the human genome.

I'll leave you to do the math, but I imagine it scales pretty well. Ten million for a human genome seems about right, especially considering the original version was estimated to cost about 3 billion dollars.

What kind of infrastructure do you need to have?

You will need lots of robots for pipetting and preparing DNA, DNA sequencing instruments, computers, and software for tracking samples, evaluating sequence quality, and assembling the sequences at the end.

Some of the other types of equipment will depend on the methods that you're using. If you're using an older method, you'll need autoclaves and special incubators for growing bacteria. If you're using a newer method, like pyrosequencing, you need to have a special clean room where you can work with a lower risk of contamination.

Fine, so how do you go about doing it?

This used to be an easier question to answer. But now that pyrosequencing (from 454) has come along, this answer isn't as simple.

Still, I can divide the steps into three general parts, and then, since there are some nice movies and Flash® animations on the internet, I will send you out to go watch them.

Here are the steps:

  • Break the genome into lots of small pieces at random positions.
  • Determine the sequence of each small piece of DNA.
  • Use an assembly program to figure out which pieces fit together.

The last two steps are a lot like determining what was written in the Dead Sea Scrolls.

Stay tuned, there will be more.

