sequence analysis

In our series on why $1000 genomes cost $2000, I raised the issue that the $1000 genome is a value based on simplistic calculations that do not account for the costs of confirming the results. Next, I discussed how errors are a natural occurrence of the many processing steps required to sequence DNA and why results need ... Read more

You might think the coolest thing about the Next Generation DNA Sequencing technologies is that we can use them to sequence long-dead mammoths, entire populations of microbes, or bits of bone from Neanderthals.


... Read more

Last spring, I gave my first hands-on workshop in working with Next Generation Sequencing data at the Eighth Annual UT-ORNL-KBRIN Bioinformatics Summit at Fall Creek Falls State Park in Tennessee. The proceedings from that conference are now on-line at BMC Bioinformatics and it's fun to look back and reflect on all that I learned at the conference and all that's happened since.

... Read more

We'll have a blast, I promise! But there's one little thing we need to discuss first...

I want to explain why I'm going to use nucleotide sequences for the blast search. (I used protein the other day). It's not just because someone told me too, there is a solid rational reason for this.

The reason is the redundancy in the genetic code.

Okay, that probably didn't make any sense to those of you who didn't already know the answer. Here it is. ... Read more

Genome sequences from California and Texas isolates of the H1N1 swine flu are already available for exploration at the NCBI. Let's do a bit of digital biology and see what we can learn.

Activity 1. What kinds of animals get the flu?

For the past few years we've been worrying about avian (bird). Now, we're hearing about swine (pig) flu.

All of this news might you wonder just who gets the flu besides pigs, birds, and humans. We can find out by looking at the data.

Over the past few years, researchers have been sequencing ... Read more

In which we identify unknown human proteins.

Yesterday, I wrote about using the BLOSUM 62 matrix to calculate a score for matches between two proteins. Those scores give us a good start on understanding how blastp determines whether two sequences are matching by chance or because they're more likely to be related. But that's not all there is to calculating a blast score, and there's at least one other statistic to consider as well, the E value.

It all comes down to biochemistry
... Read more

Here's a fun puzzler for you to figure out.

The blast graph is here:

i-02f5f2aaa95bc8ab8660ebaba090a49e-graph.png

The table with scores is here, click the table to see a bigger image:

... Read more