BLASTing through the kingdom of life

<< Return to the Archive

Share to: 
Sandra Porter

No biology course is complete these days without learning how to do a BLAST search.

Herein, I describe an assignment and an animated tutorial that teachers can readily adopt and use, and give teachers a hint for obtaining the password-protected answer key.

Development of the tutorial and the activity were supported by funding from the National Science Foundation.

This is reposted from the the original DigitalBio blog.

This popular activity, designed to accompany the BLAST for beginners tutorial, has been updated to incorporate student comments and teacher requests. Originally developed for the BIO 99 teacher workshop, this activity has been one of the most popular items on Geospiza's web site. We have seen the activity used in several venues from high school courses to workshops for researchers offered by the Lawrence Livermore National Laboratory.

Students BLAST through the kingdom of life by using blastn to identify 16 "unknown" sequences. The 16 sequences were chosen to represent diverse organisms ranging from RNA viruses that infect yeast, to humans. This set was compiled from a mixture of cDNA sequences or intron-less sequences from bacteria or viruses to minimize confusion. Further, every sequence in this set codes for some kind of protein that might be recognizable to students, such as amylase (an enzyme found in spit that breaks down starch) or DNA polymerase (makes DNA). This version of the activity, updated last summer, includes an example sequence along with the answers.

There are three pieces of information needed for this activity. These are:

1. The taxonomically diverse sequences, located in the Data Set section.

2. A worksheet and answer key, located in the Worksheet section.

3. The BLAST for beginners animated tutorial, located at the top of the tutorial section

All of these sections are part of Geospiza's Bioinformatics Teaching Materials.

Unlike "canned" activities, it should be noted that students use real sequences and real databases. Since new information is continually added to the databases, the exact information that is obtained from a database search, changes over time, even though the sequence itself and the original source of the sequence, do not.

On one hand, this can be disconcerting when it's unexpected. On the other hand, knowing that these are living and changing resources is exciting. Students know when they use these resources and programs that they're not using old or simplified techniques that are only employed in a classroom setting.

An unfortunate consequence is that grading gets a bit more challenging. The continual addition of information to the NCBI databases, used in this activity, means that some information that's unknown today might be known tomorrow. The majority of the answers in our key will not change - but new information might be added. Our current plan is to update the answers on a yearly basis, or when we're alerted to problems.

The answer key is password-protected to limit access by students. If you wish to get the password, send an e-mail to digitalbio at with your name, position, and the name of your school.