I made this video (below the fold) to illustrate the steps involved in making a phylogenetic tree. The basic steps are to:
- Build a data set
- Align the sequences
- Make a tree
In the class that I'm teaching, we're making these trees in order to compare sequences from our metagenomics experiment with the multiple copies of 16S ribosomal RNA (rRNA) genes that we can find in single bacterial genomes. Bacteria contain between 2 to 13 copies of 16S rRNA genes and we're interested in knowing how much they differ from each other. Later, we'll compare the 16S ribosomal RNA genes from multiple species of bacteria to see how much these genes differ between a variety of bacteria.
What's in the video?
The video is about 14 minutes long, so here's a quick description of what it contains.
We begin by getting data and making a data set. Some of our class data come from our metagenomics data sets. We get other data from the NCBI. The video shows how we get all the sequences for all of the 16S rRNA genes from single genomes.
Then we edit the data set to remove the paragraph characters and shorten some of the sequence descriptions. Most of the time we spend doing bioinformatics in real life is spent on editing and formatting data.
After that, we use JalView, a client server program, to connect with a web service at the University of Dundee where the sequences are aligned by ClustalW. I've written about JalView before, now you can see it in action.
Making the tree is the simplest part. When ClustalW aligns the sequences, it also performs the calculations that can guide tree-building. We use the neighbor-joining method in this video. Neighbor-joining trees group sequences together by the number of amino acid or nucleotide differences. The sequences that are most similar are placed most closely together on the tree.
UPDATE: I want to clarify a few things. This video only shows one quick and easy method. The merits of different types of tree-building programs are not discussed.
Other topics that are not included:
- rooted vs. unrooted trees
- building a tree for publication
All those topics would be important if we're going to build a tree and publish it. If we just want to investigate relationships, the method in the video will suffice.