Thursday, February 10, 2011

Module 4, Lab 14 - Bioinformatics
A simple bioinformatics pipeline to infer a phylogeny


A 'good' electropherogram of the RRss gene from an earthworm (Lumbricus; top),
and a 'bad' electropherogram of RRss gene of a cnidarian (
Hydra; bottom)
(click pic for a full size view)

_____________________________________________________

During this lab students followed a simple bioinformatics pipeline (workflow) to process data similar those generated by them in the lab. We started by visualizing DNA sequencing electropherograms and analyzing the differences between a 'good' one (from which reliable information can be extracted) and a 'bad' one (from which no reliable information can be obtained).

Then we did a crude editing of the information contained in them and exported the data in fasta file format. Once we pooled sequence data (Ribonucleotide reductase small subunit [RRss], a nuclear gene) from 9 animals in a single fasta file we did a multiple sequence alignment using an on-line version of ClustalW available on the European Bioinformatics Institute website.

The alignment was used to generate a nexus file which can be used for performing phylogenetic analyses. We ran two, very basic analyses, one under the criterion of maximum parsimony (MP) and one under the criterion of maximum likelihood (ML), using an online version of PHYLIP, available at the Mobyle@Pasteur web portal. Students will compare the outcome of both analyses and write a short essay about the process followed to obtain the phylogenies.

The following learning outcomes should have been met:
  • Introduction to the concept and the field of bioinformatics
  • Introduction to the main sequence data repository in the Americas and one of the main in the world: NCBI
  • Introduction to the main database in the NCBI website: GenBank
  • Understanding of how to interpret an electropherogram (for DNA sequencing)
  • Basic understanding of the fasta and nexus file formats
  • Basic use of the BLAST algorithm
  • Introduction to the concept of sequence alignment
  • Basic understanding of how to perform a phylogenetic analysis
----------------------

No comments: