Bioinformatics and Sequence Analysis Workshop
Most recent workshop: April 6 & 7, 2006

Sponsored by: The Department of Biological Sciences at Tennessee State University and Pittsburgh Supercomputing Center's National Resource for Biomedical Supercomputing

This two-day workshop is designed to teach researchers at Tennessee State University the basics of sequence analysis. Workshop participants will learn how to use tools for sequence analysis on the supercomputers at Pittsburgh Supercomputing Center (PSC). Lectures will be led by Alex Ropelewski, Hugh Nicholas, and others. Computer laboratory sessions will follow each lecture. Each participant will receive an account at the PSC to allow them to experiment with the techniques described throughout the workshop. Each of the four topics described below will be covered in a lecture followed by a computer laboratory session:

  • Bioinformatics and Sequence Analysis Resources on the Internet
    The advent of the Internet and the World Wide Web has substantially increased the availability of information and computational resources available to experimental biologists. This lecture will describe a number of online resources available, including resources and services availiable at the Pittsburgh Supercomputing Center. We will also discuss the use of the EMBOSS software and how to locate and retreive specified sequences from a sequence database.
  • Database Searches and Pairwise Alignments
    This lecture presents both the mathematical and biological foundations of sequence comparison. That is, the lecture will describe in detail the various sequence alignment algorithms including the approximations made in the heuristics (e.g., FASTA and BLAST) to decrease computational time as opposed to searches with a rigorous algorithms (e.g., Needleman-Wunsch, Smith-Waterman). The lecture will also describe the underlying mathematical and biological bases for the similarity matrices in current use and how these tables should be properly used. The intended goal is to provide a researcher with the necessary background information about the search protocols so that they are assured of making good choices in their database searches and understand the shortcomings of naively using a database searching tool.
  • Multiple Sequence Alignments
    This lecture covers aligning several sequences in their entirety (global alignments) and how global alignments are related to underlying biological phylogenies. During the lecture, we will describe the mathematical and biological basis for each of the multiple alignment programs and demonstrate the inherent weaknesses in each approach (e.g., ClustalW, MSA, or T-COFFEE).
  • Discovering Diagnostic Patterns and Motifs in Unaligned Sequences
    Participants will be introduced to motifs and patterns and the various formats in which they are commonly represented, including consensus residues, regular expressions, hidden Markov Models and weight matrices. Discusses methods availiable to identify short, well conserved local patterns, such as promoter sequences or catalytic sites. Also discusses profile analysis and expectation maximization techniques.


Figure: W.H. McClain and H.B. Nicholas. "Differences Between Transfer RNS Molecules." J. Mol. Biol. 194, 635-642 (1987). Graphic by David Deerfield and Joe Lappa, PSC.

Search NRBSC


NRBSC Gateways

Microphysiology Gateway image.

Volumetric Data and Viz Gateway Analysis.

Quantum Mechanics/Molecular Mechanics Simulation Gateway.


NRBSC projects are made possible by these sponsors:

NIH logo. Pittsburgh Supercomputing Center logo. NCRR logo.