Exercises for bioinformatics.psc.edu:
Homology Modeling Using MODELLER4
Purpose: To become familiar with the homology modeling program MODELLER4 and the SWISS-MOD homology modeling server (described below). It is anticipated that the example scripts can be modified to include your own protein(s).
Online MODELLER manual: http://guitar.rockefeller.edu/modeller/manual/manual.html
Untarring data files
- We will first untar a .tar file to your home directory on tourney.psc.edu. Issue the commands:
cd $HOME tar -xvf /biomed/examples/model_examples.tar cd workshop
- Verify that the following directories have been created, by issuing the command: ls -l
input1 - contains the fully automated alignment and model building given the template
input2 - contains scripts for performing template searching, alignment of the templates and target, and finally the model building.
input3 - contains scripts for model building given alignment and templates. Also is a script for refining a section of the model, determining sequence identities and comparing the structures.
Okay, you now have all the example scripts and the outputs from these scripts. Feel free to run the example scripts yourself, but the output is provided if you want to skip this. Note: The attached notes explain the PIR format that MODELLER uses. This most likely involves some manual editing of your sequence file to model your own protein.
In MODELLER4, all the input files are *.top files which control the program execution. So to run an input script named model-default, one simply types: modeller4 model-default.
- Fully automated alignment and model building
- cd $HOME/workshop/input1
- modeller4 model-default &
The input file, model-default.top, will align the target to the templates, and build the model. The target is a known protein (1fdx) so we will have some method to judge how well the methods performs. The knowns are 5fd1, 1fdn, 1fxd and 2fxb.5fd1 1fdn
1fxd (TEMPLATES) 1fdx (TARGET)
These proteins are all ferredoxins involved in electron transport. The routine is full_homol It outputs:
- model-default.log - errors (hopefully none important), restraint violations
- Alignment.seg.ali - the alignment produced by MODELLER
- 1fdx.ini - the initial model produced
- 1fdx.mat - matrix of pairwise protein distances from alignment
- 1fdx.rsr - the restraints file (somewhat cryptic)
- 1fdx.sch - the schedule file (again somewhat cryptic)
- 1fdx.V99990001 - the violations of the output structure
- 1fdx.D00000001 - the progress of the optimization
- 1fdx.B99990001 - the structure output by MODELLER (PDB)
- Sample Template searching, alignment and Model construction
- cd $HOME/workshop/input2
- modeller4 search.top & - uses input 1fdx.chn (PIR format)
- modeller4 malign.top & - multiple template alignment. This routine produces the alignment file alignment.seg.ali, as well as files that are named *.ent.fit, which are the pdb files with the coordinates superimposed.
- To look at this superposition, we can combine the files and display this file with Rasmol by issuing these commands:
cat *.ent.fit > superimpose.pdb (this file is included in the output dir)
- modeller4 get-model.top & - homology model using two templates, two models are produced with different values of the pdf (probability distribution function)
- Modeling given an alignment and model segment refining
- cd $HOME/workshop/input3
- modeller4 model-fast.top & - demonstrates the use of the routine to produce a quick-and-dirty model and is not discussed here further.
- modeller4 model-default.top & - takes the alignment, aligment.ali (in PIR format) and the templates and produces a model.
- modeller4 compare.top & - that you can use to determine how close your model structure is to the crystal structure and to the nearest template.
- Output files:
model-default produces output with the extension l like *.B99990001
model-fast produces output with extensions that end in (2), for example: *.B99990002
EUsing the SWISS-MOD server
- Using your Netscape browser, enter the SWISS-MOD site: http://www.expasy.ch/swissmod/SWISS-MODEL.html
- Click on First Approach Mode
- Enter your home e-mail address: where the results will be sent.
- Enter your name
- Enter the request title.
- Copy and paste your sequence into the area provided for it. Or if you know the SWISS-PROT AC code, enter that. Don’t push send request just yet unless you have no structure in your family.
- Check your PDB code versus ExPDB code. Remember only single chain proteins can be modeled.
- Enter the ExPDB code
- Results options:
- Normal mode
- Include a WhatCheck report of the final model
- Other options corresponding to PHD Prediction and Fold recognition are optional.
- It took approx. 10 minutes to obtain my results by e-mail. You will want to FTP your results to your local CTC machine.
- Inspect the results:
- Where does the What_Check report indicate that the structure is in error? In areas of secondary structure or loops?
- Does the structure-sequence alignment show at least 30% sequence identity?
- Are there gaps in helices or strands or are they contained in the loops?
Supported Programs for Homology Modeling in /biomed/bin
- convertpir - converts a standard NBRF/PIR format file in to the proprietary format used by MODELLER4.0.
- extractpdb - outputs the amino acid residue sequence from a standard pdb file.