Prime Structure Prediction — Find Homologs Step |
This step is used to search for homologs of the query sequence. These homologs can then be used as templates to build a model structure.
For details about the Find Homologs step, see Find Homologs in the Prime User Manual.
To search for homologs, click the Search button near the middle of the Prime panel. Depending on what is selected in the Options dialog box, BLAST or PSI-BLAST will be used to search for templates.
Optionally, run HMMER on the Pfam database by clicking the Identify Globally Conserved Residues button, to search for globally conserved residues from a multiple sequence alignment of homologous proteins. The conserved residues are shown in the sequence viewer, colored according to the template color if both the template and the query have the conserved residue. This allows you to easily identify conserved residues that are not in the both the template and the query.
When the homolog search is complete, select the template that you would like to use to build a model by clicking the sequence in the Homologs table. The template structure is retrieved from the PDB (either a local copy or the web site) and is displayed in the Workspace for inspection. Clicking on another template in the table deselects the first template and displays the new template structure.
Click Options to open the Find Homologs - Options dialog box. Here you can choose the kind of search you want to run—BLAST or PSI-BLAST—or change the settings for either type of search. You can also choose whether to run the search on the BLAST web site or using a local copy of the BLAST database. By default a local copy is used.
For example, the default database for searching is NCBI PDB (non-redundant). Other options include:
For more information on BLAST and its options, go to http://www.ncbi.nlm.nih.gov/books/NBK21097/.
If the homologs you want to use are not in the NCBI sequence database, or if you have homologs that you want to use, you can import them using the Import buttons. Homologs are aligned quickly using ClustalW on import.
To import homologs from a file, click From File. You can import homologs in PDB and Maestro format.
To import homologs from Maestro, select them in the Project Table and click From Project Table. This allows you to import homologs in any format recognized by Maestro.
If you want to build a homology model for a single chain, and have found a satisfactory single template, you can simply select the desired template in the table.
If you want to use more than one template as a composite (chimera) model, in which each template is used for a different region of the query, you can select multiple templates in the table with shift-click or control-click. The template structures must then be aligned before you use them.
Likewise, if you want to build a consensus model, you can select multiple templates in the table, but they do not need to be aligned.
If you want to build a homology model for a homomultimer, you should select a single chain of a template that is also a homomultimer with the same number of chains. The remaining chains are retrieved and used when the model is built.
For heteromultimers, each chain should be built in a separate run before they are combined to build the final multimer. You can use any of the above models to build each chain, including a homomultimer model if the heteromultimer contains a homomultimer.
Spatial alignment of multiple templates is only required if you want to build a composite (or chimera) model. Otherwise, any alignment is done as necessary; in particular, alignment should not be done if you want to build a multimer model, because the templates must be in the correct spatial relation to each other for building.
If you have already created a file with structurally aligned multiple templates, you can use the Import button to import it. Otherwise, when you have selected your templates, click Align Structures to structurally align the templates.
Only templates with some degree of structural similarity can produce a meaningful structural alignment. If Align Structures fails, a warning message appears:
WARNING The structural alignment did not produce a suitable superimposition. The selected templates should not be used together for structure building.
Templates with multiple chains may sometimes require special handling. In some cases, Align Structures fails to align multiple templates with multiple chains. Furthermore, a chain from a given template file cannot be structurally aligned to another chain from the same template file. It can also happen that Align Structures aligns a template to chain B of another template, but you need an alignment with chain A. In these cases, it is necessary to generate single-chain templates, as described in the next section, Extracting Single Chains.
If Align Structures fails to align templates with multiple chains correctly, use the Get PDB File panel to extract single chain templates from the PDB database, either from a local installation or from the web. To open this panel, you can:
Choose Project → Get PDB in the main window.
Click the Get PDB button on the Project toolbar in the main window.
To extract a particular chain:
The chain is downloaded into the working directory and imported into the Maestro project. Once you have imported all the desired chains, you can select them in the Project Table, then in the Find Homologs step, click From Project Table.
For example, if Align Structures were to have difficulty aligning several multiple-chain template structures, you would follow these steps to correct the problem:
Extract the single chains from the chosen multiple-chain templates using the instructions above.
Select the chains in the Project Table panel.
In the Find Homologs step, click From Project Table. The new homologs appear in the Homologs table.
Select the single-chain version of each multiple-chain template, and click Align Structures again.
You can also use the getpdb
utility from the command line to
extract chains from the PDB database. To use getpdb
, you must
either have access to the web to download the chain, or have a local
installation of the PDB, to which one of the following environment variables is
set:
If it is only the SCHRODINGER variable that has been set,
getpdb
uses the default location of the database,
$SCHRODINGER/thirdparty
.
To extract an entire protein from the PDB database, type:
$SCHRODINGER/utilities/getpdb 4-letter_PDB_code
For example,
$SCHRODINGER/utilities/getpdb 1aaa
retrieves the PDB file for the structure with PDB code 1AAA.
To extract a particular chain, type:
$SCHRODINGER/utilities/getpdb 4-letter_PDB_code:Chain_ID
For example, if the first multiple-chain template is the protein 1ems, and the chain needed is chain A, use the command:
$SCHRODINGER/utilities/getpdb 1ems:A
This command extracts all residues, including HETATMs, in chain A (as well as any HETATMs with no chain name.)
Note: The Chain_ID variable is case sensitive, though the PDB_code is not. In this example, 1EmS:A would also work, but 1ems:a would produce errors.
After the getpdb utility has extracted 1ems:A, it returns the following message:
saved data to file: 1ems_A.pdb
You can then click From File in
the Find Homologs and import the single-chain file,
1ems_A.pdb
. The new homolog appears in the
Homologs table.
When you are satisfied with the template or templates you have selected, follow the Comparative Modeling Path by proceeding to the Edit Alignment step.
If no satisfactory templates can be identified by BLAST/PSI-BLAST and you have not imported a template, you may wish to proceed to the Fold Recognition step.
If, after proceeding to further steps, you decide that you would like to try a different template, you can avoid overwriting your existing workflow by starting a new one before you navigate back to Find Homologs to choose a different homolog. Select Save As from the File menu. This will create a copy of the current run, allowing you to select a new template and build on it. You will then be able to compare results from each run. See the Using Prime Runs help topic for more information.
If you select multiple templates and proceed to Edit Alignment, you will be reminded of the requirements for use of multiple templates.
If one of these messages appears when you click Next to go to the Edit Alignment step, you will not be able to proceed along the Comparative Modeling Path until the problem is resolved:
You may change your search options and continue searching, or else proceed to Fold Recognition.
Check to be sure the homolog file exists (in either
$SCHRODINGER/thirdparty/database/pdb/data/structures/divided/pdb/*/
or
$SCHRODINGER_THIRDPARTY/database/pdb/data/structures/divided/pdb/*/
)
and is in the right format (gzip compressed)
pdb*.ent.gz).
Click this button to perform a search for homologs using BLAST. The search is performed with the standard choice of database, similarity matrix, gap costs, and so on, but you can change these options by clicking Options. The results of the search are returned in the Homologs table.
Set BLAST options for the search, including whether to do a BLAST search or a PSI-BLAST search. Opens the Find Homologs - Options dialog box.
Import homologs into the Homologs table. You can import them from one of two sources:
Import the homologs from a PDB file or a Maestro file. Opens a file selector, in which you can navigate to and load a single homolog.
Import the selected entries in the Project Table into the Homologs table. The entries must be selected before you click this button.
Run a search with HMMER on the Pfam database, to identify the query family and provide information about which residues are conserved in the consensus sequence. The sequence viewer shows the match bewteen the query and the consensus sequence of the family.
|