Prime Structure Prediction — Find Homologs Step

Summary
Working in the Find Homologs Step
Find Homologs Step Features
Related Topics

Summary

This step is used to search for homologs of the query sequence. These homologs can then be used as templates to build a model structure.

For details about the Find Homologs step, see Find Homologs in the Prime User Manual.

Searching for Homologs

To search for homologs, click the Search button near the middle of the Prime panel. Depending on what is selected in the Options dialog box, BLAST or PSI-BLAST will be used to search for templates.

Optionally, run HMMER on the Pfam database by clicking the Identify Globally Conserved Residues button, to search for globally conserved residues from a multiple sequence alignment of homologous proteins. The conserved residues are shown in the sequence viewer, colored according to the template color if both the template and the query have the conserved residue. This allows you to easily identify conserved residues that are not in the both the template and the query.

When the homolog search is complete, select the template that you would like to use to build a model by clicking the sequence in the Homologs table. The template structure is retrieved from the PDB (either a local copy or the web site) and is displayed in the Workspace for inspection. Clicking on another template in the table deselects the first template and displays the new template structure.

Find Homologs Search Options

Click Options to open the Find Homologs - Options dialog box. Here you can choose the kind of search you want to run—BLAST or PSI-BLAST—or change the settings for either type of search. You can also choose whether to run the search on the BLAST web site or using a local copy of the BLAST database. By default a local copy is used.

For example, the default database for searching is NCBI PDB (non-redundant). Other options include:

NCBI PDB (all), which shows all PDB hits, including PDB files with identical sequences, and
NCBI NR, (recommended for PSI-BLAST runs.) If NCBI NR is selected, only nonredundant PDB hits are displayed in the Homologs table.

For more information on BLAST and its options, go to http://www.ncbi.nlm.nih.gov/books/NBK21097/.

Importing Homologs

If the homologs you want to use are not in the NCBI sequence database, or if you have homologs that you want to use, you can import them using the Import buttons. Homologs are aligned quickly using ClustalW on import.

To import homologs from a file, click From File. You can import homologs in PDB and Maestro format.

To import homologs from Maestro, select them in the Project Table and click From Project Table. This allows you to import homologs in any format recognized by Maestro.

Selecting Templates

If you want to build a homology model for a single chain, and have found a satisfactory single template, you can simply select the desired template in the table.

If you want to use more than one template as a composite (chimera) model, in which each template is used for a different region of the query, you can select multiple templates in the table with shift-click or control-click. The template structures must then be aligned before you use them.

Likewise, if you want to build a consensus model, you can select multiple templates in the table, but they do not need to be aligned.

If you want to build a homology model for a homomultimer, you should select a single chain of a template that is also a homomultimer with the same number of chains. The remaining chains are retrieved and used when the model is built.

For heteromultimers, each chain should be built in a separate run before they are combined to build the final multimer. You can use any of the above models to build each chain, including a homomultimer model if the heteromultimer contains a homomultimer.

Aligning Structures

Spatial alignment of multiple templates is only required if you want to build a composite (or chimera) model. Otherwise, any alignment is done as necessary; in particular, alignment should not be done if you want to build a multimer model, because the templates must be in the correct spatial relation to each other for building.

If you have already created a file with structurally aligned multiple templates, you can use the Import button to import it. Otherwise, when you have selected your templates, click Align Structures to structurally align the templates.

Only templates with some degree of structural similarity can produce a meaningful structural alignment. If Align Structures fails, a warning message appears:

WARNING The structural alignment did not produce a suitable superimposition. The selected templates should not be used together for structure building.

Templates with multiple chains may sometimes require special handling. In some cases, Align Structures fails to align multiple templates with multiple chains. Furthermore, a chain from a given template file cannot be structurally aligned to another chain from the same template file. It can also happen that Align Structures aligns a template to chain B of another template, but you need an alignment with chain A. In these cases, it is necessary to generate single-chain templates, as described in the next section, Extracting Single Chains.

Extracting Single Chains

If Align Structures fails to align templates with multiple chains correctly, use the Get PDB File panel to extract single chain templates from the PDB database, either from a local installation or from the web. To open this panel, you can:

Choose Project → Get PDB in the main window.
Click the Get PDB button on the Project toolbar in the main window.

To extract a particular chain:

Enter the PDB ID in the PDB ID text box.
Enter the single-letter chain name in the Chain name text box.
(Optional) Choose the source for the structure.
Click Download.

The chain is downloaded into the working directory and imported into the Maestro project. Once you have imported all the desired chains, you can select them in the Project Table, then in the Find Homologs step, click From Project Table.

For example, if Align Structures were to have difficulty aligning several multiple-chain template structures, you would follow these steps to correct the problem:

Extract the single chains from the chosen multiple-chain templates using the instructions above.
Select the chains in the Project Table panel.
In the Find Homologs step, click From Project Table. The new homologs appear in the Homologs table.
Select the single-chain version of each multiple-chain template, and click Align Structures again.

You can also use the getpdb utility from the command line to extract chains from the PDB database. To use getpdb, you must either have access to the web to download the chain, or have a local installation of the PDB, to which one of the following environment variables is set:

SCHRODINGER_PDB
SCHRODINGER_THIRDPARTY
SCHRODINGER

If it is only the SCHRODINGER variable that has been set, getpdb uses the default location of the database, $SCHRODINGER/thirdparty.

To extract an entire protein from the PDB database, type:

$SCHRODINGER/utilities/getpdb 4-letter_PDB_code

For example,

$SCHRODINGER/utilities/getpdb 1aaa

retrieves the PDB file for the structure with PDB code 1AAA.

To extract a particular chain, type:

$SCHRODINGER/utilities/getpdb 4-letter_PDB_code:Chain_ID

For example, if the first multiple-chain template is the protein 1ems, and the chain needed is chain A, use the command:

$SCHRODINGER/utilities/getpdb 1ems:A

This command extracts all residues, including HETATMs, in chain A (as well as any HETATMs with no chain name.)

Note: The Chain_ID variable is case sensitive, though the PDB_code is not. In this example, 1EmS:A would also work, but 1ems:a would produce errors.

After the getpdb utility has extracted 1ems:A, it returns the following message:

saved data to file: 1ems_A.pdb

You can then click From File in the Find Homologs and import the single-chain file, 1ems_A.pdb. The new homolog appears in the Homologs table.

Proceeding to the next step

When you are satisfied with the template or templates you have selected, follow the Comparative Modeling Path by proceeding to the Edit Alignment step.

If no satisfactory templates can be identified by BLAST/PSI-BLAST and you have not imported a template, you may wish to proceed to the Fold Recognition step.

Saving Runs With Different Templates

If, after proceeding to further steps, you decide that you would like to try a different template, you can avoid overwriting your existing workflow by starting a new one before you navigate back to Find Homologs to choose a different homolog. Select Save As from the File menu. This will create a copy of the current run, allowing you to select a new template and build on it. You will then be able to compare results from each run. See the Using Prime Runs help topic for more information.

Error Messages

If you select multiple templates and proceed to Edit Alignment, you will be reminded of the requirements for use of multiple templates.

If one of these messages appears when you click Next to go to the Edit Alignment step, you will not be able to proceed along the Comparative Modeling Path until the problem is resolved:

No Homologs Found: You may change your search options and continue searching, or else proceed to Fold Recognition.
Invalid File:: Check to be sure the homolog file exists (in either $SCHRODINGER/thirdparty/database/pdb/data/structures/divided/pdb/*/ or $SCHRODINGER_THIRDPARTY/database/pdb/data/structures/divided/pdb/*/) and is in the right format (gzip compressed) pdb*.ent.gz).

Find Homologs Step Features

BLAST Homology Search button
Options button
Import buttons
- From File button
- From Project Table button
Identify Globally Conserved Residues button
Family name text box
Homologs table
Align Structures button

BLAST Homology Search button

Click this button to perform a search for homologs using BLAST. The search is performed with the standard choice of database, similarity matrix, gap costs, and so on, but you can change these options by clicking Options. The results of the search are returned in the Homologs table.

Options button

Set BLAST options for the search, including whether to do a BLAST search or a PSI-BLAST search. Opens the Find Homologs - Options dialog box.

Import buttons

Import homologs into the Homologs table. You can import them from one of two sources:

From File button: Import the homologs from a PDB file or a Maestro file. Opens a file selector, in which you can navigate to and load a single homolog.
From Project Table button: Import the selected entries in the Project Table into the Homologs table. The entries must be selected before you click this button.

Identify Globally Conserved Residues button

Run a search with HMMER on the Pfam database, to identify the query family and provide information about which residues are conserved in the consensus sequence. The sequence viewer shows the match bewteen the query and the consensus sequence of the family.

Family name text box

Homologs table

Align Structures button