Multiple Sequence Viewer Panel

Summary
Opening the Multiple Sequence Viewer Panel
Using the Multiple Sequence Viewer Panel
Multiple Sequence Viewer Panel Features
Acknowledgement
Related Topics

Summary

The Multiple Sequence Viewer is an alignment, visualization, and manipulation toolkit for multiple sequences.

Opening the Multiple Sequence Viewer Panel

To open the Multiple Sequence Viewer panel, you can:

Choose Tools → Multiple Sequence Viewer in the main window.

Using the Multiple Sequence Viewer Panel

The Multiple Sequence Viewer panel provides a range of tools for manipulating and aligning multiple sequences.

The Multiple Sequence Viewer (MSV) has its own projects, which contain all the sequences in the project along with associated data. The project is stored in a single file with a .msv extension, and by default is stored inside the Maestro project. You can save it externally if you wish.

The project can include sequences imported directly into the project, and sequences that are displayed in the Workspace. Directly imported sequences remain in the project unless explicitly deleted. Sequences in the Workspace are transient: when structures are inlcuded in or excluded from the Workspace, the sequence is added to or removed from the project.

Although the data is stored separately from the Maestro project data, there are interactions between the Workspace and the MSV, which depend on settings that you make:

Changes that are made in the Workspace are propagated to the MSV. These changes cover inclusion and exclusion of entries; deletion, mutation, or insertion of residues; and selection of residues.
Changes to the residue selection in the MSV can be propagated back to the Workspace.
Changes to the sequences in the MSV are not propagated back to the Maestro structure.
Color schemes can be transferred between the MSV and the Workspace.

Speed Optimization Tips

The speed of the sequence viewer depends on the number and length of sequences loaded into the viewer. Typically, alignments of 20 sequences of 300 residues or fewer can be interactively viewed and edited. Below are a few speed optimization tips.

Hide the ruler (Settings → Display Ruler off)
Use unwrapped mode (Settings → Wrap Sequences off)
Disable annotation grouping (Settings → Group Annotations off)
Collapse all sequences (Sequences → Collapse All Sequences)
Hide annotations (Annotations → Clear All Annotations)
Reduce the size of the sequence area: move sequence area splitter to the right, shrink entire window

Remember that operations involving Maestro sequences may depend on the complexity of the Maestro workspace.

Menu bar

File menu
Edit menu
Sequences menu
Alignment menu
Color menu
Annotations menu
Tools menu
Maestro menu
Settings menu

File menu

This menu provides tools for creating, opening, and saving Multiple Sequence Viewer projects; importing and exporting sequences; saving images, and closing the panel.

New Query

Create a new, empty query in a new tab.

Rename Query

Rename the current query. The name appears on the tab for the query.

Duplicate Query

Create a copy of the selected sequences in the current query in a new tab.

Delete Query

Delete the current query. All sequences, structures, and data associated with the query are removed (unless they are also in another query tab).

Open

Open an existing Multiple Sequence Viewer (MSV) project. Opens a file selector, in which you can navigate to and select the project.

Save

Save the current MSV project. If the project has not yet been saved, a file selector opens, in which you can navigate to a location and name the project. If the project has been saved previously, the project is simply saved.

Save As

Save the current MSV project with a new name. Opens a file selector, in which you can navigate to a location and name the project. After saving, the current project is the one with the new name.

Import Sequences

Import sequences into the project. Sequences can be imported from a range of file formats: FASTA, SWISSPROT, GCG, PIR, EMBL, as well as PDB and Maestro. Opens a file selector, in which you can choose the file type, navigate to and select the sequence file.

The file selector has four options:

Align to query sequence—When the sequences are imported, align them to the query sequence.
Replace matching sequences—If an imported sequence exactly matches a sequence that is already in the MSV, replace the existing sequence with the imported sequence, while preserving the alignment. If the imported sequence has structural data, import the data as well. If there are multiple matching sequences in the MSV, replace only the first instance.
Translate DNA / RNA sequences—Translate DNA and RNA sequences into the sequence for the proteins they code for, using standard genetic code. The protein sequence is imported instead of the nucleic acid sequence.
Incorporate PDB files into Maestro—When PDB sequences are imported, add the corresponding structures as entries to the Maestro project.

Structural data (ATOM records), B-factors, and secondary structure assignments are also imported if the data are in the PDB file. A nonstandard version of FASTA format is accepted, in which a residue can be preceded by its residue number; numbering is otherwise sequential, starting from 1.

Export Sequences

Export sequences from the project to a file in FASTA format or as a plain text file. Opens a file selector, in which you can navigate to a location and name the file. The file selector has three options:

Save annotations—Save the SSP and SSA annotations in the file.
Save similarity values—Save the calculated percent similarity between sequences. The values are added to the sequence name in the exported FASTA file as ID (identity), SIM (similarity), HOM (homology), e.g.
```
>1abc|ID:7.14|SIM:19.64|HOM:17.86
```
Export only selected part of the alignment—Export only the residues that are selected in all sequences.

These options only apply to export in FASTA format.

Save Image

Save an image of the sequence viewer. Available formats are PNG, EPS, and PDF. Opens a file selector, in which you can navigate to a location and name the file. The selector has one option: Export image of the entire alignment, which is selected by default. If it is not selected, the image is of only the visible part of the alignment (what you can see in the sequence display area of the panel).

Close

Close the current Multiple Sequence Viewer (MSV) project and close the panel.

Edit menu

From this menu you can choose to undo or redo actions, edit sequences, and delete sequences. The sequence editor panel accepts standard editor key strokes, such as Ctrl+C (⌘C), Ctrl+X (⌘X), and Ctrl+V (⌘V) for copying, cutting, and pasting text.

Undo

Undo the last editing operation. The operation is appended to the menu item text, for example Undo Load File.

Redo

Redo the last undone editing operation. The operation is appended to the menu item text, for example Redo Load File

New Sequence

Create a new sequence by entering the letter codes for the residues. Opens the Sequence Editor dialog box, in which you can name the sequence and type in the letter codes for the sequence.

Edit Sequence

Edit an existing sequence as a string of letter codes or create a new sequence. Opens the Sequence Editor dialog box, in which you can change the name of the sequence, and add or delete residues or gaps by character code. Only the 20 standard amino acid codes, X (unknown residue), and - and ~ (gap symbols) are recognized. When you have finished editing, you can either replace the existing sequence (if one was selected), or add it as a new sequence to the end of the list.

Paste in FASTA Format

Insert entire sequences into the sequence viewer in FASTA format. Opens the Sequence Editor dialog box, in which you can paste the sequences and edit them. The same editing rules apply, except that lines beginning with a > character are treated as a sequence header. These lines also separate one sequence from the next. Multiple sequences thus delineated are saved as separate sequences.

This feature is useful for copying an alignment from a web site and adding it directly to the MSV without having to save a file. You can also add sequences by importing a file.

Duplicate Selected Sequences

Duplicate the selected sequences, and place each duplicate immediately below its parent.

Find Pattern

Find a pattern in the sequences. Displays the Find toolbar if it is not displayed, and puts focus in the text box so you can start typing the pattern.

Renumber Residues

Renumber the residues in one or more sequences. Opens the Renumber Residues panel, where you can either renumber the selected sequences so that residues at the same position in the sequence viewer have the same residue number; or import and align a template for a single sequence, and apply the numbering of the template to the selected sequence.

Delete Selection

Delete the sequences that are selected in the viewer.

Remove Redundant Sequences

Remove sequences whose sequence identity is greater than a given threshold. Opens a dialog box in which you can set the threshold, then click Remove to perform the action. The default threshold is 100%. For each pair of sequences that are considered identical, the shorter of the two is deleted; if they are of the same length, the second (lower down in the sequence viewer) is deleted. In the latter case, you can change which sequence is discarded by reordering the sequences. This task operates on all sequences. If you click Cancel in the dialog box, the redundant sequences become the selection, so you can perform operations on them.

Sequences menu

From this menu you can control which sequences are selected, which sequences are shown, and the order in which they are listed.

Hide Selected: Hide the selected sequences.
Show All: Display all sequences.
Clear All: Delete all sequences from the project.
Select All: Select all sequences.
Deselect All: Deselect all sequences.
Invert Selection: Invert the selection of the sequences: select the unselected sequences, and deselect the selected sequences.
Expand All: Expand all sequences so that the associated data, such as secondary structure assignment, is displayed. This command is mapped to Ctrl+DOWN ARROW (⌘DOWN ARROW).
Collapse All: Hide the associated data for all sequences. This command is mapped to Ctrl+UP ARROW (⌘UP ARROW).
Set Color of Sequence Name: Set the color used for the name of the selected sequences. Opens a color selector, in which you can select a color. This feature allows you to color-code the names of the sequences.
Sort by Tree Order: Order sequences by the phylogeny tree generated by ClustalW. The tree is displayed in the leftmost part of the display area, which is normally hidden. When this option is selected, the other sorting items are not available. The nodes of the tree have a shortcut menu, with items Swap Branches to swap the order of the branches originating at the node, Select Sequences for selecting sequences from the branches originating at the node, and Hide Branch, for hiding the sequences for the branch.
Sort Ascending: Sort the sequences by the value of a given property, in ascending order. The properties that can be used are Name, Chain ID, Length, Number of Gaps, Sequence Identity, Sequence Similarity, Sequence Homology, and Sequence Score. Homology is calculated as the percentage of residues with identical side-chain chemical properties (as defined for the Side-Chain Chemistry color scheme).
Sort Descending: Sort the sequences by the value of a given property, in descending order. The properties that can be used are the same as for Sort Ascending.
Move Up: Move the selected sequences up one position in the list.
Move Down: Move the selected sequences down one position in the list.
Move to Top: Move the selected sequences to the top of the list.
Move to Bottom: Move the selected sequences to the bottom of the list.
Get PDB Structures: Download structural information (secondary structures, B-factors, coordinates) for sequences that came from the PDB, such as in a Blast search. The information is obtained from a local copy of the PDB or from the RCSB web site. If sequences are selected, information is obtained for these sequences, otherwise it is obtained for all sequences. The PDB sequence replaces the corresponding sequence in the MSV.

Alignment menu

This menu provides tools for automatic sequence alignment with ClustalW, residue selection and manual sequence alignment tasks. Some of the features are also available on the toolbar. Residues are not renumbered when the gaps are added or removed.

Multiple Alignment

Align the selected sequences simultaneously using ClustalW. If there are columns (residues) selected, the alignment is performed only on the selected residues. You can run an alignment on several discontinuous selected regions at the same time.

Pairwise Alignment

Align the selected sequences pairwise using a Smith-Waterman algorithm, and the settings from the Alignment Settings dialog box.

Align and Merge

Align new sequences with a query sequence without changing the existing alignment. Gaps are inserted into the existing alignment to preserve the residue matching.

Align by Residue Numbers

Align sequences so that residues with identical residue numbers (and insertion codes) are aligned. This is useful for families of proteins that share common numbering schemes, such as antibodies.

Use Constraints

Apply constraints on pairwise alignments, so that the constrained residues are in the same position (same column) after the alignment.

When you select this option, a constraint row is displayed between the query sequence and the other sequences. To add a constraint, click on a residue in the query sequence and then on a position in one of the other sequences. The constraints are displayed as red lines connecting the constrained residue pair. To remove a constraint, click on the constrained residue pair again. To remove all constraints, choose Clear Constraints from the Alignment menu or from the sequence shortcut menu.

You can also allow or disallow gaps in secondary structure elements, in the Pairwise Alignment Settings dialog box.

Clear Constraints

Remove all constraints on the alignment.

Alignment Settings

Opens the Pairwise Alignment Settings dialog box, in which you can choose the similarity matrix type, set the gap opening penalty and the gap extension penalty, choose whether to allow gaps in secondary structure elements, and create a substitution matrix from an existing alignment. This substitution matrix can be selected as Custom from the matrix option menu for subsequent alignments.

Lock Gaps

Lock gaps in the alignment so that they are not filled when performing manual alignment. If you insert a gap after locking the gaps, the new gap is not automatically locked. If you have a residue selection, the gaps are only locked in the selected region.

Unlock Gaps

Unlock previously locked gaps so that they can be filled.

Select Identities

Select residues that are identical in all sequences. Gaps are ignored.

Select Aligned Blocks

Select blocks of residues for which there are no gaps in any of the sequences.

Select Columns with Structure

Select columns in which at least one of the residues has 3D structure (atom coordinates) associated with it. This is useful for multiple aligned structures with SEQRES regions that don't have crystal coordinates in parts of their sequences.

Expand Selection

Expand the residue selection in the selected sequences to include the corresponding residues in all of the selected sequences. If no sequences are selected, the residue selection is expanded to cover all sequences.

Expand Selection from Query

Expand the residue selection in the query sequence to include the corresponding residues in the selected sequences, or in all sequences if no sequences are selected.

Hide Selected Columns

Hide the columns in which residues in all sequences are selected.

Hide Unselected Columns

Hide the columns in which not all residues are selected (columns in which one or more residues are not selected).

Show All Columns

Show all columns for all sequences.

Select All

Select all residues.

Invert Selection

Invert the selection of the residues: select the unselected residues, and deselect the selected residues.

Deselect All

Deselect all residues.

Delete

Delete the selected residues. Also Shift+Backspace (⇧Delete).

Crop

Delete the unselected residues.

Remove Empty Columns

Remove columns that consist entirely of gaps.

Fill with Gaps

Replace the selected residues with gaps, in the selected sequences.

Remove Gaps

Remove gaps in the residue selection of the selected sequences by shifting residues to the left. If there is no selection, all gaps are removed, including gaps at the beginning of the sequences.

Track Changes

Track the regions of the sequence that have been edited, by showing an annotation that marks the edited regions. The latest edit is shown in black, and earlier edits in progressively lighter shades, up to the fifth last edit.

Reset History

Clear the history of changes stored when changes are tracked.

Color menu

Apply a color scheme to the sequences.

Residue Type

Color the residues by residue type. The colors are:

ACFILMPVW	blue	(hydrophobic)
DE	red	(acidic)
HKR	green-yellow	(basic)
GNQSTY	orange	(other)

Residue Similarity

Color residues by similarity. Identical residues are red, similar residues (positive BLOSUM62 pairwise score) are orange, other residues are white.

Hydrophobicity (Kyte-Doolittle)

Color residues by Kyte-Doolittle hydrophobicity. Hydrophilic residues are blue, hydrophobic residues are red, residues with zero hydrophobicity are white.

Hydrophilicity (Hopp-Woods)

Color residues by Hopp-Woods hydrophilicity. Hydrophilic residues are red, hydrophobic residues are blue, residues with zero hydrophobicity are white.

Taylor Scheme

Color residues with the Aminochromography color scheme developed by William Taylor (Protein Engineering 1997, 10, 743–746). In this scheme, well conserved parts of the alignment exhibit bright, clear colors, while parts that are not well conserved have brownish, dull colors.

Constant color

Color all residues with the color chosen from the submenu. Twelve colors are offered on the submenu, with a Custom item so that you can choose your own color in a color selector.

Secondary Structure

Color the sequence by the secondary structure assignment. If no SSA is available but one or more secondary structure predictions are available, the predictions are used to color the sequence. The colors for multiple predictions are averaged, so positions where all predictions agree have bright colors, and the positions of disagreement are more gray. If no SSP nor SSA is available, the color of the sequence is not changed.

B Factor

Color the residues by their temperature factor (PDB B factor), on a green-white-red scale, with green for the lowest values and red for the highest.

Residue Propensities

Color residues by the residue propensities. The schemes that are available on the submenu are described in the table below.

Scheme	Residues	Color	Description
Helix Propensity	AMLEQK	red	helix-forming
	VIFW	magenta	weak helix-forming
	CSTNDHR	gray	ambivalent
	PGY	blue	helix-breaking
Strand Propensity	VILMTFWY	blue	strand-forming
	ACSNQHR	gray	ambivalent
	DEKGP	red	strand-breaking
Turn Propensity	GSDNP	cyan	turn-forming
	AVLIMHFWC	magenta	turn-breaking
	EQTKRY	gray	ambivalent
Helix Terminators^*	GTMRKHF	green	helix-starting
	SNDELWP	red	helix-ending
	CQAVIY	gray	ambivalent
Exposure Tendency	RNDQEHK	blue	surface
	ACGPSTWY	gray	ambiguous
	ILMFV	orange	buried
Steric Group	GACS	red	small, noninterfering
	TVNDILPM	magenta	ambiguous
	QEKR	cyan	sticky polar
	HFYW	blue	aromatic
Side-Chain Chemistry	DE	red	acidic, hydrophilic
(the default)	RKH	blue	basic, hydrophilic
	GAVILM	green	neutral, hydrophobic, aliphatic
	FYW	orange	neutral, hydrophobic, aromatic
	STNQ	cyan	neutral, hydrophilic
	C	yellow	primary thiol
	P	dark gray	imino acid

^*This item is only present on the Color Blocks submenu of the Annotations menu.

Mark Residues

Mark the selected residues with the color chosen from this submenu. This color overrides any other color applied. To remove this color, select the residues and choose Unmark Residues from the submenu.

Adjust Color Range

Set the limits of sequence identity used when weighting the color density by alignment quality. Opens a dialog box, in which you can set the lower and upper threshold for the sequence identity. Residues with identity below the lower threshold are colored white; residues with identity above the upper threshold have full color, and the color density for residues with identity between the two thresholds is set using a linear scale.

Color Sequences

Show or hide the sequence coloring.

Adjust Text Contrast

Use white for the text on residues colored with dark colors, and black for the text on residues colored with light colors. If this option is not selected, the text is black for all residues.

Annotations menu

Annotations are a means of representing additional information associated with the sequences. There are two classes of annotations: global annotations (consensus sequence, mean hydrophobicity) and local annotations. The global annotations are calculated for the entire set of sequences, the local annotations are computed for each sequence individually. Depending on the annotation type, they are presented as histogram plots (hydrophobicity, B-factor), color bars ("Color Blocks"), alphanumeric strings (consensus sequence, SSP, Pfam), graphical representations (secondary structure assignments).

Consensus Sequence

Display the consensus sequence at the top of the panel. The consensus sequence is the sequence that is composed of the most frequently occurring residue at each position in the sequence; if there are two residues that have the same frequency of occurrence, a + symbol is used, and the residues for this position are shown in its tooltip. The sequence is annotated with a histogram of the number of sequences that are represented by each residue in the consensus, with information on the percentage in the tooltip.

Consensus Symbols

Display a row at the top of the panel that contains symbols for the degree of consensus. The symbols follow the ClustalW conventions:

`*`	Single, fully conserved residue.
`:`	One of the following "strong" groups is fully conserved: STA, NEQK, NHQK, NDEQ, QHRK, MILV, MILF, HY, FYW.
`.`	One of the following "weaker" groups is fully conserved: CSA, ATV, SAG, STNK, STPA, SGND, SNDEQK, NDEQHK, NEQHRK, FVLIM, HFY.

Sequence Logo

Display logo annotation. In this annotation, residue symbols at each position whose frequency of occurence at that position is greater than a threshold are drawn in a vertical stack in order of frequency, with the height of the residue symbols proportional to the frequency of occurrence. (See Schneider T.D.; Stephens R.M. Sequence Logos: A New Way to Display Consensus Sequences. Nucleic Acids Res. 1990, 18, 6097.)

Mean Hydrophobicity

Display histogram of Kyte-Doolittle hydrophobicity for each residue in the alignment, averaged over all sequences. Hydrophobic residues have positive values; hydrophilic residues have negative values.

Mean Isoelectric Point

Add Global Annotations

Add all global annotations (listed above) to the display.

Remove Global Annotations

Remove all global annotations from the display. Does not affect sequence-dependent annotations.

Residue Numbers

Display residue numbers above the sequence. The numbers are given every 5 residues, and are left-aligned to the left edge of the residue in the sequence. This is useful for tracking sequence changes after residue deletion, for example. The ruler only gives absolute alignments.

Secondary Structure Assignment

Display secondary structure assignment for the sequence.

B-Factor

Display histogram of temperature factors for each residue in the sequence.

Disulfide Bonds

Display disulfide bonds as lines connecting cysteine residues, colored from black (strongest prediction) to light gray (weakest prediction).

Hydrophobicity

Display histogram of Kyte-Doolittle hydrophobicity for each residue in the sequence. Hydrophobic residues have positive values; hydrophilic residues have negative values.

Isoelectric Point

Display histogram of isoelectric points in a 5-residue window. The isoelectric point of the isolated amino acid and the 5-residue window value are displayed in the tooltip for each histogram bar.

Ligand Contacts

Display a row in which residue positions are colored by the shortest distance between any ligand heavy atom and any heavy atom in the residue at that position. Red is used if the distance is less than 4 Å, orange is used if the distance is less than 6 Å, and gray is used otherwise. This annotation takes a little time to generate, and requires a structure for the sequence.

Antibody CDRs

Display assignment of the three VL and VH regions. The residues are colored red, and a red line is displayed in the annotation for each region, labeled Ln or Hn for n = 1, 2, or 3.

Select Antibody CDRs

If the sequence is an antibody, select the CDRs in the sequence.

Antibody Numbering Scheme

Choose the numbering scheme to use for residues in an antibody, from Chothia, Enhanced Chothia, Kabat, IMGT, and AHo. This numbering scheme is only in effect when the Antibody CDRs annotation is turned on. When you export the antibody to a file, it is exported with the selected numbering scheme.

Color Blocks

Display a single row of color blocks that are colored according to one of the residue properties described under Residue Propensities for the Color menu. The row is presented like a sequence, but without letter codes, only the colors for the property. This submenu includes all the items that are on the Residue Propensities submenu of the Color menu, with the addition of the Helix Terminator item, and also has commands to display annotations for all properties on the submenu or to remove all color block annotations.

User Annotations

Add or remove customizable annotations. There are three types of user annotation offered:

Mark Alignment Region—Draw a rectangle over the selected part of the entire alignment (all columns). The rectangle position is fixed at the initial ruler position: it doesn't change if you change the alignment.
Mark Rectangular Region—Draw a rectangle over the currently selected block of residues. The rectangle is anchored to the top left residue and its position moves with this residue if you edit the alignment.
Add Custom Annotation—Add an annotation that you can edit to mark residues with whatever text symbols you choose. You can edit the annotation in Edit mode, and change its name from the default "Custom annotation" by right-clicking on the annotation and choosing Rename Sequence.

Clear Annotations

Remove all annotations from the selected sequences, leaving just the sequences. This action removes any secondary structure assignments and predictions, as well as the annotations added from this menu. If no sequences are selected, annotations are removed from all sequences.

Tools menu

This menu provides access to programs for finding homologs, finding families, and predicting secondary structure.

Find Homologs (BLAST)

Run a BLAST search to find homologs for the first sequence. Opens the Blast Search Settings panel, in which you can make choices for the search and start the job. A progress dialog box is displayed, and the results are displayed in the BLAST Search Results dialog box.

Show Results of BLAST Search

Show the output of the latest BLAST search, in the BLAST Search Results dialog box. The dialog box allows you to select homologs, sort results by one of a range of properties, download PDB structures for the homologs, and incorporate selected homologs into the project.

Find Family (Pfam)

Run a Pfam search to find families for the selected sequences, or for all sequences if no sequence is selected.

Predict

Run prediction programs. This submenu has the following items:

Run All Predictions: Run all the predictions listed below.
Secondary Structure: Run the secondary structure programs to obtain a prediction of the secondary structure of the selected sequences, or for all sequences if no sequence is selected. A dialog box opens that shows job progress.
Solvent Accessibility: Predict accessibility of each residue to solvent. If more than 25% of total residue surface area is predicted to be exposed to the solvent, the residue is marked "e" (exposed, colored blue), otherwise it is marked "b" (buried, colored yellow).
Domain Arrangement: Predict the arrangement of domains. Residues marked gray are likely to form a domain. Residues marked red are likely to be in linker (inter-domain) regions.
Disordered Regions: Calculate a disorder score and classify residues by this score The score is normalized to a 0 to 1 range. If a residue has a disorder score less than 0.5, it is marked light gray. If the score is between 0.5 and 0.9, the residue is marked orange. If the score is greater than 0.9, the residue is marked red.
Disulfide Bridges: Predict disulfide bridges between cysteines. Predicted bonds are drawn as lines connecting cysteine residues, colored from black (strongest prediction) to light gray (weakest prediction).
Beta Strand Contacts: Predict the contacts between beta sheets.

Clear Predictions

Remove secondary structure predictions.

Build Homology Model

Build a structure for the query sequence, using the Prime Build Structure tools. The structure is built for the query sequence, and templates must be selected from among the sequences that have PDB structures. The Build Homology Model panel opens, in which you can select options for use of templates, for multimer models, and for building the structure; and then run the job. The structure is incorporated into the Maestro project when the job finishes, and the sequence is added to the MSV and aligned with the query (or queries, for a heteromultimer).

Analyze Binding Site

Calculate sequence identity in multiple alignment columns that are within a certain spatial distance from the ligand in the query. You must have a query sequence with a structure and a set of sequences that are aligned to the query. Choosing this menu item opens the Analyze Binding Site dialog box, in which you can select sequences by percentage identity to the query in the binding site, or analyze the percentage identity, similarity, and homolgy of the aligned sequences to the query within a range of distances from the binding site.

Compare Sequences

Compare all sequences or the selected sequences, by identity, similarity, or homology. The percentage is displayed in a table, like a heat map, with cells color-coded by the percentage value. Opens the Compare Sequences panel, in which you can choose the comparison measure to display, switch between all or selected sequences, and refresh the display after changing the alignment.

Show Job Settings

Opens the Multiple Sequence Viewer Job Settings dialog box, so that you can make settings for any job that is run from the MSV.

Show Job Log

Show the log file for the most recent job in a dialog box.

Maestro menu

This menu provides options and actions for the interaction with the sequences and their structures as stored in the Maestro project and displayed in the Workspace.

Incorporate Entries from Workspace: Incorporate the sequences and the corresponding structures from the entries that are in the Workspace into the MSV.
Incorporate Selected Entries from Project Table: Incorporate the sequences and the corresponding structures from the entries that are selected in the Project Table into the MSV.
Include Incorporated Entries in Workspace: When structures are imported into the MSV and incorporated into the Maestro project, include the structures in the Workspace.
Associate Maestro Entries: Associate Maestro entries with sequences in the MSV without adding new sequences, modifying the alignment,or importing structures into the MSV. Opens the Associate Maestro Entries with MSV Sequences dialog box, in which you can choose an entry chain from the Workspace (list on the left) and an MSV sequence (list on the right), and click Associate Selected Pair to associate the sequence with the entry. The sequence identity for the selected pair is shown below the lists, and colored green if the identity exceeds 95% or red if it does not. The residues that do not match the entry sequence are marked as structureless in the MSV by using a less intense color. This facility is useful if the sequences are imported into the MSV and the structures are independently imported into Maestro.
Superimpose Structures According to Sequence Alignment: Superimpose structures according to their sequence alignment. Uses the Superposition panel, with sequence identities selected as the atoms for superposition.
Protein Structure Alignment: Run the Prime Protein Structure Alignment program on the selected (or all) protein structures and return the alignments. The sequences you select must have structures associated with them.
Get Colors from Maestro Workspace: Color sequences with the colors that they have in the Maestro Workspace. The color of the alpha carbon is used to color the residues.
Apply Colors to Workspace: Apply the colors from the MSV to the sequences and structures in Maestro.
Color Entry Surface: Color the molecular surface in the Workspace using the colors from the corresponding sequence in the MSV. If sequences are selected, only the colors of the selected sequences are applied.
Update from Maestro: Update the sequences in the MSV that originated from Maestro with any changes made in Maestro.
Update Maestro Workspace Selection from MSV: Update the atom selection in the Maestro Workspace from the residue selection in the MSV.
Update Automatically from Maestro: When this option is selected, changes made to sequences in the Maestro project are automatically propagated to the MSV. If it is not selected, you can choose Synchronize with Maestro to apply changes made in Maestro to the MSV.
Allow Structural Changes: When in Edit mode, allow mutation operations on the sequence to change the structure in Maestro. If this option is not selected, you will not be allowed to mutate sequences that came from Maestro structures. This option has no effect on deletions in the sequence, which are not propagated to the structure.

Settings menu

This menu provides settings for control of what is displayed.

Wrap Sequences: When selected, wrap the sequences so that the display consists of multiple rows of sequences, and can be scrolled vertically. When unselected, display the sequences in a single row that scrolls horizontally. Operations on unwrapped sequences are generally faster, especially when there are many sequences.
Group Annotations by Type: When selected, group the annotations of the same type and display the groups as separate rows, below the sequences. When unselected, display the annotations for each sequence directly below the sequence.
Font Size: Change the font size for the text in the sequence viewer. This submenu offers a selection of point sizes for the font.
Display Ruler: Display a "ruler" that marks the residue positions in the sequence viewer. These positions are not the same as the residue numbers in each sequence, which can be offset from the origin and have gaps.
Display Tooltips: Show information about the panel and its contents in tooltips (text displayed when the pointer pauses over the relevant part of the panel).
Display Header Row: Display a heading row above the sequences with labels for each section.
Replace Identities with Dots: Replace residue symbols with dots for all residues that are identical to those in the query sequence. This feature makes it easy to find the mutations from the query sequence in the other sequences.
Pad Alignment with Gaps: Add gaps to the end of each sequence so that the sequences are the same length.
Display Sequence Boundaries: Display the residue number of the first and last visible residue in each row for each sequence. The numbers are displayed to the left and right of the sequence.
Display Percentage Identity: Display the percentage identity with the query sequence to the right of the sequence. By default, the query sequence is the consensus sequence. You can set the query sequence by right-clicking on a sequence and choosing Set as Query from the shortcut menu. The name of the query sequence is displayed in the status area.
Display Percentage Similarity: Display the percentage similarity to the query sequence to the right of the sequence. Two residues are similar if they have a positive BLOSUM62 pairwise score. The percentage similarity is calculated from the number of similar residues divided by the number of aligned residues. By default, the query sequence is the consensus sequence. You can set the query sequence by right-clicking on a sequence and choosing Set as Query from the shortcut menu. The name of the query sequence is displayed in the status area.
Display Percentage Homology: Display the percentage homology to the query sequence to the right of the sequence. Homology is calculated as the percentage of residues with identical side-chain chemical properties (as defined for the Side-Chain Chemistry color scheme). By default, the query sequence is the consensus sequence. You can set the query sequence by right-clicking on a sequence and choosing Set as Query from the shortcut menu. The name of the query sequence is displayed in the status area.
Display Score: Display the BLOSUM62 similarity score to the right of the sequence. The score is calculated relative to the query sequence.
Include Gaps in Sequence Identity Calculations: When calculating sequence identity, count gaps as though they were residues, rather than ignoring them. For example, a column consisting of 2 different residues and 8 gaps would have a sequence identity of 20% if gaps are included, but 50% if gaps are ignored.
Calculate Sequence Identity Only in Selected Columns: When calculating sequence identity, restrict the calculation to the columns (residues) that are selected. This allows you to calculate the identity of regions of a sequence rather than the whole sequence.
Update Sequence Profile: Update the internal profile that is used for sequence identity and consensus calculations, and sequence coloring. If Automatically Update Sequence Profile is off, use this command to manually update the profile before doing calculations or applying coloring.
Automatically Update Sequence Profile: Update the internal profile that is used for sequence identity and consensus calculations, and sequence coloring automatically when changes are made. Deselect this option to improve performance when you have a large number of sequences.
Ask Before Accessing a Remote Server: Display a dialog box requesting confirmation of the action before retrieving information from a web server.
Return to Default Settings: Reset all settings on the Settings menu to the default values.

Toolbars

There are two rows of tools on the tool bars. The first row contains toolbars that only have buttons. These toolbars are described together. The second row contains toolbars that have other tools besides buttons. These toolbars are described separately. You can show or hide any of the toolbars from the shortcut menu.

Toolbar Buttons
Sequence Editing Modes
Fetch Toolbar
Find toolbar

Toolbar Buttons

The buttons on the toolbars that only have buttons are described below.

	Import Sequences Import sequences into the Multiple Sequence Viewer project from a file. The file must be in FASTA or PDB format. Opens a file selector, in which you can navigate to and select the file. Same as File → Import Sequences.
	Export Sequences Export sequences from the Multiple Sequence Viewer project to a FASTA file. Opens a file selector, in which you can navigate to a location and name the file. Same as File → Export Sequences.
	Undo Undo the last action. Same as Edit → Undo.
	Redo Redo the last action. Same as Edit → Redo.
	Lock gaps Lock gaps in the sequences so that they are not filled when performing manual alignment. The locking is applied once. Gaps created after locking is done are not automatically locked. To lock them, click this button again. Locked gaps are indicated by a dash (`-`); unlocked gaps are indicated by a tilde (`~`). If you have a residue selection, gaps are only locked in the selected region.
	Unlock gaps Unlock gaps in the sequences after they have been locked. This allows gaps to be filled when performing manual alignment.
	Pairwise Alignment Align multiple sequences pairwise using ClustalW. Same as Alignment → Pairwise Alignment.
	Multiple Alignment Align multiple sequences simultaneously using ClustalW. Same as Alignment → Multiple Alignment.
	Color Matching Residues Only Apply the current color scheme only to residues that are identical in all sequences (gaps are ignored).
	Weight Colors by Alignment Quality Set the color density according to the sequence identity.
	Average Colors in Columns Average the colors in each column and color all residues in the column by the average color.
	Zoom in Increase the width of each residue so that the horizontal scale is expanded.
	Zoom out Decrease the width of each residue so that the horizontal scale is contracted. When the residues are narrower than the text, the text is no longer displayed. The residues can be identified by their tooltips.
	Wrap Sequences Wrap the sequences so that the display consists of multiple rows of sequences, and can be scrolled vertically. When unselected, display the sequences in a single row that scrolls horizontally. Same as Settings → Wrap Sequences.
	Build Homology Model Build a 3D model of the sequence using Prime. Same as Tools → Build 3D Model.

Sequence Editing Modes

The Mode option menu allows you to select one of four sequence editing modes so that you can edit the alignment, and in some cases, edit the sequence itself.

You can lock the sequence downstream (to the right) of the residue or block that you are moving, so that the downstream part of the sequence moves as a block, without creating or removing gaps. To lock or unlock the sequence downstream, click the Lock Sequence Downstream button to the right of the Mode option menu. Locking is on by default.

You can also use the Lock Gaps and Unlock Gaps toolbar buttons to prevent gaps from collapsing while editing the alignment.

The four sequence editing modes are described below.

Select and Slide

Use this mode to select multiple residues and slide them.

To select residues, drag over the residues, or shift-click the first and last residues in the range. You can drag across multiple sequences to select residues, and you can drag in the ruler to select residues in all sequences.
To deselect selected residues, control-click the residues.
To slide the selected residues, drag them to their new location.

Grab and drag

Use this mode to drag single residues to a new location. If there are residues adjacent to the residue you drag, they are also moved as you drag. Gaps are filled as you drag, unless they are locked.

Edit

Edit the sequence or the SSPs by typing. You can mutate residues by typing in the replacement residue code, and delete residues. If the sequence is a Maestro sequence, a mutation operation also mutates the residues in the structure, but you must have Allow Structural Changes selected on the Maestro menu to perform the mutation. Deletions do not change the structure, regardless of the Allow Structural Changes setting. You can also edit sequences as text by using the tools on the Edit menu. You can change an SSP by typing in the replacement code (E, H, or -).

The allowed key strokes are listed below.

SPACE—insert a single gap under the cursor and move the downstream sequence one position to the right.
BACKSPACE (DELETE)—delete a single gap to the left of the cursor and move the downstream sequence one position to the left. The cursor stays on the current residue, which moves to the left.
DELETE (Fn+DELETE)—delete the single gap under the cursor and move the downstream sequence one position to the left. The cursor stays at the same ruler position.
Shift+BACKSPACE (⇧DELETE)—delete the selected residues.
Arrow keys—move the cursor around the alignment. Using CTRL with the left and right arrow keys moves the cursor by 10 residues.
HOME—move the cursor to the leftmost position that is displayed in the current sequence (the left side of the sequence display). Pressing HOME again moves the cursor to the first residue in unwrapped mode, scrolling the display.
END—move the cursor to the rightmost position that is displayed in the current sequence (the right side of the sequence display). Pressing END again moves the cursor to the last position in unwrapped mode, scrolling the display. The cursor is not necessarily over a residue.

Insert and remove gaps

Insert single gaps by clicking with the left mouse button. Delete single gaps by clicking with the right mouse button. If multiple sequences are selected and you click in one of the selected sequences, the gaps are inserted or deleted in all selected sequences. If you click in a single selected sequence or in a sequence that is not selected, the gaps are inserted or deleted only in the sequence you clicked on.

Fetch Toolbar

The Fetch tool allows you to fetch sequences from the Protein Data Bank or the Entrez Protein Database, from the appropriate web sites.

To fetch sequences from the PDB, type the four-character code into the Fetch text box and press ENTER. The sequence is retrieved from the RCSB web site and added to the project.
To fetch sequences from Entrez, type the access code into the Fetch text box and press ENTER. The access code format is database|code, where database is the code for the database (gi for GeneBank, pdb for the Protein Data Bank, emb for the EMBL Sequence Database, and so on), and code is the sequence code for the database. Examples: gi|12345, pdb|2aba, emb|CAA44029.1. The sequence is retrieved from the NCBI web site and added to the project.

Find Pattern Toolbar

This toolbar provides a tool for finding a pattern in the sequences. The pattern used in the search is an extended PROSITE pattern, which makes use of secondary structure and property information. The pattern has the following syntax.

residue	Find occurrences of the specified residue. The residue must be given as an upper case letter. For example, `A` finds alanine.
`[`list`]`	Find occurrences of any of the residues listed. For example `[AIL]` finds occurrences of A, I, or L
`{`list`}`	Exclude all occurrences of any of the residues listed. For example `{ED}` ensures that occurrences of E and D are not found.
`a`	Find acidic residues (D and E)
`b`	Find basic residues (K and R)
`e`	Find residues in an extended (beta-strand) region
`f`	Find residues in a flexible region (B-factor greater than the chain average)
`h`	Find residues in a helical region
`o`	Find hydrophobic residues (A, C, F, I, L, P, V, W, and Y)
`p`	Find aromatic residues (F, W, and Y)
`s`	Find solvent-exposed residues
`x . ?`	Find any residue. Any of these three characters can be used.
`@`number	Find the residues with the specified PDB residue number (not ruler position). Insertion codes are not recognized, so all residues with a given insertion code are found.
`(`m`)` `(`m`,`n`)`	Find the specified number of contiguous occurrences of a residue (or residue type). The second form specifies a variable number of occurrences, e.g. `o(2-4)` means find two to four consecutive hydrophobic residues.

To run a simple search for a sequence of residues, just type in the sequence.

For a more complex search, in which you apply conditions at each residue position, you can combine these elements to create a search pattern, such as G-[IL]-o{AC}. There is an implied AND between contiguous elements: so in the example given, o{AC} means a hydrophobic residue that is not Ala or Cys. Each such sequence of elements that applies to a single residue must be separated from the next sequence by a - character. The search takes place when you press Enter. The patterns that are found are highlighted, and all other residues are colored white.

As well as typing in patterns, you can store and retrieve your own patterns from the Select Pattern option menu. The patterns are listed at the top of this option menu, with four default items, Deamidation Site, Glycosylation Site, Proteolysis Site, and Oxidation Site. The last item is Edit Patterns, which opens the Edit Patterns dialog box, in which you can change patterns, including these default patterns, and add and delete patterns. You can edit the table cells to change the pattern name, the definition, and the "hotspot". The Hotspot column contains the residue index in the pattern that should be selected when the pattern is found. This feature is ignored by the MSV but is used in other applications that rely on the MSV (notably BioLuminate).

Sequence display area

The MSV can handle multiple query sequences. Each is displayed in a separate tab in the sequence display area. You can create a new tab by clicking the "+" button to the top right of the display area, or by choosing Sequences → New Query. Using the "+" button opens a dialog box to import a sequence; the menu choice creates an empty query. The tabs are labeled Query 1, Query 2, and so on, by default. You can change the name by choosing Rename Query from the tab shortcut menu. Closing a tab with the X button removes the sequences and related data from the MSV.

The sequence display area in each tab is divided into three sections. The first of these is empty until you do an alignment with ClustalW. When the alignment is done, this area contains a phylogeny tree diagram. The second section displays the sequence name and the names of the various annotations. The annotations can be expanded or collapsed individually, by clicking the "tree node" (box) immediately to the left of the sequence name. They can also be expanded or collapsed globally with Ctrl+DOWN ARROW and Ctrl+UP ARROW (⌘UP ARROW). Selected sequences are colored slate blue. The third section displays the sequences and their annotations, global annotations, and a ruler.

Sequences are represented by the standard residue letter symbols. Residues for which atom coordinates are missing are colored in a paler shade than residues for which atom coordinates are available.

Secondary structure predictions are represented by the characters H (helix), E (extended), and - (everything else). Secondary structure assignments are indicated by tubes for helices and arrows for extended structures.

There are three shortcut menus. The sequence shortcut menu opens when you right-click in the sequence name section. The alignment shortcut menu opens when you right-click on a sequence. The query shortcut menu opens when you right-click on the tab name.

Sequence shortcut menu

This shortcut menu contains items from several of the main menus, and one additional item. If you right-click on a sequence when there is no selection, the sequence you click on is selected. Otherwise, right-clicking does not change the selection. The menu items are listed below, with descriptions or links to their descriptions in the main menus.

Set As Query Sequence: Make the selected sequence the query sequence. This item is only present on the menu if there is a single sequence selected. To make the consensus sequence the query, first display it with Annotations → Consensus Sequence, then use this command to make it the query.
Select Ligand Contacts: Select the residues that are in contact with the ligand. This is a useful way of selecting residues in a binding site. This item is present when you right-click on a Ligand Contacts annotation.
Rename Sequence: Rename the sequence. Opens a dialog box in which you can enter a new name.
Translate DNA / RNA sequence: Translate a DNA or RNA sequence into the sequence for the protein it codes for, using standard genetic code.
Find Homologs (BLAST Search)
Find Family (Pfam Search)
Predict: These items are the same as on the Tools menu.
Get PDB Structures
Select All
Deselect All
Invert Selection
Hide Selected
Show All
Set Color of Sequence Name
Move Up
Move Down
Move to Top
Move to Bottom: These items are the same as on the Sequences menu.
Delete Selection: This item is the same as on the Edit menu.
Annotations: This submenu is a copy of the Annotations menu.
Clear Annotations: Clear (remove) the annotations for the selected sequences.

Alignment shortcut menu

This shortcut menu contains items from the Alignment menu, and some additional items.

Select Identities
Select All
Invert Selection
Deselect All
Delete
Multiple Sequence Alignment: These items are the same as on the Alignment menu.
Anchor Residues Outside Selection: Set anchors on the residues outside the selection so that they do not move at all during alignment. The residues outside the selection are grayed out. This action prevents new gaps from being created in a sequence, but you can slide residues into existing gaps. To remove the anchors, click in an area outside the sequences.
Clear Restricted Regions: Remove the anchors that were set by Anchor Residues Outside Selection.

Query shortcut menu

This shortcut menu is displayed when you right-click the tab that contains the query name. It has two items, which are also on the Sequences menu.

Rename Query: Rename the current query. The name appears on the tab for the query.
Duplicate Query: Create a copy of the selected sequences in the current query in a new tab.

Status area

The status area at the foot of the panel displays information on the current task, on the sequences in the MSV project, and on the query sequence.

Acknowledgement

The Multiple Sequence Viewer was developed in collaboration with Dr. Jano Jusuf and Dr. Stanley Krystek from Bristol-Myers Squibb.