Welcome, Guest! Log In | Create Account

Analysis

From pfaat

Jump to: navigation, search


Contents

Analysis

#conservationtostructure

How do I map conservation scores in the alignment to the structure?

This feature converts the conservation scores displayed below each column to a color that is mapped to the structure. The colors of the rainbow are used to designate a range of conservation scores where red represents an invariant position and violet represents a highly variable position.

1. Associate and view a structure as described here.

2. Optional: On the Analysis menu, click Conservation scores -> Information score... This allows one to recompute conservation scores for selected sequences or selected columns.

3. On the Analysis menu, click Map Conservation score to Structure...

4. Select the sequence name that is associated with the structure.

5. Click OK.


How do I find trends in the multiple sequence alignment that explain data in a sequence annotation?

The PLSR method finds trends in a multiple sequence alignment (residues and their properties) that best describe a set of sequence annotation values. The sequence annotation values are typically IC50 data or binding data. In this example, we pant to predict residues that are specificity determining for NT3 (binds TRKC) and NGF (binds TRKA and NGFR). Each dialog described below has a '''help''' button that provides more details.

1. Select the binding site columns as described here.

2. Shift select all sequences where the cluster ID equals one or two. Selecting sequence names is described here . The cluster Ids are a convenient way to distinguish the binding interactions. Cluster 1 corresponds to the NT3 family which binds TRKC and cluster 2 corresponds to the NGF family which binds TRKA and NGFR.

3. On the Analysis menu, click PLSR -> Build...

4. Select cluster id as the Y variable. Check the selected columns box and All properties. Check the Selected Sequences only box. Click the Next button. The selected sequences will be displayed.

Change the name and location of the property file of desired. Click the Yes button to continue.

5. On the Build options dialog, select binary transform and type 2 in the binary cutoff field. This means any Y variable (cluster ID) below 2 will be transformed to -1 and any value greater than or equal to 2 will be transformed to +1. Click the Next button.

6. Skip the permutation test by clicking on the Next or Skip button.

7. On the output dialog, change the names and locations of the output files if desired. Click the Go button. Each coefficient is converted to an absolute value (to account for negative values) and a mean is computed and displayed for each residue. Residues (columns) with a large average coefficient will contribute a lot to the multi-variate model and are predicted to be specificity determining. The plots are written to a PDF and can be viewed later.

#njtree

How do I reconstruct a Neighbor Joining tree?

1. If you are interested reconstructing a tree based on the binding site, select the binding site columns as described here. Warning the reliability of tree reconstruction tends to go down as you reduce the number of columns /sites.

2. If you want to reconstruct a tree for selected sequences, select a set of sequences as described here

2. On the Analysis menu, click Neighbor Joining Tree...

3. Select a matrix that will be used to compute pair-wise distances.

4. Check selected sequences and selected columns if you performed steps 1 and 2.

5. Check the Bootstrapping box to perform bootstrap analysis. This will increase the compute time as a separate tree is computed for each iteration.

6. Click the Warning button as an FYI. Click the OK button to run.


How do I compute percent identities between all sequence pairs?

1. If you are only interested in the binding site, select the binding site columns as described here.

2. If you only want to compute %Ids between one sequence and all other sequences, select your sequence of interest as described here. e.g. NGF_HUMAN. Skip this step if you are interested in all pairs.

3. On the Analysis menu, click % ID / Comparison Matrix...

image:Percentidentitymatrix.png


4. Select Percent Identity as the Similarity type.

5. Select Selected Sequences in the Columns section.

6. Select All Sequences in the Rows section.

7. If you are only interested in the binding site, check the Selected Columns Only box.

8. Optional: Specify a file name to write the Percent identities.

9. Optional: Specify a file name to write the number of identical residues.

10. Click OK.