From: Nomi H. <no...@us...> - 2003-04-21 17:01:56
|
Update of /cvsroot/gmod/apollo/doc/html In directory sc8-pr-cvs1:/tmp/cvs-serv12797 Added Files: synteny.html Log Message: Synteny user manual (split off from mail user manual, which was getting too big). --- NEW FILE: synteny.html --- <html> <head> <title>Apollo Synteny User Guide</title> </head> <body bgcolor="#FFFFFF"> <H1 align=center>Apollo Synteny User Guide</H1> <CENTER><img SRC="images/synteny-small.gif" width=166 height=115></CENTER> <P> The Synteny viewer in Apollo lets you view the genomes of two species simultaneously, as well as a comparative analyses between the genomes. This viewer is still under development, and is not yet available as a default option in Apollo. If you'd like to try it out, you can edit conf/apollo.cfg in your Apollo directory and uncomment one of the two DataAdapterInstall lines for the SyntenyAdapter: <pre> // Uncomment if you want to read synteny data from Ensembl databases //DataAdapterInstall "apollo.dataadapter.synteny.SyntenyAdapter" "synteny.style" // OR uncomment if you want to read synteny data from GFF files //DataAdapterInstall "apollo.dataadapter.synteny.SyntenyAdapter" "synteny.style.gff" </pre> <h3><a NAME="Compara"></a>Reading cross-species comparative data</h3> There are currently two input modes for viewing synteny data: <ul> <li> Single species data present in ensembl databases, with comparative data present in an ensembl_compara database. Examples of these databases can be found on the mysql instance at kaka.sanger.ac.uk:3306. This is easily the most flexible input format: Apollo should be able to point and read the databases on kaka when you install it. </li> <li> Single species data and comparative data provided in GFF Files. This method is the least tested (but - oddly enough - probably the most reliable giving the relatively small number of moving parts in the machine). Contact <a href=mailto:vv...@sa...>Vivek Iyer</a> if this manual doesn't cover it adequately </li> </ul> <h4>Configuring Apollo to read ensembl-compara data</h4> <p> The nicest way to browse comparative data is to view the single-species and compara-data in ensembl-format (You can view such databases on at the mysql instance on kaka.sanger.ac.uk:3306 - examples of single species databases look like "homo_sapiens_core_11_31",and an example compara database looks like "ensembl_compara_11_1"). When Apollo starts up (or if you use the File->Open menu option), choose the "Synteny" option from the drop-down labelled "Choose a data adapter". You will be presented with a tabbed pane with three tabs: one tab to specify the data source and read option for each single species, and the middle tab to specify where comparative information comes from. With the configuration that apollo is shipped with, the tabs look like this: </p> <div align="center"><img src="images/synteny-mouse.gif" alt="synteny-mouse-tab" width="599" height="423"> <br> </div> <p> Notice the startling similarity of the "mouse" and "human" single-species tabs to the <a href="userguide.html#Reading_Ensembl_via_ensj">ensj-adapter</a> discussed above. This is not an accident: they work the same way. In particular, you can specify which region of the chromosome to load using a <b>gene</b> stable id or a region or the chromosome. In addition, you must get the input datasource right that is, set up the DataSource panel to point at your favourite known ensembl mouse database: here it's pointing at the current instance on kaka: </p> <div align="center"><img src="images/synteny-mouse-datasource.gif" alt="synteny-mouse-datasource"> <br> </div> <p> The "human" tab will show analogous information for the human data.</p> <p>The middle tab points at the ensembl-compara database: <p> <div align="center"><img src="images/synteny-compara.gif" alt="synteny-compara-tab" width="654" height="590"> <br> </div> <p> This database stores two types of alignments between species: dna-dna aligns, and protein-protein aligns. You can load either or both types by checking the appropriate checkboxes at the top of the tab. </p> The source of the alignments must be an ensembl-compara database. The current one on kaka is ensembl_compara_11_1 as shown. The button marked "view full synteny panel will display a picture of synteny blocks found between mouse chromosomes and human chromosomes - it's another way of selecting which regions in the two species you want to see. More on this shortly.</p> <h4>Using Apollo to read cross-species data from ensembl and ensembl-compara tables</h4> Once you've configured the right datasources in the various tabs, you can try to read data. In order to read anything, Apollo must be supplied with EITHER <ul> <li>One or more chromosome regions, e.g. Chromosome 1, 1-1000000 of Human, (either with or without a corresponding region of mouse), OR</li> <li>A gene stable id from one species or the other. </ul> <p> For instance, you might want to view comparative information for genes on a mouse related to the region Chromosome1:1Mb-2Mb on a human. The following information entered on the tabs will tell Apollo to do this: </p> <div align="center"><img src="images/synteny-mouse-region.gif" alt="synteny-mouse-region"> <br> </div> <p> Note that the region data fields are blank, which means apollo will try to imply the information from the mouse-human links it sees attached to the human genes, which it reads as directed: </p> <div align="center"><img src="images/synteny-human-region.gif" alt="synteny-human-region"> <br> </div> <p> If you now hit the "OK", then the data will be read, and Apollo will display the genes from both species, with the alignments shown in the middle: </p> <div align="center"><img src="images/synteny-apollo-frame.gif" alt="synteny-apollo-frame" width="800" height="800"> <br> </div> <p> What this display implies is that Apollo found that all the human-mouse links that mapped onto the region Human Chr 1:1Mb-2Mb ended on Mouse Chr 4: 151Mb-152Mb. Apollo therefore loaded that mouse region, and showed the links. This quite complicated frame has the following features to ease navigation: <ol> <li>There are two types of links in the centre pane: dna-dna alignments and protein-protein alignments. They are actually just features, and so are coloured according to this adapter's ".tiers" file, and can be switched on and off with the "Types Panel". By default the synteny adapter uses the "ensj.tiers" file, and these links are features of type "syntenyd" and "syntenyp" respectively. You can suppress the display of the (large number of) dna-dna alignments by opening the <a href="userguide.html#TypesPanel">types panel</a> and selecting not to show the type "syntenyd". This will (in this case) significantly speed up the rendering speed of the display.</li> <li>If you left-mouse-click on a link in the centre frame, it will centre the display on that link, and highlight the homologous genes that the link represents (if the link is a dna-dna-alignment, then the aligning sequence will be highlighted when you're zoomed in close enough). <div align="center"><img src="images/synteny-single-link.gif" alt="synteny-single-link" width="800" height="600"> <br> </div> -- notice that the homologous genes indicated by the link are highlighted in red. Links between genes on opposite strands are drawn to look like "bow ties" instead of parallelograms. Clicking in the centre panel anywhere <em>except</em> on a link will center the display at the selected point. </li> <li>If you left-click on an exon in one of the panels, its homologous gene (if it exists) will be highlighted in the other panel. The structure of exons of the selected transcript and the stable id of the gene will be displayed (as always) in the panels to the left.</li> <li>The zoom buttons (x2, x10 etc) and scrollbars behave for each single-species panel as usual. The combination SHIFT+zoom button will simultaneously zoom both species panels.</li> </ol> </p> <p> There are some menu-directed actions that are quite useful in this context, for instance "Edit->Find" (to allow you to zoom in on a gene by stable id) and View->Reverse Complement (to "straighten out" the alignments in a region which has moved from the forward strand in one species to the reverse strand in the other).These actions are only applicable to a single species' panel at a time. To determine which species the menu action will be applied to, pull down the "File" menu. You will see the names of both species in the menu. <div align="center"><img src="images/synteny-file-menu.gif" alt="synteny-single-link" width="200" height="190"> <br> </div> </p> <p> Choose the species name from the menu, and you will notice the corresponding name of the species (next to the zoom buttons) highlighted in red. <div align="center"><img src="images/synteny-species-focus.gif" alt="synteny-species-focus" width="200" height="200"> <br> </div> Now you can use the Edit->Find menu option as well as Reverse complement etc, and the action will apply to the panel for the chosen species. </p> <h4>Guided browsing with the full synteny panel</h4> <p>Alignments between different species have been grouped together into larger blocks called synteny regions. These regions are stored in the ensembl-compara database, and are visible in a panel invoked with the "View Full Synteny Panel" button in the compara (middle) dataadapter - see left image. If you push this button, you will see a panel loaded with many images of related chromosomes, like this one in the middle image (this particular image showing human chromosomes syntenic to Mouse Chromsome 1 21Mb-33Mb <-> Human Chromosome 6 ). Left-clicking on any of the marked regions on the central chromosome will bring up a "this one please" menu option. </p> <div align="center"> <img src="images/synteny-full-panel.gif" alt="synteny-full-panel"> <img src="images/synteny-chromosome-panel.gif" alt="synteny-chromosome-panel"> <img src="images/synteny-chromosome-panel-selected.gif" alt="synteny-chromosome-panel-selected"> </div> <p>Selecting this option will "drag" the selected pair of regions (eg Mouse 1:back to the chromosome/start/end ranges into <em>both</em> of the single-species adapters, that is, the ranges will be completely specified for both species. Nifty, eh? </p> <h4>When genes in your selected (query) region are homologous to genes in more than one target chromosome...</h4> If you choose to only enter a region/stable id for a single species, then Apollo will examine the compara-alignments for that region (and the pair of species) to work out which region to load for the "other" species. For instance, say you are comparing Mouse and Human genomes, and you select Human Chromosome 1:1-1Mb, and start a data load. Then Apollo will examine Mouse-Human protein-protein alignments, and deduce that the first Mb of Human Chr 1 actually maps to three Mouse Chromosomes: <ul> <li>Mouse Chr 2: 112Mb-113Mb (where there are 12 protein-protein aligns) </li> <li> Mouse Chr 4:150Mb-152Mb (where there are 12 protein-protein aligns) </li> <li> Mouse Chr 17 45Mb-46Mb (where there is 1 protein-protein align) </il> </ul> <p>Apollo can only display one of these regions at a time, so it will offer you the choice of which one to display, by showing the following panel:</p> <div align="center"> <img src="images/synteny-align-block-chooser.gif" alt="align-block-chooser" width="465" height="150"> </div> <p>You can choose which region to load by selecting the appropriate radio button and pushing the "OK" button. Apollo will then proceed to display the region you have selected on the Mouse against the first Mb of Human Chr 1.</p> <h4>Changing species, and browsing three species simultaneously</h4> The synteny.style file contains extra entries (beyond display options) that determine which species are loaded into the various tabs of the Synteny adapter. By changing these entries, you can change <ul> <li>Which species are loaded (eg "Rat" instead of "Mouse") </li> <li>The order in which the species appear (eg the "Human" tab appears, then the "Human-Rat" tab, and then the "Rat" tab), and</li> <li>The number of species which are simultaneously compared (eg you could load three species simultaneously).</li> </ul> <p>Rather than describe these entries in detail, Apollo has been shipped with a number of different synteny.style file examples, to accomodate these various cases. These files are all present in the conf/ subdirectory of the Apollo distribution, and labelled according to their contents. For instance, the style file which will make Apollo browse Mouse vs Rat is labelled synteny.style.mouse-rat. To activate this file, simply copy it over the current synteny.style file. Then when Apollo is restarted, you will see the following adapters: <div align="center"> <img src="images/synteny-mouse-rat.gif" alt="synteny-mouse-rat" width="600" height="425"> </div> - and you can browse mouse/rat alignments as before.</p> <p> To browse mouse, rat and human genomes simultaneously, copy synteny.style.mouse-rat-human over synteny.style. Restarting Apollo will bring up three "pairs" of adapters: <div align="center"> <img src="images/synteny-mouse-rat-human.gif" alt="synteny-mouse-rat" width="600" height="425"> </div> -- again, any pair of species are configured exactly as before. Browsing can be initiated by choosing a single region or gene stable id in any species.</p> </ol> <h4>Reading Compara data from GFF Files</h4> If you have data from two species in Sanger GFF format, and links between that data (also in Sanger GFF Format) then you can load in the data into the Synteny viewer as before. Here's how to do this: <ol> <li>Copy the file conf/synteny.style.gff over conf/synteny.style. This will configure the compara-adapter (a composite) to use Apollo's gff adapters to read in data. </li> <li>Start Apollo. Choose "Synteny" from the Adapter list as usual. You will be presented with the following panel: <P> <div align="center"> <img src="images/synteny-gff-adapter.gif" alt="synteny-gff-adapter" width="622" height="242"> </div> <P> Note that I've labelled the species "Species1" etc: since the only thing that determines the data loaded is what is in the input files, I saw little point in making the logical species names anything more interesting (such as "human" etc). </li> <li> Choose files for each of the individual species' features and the link file. There is a single species file (called "chr2.200000-4000000.gff") in the data/ directory of the Apollo distribution. To start off with, you can use this file to load information for both species 1 and species 2 (why not?). In addition there is a file provided (data/links.gff) that contains a few links between the individual species' features. Once you have chosen the files, push the "OK" button to start loading and Apollo will load the chosen features and the links between them. </li> <li>The mechanics of browsing the displayed data are the same as when you load data from the EnsJ adapters, with the exception that there is no underlying sequence information</li> </ol> <h4>Currently known bugs</h4> <p>Compara-data browsing is admittedly buggy at time of writing: we will iron these out soon. Here are the some of the issues I currently know about:<p> <ol> <li>Adapters can have potentially confusing behaviour (eg changes are made to the user entries in the GUI and are still thereewhen the adapters are redisplayed via File->New).</li> <li>Synteny choice in drop-down is a tar-pit - can't choose something else with File->New once you've done this (and vice versa, you can't switch between single-species and synteny browsing).</li> <li>Currently switching the "active" species - eg from Mouse -> Human via the "File" Menu sometimes results in <em>both</em> species labels being painted red. This is hard to reproduce to track down.</li> <li>Can't select an exon from the structure panel to the left of the display: NullPointer problem. This is not subtle.</li> <li>Loading GFF: Chromosome range is labelled "null" right now</li> </ol> <P><HR><h3><a href=userguide.html>Back to main Apollo user guide</a></h3> </body> </html> |