<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Recent changes to Orphan Fragment Localization</title><link>https://sourceforge.net/p/bait/wiki/Orphan%2520Fragment%2520Localization/</link><description>Recent changes to Orphan Fragment Localization</description><atom:link href="https://sourceforge.net/p/bait/wiki/Orphan%20Fragment%20Localization/feed" rel="self"/><language>en</language><lastBuildDate>Thu, 04 Jul 2013 21:14:08 -0000</lastBuildDate><atom:link href="https://sourceforge.net/p/bait/wiki/Orphan%20Fragment%20Localization/feed" rel="self" type="application/rss+xml"/><item><title>Orphan Fragment Localization modified by Mark_HIlls</title><link>https://sourceforge.net/p/bait/wiki/Orphan%2520Fragment%2520Localization/</link><description>&lt;div class="markdown_content"&gt;&lt;pre&gt;--- v5
+++ v6
@@ -7,7 +7,6 @@

 While many genomes have been given the designation of 'completed', there are a number of large and small contigs that have yet to be placed on the scaffold. These contigs locate somewhere within the genome, but have yet to be placed due to a lack of sequence overlap.  These fragments likely locate within the myriad sequence gaps present in these genomes, which presumably have a high degree of sequence complexity. 

-![BAIT pipeline for orphan fragment analysis](http://i295.photobucket.com/albums/mm141/rareaquaticbadger/BAIT/Slide2_zps220d5a74.jpg "BAIT pipeline for orphan fragment analysis")

 Typical Run
 ===========
@@ -30,15 +29,19 @@
 -r
 &gt;Used to turn on the SCE identification algorithm. Triggers the CNV program DNAcopy to identify any intervals where the strand inheritance state changes from one format to another, for example, it will identify a region where reads map entirely to Watson on one side (WW) and to both Watson and Crick on the other side (WC). After the CNV algorithm has ascertained a large interval, a simple script then scans the interval looking for the point where the state change occurs.  Identifying SCE is essential in locating where orphan scaffolds fit on a chromosome, therefore, this option is required when running an orphan fragment assembly.

+![BAIT pipeline for orphan fragment analysis](http://i295.photobucket.com/albums/mm141/rareaquaticbadger/BAIT/Slide2_zps220d5a74.jpg "BAIT pipeline for orphan fragment analysis")
+
 Output Files
 ============

 Orphan fragment plot
 --------------------

-The orphan fragment plots contain
+The orphan fragment pdf contains a single ideogram plot for every orphan fragment found in the genome. For example, in a genome with 44 orphan scaffolds, 44 pages are generated, with a single fragment on each page.  The ideograms plotted show the percentage concordance between the strand inheritance pattern of the orphan fragment and the strand inheritance pattern of the chromosome. The histograms to the left show the concordance in the + orientation (where the orphan fragment is in the same orientation as the chromosome) and the histograms on the right show the concordance in the - orientation (where the orphan fragment is in the opposite orientation to the chromosome).  

 ![Orphan fragment plot](http://i295.photobucket.com/albums/mm141/rareaquaticbadger/BAIT/Unknown_31212_q20_2012-12-04_Page_01_zps54e74dff.jpg "Orphan fragment plot")
+
+In the example above, the fragment has the same template state as the top of chromosome 1 in the reverse orientation in 91.07% of libraries.  This means that when chr1 is WC the fragment is WC, when chr1 is CC the fragment is WW, and when chr1 is WW the fragment is CC.  The red region shows the peak agreement, that is, the most highly concordant region in the genome.  This location is the most likely place that the orphan scaffold maps to.  If a gap file is used (by invoking the -G option), any gap regions within the region of peak agreement are printed out to identify the most likely locations that the scaffold maps to.  The number of libraries with informative data is also used, together with the proportion of WC fragments (if fragments are 100% WC they are likely highly repetitive and cannot be mapped to a particular location.

 Orphan fragment heatmap
 -----------------------
@@ -49,3 +52,13 @@
 ---------------------

 The orphan fragment table provides a list of locations of particular fragments.  It gives the fragment name and the predicted interval location of the fragment within the genome.  If the -G option is used and a gap region or regions are coincident with this location, these gap sites are also displayed. The percentage concordance and the number of libraries covered are also given.
+
+**Jump to:**
+
+[Wiki Main Page](https://sourceforge.net/p/bait/wiki/Home/ "Wiki main page")
+[What is Strand-seq and how does it work?](https://sourceforge.net/p/bait/wiki/Introduction%20to%20Strand-seq%20and%20BAIT/ "A brief outline of Strand-seq and the BAIT pipeline")
+[Tutorial for strand inheritance studies](https://sourceforge.net/p/bait/wiki/Strand%20Inheritance/ "For immortal strand theory / silent sister hypothesis / epigenetic projects")
+[Tutorial for sister chromatid exchange studies](https://sourceforge.net/p/bait/wiki/Sister%20Chromatid%20Exchange/ "For localization and counts of SCE, and comparison of SCE locations to genomic landscapes")
+[Tutorial for identifying genomic rearrangements](https://sourceforge.net/p/bait/wiki/Identification%20of%20misorients/ "For finding genomic rearrangements")
+[Tutorial for localization of orphan fragments](https://sourceforge.net/p/bait/wiki/Orphan%20Fragment%20Localization/ "For localizing unplaced and unlocalized scaffolds in chromosome/complete-stage genomes")
+[Tutorial for building early stage genomes](https://sourceforge.net/p/bait/wiki/Genome%20Building/ "For clustering contigs from scaffold/chromosome-stage genomes into chromosomes and inferring relative orders")
&lt;/pre&gt;
&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Mark_HIlls</dc:creator><pubDate>Thu, 04 Jul 2013 21:14:08 -0000</pubDate><guid>https://sourceforge.net0971eb10f804faf0fd910b6661b74f4857ee1a93</guid></item><item><title>Orphan Fragment Localization modified by Mark_HIlls</title><link>https://sourceforge.net/p/bait/wiki/Orphan%2520Fragment%2520Localization/</link><description>&lt;div class="markdown_content"&gt;&lt;pre&gt;--- v4
+++ v5
@@ -6,6 +6,8 @@
 Strand-seq allows us to align contigs without overlapping sequence.  Here, we can use the strand inheritance pattern as a unique signature. Each contig will either have a template state of WW, WC or CC.  If two contigs share the same template state in every single library studied, it is extremely likely that these contigs are spatially close.  We can therefore use a measure of concordance between contigs measure the similarity, and the closer the contigs are to 100% concordance, the more likely they are to be very close together.  Similar to sequence overlap, if contigs are mis-oriented with respect to each other we need to 'reverse complement' our strand states; a WW signature becomes CC, but a WC signature remains the same (becomes CW). We therefore need to measure concordance in the forward orientation (what is the similarity where both are WW, both are CC or both are WC) and in the reverse orientation (what is the similarity where one is WW and the other is CC, where one is CC and the other is WW, or where both are WC).

 While many genomes have been given the designation of 'completed', there are a number of large and small contigs that have yet to be placed on the scaffold. These contigs locate somewhere within the genome, but have yet to be placed due to a lack of sequence overlap.  These fragments likely locate within the myriad sequence gaps present in these genomes, which presumably have a high degree of sequence complexity. 
+
+![BAIT pipeline for orphan fragment analysis](http://i295.photobucket.com/albums/mm141/rareaquaticbadger/BAIT/Slide2_zps220d5a74.jpg "BAIT pipeline for orphan fragment analysis")

 Typical Run
 ===========
&lt;/pre&gt;
&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Mark_HIlls</dc:creator><pubDate>Wed, 03 Jul 2013 17:34:04 -0000</pubDate><guid>https://sourceforge.nete06aa6cd6c124fb405b35a09c1321c39c948ab70</guid></item><item><title>Orphan Fragment Localization modified by Mark_HIlls</title><link>https://sourceforge.net/p/bait/wiki/Orphan%2520Fragment%2520Localization/</link><description>&lt;div class="markdown_content"&gt;&lt;pre&gt;--- v3
+++ v4
@@ -5,7 +5,7 @@

 Strand-seq allows us to align contigs without overlapping sequence.  Here, we can use the strand inheritance pattern as a unique signature. Each contig will either have a template state of WW, WC or CC.  If two contigs share the same template state in every single library studied, it is extremely likely that these contigs are spatially close.  We can therefore use a measure of concordance between contigs measure the similarity, and the closer the contigs are to 100% concordance, the more likely they are to be very close together.  Similar to sequence overlap, if contigs are mis-oriented with respect to each other we need to 'reverse complement' our strand states; a WW signature becomes CC, but a WC signature remains the same (becomes CW). We therefore need to measure concordance in the forward orientation (what is the similarity where both are WW, both are CC or both are WC) and in the reverse orientation (what is the similarity where one is WW and the other is CC, where one is CC and the other is WW, or where both are WC).

-While many genomes have been given the designation of 'completed', there are a number of large and small contigs that have yet to be placed on the scaffold. 
+While many genomes have been given the designation of 'completed', there are a number of large and small contigs that have yet to be placed on the scaffold. These contigs locate somewhere within the genome, but have yet to be placed due to a lack of sequence overlap.  These fragments likely locate within the myriad sequence gaps present in these genomes, which presumably have a high degree of sequence complexity. 

 Typical Run
 ===========
@@ -34,8 +34,16 @@
 Orphan fragment plot
 --------------------

+The orphan fragment plots contain
+
+![Orphan fragment plot](http://i295.photobucket.com/albums/mm141/rareaquaticbadger/BAIT/Unknown_31212_q20_2012-12-04_Page_01_zps54e74dff.jpg "Orphan fragment plot")
+
 Orphan fragment heatmap
 -----------------------

+IMAGE AND DESCRIPTION TO FOLLOW
+
 Orphan fragment table
 ---------------------
+
+The orphan fragment table provides a list of locations of particular fragments.  It gives the fragment name and the predicted interval location of the fragment within the genome.  If the -G option is used and a gap region or regions are coincident with this location, these gap sites are also displayed. The percentage concordance and the number of libraries covered are also given.
&lt;/pre&gt;
&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Mark_HIlls</dc:creator><pubDate>Fri, 28 Jun 2013 21:39:11 -0000</pubDate><guid>https://sourceforge.net38adf4650ef85f8411055c6b8c940c1e511e0265</guid></item><item><title>Orphan Fragment Localization modified by Mark_HIlls</title><link>https://sourceforge.net/p/bait/wiki/Orphan%2520Fragment%2520Localization/</link><description>&lt;div class="markdown_content"&gt;&lt;pre&gt;--- v2
+++ v3
@@ -1,7 +1,11 @@
 Introduction
 ============

+Contigs are primarily assembled using sequence complementarity, where if two contigs share sequence overlap they can be merged together to form a bigger contig.  In this respect, the sequence complementarity can be through of as a unique signature that identified that two contigs belong together.  In some cases, contigs will be misoriented with respect to each other, and thus there will still be sequence overlap, but one of the contigs will match only when the *reverse complement* of the sequence is used; the signature is still there, but requires flipping.

+Strand-seq allows us to align contigs without overlapping sequence.  Here, we can use the strand inheritance pattern as a unique signature. Each contig will either have a template state of WW, WC or CC.  If two contigs share the same template state in every single library studied, it is extremely likely that these contigs are spatially close.  We can therefore use a measure of concordance between contigs measure the similarity, and the closer the contigs are to 100% concordance, the more likely they are to be very close together.  Similar to sequence overlap, if contigs are mis-oriented with respect to each other we need to 'reverse complement' our strand states; a WW signature becomes CC, but a WC signature remains the same (becomes CW). We therefore need to measure concordance in the forward orientation (what is the similarity where both are WW, both are CC or both are WC) and in the reverse orientation (what is the similarity where one is WW and the other is CC, where one is CC and the other is WW, or where both are WC).
+
+While many genomes have been given the designation of 'completed', there are a number of large and small contigs that have yet to be placed on the scaffold. 

 Typical Run
 ===========
&lt;/pre&gt;
&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Mark_HIlls</dc:creator><pubDate>Fri, 28 Jun 2013 19:43:16 -0000</pubDate><guid>https://sourceforge.net0f9cf7c6f5b3bbabbc24b276d9dfaf6c4d464cf2</guid></item><item><title>Orphan Fragment Localization modified by Mark_HIlls</title><link>https://sourceforge.net/p/bait/wiki/Orphan%2520Fragment%2520Localization/</link><description>&lt;div class="markdown_content"&gt;&lt;pre&gt;--- v1
+++ v2
@@ -0,0 +1,37 @@
+Introduction
+============
+
+
+
+Typical Run
+===========
+
+    BAIT -A 1 -rv -G gapFile.bed -o outputName
+
+or
+
+    BAIT -A 3 -rv -o outputName
+
+-A 1
+&gt;This option activates the orphan plotting pipeline that creates an ideogram plot indicating the most likely location of the orphan scaffold based on strand concordance.  This option is best used when there are &lt;100 orphan fragments as an ideogram is plotted for each orphan fragment present.
+
+-A 3
+&gt;This option activates the orphan plotting pipeline that creates a heatmap indicating the most likely location of the orphan scaffold based on strand dissimilarities.  This option is best used when there are &gt;100 orphan fragments as it makes data interpretation far easier.  This option cuts the genome by SCE location, and clusters all orphan fragments to each set of chromosome pieces in turn, creating one heatmap for each chromosome. The locations of fragments within each chromosome is determined as they will arrange in an optimal cluster between the chromosome pieces.
+
+-G
+&gt;This option plots a bed-format file onto the standard BAIT ideogram.  The file produces white gaps along the ideogram to indicate regions that have not yet been sequenced. A gap file can be easily downloaded from the UCSC table browser for any organism and used.  In the context of the orphan fragment plotter, the gap file serves a second purpose.  If any gaps are contained the region in which the orphan fragment is located to, these gaps are printed as the most likely locations of the fragment within this region.
+
+-r
+&gt;Used to turn on the SCE identification algorithm. Triggers the CNV program DNAcopy to identify any intervals where the strand inheritance state changes from one format to another, for example, it will identify a region where reads map entirely to Watson on one side (WW) and to both Watson and Crick on the other side (WC). After the CNV algorithm has ascertained a large interval, a simple script then scans the interval looking for the point where the state change occurs.  Identifying SCE is essential in locating where orphan scaffolds fit on a chromosome, therefore, this option is required when running an orphan fragment assembly.
+
+Output Files
+============
+
+Orphan fragment plot
+--------------------
+
+Orphan fragment heatmap
+-----------------------
+
+Orphan fragment table
+---------------------
&lt;/pre&gt;
&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Mark_HIlls</dc:creator><pubDate>Fri, 28 Jun 2013 19:02:50 -0000</pubDate><guid>https://sourceforge.net960a57f31b4c7a473dad056cbd332fa90f34b024</guid></item><item><title>Orphan Fragment Localization modified by Mark_HIlls</title><link>https://sourceforge.net/p/bait/wiki/Orphan%2520Fragment%2520Localization/</link><description/><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Mark_HIlls</dc:creator><pubDate>Tue, 25 Jun 2013 17:43:19 -0000</pubDate><guid>https://sourceforge.net847240a3fd5335f9275d0839100aebf07bc3ba10</guid></item></channel></rss>