High-throughput sequencing has been widely used to find novel protein-binding RNAs, utilizing methods such as CLIP-Seq and Genomic SELEX. Typical analyses of the high-throughput data employ the use of a histogram of the aligned read count per base, the intensity of which indicates, for a specific region of an RNA, the underlying sequence's relative occurrence of binding events in the experiment. This technique can, however, obscure the precise locations of binding sites in close proximity to each other, which would be found when analyzing high-throughput data for enzymes whose activity require multiple RNA binding sites, as well as for overlapping distinct ncRNA loci derived from processing events. We present a density clustering approach based on the OPTICS algorithm which can detect any number of RNA binding sites indicated by the data without prior knowledge of the expected length or number of binding sites in the transcript.
Downloads:
0 This Week