Re: [Useq-users] balanced number of reads?

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hello Noboru,

I'm not seeing the increase in the number of regions when you subsample the
input control to match the chIP sample.

I see 283 regions with the full control and 18 with the matched control when
thresholding using a qvalue of 20 (0.01) and a log2Ratio of 1 (2x).

Here's what I did:

1) Run Tag2Point to convert your bed datasets to binary PointData
2) Run the PointDataManipulator to filter out duplicate reads. Both datasets
look good with 94% unique
3) Run ScanSeqs to window scan your data
4) Run EnrichedRegionMaker to collapse overlapping windows that exceed the
above thresholds into a list of putative peaks.

For the reduced control dataset, I used the SubSamplePointData to randomly
toss duplicate filtered input PointData to 3398890 and then ran ScanSeqs and
the EnrichedRegionMaker.

I wonder where the discrepancy occurred? I've attached the two spread sheet
results from the EnrichedRegionMaker.

-cheers, D

-- 
David Austin Nix, PhD
Bioinformatics Shared Resource
Huntsman Cancer Institute, Room 3165
2000 Circle of Hope, Salt Lake City, UT 84112
(801) 587-4611
dav...@hc...
http://bioserver.hci.utah.edu

------ Forwarded Message
From: Noboru Jo Sakabe <ns...@uc...>
Date: Thu, 06 Jan 2011 11:02:53 -0600
To: David Nix <dav...@gm...>
Subject: genome build

    Hi David, I forgot to mention it's mm9.

David Nix wrote: 
>  
> Hmm, that's a bit worrying.  There's no need to balance reads with USeq.
> This is internally controlled. More data should increase the number of
> regions returned at a given FDR, not decrease it. Your result is rather odd?
> Would you mind posting the data to a web accessible directory somewhere.
> Label it chIP and Input and let me know what genome build it is, I'd like to
> run some tests.
> 
> -cheers, D
> 
> 
> On 1/5/11 3:41 PM, "Noboru Jo Sakabe" <ns...@uc...>
> <mailto:ns...@uc...>  wrote:
> 
>   
>  
>>  
>>     Hi David, I ran Useq on a sample that has a lot fewer reads than input.
>>     I got very few peaks.
>>     Then I balanced treatment and input, randomly selecting reads from
>> input.
>>     Then I got ~14k peaks at FDR 4%. QuEST had also found a similar
>> number of peaks.
>>     I know that balancing reads is an issue in MACS, I would like to
>> know if this is also true for Useq. I believe it is, given my results,
>> but could you comment on this?
>>     Thank you!
>> 
>> noboru
>> 
----------------------------------------------------------------------------->>
-
>> Learn how Oracle Real Application Clusters (RAC) One Node allows customers
>> to consolidate database storage, standardize their database environment, and,
>> should the need arise, upgrade to a full multi-node Oracle RAC database
>> without downtime or disruption
>> http://p.sf.net/sfu/oracle-sfdevnl
>> _______________________________________________
>> Useq-users mailing list
>> Use...@li...
>> https://lists.sourceforge.net/lists/listinfo/useq-users
>>     
>>  
>  
> 
> 
>   

------ End of Forwarded Message