geneBody_coverage graphs misleading
More and more researchers are now using HLA sequences as part of the genome reference, which means the sequence name will have * and : in the sequence names http://hla.alleles.org/nomenclature/naming.html. While using HLA in the reference the bam file have sequence name (chrom) have : in the name which makes sam.py fail at these location (chrom, i_st, i_end) = i.split(":") . Will it be possible to update the code to use different delimiter instead of :
Not sure which GTF2BED you were using. You can either download BED file directly from UCSC table browser or convert the GTF to BED using https://bedops.readthedocs.io/en/latest/content/reference/file-management/conversion/gtf2bed.html
Hi Rimpi, I'm experiencing the same issue that you're having as well - were you ever able to find a fix or does anyone else have some recommendations?
Hi Maria, I was trying to use read_distribution.py get an error, I downloaded the gtf files from UCSC and converted to bed using gtf2bed. Can you please help me. processing /projects/ccs/schurerlab/Rimpi/Genomes/UsedForSTAR/hg38.ensGene.bed ... Traceback (most recent call last): File "/usr/local/bin/read_distribution.py", line 302, in <module> main() File "/usr/local/bin/read_distribution.py", line 191, in main intergenic_down1kb_base,intergenic_down5kb_base,intergenic_down10kb_base) = process_gene_model(options.ref_gene_model)...
Hi, We are trying to use RSeQC with our RNA Seq samples. I have tried using several BED files (the hg19 one provided, the RefSeq and UCSC BED files downloaded from the folder on source forge) but none of them seem to be working. When I use the 'junction_annotation.py' module, my output shows that 100% of the transcripts assembled are novel which is not true. I am not sure what is going on and why this is the case ? Any help/thoughts on this would be highly appreciated. Thank you. Aditi Kulkarni
Hi, We are trying to use RSeQC with our RNA Seq samples. I have tried using several BED files (the hg19 one provided, the RefSeq and UCSC BED files downloaded from the folder on source forge) but none of them seem to be working. When I use the 'junction_annotation.py' module, my output shows that 100% of the transcripts assembled are novel which is not true. I am not sure what is going on and why this is the case ? Any help/thoughts on this would be highly appreciated. Thank you. Aditi Kulkarni Electronic...
Hi, We are trying to use RSeQC with our RNA Seq samples. I have tried using several BED files (the hg19 one provided, the RefSeq and UCSC BED files downloaded from the folder on source forge) but none of them seem to be working. When I use the 'junction_annotation.py' module, my output shows that 100% of the transcripts assembled are novel which is not true. I am not sure what is going on and why this is the case ? Any help/thoughts on this would be highly appreciated. Thank you. Aditi Kulkarni Bioinformatics...
The BED fie must have 12-column (standard BED format, https://genome.ucsc.edu/FAQ/FAQformat.html#format1). Otherwise, all non-standard BED lines will be skipped.
The BED fie must have 12-column (standard BED format, https://genome.ucsc.edu/FAQ/FAQformat.html#format1). Otherwise, all non-standard BED files will be skipped.
I have the same problem using a UCSC formatted BED file: it contains 8 columns. All the lines are skipped. Any idea ?
Thanks for the response. It is true that I don't know which strand the assembled transcripts belong to. However, if the library prep was strand-specific, then for each assembled transcript all F reads that map to it should map in the same direction, right? If it was unstranded, then the F reads would map equally in both directions. Or am I missing something?
RSeQC compares the "strand of reads" (after alignment) to the "strand of gene" (from your BED file) to determine if the RNA-seq experiment is strand-specific or not. Without reference genome and refernce gene model, RSeQC cann't tell if the library was prepared as strand-specific. After de novo assembly, you basicaly get mRNA sequences (partial or complete), but you still don't know which strand (+ or -) the mRNA is encoded. Liguo
RSeQC compares the "strand of reads" (after alignment) to the "strand of gene" (from your BED file) to determine if the RNA-seq experiment is strand-specific or not. Without reference genome and refernce gene model, RSeQC cann't tell if the library was prepared as strand-specific. After de novo assembly, you basicaly get mRNA sequences (partical or complete), but you still don't know which strand (+ or -) the mRNA is encoded. Liguo
RSeQC compares the "strand of reads" (after alignment) to the "strand of gene" (from your BED file) to determine if the RNA-seq experiment is strand-specific or not. Without reference genome and refernce gene model, RSeQC cann't tell if ibrary was prepared as strand-specific. After de novo assembly, you basicaly get mRNA sequences (partical or complete), but you still don't know which strand (+ or -) the mRNA is encoded. Liguo
BED file for infer_experiment.py
HI Maria, The bed file was downloaded from UCSC table browser, I have no idea why...
Thanks for your reply. I am sorry, I was not clear. I mean, why only rRNA genes are...
Hi Maria, I just downloaded hg19_RefSeq.bed.gz, it contains 63169 transcripts (i.e...
question about hg19_RefSeq.bed file
Great stuff, thanks for the quick response! We work with a lot of non-model organisms...
Hi Phil, broken link has been fixed. We have no plan to support GTF as input. Most...
GTF to BED conversion link broken