Download Latest Version dsimM252v1.2.gtf.gz (4.4 MB)
Email in envelope

Get an email when there's a new version of Pelement-Select

Home / annotation
Name Modified Size InfoDownloads / Week
Parent folder
dsimM252v1.2.gtf.gz 2023-05-05 4.4 MB
prepare_annotation.Rmd 2023-05-05 31.1 kB
dsimM252v1.2.fa.README 2023-05-05 1.2 kB
Totals: 3 Items   4.4 MB 0
bcftools consensus -f dsimM252v1.1.fa -i 'QUAL>0 & REF~"N" & (GT="1/1" | GT="2/2") & STRLEN(ALT)=STRLEN(REF)' Pool_25a_nlocs.vcf.gz > dsimM252v1.2.fa

# number of Ns remaining
[vetlinux05@pgnsrv043 M252_Nlocs]$ sed '/^>/d;s/[^N]//g;/^$/d' dsimM252v1.2.fa | tr '\n' '+' | sed 's/+//g' | wc -c
838158
[vetlinux05@pgnsrv043 M252_Nlocs]$ sed '/^>/d;s/[^N]//g;/^$/d' dsimM252v1.1.fa | tr '\n' '+' | sed 's/+//g' | wc -c
856534
# only named chromosomes incl. mtDNA
[vetlinux05@pgnsrv043 M252_Nlocs]$ sed '/^>/d;s/[^N]//g;/^$/d' 1.fa | tr '\n' '+' | sed 's/+//g' | wc -c
513866
[vetlinux05@pgnsrv043 M252_Nlocs]$ sed '/^>/d;s/[^N]//g;/^$/d' 2.fa | tr '\n' '+' | sed 's/+//g' | wc -c
502052
# only 2L, 2R, 3L, 3R, 4, X
[vetlinux05@pgnsrv043 M252_Nlocs]$ sed '/^>/d;s/[^N]//g;/^$/d' 1.fa | tr '\n' '+' | sed 's/+//g' | wc -c
363190
[vetlinux05@pgnsrv043 M252_Nlocs]$ sed '/^>/d;s/[^N]//g;/^$/d' 1.fa | tr '\n' '+' | sed 's/+//g' | less -S
[vetlinux05@pgnsrv043 M252_Nlocs]$ sed '/^>/d;s/[^N]//g;/^$/d' 2.fa | tr '\n' '+' | sed 's/+//g' | wc -c
353967

# fills in close to 18000 Ns without changing the sequence lengths, about half of them in the major chromosomes
Source: dsimM252v1.2.fa.README, updated 2023-05-05