Menu

RegEx coverage

Anonymous
2015-11-24
2016-07-15
  • Anonymous

    Anonymous - 2015-11-24

    Dear Marc and Metawatt users,

    I've got a FASTA file with headers that contain coverage information. The default RegEx used by MetaWatt to extract the coverage does not work properly Therefore I am now trying to find a new, working RegEx pattern.

    I tried several different patterns. I used online RegEx builders/testers (e.g. https://regex101.com/) to check wether the pattern actually works. Patterns validated by regex101 would for some reason not work in MetaWatt. How am I now going to find a RegEx that works?

    Ideally I would like to find the a working Regex patterns for the following type of headers:

    CLC bio (example)
    >TruPE-73_S88_L001_R1_001_(paired)_trimmed_(paired)_contig_1 Average coverage: 30.02
    Example RegEx: "[\d]+.[\d]+" (works on regex101, not in MetaWatt)

    SPAdes (example)
    >NODE_1_length_450972_cov_7.14856_ID_1

    Any help would be greatly appreciated.
    Thanks!

    Jeroen

     

    Last edit: Anonymous 2015-11-24
  • Anonymous

    Anonymous - 2015-11-25

    This seems to work to grab a float out of the header...
    ([0-9]+)([.,]+[0-9]+)

     
  • Arthi Ramachandran

    Hi!
    I was wondering whether this issue has been solved? I am having the same issues with regex.
    Thanks!
    Arthi

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.