grep OR in parathesis, e.g, (THIS\|THAT)

Help
Tharg
2014-04-24
2014-04-24
  • Tharg

    Tharg - 2014-04-24

    Hi -

    Doing this, for example :

    grep "(014826\|007381)\t[^\t]*$" aaaa.txt

    will never match 014826 or 007381. But this, for example, will :

    grep "(\n\n\|014826\|007381\|\n\n)\t[^\t]*$" aaaa.txt

    It seems to me that the first and last values in the parenthesis are not matched. Although if I try :

    grep "(\|014826\|007381\|)\t[^\t]*$" aaaa.txt

    it matches all lines, as expected, so the first and last are being used in some way.

    What am I doing wrong?

    Many thanks.

     
  • Greg

    Greg - 2014-04-24

    I'm not sure about what version of grep you're using grep what OS, but I've always used grep -E d when using EREs in a grep command.

    It would be helpful to see your raw data (or an example thereof) to help troubleshoot your command.

     
  • Tharg

    Tharg - 2014-04-24

    Thank you.

    My OS is win7

    Grep says it's

    grep (GNU grep) 2.5.1

    using a text file, "TEST.txt" consisting of only the following two lines

    ,,,,AA,
    ,,,,BB,

    and running

    grep "(AA\|BB),$" TEST.txt

    gets me no results.

    however running

    grep "(\n\n\|AA\|BB\|\n\n),$" TEST.txt

    returns both lines, as expected.

    I've tried messing with -e, -E and -P but I've just been trying stuff fairly randomly with these options and have had no real luck.

    E : it seems that the trailing comma and end-of-line in the regex isn't necessary to show the same (unexpected, by me) behaviour.

     
    Last edit: Tharg 2014-04-24
  • Greg

    Greg - 2014-04-24

    Using AIX 6.1.6.0 grep:

    grep -E "(AA|BB)" TEST.txt returns your text above

     
  • Tharg

    Tharg - 2014-04-24

    Thanks, you are right, this does work.

    However for some reason I now have another issue. If I match to (\t[^\t]*){7}$ at the end of the regex (so it looks in the 8th from last field in a tab delimited file) grep dies (after some successful searching it outputs some gibberish chars then "grep.exe has stopped working")

    It works fine if I use ^([^\t]*\t){12} at the beginning of the regex though (to look at the 13th field after the start or whatever).

    But that's another issue which I suspect isn't grep's fault, I'm just rambling now.

    Thanks again for the help.

     
  • Greg

    Greg - 2014-04-24

    Provide a sample of your text, and what you want to pull out using grep please.

     

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks