Menu

#386 Support conventional regular expression

closed
rules (229)
5
2014-08-26
2005-12-07
No

The current XPath serach does not support the
conventional regular expression.

For example the user would like to search for hard
coded static IP addresses a regular expression would
look like
\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b

This kind of search on string literals would provide
great power to user to extend XPath Rules.

Discussion

  • Tom Copeland

    Tom Copeland - 2005-12-07

    Logged In: YES
    user_id=5159

    Hi Siva -

    Thanks for the idea! It looks like we could do this via a
    Jaxen extension function:

    http://jaxen.org/extensions.html

    Unless there's an easier way to do it... I'm not sure. I'll
    ask about it on the Jaxen user's list...

    Yours,

    tom

     
  • Tom Copeland

    Tom Copeland - 2005-12-08

    Logged In: YES
    user_id=5159

    Hi Siva -

    OK, looks like an extension function is the way to go:

    http://archive.jaxen.codehaus.org/user/msg00883.html

    Let's see - do we want to use JDK 1.4 regex package? Or
    some third party package - Jakarta ORO or some such?

    Yours,

    tom

     
  • Elliotte Rusty Harold

    Logged In: YES
    user_id=226817

    A couple of things to watch out for:

    1. 127.0.0.1 (loopback address) is probably OK to see in
      source code

    2. Subnet masks might look like IP addresses. I'm not sure
      which ones are common. Maybe 255.255.255.0

    3. Don't forget IPv6 addresses (though you might not want to
      grab those in your first pass at the problem)

     
  • Sivakumar Mambakkam

    Logged In: YES
    user_id=154590

    Here was my attempt to do some work on it...
    The Source: =========================
    String pattern = ".**";
    String line = "xxx.xxx.xxx.xxx";
    if (Pattern.matches(pattern, line)) {
    System.out.println(line + " matches \"" + pattern + "\"");
    } else {
    System.out.println("NO MATCH");
    }
    =====================================
    To Trigger this logic: ===========================
    have a new Rule Class called xpath-regex which will work
    based on the following NODE structure of properties

    <properties>
    <property name="xpath-regex">
    <value>
    <![CDATA[</value></property></properties>

    //LocalVariableDeclaration/VariableDeclarator/VariableInitializer/Expression/PrimaryExpression/PrimaryPrefix/Literal
    ]]>

    <regex>
    <![CDATA[</regex>

    \b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b
    ]]>


    ==================================================

    This is another package I thought we could look at
    http://jregex.sourceforge.net/

     
  • Sivakumar Mambakkam

    Logged In: YES
    user_id=154590

    Right now the target I have in mind is only to provide for a
    regular expression search on a "STRING LITERAL". I am trying
    to use the IP Address concept to show this could be used.

    This way we can allow the user to worry about what should or
    should not be hard-coded in their source code :-)

    How many time have we come across these hard-coded making
    them un-portable and ugly :-)

    In general I agree with "Elliotte's" comment of what IP
    Address would be allowed. We could have this information as
    part of Description oand/or Sample code.

     
  • Elliotte Rusty Harold

    Logged In: YES
    user_id=226817

    YAGNI. If you have a use case or cases that really require
    regular expressions, then let's consider it, but let's not
    try to solve all problems in advance of actual user
    requirements. This particular use case I think would be more
    appropriately addressed without regular expressions.

     
  • Sivakumar Mambakkam

    Logged In: YES
    user_id=154590

    I understand... But the request for this feature is not
    aimed at solving one particular case.

    To provide a limited background:

    We package PMD as part of our product Optimal Advisor
    [http://www.compuware.com/products/optimalj/2911_ENG_HTML.htm]

    In our current release we a provide a UI wrapper to the user
    to be able to build custom coding rules that would enable
    them to make productive use of the tool. The UI is a simple
    interface where the end-user can fill out a form and have
    the rule added to a existing ruleset.

    After the current release of our product bundled with PMD,
    we have had few requests from our Engineers on the filed for
    some extended rules.

    One such request was: the need to be able to pass regular
    expression as a parameter in string search.
    ================

    The above is the reason I felt a proper regular expression
    support would provide the customer with ability to not only
    build Custom Rules using XPath, but also extend the XPath
    queries with regular expression.

     
  • Tom Copeland

    Tom Copeland - 2005-12-15

    Logged In: YES
    user_id=5159

    OK, I've gotten Jakarta-ORO 2.0.8 and have started to fiddle
    with it... seems doable...

    Yours,

    Tom

     
  • Tom Copeland

    Tom Copeland - 2005-12-16

    Logged In: YES
    user_id=5159

    It works! You can actually write XPath like:

    //ClassOrInterfaceDeclaration[regexp( @Image, '/Foo/' )]

    and it'll return the proper nodes. Good times. To use it,
    you'll need to download some new jar files from here:

    http://infoether.com/~tom/siva/pmd-3.4.jar
    http://infoether.com/~tom/siva/jakarta-oro-2.0.8.jar

    and then you can use PMD as usual; just make sure that oro
    gets in the CLASSPATH.

    I'll polish this up a bit, write some unit tests, and then
    check it in... fun stuff!

    Yours,

    Tom

     
  • Tom Copeland

    Tom Copeland - 2005-12-16

    Logged In: YES
    user_id=5159

    Oops, there were some problems... I've uploaded a new
    version, that should work fine. Here's a demo:

    ============================
    $ cat Foo.java
    public class Foo {
    public class Fbb {}
    public class F1o {}
    }
    $ ./pmd.sh Foo.java text scratchpad -debug
    In JDK 1.4 mode
    Loaded rule RegexTest
    Processing /home/tom/pmd/pmd/bin/Foo.java
    /home/tom/pmd/pmd/bin/Foo.java:1 regex test
    /home/tom/pmd/pmd/bin/Foo.java:3 regex test
    ============================

    and here's the test rule:

    ============================
    <rule name="RegexTest" message="regex test" class="">
    <description>
    test
    </description>
    <properties>
    <property name="xpath">
    <value>
    <![CDATA
    //ClassOrInterfaceDeclaration[regexp(@Image, '/F?o/')

    ]]>
    </value>
    </property>
    </properties>
    <priority>3</priority>
    <example>
    </example></rule>

     </example>
     </rule>
    

    ============================

    Fun stuff!

    Tom

     
  • Tom Copeland

    Tom Copeland - 2005-12-19

    Logged In: YES
    user_id=5159

    Hi Siva -

    OK, this is checked in to CVS. In a nutshell, here's how to
    use it:

    //ClassOrInterfaceDeclaration[regexp(@Image, '/F?o/')]

    And there's a dependency on jakarta-oro, so you'll need to
    put that in your CLASSPATH to use this. You can get an
    updated pmd-3.4.jar file here:

    http://infoether.com/~tom/pmd-3.4.jar

    that contains this new feature.

    Thanks for the suggestion!

    Yours,

    Tom

     
  • Sivakumar Mambakkam

    Logged In: YES
    user_id=154590

    Tom:
    Thanx a lot... That was nice piece of work and a quick turn
    around...

    I just tested it out with PMD/Designer and Viewer....

    PMD and Designer are working fine, looks like the Viewer is
    still missing some code...

    Any way ... here is my rule and test case... Think anyone
    would find this useful..?

    <rule name="IPHardCodeCheck" 
          message="Check for hard coded IP Addresses"
          class="net.sourceforge.pmd.rules.XPathRule">
        <description>
            Check for hard coded IP Address usage. Not a
    

    nice thing to do!?!?

    <properties>
    <property name="xpath">
    <value>
    <![CDATA[</value></property></properties>

    //PrimaryExpression/PrimaryPrefix/Literal[regexp(@Image,'/(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)/')]
    ]]>



    <priority>3</priority>
    <example>
    <![CDATA
    public class ConnectTo {
    public void testMethod(Object obj){
    //..........do lots of stuff
    String sIpAddress = "10.16.20.32";
    // do the work
    //..........do loads more stuff
    }
    }
    ]>
    </example>

    Thanx

    Siva

     
  • Tom Copeland

    Tom Copeland - 2005-12-22

    Logged In: YES
    user_id=5159

    Hi Siva -

    Thanx a lot.

    No problemo! By the way, I've modified it slightly so that
    you don't have to wrap the regular expression in //, so now
    you can just do this:

    //ClassOrInterfaceDeclaration[regexp(@Image, 'F?o')]

    vs this;

    //ClassOrInterfaceDeclaration[regexp(@Image, '/F?o/')]

    It'll save a few characters here and there, good times.

    looks like the Viewer is
    still missing some code...

    Oop, yup, haven't updated that, will do.

    my rule and test case

    Hm, you know, I'm not sure... maybe post it to the forums
    and see if anyone has thoughts on it?

    I'll almost certainly use it as an example in a blog entry :-)

    Yours,

    Tom

     
  • Tom Copeland

    Tom Copeland - 2005-12-22

    Logged In: YES
    user_id=5159

    Viewer is fixed now, new pmd-3.4 uploaded:

    http://infoether.com/~tom/pmd-3.4.jar

    Yours,

    Tom

     
  • Sivakumar Mambakkam

    Logged In: YES
    user_id=154590

    Actually my sample comes from the site http://www.regular-
    expressions.info/

     
  • Tom Copeland

    Tom Copeland - 2005-12-29

    Logged In: YES
    user_id=5159

    Ah, OK, I'll tweak that blog entry then, thanks!

    Tom

     
  • Tom Copeland

    Tom Copeland - 2005-12-29

    Logged In: YES
    user_id=5159

    Hi Siva -

    One more change - Daniel Sheppard noted that this function
    in XPath 2.0 and is called "matches" in the spec:

    http://tomcopeland.blogs.com/juniordeveloper/2005/12/using_regular_e.html

    I think we should rename it from "regexp" to "matches" so as
    to align with the spec... sound OK to you?

    Yours,

    Tom

     
  • Tom Copeland

    Tom Copeland - 2006-01-04

    Logged In: YES
    user_id=5159

    Hi Siva -

    I went ahead and made this change and uploaded a new jar
    file... so now this should work:

    //ClassOrInterfaceDeclaration[matches(@Image, 'F?o')]

    I'll go ahead and mark this 'pending',

    Yours,

    Tom

     
  • Sivakumar Mambakkam

    Logged In: YES
    user_id=154590

    Hi Tom-

    Works fine and I have tested it with a couple of test cases..

    regards

    Siva

     
  • Tom Copeland

    Tom Copeland - 2006-01-05

    Logged In: YES
    user_id=5159

    Hi Siva -

    Cool, thanks for the confirmation!

    Yours,

    Tom

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.