IP Address's as Keywords

John Black
2014-06-24
2014-07-10
  • John Black

    John Black - 2014-06-24

    Using v6.6.6 (Friday the 13th Edition). How can I get IP address's to be highlighted using "user define language"?

     
  • Loreia2

    Loreia2 - 2014-06-24

    Hi John,

    UDL does not support something so complex. Sorry.
    This will be possible only when regex support is added,

    Regards,
    Loreia

     
    • John Black

      John Black - 2014-06-24

      thanks

       
  • Thomas

    Thomas - 2014-06-25

    Moin John,

    As Loreia mentioned this is not possible with UDL.

    As work around you can try using the "mark" function of the find dialog. As you can see in the attached screen shot you must enter a regular expression for the IP adress (some other ragular expression for IP addresses can be found at http://www.regular-expressions.info/examples.html). If you check the "Mark line" option all lines having an IP address are marked and you can quickly navigate between these lines (using the F2-key).

    Regards,
    Thomas

     
  • THEVENOT Guy

    THEVENOT Guy - 2014-06-27

    Hi, John Black, and All

    The regex \b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b and the method, described by Thomas, are quite correct to match valid IPv4 addresses

    However, this regex may also detect wrong IP addresses as, for example, 999.999.999.999 or 100.200.020.050 ( instead of the right syntax 100.200.20.50 )

    So, to highlight ANY VALID IPv4 address ONLY, in the current text, follow the method below :

    • Open the Search dialog ( CTRL + F )

    • Click on the fourth tab, named Mark

    • Fill in the regex, below, in the Find what field :

    \b(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(?1)){3}\b

    • Check the Bookmark line option

    • Check the Purge for each search option

    • Check the Wrap around option

    • Select the Regular expression search mode

    • Click on the Mark All button

    Et voilà !


    NOTES :

    The FIRST part (25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d) is looking for a VALID INDIVIDUAL number of an IPv4 address :

    • 25[0-5]   represents any number between 250 and 255
    • 2[0-4]\d represents any number between 200 and 249
    • 1\d\d     represents any number between 100 and 199
    • [1-9]?\d represents any number between 10 and 99 or any number between 0 and 9, without ANY LEADING ZERO

    This regex, of a VALID INDIVIDUAL number of an IPv4 address, is stored as the group1

    Then, a COMPLETE address IPv4 can be seen as a WORD, formed with FOUR valid numbers, as described above, separated with a DECIMAL point. So, the final regex is \b(.....)(\.(?1)){3}\b, where the (.....) form represents the regex, explained above .

    IMPORTANT :

    Just notice that the reference (?1) aims to the EXACT regex of the group1 :
    ( 25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d ) and NOT to the CURRENT value of the group1.

    You still don't see the difference, between the two forms \1 and (?1) ?

    An other simple example, to make it clear :

    Consider the simple regex (\d+)_\1. It matches, for example, the two strings 123_123 and 45678_45678, but NOT the strings 123_45678 or 45678_123, because the \1 form is a back reference to the current contains of the group1 !

    On the contrary, the regex (\d+)_(?1) will match these four strings, because the
    (?1) form is a recursive call to the definition of the group1 ( \d+ ), which is used outside the group1 itself !


    Let's go back to our regex ! It ignores ALL NON VALID IPv4 addresses, like :

    • 201.45.257.300  ( Number GREATER than 255 )
    • 123.099.04.200  ( NON significant ZERO, beginning a number )
    • 100.200.3        ( LESS than FOUR blocks of numbers )
    • bar1.1.1.1foo     ( IPv4 address GLUED in a text )
    • 12.34. 56.78     ( NON DIGITS and NON DECIMAL POINT in IPv4 address )

    But, ALL the VALID addresses, below, are detected :

    • 201.45.255.255

    • 123.99.4.200

    • 100.200.3.0

    • bar 127.0.0.1 foo

    • 12.34.56.78

    and also :

    • 0.0.0.0

    • 1.1.1.1

    • 1.10.100.200

    • 127.0.0.1

    • 255.255.255.255

    • 200.100.10.0


    REMARK :

    We could also use a named variable, as the word 'Byte', for the group1. Then, the SEARCH regex would become :

    \b(?<Byte>25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(?&Byte)){3}\b

    Hope that this additional explanation will be useful to you !

    Cheers,

    guy038

    You'll find good documentation, about the new Boost C++ Regex library ( similar to the PERL Regular Common Expressions ) used by Notepad++, since the 6.0 version, at the TWO addresses below :

    http://www.boost.org/doc/libs/1_48_0/libs/regex/doc/html/boost_regex/syntax/perl_syntax.html

    http://www.boost.org/doc/libs/1_48_0/libs/regex/doc/html/boost_regex/format/boost_format_syntax.html

    • The FIRST link explains the syntax, of regular expressions, in the SEARCH part

    • The SECOND link explains the syntax, of regular expressions, in the REPLACEMENT part

     
    Last edit: THEVENOT Guy 2014-06-27