I have a (half-backed) workaround for the https://-issue. The solution is half-backed because SciTE does not use POSIX ERE (with alternation) but only POSIX BRE (without alternation). Notepad++ unfortunately does use SciTEs RegEx parser (https://sourceforge.net/apps/mediawiki/notepad-plus/index.php?title=Unsupported_Regex_Operators). So I could not use alternation to separate the different URI-Schemas. Now here is a description of what I have done:
< const char *urlHttpRegExpr = "http://[a-z0-9_\\-\\+.:?&@=/%#]*";
> const char *urlHttpRegExpr = "[A-Za-z]+://[A-Za-z0-9_\\-\\+~.:?&@=/%#]+";
For URI-Schema the RegEx (\w+) and (\D+) is not usable because "http1://" or "htt p://" would be matched. So I used a set of characters: "[A-Za-z]+". I do not know if there are URI-Schemas (http://tools.ietf.org/html/rfc3986) with big capitals. If they do not exist remove them from the set! If the URI-Schema length is limited in some way or has a minimum size we can attach this limits to the RegEx-string.
I also changed the alphabet range for the authority and path. Now big capitals are matched. With the old implemantation the match of URL
will stop at
At least I changed the quantifier for the authority and path from '*' into '+'. Without this change any single URI-Schema without authority and path would be also underlined and clickable in Notepad++ to mark a link (IMHO a wrong behaviour).
For a fast check I used the Notepad++ integrated RegEx-Search on the following lines:
abc htt p://abc
Another solution could be using a new RegEx-Parser like the small http://www.pcre.org/. This can be attached to SciTE. Look at
http://www.scintilla.org/ScintillaDoc.html and search for "A different regular expression library can be". Maybe this could be used as plugin?
At the end a warning:
With my solution the called application has to deal with the URI-Schema because oops:// is also matched as well known protocolls like http://, ftp:// and so on. The behaviour "let the called application decide what to do with the data" was a discussed security issue called the "URI Handling Vulnerability" and was not limited to MS IE and MS Windows (see:
But if you look into the discussions spreed around the internet everybody wants to blame the usual suspects. IMHO this problem is still unsolved because only Microsofts Windows SHELL32.dll and some browsers are fixed. The current URL-handling in Notepad++ could be also affected by this issue. Long story short: My solution does not implement a new security issue but implements an extended detection of URIs which was requested by some Notepad++ users.