For regexkitlite, it would be cool to have somethig to escape the reserved characters.
Consider something like
NSString * S = @"Do you dance?";
NSString * regex = [NSString stringWithFormat:@"%@",[S stringByEscapingICUREControlCharacters]];
NSRange R = [searchString rangeOfRegex:regex];
In general the format string is much more complicated. Here is what I use:
- (NSString *)stringByEscapingICUREControlCharacters;
{
// see http://www.icu-project.org/userguide/regexp.html
// "Characters that must be quoted to be treated as literals are * ? + [ ( ) { } ^ $ | \ . /"
NSMutableString * MS = [NSMutableString stringWithString:self];
[MS replaceOccurrencesOfString:@"\\" withString:@"\\\\" options:NULL range:NSMakeRange(0,[MS length])];
[MS replaceOccurrencesOfString:@"*" withString:@"\\*" options:NULL range:NSMakeRange(0,[MS length])];
[MS replaceOccurrencesOfString:@"?" withString:@"\\?" options:NULL range:NSMakeRange(0,[MS length])];
[MS replaceOccurrencesOfString:@"+" withString:@"\\+" options:NULL range:NSMakeRange(0,[MS length])];
[MS replaceOccurrencesOfString:@"[" withString:@"\\[" options:NULL range:NSMakeRange(0,[MS length])];
[MS replaceOccurrencesOfString:@"(" withString:@"\\(" options:NULL range:NSMakeRange(0,[MS length])];
[MS replaceOccurrencesOfString:@")" withString:@"\\)" options:NULL range:NSMakeRange(0,[MS length])];
[MS replaceOccurrencesOfString:@"{" withString:@"\\{" options:NULL range:NSMakeRange(0,[MS length])];
[MS replaceOccurrencesOfString:@"}" withString:@"\\}" options:NULL range:NSMakeRange(0,[MS length])];
[MS replaceOccurrencesOfString:@"^" withString:@"\\^" options:NULL range:NSMakeRange(0,[MS length])];
[MS replaceOccurrencesOfString:@"$" withString:@"\\$" options:NULL range:NSMakeRange(0,[MS length])];
[MS replaceOccurrencesOfString:@"|" withString:@"\\|" options:NULL range:NSMakeRange(0,[MS length])];
[MS replaceOccurrencesOfString:@"." withString:@"\\." options:NULL range:NSMakeRange(0,[MS length])];
[MS replaceOccurrencesOfString:@"/" withString:@"\\/" options:NULL range:NSMakeRange(0,[MS length])];
return [NSString stringWithString:MS];
}
I forgot to mention that the above pethod would apply to search patterns.
I think replacement patterns should only need to escape the '$' and '\' characters.
regards
I'm of mixed thoughts about this. I can certainly understand why you'd want something like this, but getting it to behave correctly is a bit tricky. Correctly, in this case, is loosely defined as 'as the programmer expects, without surprises.'
There's also another way to handle situations like this. You can use \Q...\E to 'quote' parts of a string inside a regular expression, with the text in-between \Q and \E being treated as literal, with no special meaning of any regular expression characters.
Also, a much more compact, and faster, way of accomplishing this is to use the following:
[string stringByReplacingOccurrencesOfRegex:@"[\\*\\?\\+\\[\\(\\)\\{\\}\\^\\$\\|\\\\\.\\/]" withString:@"\\$0"]
I haven't tested the above, but I'm fairly sure it's correct. If not, it's close enough to give you an idea of what to do. Basically, for any character in the set of characters composed of "*?+[(){}^$|\./", replace that character with "\{MATCHED_CHARACTER}".
I'll leave this bug open for now and see if time and thought sheds any light on what the 'right thing' to do is.
For now, though, I'm inclined to include something in the Cookbook section of the documentation on how to escape strings with special characters.
Oh, I noticed that you're passing NULL to options: in the example you provided. This is technically incorrect, as the type for the options argument is 'RKLRegexOptions', which is typedef'd to uint32_t. NULL is defined as (void *)0, which is a pointer. This 'works' because in C, through a complex set of rules, this is "equivalent enough" to a integer of 0, and RKLNoOptions just happens to be 0 as well. Depending on what warning options you have in effect, you may or may not get a warning, but most warning levels will probably complain about something along the lines of "makes an integer from a pointer with out a cast" if you use NULL like this. Just FYI.