From: skaller <sk...@us...> - 2004-05-27 19:46:43
|
On Fri, 2004-05-28 at 03:02, Richard Jones wrote: > On Fri, May 28, 2004 at 01:38:37AM +1000, skaller wrote: > [...] > > I'll settle for a String.replace function (see original posting) which > would allow me to write the required escape_* functions quickly and > simply. I have no problem with String.replace! Also, I have no problem with a charset type, and a function: isa chset ch which tests if ch is in chset. I even have an implementation of charset... it works with my regexp package so I don't need any 'isa' function, I can use regexps instead. -------------------------------------- type charset_t val charset_of_string: string -> charset_t val charset_of_int_range: int -> int -> charset_t val charset_of_range: string -> string -> charset_t val charset_union: charset_t -> charset_t -> charset_t val charset_inv: charset_t -> charset_t val regexp_of_charset: charset_t -> regexp_t val regexp_underscore: regexp_t val eol: int val regexp_dot: regexp_t -------------- implementation ----------------------- type charset_t = bool array let charset_of_string s = let x = Array.make 256 false in for i = 0 to String.length s - 1 do x.(Char.code s.[i]) <- true done; x let charset_of_int_range x1 x2 = let x = Array.make 256 false in for i = x1 to x2 do x.(i) <- true done ; x let charset_of_range s1 s2 = if String.length s1 <> 1 then failwith "Charset range(first) requires string length 1" ; if String.length s2 <> 1 then failwith "Charset range(last) requires string length 1" ; let x1 = Char.code (s1.[0]) and x2 = Char.code (s2.[0]) in charset_of_int_range x1 x2 let charset_union x1 x2 = let x = Array.make 256 false in for i = 0 to 255 do x.(i) <- x1.(i) || x2.(i) done; x let charset_inv y = let x = Array.make 256 false in for i = 0 to 255 do x.(i) <- not y.(i) done; x let regexp_of_charset y = let res = ref REGEXP_epsilon in for i = 0 to 255 do if y.(i) then res := let r = REGEXP_string (String.make 1 (Char.chr i)) in if !res = REGEXP_epsilon then r else REGEXP_alt ( !res, r) done ; !res let regexp_underscore = regexp_of_charset (charset_of_int_range 0 255) let eol = Char.code '\n' let regexp_dot = regexp_of_charset ( charset_union (charset_of_int_range 0 (eol - 1)) (charset_of_int_range (eol + 1) 255) ) -- John Skaller, mailto:sk...@us... voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net |