You can subscribe to this list here.
2003 |
Jan
|
Feb
(81) |
Mar
(97) |
Apr
(88) |
May
(80) |
Jun
(170) |
Jul
(9) |
Aug
|
Sep
(18) |
Oct
(58) |
Nov
(19) |
Dec
(7) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2004 |
Jan
(22) |
Feb
(9) |
Mar
(28) |
Apr
(164) |
May
(186) |
Jun
(101) |
Jul
(143) |
Aug
(387) |
Sep
(69) |
Oct
(14) |
Nov
(8) |
Dec
(99) |
2005 |
Jan
(10) |
Feb
(34) |
Mar
(24) |
Apr
(7) |
May
(41) |
Jun
(20) |
Jul
(3) |
Aug
(23) |
Sep
(2) |
Oct
(26) |
Nov
(41) |
Dec
(7) |
2006 |
Jan
(6) |
Feb
(3) |
Mar
(11) |
Apr
|
May
|
Jun
(5) |
Jul
(8) |
Aug
(20) |
Sep
|
Oct
(6) |
Nov
(5) |
Dec
|
2007 |
Jan
|
Feb
(1) |
Mar
|
Apr
(3) |
May
(2) |
Jun
|
Jul
|
Aug
(1) |
Sep
(7) |
Oct
(6) |
Nov
(19) |
Dec
(11) |
2008 |
Jan
|
Feb
(7) |
Mar
(9) |
Apr
(21) |
May
(42) |
Jun
(27) |
Jul
(28) |
Aug
(26) |
Sep
(16) |
Oct
(32) |
Nov
(49) |
Dec
(65) |
2009 |
Jan
(35) |
Feb
(20) |
Mar
(36) |
Apr
(42) |
May
(111) |
Jun
(99) |
Jul
(70) |
Aug
(25) |
Sep
(15) |
Oct
(29) |
Nov
(3) |
Dec
(18) |
2010 |
Jan
(10) |
Feb
(4) |
Mar
(57) |
Apr
(63) |
May
(71) |
Jun
(64) |
Jul
(30) |
Aug
(49) |
Sep
(11) |
Oct
(4) |
Nov
(2) |
Dec
(3) |
2011 |
Jan
(1) |
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
(1) |
Jul
(1) |
Aug
(2) |
Sep
|
Oct
|
Nov
|
Dec
(1) |
2012 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
|
2013 |
Jan
|
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2014 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(2) |
Jul
|
Aug
|
Sep
(1) |
Oct
|
Nov
(1) |
Dec
|
2015 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
|
2021 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
|
2022 |
Jan
|
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2023 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
(1) |
2024 |
Jan
(1) |
Feb
(3) |
Mar
(6) |
Apr
(2) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(1) |
2025 |
Jan
(2) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
(1) |
Sep
(5) |
Oct
|
Nov
|
Dec
|
From: Brian H. <bh...@sp...> - 2004-09-11 15:15:43
|
On Sat, 11 Sep 2004, Bardur Arantsson wrote: > I would have suggested "slice" just to be consistent with > ExtString.String.slice, but since it uses the (index,len) > approach instead of (first,last) and is also much stricter > it might be confusing to use the same name. Actually, (first, last) would be easier to implement. Is there a preference in the group? > > How about "sub"? That's what this operation is called for > regular arrays and strings... > I've changed the name to sub. Most recent versions attached. -- "Usenet is like a herd of performing elephants with diarrhea -- massive, difficult to redirect, awe-inspiring, entertaining, and a source of mind-boggling amounts of excrement when you least expect it." - Gene Spafford Brian |
From: Brian H. <bh...@sp...> - 2004-09-11 14:41:38
|
On Fri, 10 Sep 2004, Nicolas Cannasse wrote: > - remove all ;; because they are not needed These were me. I've found it usefull to have the ;; when first compiling modules because they help limit where errors can be. They can be actually usefull. -- "Usenet is like a herd of performing elephants with diarrhea -- massive, difficult to redirect, awe-inspiring, entertaining, and a source of mind-boggling amounts of excrement when you least expect it." - Gene Spafford Brian |
From: Brian H. <bh...@sp...> - 2004-09-11 14:37:23
|
On Fri, 10 Sep 2004, Jesse Guardiani wrote: > Actually, I tried this before I submitted the current implementation and I > couldn't get it to work. The problem seemed to be that Dllist has no > concept of Empty, so the length function becomes really difficult to > implement. I moved the concept of "empty" one level up in my implementation. This is because I didn't want to deal with options on the next/prev links. > > Brian, do you think it's possible to implement enums without the "valid" > test? If everyone thinks it's possible then I'll give it another shot. I just > didn't have much luck the first time around. Actually, changing things so that x.next == x is the sign of the end of the list should make enums easier. let rec enum_done = { data = Obj.magic (); next = enum_done; prev = enum_done };; let enum dlst = match dlst.head with | None -> Enum.empty () | Some(hd) -> let next r () = if (!r) == enum_done then raise Enum.no_more_elements else begin let rval = (!r).data in if (!r).next == (!r) then r := enum_done else r := (!r).next; rval end and count r () = let rec loop cnt node = if node.next == node then cnt + 1 else loop (cnt + 1) node.next in if (!r) == enum_done then 0 else loop 0 node in let rec clone r () = let r = ref (!r) in Enum.make ~next:(next r) ~count:(count r) ~clone:(clone r) in let r = ref hd in Enum.make ~next:(next r) ~count:(count r) ~clone:(clone r) ;; -- "Usenet is like a herd of performing elephants with diarrhea -- massive, difficult to redirect, awe-inspiring, entertaining, and a source of mind-boggling amounts of excrement when you least expect it." - Gene Spafford Brian |
From: Nicolas C. <war...@fr...> - 2004-09-11 14:16:47
|
> > > ocamlc/ocamlopt flag) so if I keep the compiler into core language (an > > > intersection between ocaml features and my language new features) > > > > Just out of curiosity, would it be possible to have some information about > > your new language (feel free to answer off-list if you want) ? > > Aw, answer on list please. Who know what cooperation > can be found. Many people are using Ocaml for developing > new languages. Talking about them does sometimes annoy > Ocaml users who aren't doing new languages .. but it > *is* what Ocaml is good for. Well I'm sorry I had first some talks privatly with Alain. Speaking french is more easy for exposing my idea and getting feedback :) Here's a translation of what I wrote : I'm thinking of keeping "core-ML" and adding several functionnalities in order to get introspection and dynamic loading. From a language point of view, it will looks more like CamlLight than OCaml, but a lot of work is needed for runtime and librairies. In particular, it will be possible to generate, compile, and execute code at runtime , then examine results using reflexion, so the language will be more dynamic. The goal is to have a language that helps writing other languages but also enable them to use the same runtime and interact together, a little like .net but more functionnal-oriented. Nicolas |
From: skaller <sk...@us...> - 2004-09-11 13:33:19
|
On Sat, 2004-09-11 at 22:45, Alain Frisch wrote: > On Sat, 11 Sep 2004, Nicolas Cannasse wrote: > > > ocamlc/ocamlopt flag) so if I keep the compiler into core language (an > > intersection between ocaml features and my language new features) > > Just out of curiosity, would it be possible to have some information about > your new language (feel free to answer off-list if you want) ? Aw, answer on list please. Who know what cooperation can be found. Many people are using Ocaml for developing new languages. Talking about them does sometimes annoy Ocaml users who aren't doing new languages .. but it *is* what Ocaml is good for. And would be even better at if it has some more lexer/parser libs. I actually have two parsers -- the Felix one and also I'm using Frontc/Cil to parse C for my wrapper generator. I've tried to extend the latter to handle C++ but got stuck trying to parse constructor declarations. Both systems are currently ocamllex/ocamlyacc based. -- John Skaller, mailto:sk...@us... voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net |
From: Alain F. <Ala...@en...> - 2004-09-11 12:46:01
|
On Sat, 11 Sep 2004, Nicolas Cannasse wrote: > ocamlc/ocamlopt flag) so if I keep the compiler into core language (an > intersection between ocaml features and my language new features) Just out of curiosity, would it be possible to have some information about your new language (feel free to answer off-list if you want) ? -- Alain |
From: Nicolas C. <war...@fr...> - 2004-09-11 12:37:37
|
> > However since my ocaml bootstrapped language will not get polymorphic > > variants I can't use your implementation of regexp... > > Well, at this stage you can regard it as an experiment. > Once a nice structure is discovered you can probably > 'freeze' part of the typing, and use ordinary variants > instead. The remaining part you can either Yes that might be possible... > I mention this because Felix generates C++ so I have > a far worse bootstrapping problem .. I have to implement > everything twice, once in Ocaml and then again in C++. > Clearly I'll use Felix code to generate the C++, > and it looks 'similar' to Ocaml in some ways, but still > it will be hard to maintain two compiler implementations. That's hard :) I choose what I think is a nice bootstrapping way : i'm writing in OCaml a parser for my language that can produce an ocaml AST (and works with -pp ocamlc/ocamlopt flag) so if I keep the compiler into core language (an intersection between ocaml features and my language new features) then I can compile the compiler using only ocaml. However for full boostrapping I need of course a lexer/parser library written in my language, that can be first written in Ocaml then translated, but since I want to experiment the syntax changes I should write them directly in new language. Nicolas |
From: Nicolas C. <war...@fr...> - 2004-09-11 12:33:00
|
> It just seems that lots of small utilities I've written > require simple (but "non-standard" parsers), almost all of > them handwritten up until now because anything but GLR > parsing just seems like premature -- and for my purposes > unnecessary -- optimization, and almost always ends up > being more work when implemented using ocamlyacc (or > similar) because of the various hacks/warkarounds you have > to apply to the grammars to make them LALR(1). > > Of course, if anyone knows of such a beast for OCaml, then > any pointers to it would be appreciated. :) I stop using ocamlyacc. It's too much difficult to get the grammar right, and the approach is to me counterintuitive. I'm using LL(1) stream parsers of CamlP4, they're less more powerful but so much easy to understand, maintain, and extend ! Nicolas |
From: Bardur A. <oca...@sc...> - 2004-09-11 11:30:53
|
On Sat, Sep 11, 2004 at 09:10:05PM +1000, skaller wrote: [--snip--] > Anyhow I think Ocaml could really use a nice set > of lexer/parser libs. Definitely. I've been looking towards implementing a GLR parser (generator) myself, but unfortunately just haven't had the time to take a serious stab at it. It just seems that lots of small utilities I've written require simple (but "non-standard" parsers), almost all of them handwritten up until now because anything but GLR parsing just seems like premature -- and for my purposes unnecessary -- optimization, and almost always ends up being more work when implemented using ocamlyacc (or similar) because of the various hacks/warkarounds you have to apply to the grammars to make them LALR(1). Of course, if anyone knows of such a beast for OCaml, then any pointers to it would be appreciated. :) -- Bardur Arantsson <ba...@im...> <ba...@sc...> - I didn't go to medical school for six friggin' years to be called *Mr.* Evil! Dr. Evil | Austin Powers: Intl. Man of Mystery |
From: skaller <sk...@us...> - 2004-09-11 11:10:15
|
On Sat, 2004-09-11 at 20:10, Nicolas Cannasse wrote: > > (* extend the regexp type -- polymorpic variants are wonderful! *) >=20 > That's true :) > I got live demo from Jacques Garrigue showin variant reusability, that = was > great and this time I asked him if nobody thought of writing an extensi= ble > parser based on polymorphic variants extensibility. Looks like you're d= oing > this with regexp/lexer, Trying to :) > are you planing to do this with parser as well ? > This could lead to an extensible grammar =C3=A0 la Camlp4. Yes, however I thought I'd try the simpler case first. The engine for a parser is more complex, and worse there are many kinds of parser -- LALR, LL(1), LL(k), Earley -- .... >=20 > However since my ocaml bootstrapped language will not get polymorphic > variants I can't use your implementation of regexp... Well, at this stage you can regard it as an experiment. Once a nice structure is discovered you can probably 'freeze' part of the typing, and use ordinary variants instead. The remaining part you can either (a) cheat somehow (b) generate code dynamically (c) use a different encoding -- for example you could use strings, use a mini-parser and a kind of interpreter to build a clunky but still flexible data structure [EG XML is very clunky but you can still do lots with it] I mention this because Felix generates C++ so I have a far worse bootstrapping problem .. I have to implement everything twice, once in Ocaml and then again in C++. Clearly I'll use Felix code to generate the C++, and it looks 'similar' to Ocaml in some ways, but still it will be hard to maintain two compiler implementations. One way I am trying to get around that is to make the language so powerful, I can parameterise enough things such as the grammar, that I only need to implement the same parser engine for both languages -- then grammar changes are reflected into both compilers. For regexps it is easy -- if I can generate an interpretable transition matrix for Ocaml .. it is easy enough to=20 print the matrix as C code and write C to do the equivalent. The actual compiler engine (dfa.ml) is simple enough to recode in C/C++ or Felix. Obviously it get harder for a parser .. the term rewriting in the optimiser will be laborious .. and the lookup=20 algorithm scares the **** out of me :) [That's my most complex module] Anyhow part of this process requires discovering the maximally simple and expressive abstractions: that's more a math problem with an Ocaml test model. If I get nice terse powerful Ocaml representation, it seems like the hard work -- using ordinary variants for you, or a Felix or even C encoding for me -- is more likely to actually succeed. A side effect will be a nice set of Ocaml libraries. It actually seems more sensible to work in the abstract and ignore my specific needs -- quite apart from that attracting other people. And also I=20 can't use either Str or PCRE (I refuse to use Str because it isn't reentrant, and I can't use PCRE because it depends on a non-standard C library) Anyhow I think Ocaml could really use a nice set of lexer/parser libs. I have James Woodyatts combinator library to examine as well. --=20 John Skaller, mailto:sk...@us... voice: 061-2-9660-0850,=20 snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net |
From: Nicolas C. <war...@fr...> - 2004-09-11 10:09:24
|
> Consider this: > > type ('x, 'a) regexp_t' =3D > [ > | `REGEXP_char of int > | `REGEXP_seq of 'x * 'x (** concatenation *) > | `REGEXP_alt of 'x * 'x (** alternation *) > | `REGEXP_aster of 'x (** Kleene closure *) > | `REGEXP_epsilon (** epsilon: null string *) > | `REGEXP_code of 'a (** associated code *) > ] > > type 'a regexp_t =3D ('x, 'a) regexp_t' as 'x > > > (* extend the regexp type -- polymorpic variants are wonderful! *) That's true :) I got live demo from Jacques Garrigue showin variant reusability, that wa= s great and this time I asked him if nobody thought of writing an extensibl= e parser based on polymorphic variants extensibility. Looks like you're doi= ng this with regexp/lexer, are you planing to do this with parser as well ? This could lead to an extensible grammar =E0 la Camlp4. However since my ocaml bootstrapped language will not get polymorphic variants I can't use your implementation of regexp... Nicolas |
From: Nicolas C. <war...@fr...> - 2004-09-11 10:02:33
|
> On the subject of Dllist.enum... I noticed that the count () > function always computes the length of the remaining nodes > by traversing the list. It would be more efficient to do > this, I think: > > 1) The first time count () is called, compute the length > and store it "in the enum". > > 2) Subsequent calls to count () just return the computed > length. > > 2) Every time next () is called, decrese the number of > remaining elements by 1 (if the number hasn't been set, > you don't need to do anything). > > This would only incur minimal overhead for next () and for > any code the uses count () it basically reduces count () > to O(1) averaged over all the elements. This is actually the implementation of List.enum. Jesse you can have a look at it to implement it. Nicolas |
From: skaller <sk...@us...> - 2004-09-11 08:16:21
|
Consider this: type ('x, 'a) regexp_t' = [ | `REGEXP_char of int | `REGEXP_seq of 'x * 'x (** concatenation *) | `REGEXP_alt of 'x * 'x (** alternation *) | `REGEXP_aster of 'x (** Kleene closure *) | `REGEXP_epsilon (** epsilon: null string *) | `REGEXP_code of 'a (** associated code *) ] type 'a regexp_t = ('x, 'a) regexp_t' as 'x (* extend the regexp type -- polymorpic variants are wonderful! *) type 'a gre_t = [ | ('b,'a) regexp_t' | `REGEXP_group of 'b ] as 'b ;; type 'b gcode = [`User of 'b | `Left of int | `Right of int ] ;; let groupify (re: 'a gre_t) : 'a gcode regexp_t = let counter = ref 0 in let rec aux (re: 'a gre_t) : 'a gcode regexp_t = match re with | `REGEXP_group a -> let n = !counter in incr counter; let a = aux a in seq (seq (code (`Left n)) a) (code (`Right n)) | `REGEXP_code a -> `REGEXP_code (`User a) | `REGEXP_alt (a,b) -> let a = aux a in let b = aux b in `REGEXP_alt (a,b) | `REGEXP_seq (a,b) -> let a = aux a in let b = aux b in `REGEXP_alt (a,b) | `REGEXP_aster a -> `REGEXP_aster (aux a) | `REGEXP_epsilon -> `REGEXP_epsilon | `REGEXP_char c -> `REGEXP_char c in aux re What this does is extend the regular expression type to allow a group combinator, but then reduce it by using left/right bracket codes. The intent of this demo is to show how the REGEXP_code thing can be used to *implement* groups without having to make them part of the regular expression compiler engine -- the client can use the group enhanced combinators, reduce it to the standard form with 'groupify', ensuring all brackets are balanced and systematically numbered, and then execute the group enhanced matching automaton with the compiled regexp. I would actually propose not to use this particular form except for compatibility. Named groups are MUCH nicer! In addition, groups could be used to build lists, instead of just the last substring matched: even better, since substring lists can nest in others, we could even build a crude parse tree. BTW: Left n just stores the current position in the input into the 'left' component of the n'th slot of an array. Right n just stores into the 'right' slot. At the end, the string in the range 'left' upto 'right' of array index position k is the k'th group. This only works matching the whole string, for searching and lexing we need save the array for each terminal state so it can be restored on a backtrack (matching never backtracks). I think a similar trick can allow context sensitive searching, and start/end of data detection (and perhaps some other interesting things :) -- John Skaller, mailto:sk...@us... voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net |
From: Bardur A. <oca...@sc...> - 2004-09-11 06:32:06
|
On Fri, Sep 10, 2004 at 05:39:22PM -0400, Jesse Guardiani wrote: > Actually, I tried this before I submitted the current implementation and I > couldn't get it to work. The problem seemed to be that Dllist has no > concept of Empty, so the length function becomes really difficult to > implement. > > Brian, do you think it's possible to implement enums without the "valid" > test? If everyone thinks it's possible then I'll give it another shot. I just > didn't have much luck the first time around. > On the subject of Dllist.enum... I noticed that the count () function always computes the length of the remaining nodes by traversing the list. It would be more efficient to do this, I think: 1) The first time count () is called, compute the length and store it "in the enum". 2) Subsequent calls to count () just return the computed length. 2) Every time next () is called, decrese the number of remaining elements by 1 (if the number hasn't been set, you don't need to do anything). This would only incur minimal overhead for next () and for any code the uses count () it basically reduces count () to O(1) averaged over all the elements. -- Bardur Arantsson <ba...@im...> <ba...@sc...> - Be ot or bot ne ot, taht is the nestquoi. | Monty Python's Flying Circus |
From: Bardur A. <oca...@sc...> - 2004-09-11 06:18:34
|
On Wed, Sep 08, 2004 at 05:01:15PM -0500, Brian Hurt wrote: > - added splice, return a subset of an indexed tree. This takes either > O(log N) where N is the length of the source tree, or O((log M)^2) where > M is the length of the returned tree. I can't think of a better way of > doing this- suggestions welcomed. No suggestions on better ways to do it, but I don't think the name is quite appropriate. If "splice" is used in verb form it can only (well, according to M-W online) be used in the sense of insertion, not extraction. In the noun form it just refers to a "a joining or joint made by splicing something". I would have suggested "slice" just to be consistent with ExtString.String.slice, but since it uses the (index,len) approach instead of (first,last) and is also much stricter it might be confusing to use the same name. How about "sub"? That's what this operation is called for regular arrays and strings... -- Bardur Arantsson <ba...@im...> <ba...@sc...> - Your Neutralness, it's a beige alert! - If I don't survive, tell my wife "Hello." Neutral Vice President and Neutral President | Futurama |
From: Nicolas C. <war...@fr...> - 2004-09-11 04:36:25
|
> I'm playing with my regexp package and I'd like > to discuss architecture. [...] > The goal is a simple, powerful, in Caml regular matching > facility, written entirely in Ocaml, and obsoleting Str, PCRE, > and Ocamllex. That's interesting, and I might help here. I'm currently working at designing a language that will be bootstrapped by OCaml, and I need to write a lexer and and LL(1) parser engine in OCaml for that . I do no care about groups since the regexp I'm needing in the language lexer won't be using them, but I'm interested in work about producing and optimizing the tables in realtime. The engine I'm thinking of will be available for the language programmer also so I might need groups after all :) What I like about your design is the possibility of getting compile time regexp ast checking - better than ocaml Str - as well as runtime table production - better than ocamllex . However having a too much abstract implementation for groups and marks might not be a good thing for the end user. You might provide also a string parser that converts/prints good-old regexp strings to/from ast. Nicolas |
From: Nicolas C. <war...@fr...> - 2004-09-11 03:28:29
|
> New version available here: > http://www.wingnet.net/~jesse/ocaml/dllist/index.html > > Changes: > - merge -> splice > - jump -> skip > - enums reimplemented > - added rev_enum function > > My sourceforge ID is 'trevarthan'. I'll be the maintainer. > > Thanks! I would add an "add" function to would be an append returning "unit" (because I don't want all the time to write ignore...). I would also eliminate the use of a data structure for enum operations , the "valid" test can be replaced everywhere by a curr == node , this might simplify implementation. For the syntax/typo : - remove all ;; because they are not needed - please replace your 8 chars indentation by tabs or 2/4 spaces indents It's ok for inclusion into ExtLib, you are now developer of ExtLib and you can commit to the CVS once ready. Please check (that goes for Brian IdxTree too) that your module builds well with Makefile and with install.ml script, and that the documentation generated by ocamldoc , with install script ( and ExtLib CSS ) is consistent with other modules. Thanks Nicolas Cannasse |
From: Jesse G. <je...@wi...> - 2004-09-11 03:11:09
|
On Friday 10 September 2004 5:17 pm, Nicolas Cannasse wrote: > > New version available here: > > http://www.wingnet.net/~jesse/ocaml/dllist/index.html > > > > Changes: > > - merge -> splice > > - jump -> skip > > - enums reimplemented > > - added rev_enum function > > > > My sourceforge ID is 'trevarthan'. I'll be the maintainer. > > > > Thanks! > > I would add an "add" function to would be an append returning "unit" > (because I don't want all the time to write ignore...) OK. Do you think something like "add_before" might be useful too? (prepend with unit return value) I suck with function naming, so please feel free to help me out. > . I would also > eliminate the use of a data structure for enum operations , the "valid" test > can be replaced everywhere by a curr == node , this might simplify > implementation. Actually, I tried this before I submitted the current implementation and I couldn't get it to work. The problem seemed to be that Dllist has no concept of Empty, so the length function becomes really difficult to implement. Brian, do you think it's possible to implement enums without the "valid" test? If everyone thinks it's possible then I'll give it another shot. I just didn't have much luck the first time around. > For the syntax/typo : > - remove all ;; because they are not needed > - please replace your 8 chars indentation by tabs or 2/4 spaces indents OK. I'll make these changes this weekend. -- Jesse Guardiani, Systems Administrator WingNET Internet Services, P.O. Box 2605 // Cleveland, TN 37320-2605 423-559-LINK (v) 423-559-5145 (f) http://www.wingnet.net |
From: skaller <sk...@us...> - 2004-09-11 01:28:54
|
On Sat, 2004-09-11 at 07:05, Nicolas Cannasse wrote: > It has my approval for inclusion into ExtLib. You can commit it on CVS right > now or wait for more people insights. IMHO the way to handle this is: (1) Make an experimental module in CVS for early commital (2) Based on your approval and agreement with the experimenters a library can be deemed ready to put in the main library. In that case *you* transfer it. It would be nice if more than one person could work on a proposal, without 'polluting' the main, stable, ready-for-the-bigtime library. -- John Skaller, mailto:sk...@us... voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net |
From: skaller <sk...@us...> - 2004-09-11 01:18:17
|
On Sat, 2004-09-11 at 07:29, Nicolas Cannasse wrote: > > I'm playing with my regexp package and I'd like > > to discuss architecture. > [...] > > The goal is a simple, powerful, in Caml regular matching > > facility, written entirely in Ocaml, and obsoleting Str, PCRE, > > and Ocamllex. > > That's interesting, and I might help here. > I'm currently working at designing a language that will be bootstrapped by > OCaml, and I need to write a lexer and and LL(1) parser engine in OCaml for > that . Me too. In fact the code I posted is taken from the Felix compiler where I'm following the same development program. If we do some work generalising the engine, I'll probably paste the new code back into it. See below some comments on recursive transition machines (RTN). > The engine I'm thinking of will be available for the language programmer > also so I might need groups after all :) Groups are like tuples -- they're OK for small regexps. For more complex analysis, marks are more general. Marks allow you to match a string and return a trailing mark which indicates the class -- that is, a token. > What I like about your design is the possibility of getting compile time > regexp ast checking - better than ocaml Str - as well as runtime table > production - better than ocamllex . However having a too much abstract > implementation for groups and marks might not be a good thing for the end > user. You might provide also a string parser that converts/prints good-old > regexp strings to/from ast. It depends on the end users needs whether the combinator design for regexps is convenient. If you're an end user writing a programming language lexer it is nicer than string regexps. If you're doing a quick and dirty phone number extraction as in Bagley Shootout, a string encoding is easier to write -- hehe but not much. EVERY language that passes that test is WRONG! Even the specified correct answer is wrong. Clearly it is possible*** to write a function: perl_re_to_regexp: string -> regexp_t and of course also posix_re_to_regexp emacs_re_to_regexp ... and of course also unicode_re_to_regexp: ustring -> regexp etc etc and this *should* be done. I just haven't done it yet because there are some intermediate steps. For example the engine I showed uses arbitrary marks: groups are just a special case of that, but there needs to be an implementation supporting them. [Polymorphic variants are invaluable here!!] Another example: context. It makes no sense to check context when matching a string -- context is only meaningful in searching and lexing. You can search for the re R with H in front and T after it using a match like this: .* H\(R\)T.* which shows searching is just a special case of matching. *** There are a couple of caveats. Most string regexps can do a couple of things that aren't regular. For example match corresponding pairs. (I forget the syntax). This ability is strictly non-regular, and must NOT be supported. A convenient LL(1) parser is the correct way to do this. In fact, 'convenient' has a strong meaning. LL(1) parser engines are weak compared to a pushdown stack of regexps. In fact, a stack of regexps can parse any context free language. Its called a RTN -- recursive transition network. Python's parser is one. Basically RTN is specified using a meta-grammar such as EBNF : the RHS productions can use regular combinators (i.e you can have ? * and + as well as | and concatenation) However as well as matching tokens on the RHS, you can match a non-terminal, which recursively invokes another finite state automaton. RTN's mean the regular engine must be able to accept 'tokens' from the engine it controls as inputs, not 'characters' and of course 'tokens' must also be allowed to have attributes. The engine I posted only accepts integers as inputs -- it needs to be made polymorphic in the character type, without loss of efficiency.. -- John Skaller, mailto:sk...@us... voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net |
From: skaller <sk...@us...> - 2004-09-11 00:36:17
|
On Sat, 2004-09-11 at 04:50, Jesse Guardiani wrote: > Richard Jones wrote: > > > (Repost: original was apparently tagged as sp*m). > > > > Not really sure I understand it from a user's point of view. > > > > For me what really matters is that I could write something like: > > > > if string =~ /(a+)(b*)/ then ... > > > > The regular expression (Perl-compatible, please!) ought to be compiled > > and checked at compile time, and the compiler should handle caching > > compiled expressions containing variables. > > That would certainly be nice. Felix actually uses a syntax like this: open Lexer; regexp digit = ["0123456789"]; regexp digits3 = digit digit digit; regexp digits4 = digits3 digit; regexp area_code = digits3 | "(" digits3 ")"; regexp exchange = digits3; regexp phone = area_code " " exchange (" " | "-") digits4; reglex start to finish with | phone => .... | _ => "" endmatch which is an excerpt from the Bagley Shootout test. -- John Skaller, mailto:sk...@us... voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net |
From: skaller <sk...@us...> - 2004-09-11 00:24:41
|
On Sat, 2004-09-11 at 02:54, Richard Jones wrote: > (Repost: original was apparently tagged as sp*m). > > Not really sure I understand it from a user's point of view. > > For me what really matters is that I could write something like: > > if string =~ /(a+)(b*)/ then ... This can be done down the track. That's under the heading 'parsers'. One write a parser for 'Perl Re' or "Emacs' or 'Glob' or 'Posix' and translates that string to a REGEXP and then run thru the engine. String based Re's are convenient for 'Micky Mouse' jobs. They're totally unsuitable as fundamental components. Reasons: (1) for complex regexps, regular *expressions* are untenable. Regular *definitions* are mandatory. Thats when you use a sequence of named regexps like in Lex. (2) Encoding both the regexp operators and data in the same string is untenable for complex regexps. All that escaping is a problem. Numbered groups is a very bad idea for complex regexps like a lexer specification for a programming language. And there is another reason too: (3) Encoding character data and regexp operators in a string is not i18n compatible. It isn't possible to deem certain characters as the operators, because you don't know what the character set is. (4) Strings are 8 bit. The engine must support 32 bit. (5) String based Re's aren't typesafe. -- John Skaller, mailto:sk...@us... voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net |
From: Nicolas C. <war...@fr...> - 2004-09-10 21:52:56
|
> - Changed name > > - Everything from Funarr still supported > > - I stuck with a weight balanced tree. Chris Okasaki's Random Access > Lists are interesting, but they're geared towards list replacement, not > array replacement. For example, I can't figure out how to delete elements > in the middle of a random access list- I can with a weight balanced tree. > > - added append, to append two indexed trees in O(log N) time. The > returned tree is balanced. > > - added splice, return a subset of an indexed tree. This takes either > O(log N) where N is the length of the source tree, or O((log M)^2) where > M is the length of the returned tree. I can't think of a better way of > doing this- suggestions welcomed. > > - Added list, array, and enum conversions. > > - Added rev function to reverse an indexed tree > > - I realized that I could do a binary search on the indexed tree in the > same amount of time it takes me to get a single element. So I added two > binary search functions- search, which returns the found element, and > find_index, which returns the index of the found element. Note that with > these functions it now becomes possible to implement a threaded map > structure, by combining an index tree with a map. I would rename Idxtree to IdxTree , using a maj T for case. search_index instead of find_index would be more consistent. I would also drop the O(N) notations were it's obvious that all elements are browsed : of_list / to_array / iter / fold for example. I would also remove rev_enum and of_rev_enum which are a little bit particular and can be acheived using to_enum + rev or of_enum + rev. And I would add iteri + mapi functions. It has my approval for inclusion into ExtLib. You can commit it on CVS right now or wait for more people insights. Thanks, Nicolas |
From: Brian H. <bh...@sp...> - 2004-09-10 21:48:34
|
On Fri, 10 Sep 2004, Nicolas Cannasse wrote: > I would rename Idxtree to IdxTree , using a maj T for case. OK. > > search_index instead of find_index would be more consistent. You're right. They were originally find and find_index, but I didn't like changing the semantics of find, so I changed it's name to search, but didn't change find_index. Thinko. > > I would also drop the O(N) notations were it's obvious that all elements are > browsed : of_list / to_array / iter / fold for example. No. They could be O(N log N) implemented poorly. For example: let iter f idxtree = for i = 0 to (length idxtree) - 1 do f (get idxtree i) done ;; I am explicitly disallowing this implementation. > I would also remove > rev_enum and of_rev_enum which are a little bit particular and can be > acheived using to_enum + rev or of_enum + rev. At additional cost. But I'm not sure how usefull these functions really are. Anyone else have an opinion they'd like to weigh in with? > > And I would add iteri + mapi functions. Good idea. > > It has my approval for inclusion into ExtLib. You can commit it on CVS right > now or wait for more people insights. I'll look at checking it in. -- "Usenet is like a herd of performing elephants with diarrhea -- massive, difficult to redirect, awe-inspiring, entertaining, and a source of mind-boggling amounts of excrement when you least expect it." - Gene Spafford Brian |
From: Jesse G. <je...@wi...> - 2004-09-10 18:54:02
|
Richard Jones wrote: > (Repost: original was apparently tagged as sp*m). > > Not really sure I understand it from a user's point of view. > > For me what really matters is that I could write something like: > > if string =~ /(a+)(b*)/ then ... > > The regular expression (Perl-compatible, please!) ought to be compiled > and checked at compile time, and the compiler should handle caching > compiled expressions containing variables. That would certainly be nice. To your knowledge, has this been discussed on the INRIA list? What did they say? -- Jesse Guardiani, Systems Administrator WingNET Internet Services, P.O. Box 2605 // Cleveland, TN 37320-2605 423-559-LINK (v) 423-559-5145 (f) http://www.wingnet.net |