From: William S F. <ws...@fu...> - 2010-06-21 18:28:49
|
[Moved to swig-devel list from swig-user as this is a development discussion]. Vadim Zeitlin wrote: > Hello, > > I've uploaded the patches for regex support to SF, please see > > https://sourceforge.net/tracker/?func=detail&aid=3018506&group_id=1645&atid=301645 > > There are 10 of them and they should be applied in order, from 1 to 10, as > several of them depend on each other. Here is a brief overview of all of > them: > > The patches (1) and (2) are not really related to regex support at all and > I had sent them before but as I needed them to be able to run tests I > included them again just in case. Patch (3) is optional and just removes > the warnings which appeared in compilation of original Torsten's code due > to DohWrite() non-const-ness. I could have added casts to the code, of > course, but it seemed to be cleaner to correct DohWrite() instead. If this > is against SWIG policy you can omit applying this patch. > Fixes for const handling is fine, and yes casting away const isn't really acceptable as one day the code will be all const correct. Don't hold your breath though :) By and large const isn't done well as unfortunately DOH wasn't designed with const in mind and being at the lowest level it impacts everything else. > Patches (4) and (7) add documentation for the previously existing %rename > features and so are independent of regex support as well but the next > patches depend on them. Notably (5) and (8) which implement regex encoder > and regexmatch function respectively and also document them. > SWIG_LIB_PCRE in patch (5) - couldn't this be named more generically and submitted to the autoconf archive? Swig_name_regexmatch_value in patch (8) formats code to 80 chars, please add a patch to use 160 chars. > Patch (9) removes the old rxspencer support. It doesn't need to be applied > neither, i.e. both rxspencer and PCRE can coexist but as you said it was > better to remove it, this is what I did. > > Finally, patch (10) just fixes Torsten's changes when using libpcre > statically. > > > > Compared to Torsten's patch I added support for [not]regexmatch and > regextarget, documentation and another unit test. I also tested the new > functionality under Debian both natively and when cross-compiling. In the > former case you just need to install the appropriate package > (libpcre3-dev). For cross-compiling I built libpcre myself from sources > using Debian i586-mingw32msvc cross-compiler. This was as easy as running > "configure --host=i586-mingw32msvc --disable-shared --prefix=$HOME/mingw && > make install". > Can you add this additional configuring to Tools/mkwindows.sh for running on mingw, and I can check this works on Linux. The windows version of SWIG can be built with this script (although I last checked it on mingw/cygwin a very long time ago, but use it on Ubuntu for the official releases). > To build SWIG itself I then just had to point its configure to the right > pcre-config: "PATH=$HOME/mingw/bin configure --host=i586-mingw32msvc". > This may not be ideal but IMHO it's easy enough and is the best I can do, > hopefully it won't be a big burden for you. > > But this leads me to my first question in a list of open questions I still > have: > > Q1. I don't like the fact that PCRE support is not on by default, IMHO we > don't need --with-pcre=check at all and should default to --with-pcre > and let people who want to disable it because they don't have the > library to use --without-pcre explicitly. What do you think about this? > Agreed, it should be on by default. Also by default, SWIG should fail to configure if pcre is not found. Some open source projects are highly configurable and it is possible to build an executable with various features turned on or off. We've not really had that with SWIG before (except unoffiically with rxspencer). I'd rather have pcre support always in though, but I suppose we'd better allow building without pcre with an explicit --without-pcre to configure. Then we'd also need an option to report how SWIG was built, something similar to what vim outputs; there are a huge number of configure time options for vim: $ vim --version VIM - Vi IMproved 7.1 (2007 May 12, compiled Jan 8 2009 02:15:11) Included patches: 1-314 Compiled by buildd@rothera.buildd Huge version with GTK2 GUI. Features included (+) or not (-): +arabic +autocmd +balloon_eval +browse ++builtin_terms ... etc ... > And a few more ones: > > Q2: Old rxspencer encoder didn't do anything if regex support wasn't > compiled in. The new regex one exits the program with an error message > in such case. IMHO the new behaviour is better but I can change it to > silently doing nothing if you prefer. Should I? > > And the same question about rxsmatch -- should it just never match or > exit the program too, as it does now? > I think anything that relies on pcre and is not available should exit SWIG fatally with a helpful error message pointing to the source code requiring regex handling. > Q3: Code handling substitutions in regex replacement part normally expects > '@' to be followed by a digit but its handling of not normal cases > seems inconsistent to me: A bare '@' at the end of the string is kept > as is but there is no way to quote '@' anywhere else, it is just > skipped silently if it's not followed by a digit. I don't think '@' is > valid in identifiers in any output language anyhow but I'm not sure > about this. In any case, I think that we should either provide a way to > quote it anywhere (using "@@"?) or not allow it anywhere, including at > the end of the string. > I'll pass on this one, but I don't follow why SWIG's handling of @ should be different to perl or any other language using regular expressions. Or is this something wrong with pcre? > Q4: Currently the regex isn't implicitly anchored to the beginning/end of > the string. This means that regexmatch$name="Max" will match > FindMaximum() which might be unexpected. You need to use "^Max$" > explicitly to match this string only. I wonder if we shouldn't rather > implicitly anchor the string by adding '^' and '$' to it ourselves? > No, personally I think the default is expected and it is normal to add want it. I would go ahead and commit your patches as you supplied them and then feel free to just commit any modifications as discussed or as you see fit. William |
From: William S F. <ws...@fu...> - 2010-11-15 22:19:22
|
Vadim Zeitlin wrote: > On Thu, 19 Aug 2010 19:28:15 +0100 William S Fulton > <ws...@fu...> wrote: > > WSF> > WSF> Vadim Zeitlin wrote: > WSF> > WSF> > But while doing this I discovered something else that I > hadn't > WSF> > WSF> > realised before: apparently there can be only a single > %rename > WSF> > WSF> > attached to a declaration. So if you want to rename > stuff globally > WSF> > WSF> > and try to do e.g. > WSF> > WSF> > > WSF> > WSF> > %rename("%(regex/^Set(.*)/put\\1/)s") ""; > WSF> > WSF> > %rename("%(regex/^Get(.*)/get\\1/)s") ""; > ... > WSF> There must be some iteration through the list somewhere in the > code as > WSF> there is priority as to what is a better match > > You were right, of course, there is already an iteration over all > rename_list elements in Swig_name_nameobj_lget() in naming.c. And the > problem lies in this function too: it currently assumes that if there > is no > "target name", then the %rename always matches. This is, however, not true > in the above case as a %(regex)s expansion may fail. So I'd like to > propose > the following patch: > > --- a/Source/Swig/naming.c > +++ b/Source/Swig/naming.c > @@ -1302,7 +1302,13 @@ Hash *Swig_name_nameobj_lget(List *namelist, > Node *n, String *prefix, String *na > : Swig_name_match_value(tname, sname); > Delete(sname); > } else { > - match = 1; > + /* Applying the renaming rule may fail if it contains a > %(regex)s expression that doesn't match the given name. */ > + String *sname = NewStringf(Getattr(rn, "name"), name); > + if (sname) { > + if (Len(sname)) > + match = 1; > + Delete(sname); > + } > } > } > if (match) { > > It makes the above example work for me and, AFAICS, shouldn't break > anything else. Do you have any objections against committing this? > Go for it, as usual please add plenty of testcases and documentation with examples with the checkin. > Thanks, > VZ > > P.S. Thanks a lot for adding swigprint gdb command, it made debugging this > incomparably easier. Thanks, I wish I'd done that years ago! William |
From: Vadim Z. <vz...@ze...> - 2010-11-16 14:12:48
|
On Mon, 15 Nov 2010 22:19:07 +0000 William S Fulton <ws...@fu...> wrote: WSF> Vadim Zeitlin wrote: WSF> > I'd like to propose the following patch: WSF> > WSF> > --- a/Source/Swig/naming.c WSF> > +++ b/Source/Swig/naming.c WSF> > @@ -1302,7 +1302,13 @@ Hash *Swig_name_nameobj_lget(List *namelist, WSF> > Node *n, String *prefix, String *na WSF> > : Swig_name_match_value(tname, sname); WSF> > Delete(sname); WSF> > } else { WSF> > - match = 1; WSF> > + /* Applying the renaming rule may fail if it contains a WSF> > %(regex)s expression that doesn't match the given name. */ WSF> > + String *sname = NewStringf(Getattr(rn, "name"), name); WSF> > + if (sname) { WSF> > + if (Len(sname)) WSF> > + match = 1; WSF> > + Delete(sname); WSF> > + } WSF> > } WSF> > } WSF> > if (match) { WSF> > WSF> > It makes the above example work for me and, AFAICS, shouldn't break WSF> > anything else. Do you have any objections against committing this? WSF> > WSF> Go for it, as usual please add plenty of testcases and documentation WSF> with examples with the checkin. I extended the existing Python rename_pcre_encoder unit test and added the same test for C# and Java, all of them pass. Please let me know if I broke anything, VZ |
From: Vadim Z. <vz...@ze...> - 2010-06-21 22:32:18
|
On Mon, 21 Jun 2010 19:28:30 +0100 William S Fulton <ws...@fu...> wrote: WSF> [Moved to swig-devel list from swig-user as this is a development WSF> discussion]. Sorry, this was my mistake, I intended to send this to -devel. Thanks for rerouting it! WSF> Fixes for const handling is fine, and yes casting away const isn't WSF> really acceptable as one day the code will be all const correct. Don't WSF> hold your breath though :) By and large const isn't done well as WSF> unfortunately DOH wasn't designed with const in mind and being at the WSF> lowest level it impacts everything else. Yes, I noticed, and while I (as well as probably everybody else) don't have the energy to fix all of them at once I thought it could be nice to fix the one that was exposed by this patch. WSF> > Patches (4) and (7) add documentation for the previously existing %rename WSF> > features and so are independent of regex support as well but the next WSF> > patches depend on them. Notably (5) and (8) which implement regex encoder WSF> > and regexmatch function respectively and also document them. WSF> > WSF> SWIG_LIB_PCRE in patch (5) - couldn't this be named more generically and WSF> submitted to the autoconf archive? There is already a macro for testing for PCRE in the archive but it doesn't use pcre-config as I hoped it would. Submitting this version there is probably indeed the right thing to do, I'll do it tomorrow. WSF> Swig_name_regexmatch_value in patch (8) formats code to 80 chars, please WSF> add a patch to use 160 chars. Ok, I didn't know about this rule. What other settings should be used for SWIG code, is "set sw=2 ts=8 noet tw=160" all I need (as you quoted vim version output below I hope this was an understandable question for you :-)? WSF> Can you add this additional configuring to Tools/mkwindows.sh for WSF> running on mingw, and I can check this works on Linux. Just to make it clear: I did do my tests on Linux (Debian amd64). But I'll update the script. WSF> > Q1. I don't like the fact that PCRE support is not on by default, IMHO we WSF> > don't need --with-pcre=check at all and should default to --with-pcre WSF> > and let people who want to disable it because they don't have the WSF> > library to use --without-pcre explicitly. What do you think about this? WSF> > WSF> Agreed, it should be on by default. Also by default, SWIG should fail to WSF> configure if pcre is not found. Ok, I'll change this. WSF> Then we'd also need an option to report how SWIG was built, something WSF> similar to what vim outputs WSF> options for vim: Currently we have % swig -version SWIG Version 2.0.1 Compiled with g++ [x86_64-unknown-linux-gnu] Please see http://www.swig.org for reporting bugs and further information Should I just add a line "This version of SWIG has regular expression support." after "Compiled with" line? WSF> > And a few more ones: WSF> > WSF> > Q2: Old rxspencer encoder didn't do anything if regex support wasn't WSF> > compiled in. The new regex one exits the program with an error message WSF> > in such case. IMHO the new behaviour is better but I can change it to WSF> > silently doing nothing if you prefer. Should I? WSF> > WSF> > And the same question about rxsmatch -- should it just never match or WSF> > exit the program too, as it does now? WSF> > WSF> I think anything that relies on pcre and is not available should exit WSF> SWIG fatally with a helpful error message pointing to the source code WSF> requiring regex handling. The current behaviour is good then. WSF> > Q3: Code handling substitutions in regex replacement part normally expects WSF> > '@' to be followed by a digit but its handling of not normal cases WSF> > seems inconsistent to me: A bare '@' at the end of the string is kept WSF> > as is but there is no way to quote '@' anywhere else, it is just WSF> > skipped silently if it's not followed by a digit. I don't think '@' is WSF> > valid in identifiers in any output language anyhow but I'm not sure WSF> > about this. In any case, I think that we should either provide a way to WSF> > quote it anywhere (using "@@"?) or not allow it anywhere, including at WSF> > the end of the string. WSF> > WSF> I'll pass on this one, but I don't follow why SWIG's handling of @ WSF> should be different to perl or any other language using regular WSF> expressions. Or is this something wrong with pcre? SWIG uses "@1" instead of "\1" in Perl (and everywhere else). I think it's not a bad idea to continue to do this because it will make it simpler for people having code using rxspencer to upgrade and also because it avoids the double backslashes galores but it could be changed to use '\\' if you prefer. WSF> > Q4: Currently the regex isn't implicitly anchored to the beginning/end of WSF> > the string. This means that regexmatch$name="Max" will match WSF> > FindMaximum() which might be unexpected. You need to use "^Max$" WSF> > explicitly to match this string only. I wonder if we shouldn't rather WSF> > implicitly anchor the string by adding '^' and '$' to it ourselves? WSF> > WSF> No, personally I think the default is expected and it is normal to add WSF> want it. WSF> WSF> I would go ahead and commit your patches as you supplied them and then WSF> feel free to just commit any modifications as discussed or as you see fit. I could update the patches before you commit them if you prefer but it would have to wait until tomorrow as it's getting too late to start doing this today... Thanks for looking over this, VZ |
From: William S F. <ws...@fu...> - 2010-06-22 06:43:56
|
Vadim Zeitlin wrote: > WSF> Swig_name_regexmatch_value in patch (8) formats code to 80 chars, please > WSF> add a patch to use 160 chars. > > Ok, I didn't know about this rule. What other settings should be used for > SWIG code, is "set sw=2 ts=8 noet tw=160" all I need (as you quoted vim > version output below I hope this was an understandable question for you :-)? > I use this in .vimrc: au BufRead */swig*/* set tabstop=8 | set shiftwidth=2 | set softtabstop=2 | set expandtab " SWIG au BufRead */swig*/Source/* set tabstop=8 | set shiftwidth=2 | set softtabstop=2 | set noexpandtab " SWIG Source The formatting rules are really those used in the 'make beautify-file' target in the Source directory. This doesn't get run often though as it doesn't work 100%, requiring some manual fixes afterwards :( > WSF> Then we'd also need an option to report how SWIG was built, something > WSF> similar to what vim outputs > WSF> options for vim: > > Currently we have > > % swig -version > > SWIG Version 2.0.1 > > Compiled with g++ [x86_64-unknown-linux-gnu] > Please see http://www.swig.org for reporting bugs and further information > > Should I just add a line "This version of SWIG has regular expression > support." after "Compiled with" line? > Let's add a new line for configured options like in vim, using +pcre for --with-pcre and -pcre for --without-pcre: % swig -version SWIG Version 2.0.1 Compiled with g++ [x86_64-unknown-linux-gnu] Configured options: +pcre Please see http://www.swig.org for reporting bugs and further information > WSF> > Q3: Code handling substitutions in regex replacement part normally expects > WSF> > '@' to be followed by a digit but its handling of not normal cases > WSF> > seems inconsistent to me: A bare '@' at the end of the string is kept > WSF> > as is but there is no way to quote '@' anywhere else, it is just > WSF> > skipped silently if it's not followed by a digit. I don't think '@' is > WSF> > valid in identifiers in any output language anyhow but I'm not sure > WSF> > about this. In any case, I think that we should either provide a way to > WSF> > quote it anywhere (using "@@"?) or not allow it anywhere, including at > WSF> > the end of the string. > WSF> > > WSF> I'll pass on this one, but I don't follow why SWIG's handling of @ > WSF> should be different to perl or any other language using regular > WSF> expressions. Or is this something wrong with pcre? > > SWIG uses "@1" instead of "\1" in Perl (and everywhere else). I think it's > not a bad idea to continue to do this because it will make it simpler for > people having code using rxspencer to upgrade and also because it avoids > the double backslashes galores but it could be changed to use '\\' if you > prefer. > I don't think the rxspencer should dictate how we do things going forwards unless it is the best option. Some upgrade documentation in the CHANGES file will suffice, so here we can mention to use \\ instead of @ if we want to go this way. I really don't like the idea that the rest of the world uses \1 and SWIG uses @1. Why don't we accept \1 ? Aren't these quoted within square brackets in any case, so the C string escaping would not be needed? What do other C oriented languages do for escaping backslashes in regular expressions? > WSF> I would go ahead and commit your patches as you supplied them and then > WSF> feel free to just commit any modifications as discussed or as you see fit. > WSF> > Q4: Currently the regex isn't implicitly anchored to the beginning/end of > WSF> > the string. This means that regexmatch$name="Max" will match > WSF> > FindMaximum() which might be unexpected. You need to use "^Max$" > WSF> > explicitly to match this string only. I wonder if we shouldn't rather > WSF> > implicitly anchor the string by adding '^' and '$' to it ourselves? > WSF> > > WSF> No, personally I think the default is expected and it is normal to add > WSF> want it. Sorry, half my sentence got chomped. I'm just trying to say that a user should explicitly add "^" and "$" when needed. > > I could update the patches before you commit them if you prefer but it > would have to wait until tomorrow as it's getting too late to start doing > this today... No great rush, I'll leave you to commit them as you see fit. William |
From: Vadim Z. <vz...@ze...> - 2010-06-22 11:37:23
|
On Tue, 22 Jun 2010 07:43:42 +0100 William S Fulton <ws...@fu...> wrote: WSF> The formatting rules are really those used in the 'make beautify-file' WSF> target in the Source directory. AFAICS the only important addition there is that opening braces of the functions should be on the same line as function declaration itself. I'll update the patches to respect this setting. WSF> Let's add a new line for configured options like in vim, using +pcre for WSF> --with-pcre and -pcre for --without-pcre: Ok, added. WSF> I don't think the rxspencer should dictate how we do things going WSF> forwards unless it is the best option. Some upgrade documentation in the WSF> CHANGES file will suffice, so here we can mention to use \\ instead of @ WSF> if we want to go this way. I really don't like the idea that the rest of WSF> the world uses \1 and SWIG uses @1. Why don't we accept \1 ? Aren't WSF> these quoted within square brackets in any case, so the C string WSF> escaping would not be needed? Is there anything special about square brackets? AFAICS they're just parsed by the code itself and don't seem to suppress C string escaping. OTOH maybe it would indeed be a good idea to suppress backslash handling in this string, I don't see what would it be ever needed for (stuff like "\n" shouldn't occur in the identifier names). Alternatively, we could keep it but also add special case for "\1".."\9". Anyhow, I think you're right and it would be better to use "\1" instead of "@1", I just blindly applied the original patch but it doesn't seem like a good idea, I don't know of any language/program/whatever that uses @1. So unless there are objections to this, I'll change it to use backslashes. In fact I also wonder about using the square brackets now. It is used by %strip function (e.g. you can write "%(strip:[wx])s") but that's ok because there is no common convention there. For regexes OTOH using slashes is almost ingrained in my fingers so maybe we should allow "%(regex:/foo/bar/)s" instead of the current "%(regex:[foo][bar])s"? Another alternative would be to follow Perl and allow any characters to be used as delimiter, allowing both slashes and anything else (e.g. '@' is a common choice for this) or any pair of matching delimiters including '[' and ']' (and also '(' and ')', '{' and '}', '<' and '>' -- but limited to ASCII only, unlike Perl). This would be more complicated and I'm not sure it's worth it because the main motivation for using something else than slashes is using regexes with string already containing a lot of slashes which is often the case for file names but is never the case for identifiers. So IMO imposing slashes would be enough -- and better than using square brackets. What do you think? WSF> What do other C oriented languages do for escaping backslashes in WSF> regular expressions? AFAIK nothing, i.e. they just stick with backslashes meaning that using regexes in C[++] is ugly. WSF> > I could update the patches before you commit them if you prefer but it WSF> > would have to wait until tomorrow as it's getting too late to start doing WSF> > this today... WSF> No great rush, I'll leave you to commit them as you see fit. Sorry for asking so many questions about this but I just dislike this red-faced feeling you get when you commit something wrongly in a public repository, so just to be sure: would you like me to commit the updated patches directly to SWIG svn or update the patches in the SF tracker and still leave committing them to you? And in the former case, should I commit to a branch or directly on the trunk? Thanks, VZ |
From: William S F. <ws...@fu...> - 2010-06-22 22:49:55
|
Vadim Zeitlin wrote: > WSF> I don't think the rxspencer should dictate how we do things going > WSF> forwards unless it is the best option. Some upgrade documentation > in the > WSF> CHANGES file will suffice, so here we can mention to use \\ > instead of @ > WSF> if we want to go this way. I really don't like the idea that the > rest of > WSF> the world uses \1 and SWIG uses @1. Why don't we accept \1 ? Aren't > WSF> these quoted within square brackets in any case, so the C string > WSF> escaping would not be needed? > > Is there anything special about square brackets? AFAICS they're just > parsed by the code itself and don't seem to suppress C string escaping. I don't really know. Marcelo added all sorts of unusual stuff, so am not sure about this. > OTOH maybe it would indeed be a good idea to suppress backslash > handling in > this string, I don't see what would it be ever needed for (stuff like "\n" > shouldn't occur in the identifier names). Alternatively, we could keep it > but also add special case for "\1".."\9". Anyhow, I think you're right and > it would be better to use "\1" instead of "@1", I just blindly applied the > original patch but it doesn't seem like a good idea, I don't know of any > language/program/whatever that uses @1. So unless there are objections to > this, I'll change it to use backslashes. > Yes, see below. > In fact I also wonder about using the square brackets now. It is used by > %strip function (e.g. you can write "%(strip:[wx])s") but that's ok > because > there is no common convention there. For regexes OTOH using slashes is > almost ingrained in my fingers so maybe we should allow > "%(regex:/foo/bar/)s" > instead of the current "%(regex:[foo][bar])s"? Another alternative > would be > to follow Perl and allow any characters to be used as delimiter, allowing > both slashes and anything else (e.g. '@' is a common choice for this) or > any pair of matching delimiters including '[' and ']' (and also '(' and > ')', '{' and '}', '<' and '>' -- but limited to ASCII only, unlike Perl). > This would be more complicated and I'm not sure it's worth it because the > main motivation for using something else than slashes is using regexes > with > string already containing a lot of slashes which is often the case for > file > names but is never the case for identifiers. So IMO imposing slashes would > be enough -- and better than using square brackets. What do you think? > Yeah, I think so. If someone finds a use case where forward slashes don't work, we can always add in another option later. I don't think any use cases will appear though unless we start to use regex outside of %rename. Maybe we should support [] too though as that is what the other encoders use. Not sure. > > WSF> What do other C oriented languages do for escaping backslashes in > WSF> regular expressions? > > AFAIK nothing, i.e. they just stick with backslashes meaning that using > regexes in C[++] is ugly. > I think I have the solution. My plan was always to merge the c++0x support into trunk for the next release. Given that c++0x supports raw string literals, we should take advantage of this. So then we'd remove the rather odd "@1" syntax and use plain C strings, except, with raw string literals we have an alternative option where there is no need to escape the backslashes. The choices are then: "\\1" or R"(\1)" but not "@1". Note that the raw string literals changed earlier this year in N3077 to use parenthesis rather than square brackets. R"[\1]" used to be the appropriate syntax - http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3077.html This way our usage is standard and provides a neater syntax. One minor problem is this raw string literal support doesn't seem to be working as well as I thought it was in the gsoc2009-matevz branch. Probably I can fix that though. > > WSF> > I could update the patches before you commit them if you > prefer but it > WSF> > would have to wait until tomorrow as it's getting too late to > start doing > WSF> > this today... > WSF> No great rush, I'll leave you to commit them as you see fit. > > Sorry for asking so many questions about this but I just dislike this > red-faced feeling you get when you commit something wrongly in a public > repository, so just to be sure: would you like me to commit the updated > patches directly to SWIG svn or update the patches in the SF tracker and > still leave committing them to you? And in the former case, should I > commit > to a branch or directly on the trunk? Ah, don't worry about messing it up... it is just source code control so anything awful can easily be reverted. If the test-suite runs, you should be fine. See the docs (Extending.html) on running the test-suite. I'd like this in trunk sooner rather than later and there seem to be enough interest/hands on help to fix up everything, so please commit direct to trunk. William |
From: Vadim Z. <vz...@ze...> - 2010-06-23 08:27:57
|
On Tue, 22 Jun 2010 23:49:44 +0100 William S Fulton <ws...@fu...> wrote: WSF> > So IMO imposing slashes would be enough -- and better than using WSF> > square brackets. What do you think? WSF> > WSF> Yeah, I think so. If someone finds a use case where forward slashes WSF> don't work, we can always add in another option later. I don't think any WSF> use cases will appear though unless we start to use regex outside of WSF> %rename. Maybe we should support [] too though as that is what the other WSF> encoders use. Not sure. The only encoder which uses them is the "strip" one. The only other existing encoder which takes any arguments at all is the "command" one but it does it differently and doesn't use any delimiter at all but expects a "<<<" to terminate the string. This seems to be some sort of generic SWIG string handling feature though as I don't see any code for handling "<<<" in Source/Swig/misc.c itself. OTOH I don't see it anywhere else neither... How/where exactly are strings parsed in SWIG? I am reading parser.y and just don't see any special handling for them there, not even for backslashes. What am I missing? WSF> > WSF> What do other C oriented languages do for escaping backslashes in WSF> > WSF> regular expressions? WSF> > WSF> > AFAIK nothing, i.e. they just stick with backslashes meaning that using WSF> > regexes in C[++] is ugly. WSF> > WSF> I think I have the solution. My plan was always to merge the c++0x WSF> support into trunk for the next release. Given that c++0x supports raw WSF> string literals, we should take advantage of this. This looks like a good idea if we need escaping in these strings at all. But do we really? Once again, these are not arbitrary strings but something that should produce a valid identifier (albeit in target language, which might have somewhat different rules for identifier names than C). Why should escape sequences be handled at all here? I.e. I'd really just prefer to write %rename("%(regex:/^wx(?!EVT)(.*)/\1/)s") ""; literally, without neither doubling the backslashes not using raw strings. WSF> Note that the raw string literals changed earlier this year in N3077 to WSF> use parenthesis rather than square brackets. R"[\1]" used to be the WSF> appropriate syntax - WSF> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3077.html This is completely OT but I really wonder why had this been done. Using R[] seemed better to me for several reasons... WSF> I'd like this in trunk sooner rather than later and there seem to be WSF> enough interest/hands on help to fix up everything, so please commit WSF> direct to trunk. My problem is that I'm leaving at the end of the week so I must either commit this tomorrow at latest (to be able to fix any horrible problems introduced by me Friday) or wait for ~2 weeks. If using just "\1" is ok I should be able to do the former but if we want to use the C++0x raw strings it's probably not doable as I need to wait until you merge the C++0x branch first as it doesn't seem to make sense to reimplement parsing of these string here, even if it's not very complicated. Anyhow, please let me know what do you think about this, thanks, VZ |
From: William S F. <ws...@fu...> - 2010-06-23 22:31:03
|
Vadim Zeitlin wrote: > On Tue, 22 Jun 2010 23:49:44 +0100 William S Fulton <ws...@fu...> wrote: > > WSF> > So IMO imposing slashes would be enough -- and better than using > WSF> > square brackets. What do you think? > WSF> > > WSF> Yeah, I think so. If someone finds a use case where forward slashes > WSF> don't work, we can always add in another option later. I don't think any > WSF> use cases will appear though unless we start to use regex outside of > WSF> %rename. Maybe we should support [] too though as that is what the other > WSF> encoders use. Not sure. > > The only encoder which uses them is the "strip" one. The only other > existing encoder which takes any arguments at all is the "command" one but > it does it differently and doesn't use any delimiter at all but expects a > "<<<" to terminate the string. This seems to be some sort of generic SWIG > string handling feature though as I don't see any code for handling "<<<" > in Source/Swig/misc.c itself. OTOH I don't see it anywhere else neither... > How/where exactly are strings parsed in SWIG? I am reading parser.y and > just don't see any special handling for them there, not even for > backslashes. What am I missing? > From what I can see it is STRING in parser.y which comes from SWIG_TOKEN_STRING in expr.c and scanner.c. Not sure about the "<<<", didn't even know about it. > WSF> > WSF> What do other C oriented languages do for escaping backslashes in > WSF> > WSF> regular expressions? > WSF> > > WSF> > AFAIK nothing, i.e. they just stick with backslashes meaning that using > WSF> > regexes in C[++] is ugly. > WSF> > > WSF> I think I have the solution. My plan was always to merge the c++0x > WSF> support into trunk for the next release. Given that c++0x supports raw > WSF> string literals, we should take advantage of this. > > This looks like a good idea if we need escaping in these strings at all. > But do we really? Once again, these are not arbitrary strings but something > that should produce a valid identifier (albeit in target language, which > might have somewhat different rules for identifier names than C). Why > should escape sequences be handled at all here? > > I.e. I'd really just prefer to write > > %rename("%(regex:/^wx(?!EVT)(.*)/\1/)s") ""; > > literally, without neither doubling the backslashes not using raw strings. > Not sure you can do that as all strings are C strings in the parser. You might end up rewriting chunks of the parser just to handle non-standard C strings just for this case. Look at 'rename_directive' in parser.y, for example you'd need to reimplement a custom 'declarator'. I think the plan was to one day unify %feature and %rename into one being a special case of the other, so this special handling makes this even trickier. Not that I think this will happen any time soon, if ever. But probably only the occasional \1 needs escaping and given that reading regular expressions always requires some concentration, one more backslash isn't a big deal. Given C symbols are simple anyway, there is nothing within the symbol name that will need escaping, so I don't think it is a big deal. The renamed symbol still needs to be a valid C symbol name as it gets used in C code, so that kind of rules out some potentially exotic symbol names in target languages. > WSF> I'd like this in trunk sooner rather than later and there seem to be > WSF> enough interest/hands on help to fix up everything, so please commit > WSF> direct to trunk. > > My problem is that I'm leaving at the end of the week so I must either > commit this tomorrow at latest (to be able to fix any horrible problems > introduced by me Friday) or wait for ~2 weeks. If using just "\1" is ok I > should be able to do the former but if we want to use the C++0x raw strings > it's probably not doable as I need to wait until you merge the C++0x branch > first as it doesn't seem to make sense to reimplement parsing of these > string here, even if it's not very complicated. Maybe, I've got this wrong, but I'm not sure how you will parse just "\1" instead of "\\1" without loads of parser work as mentioned earlier. I would accept "\\1" is required for now and when c++0x is merged in then the raw literal strings are an alternative option, NOT the only option. So whatever you do is independent of the C++0x merge and all your tests can use regular C strings using escaping. It'll take me more than 2 weeks to do this c++0x so it won't hold up a release. If you think commiting to trunk is too much of a risk, put in a temporary feature branch for us to use. William |
From: Vadim Z. <vz...@ze...> - 2010-07-16 20:39:24
|
On Wed, 23 Jun 2010 23:30:54 +0100 William S Fulton <ws...@fu...> wrote: WSF> Maybe, I've got this wrong, but I'm not sure how you will parse just WSF> "\1" instead of "\\1" without loads of parser work as mentioned earlier. WSF> I would accept "\\1" is required for now and when c++0x is merged in WSF> then the raw literal strings are an alternative option, NOT the only WSF> option. So whatever you do is independent of the C++0x merge and all WSF> your tests can use regular C strings using escaping. Hello again, I'm finally ready to commit the patches using %regex:/pattern/subst/ syntax with "\\1" used for back-references in the substitution part. I've tested the native build under Linux and cross-compiling for MinGW from Linux too. The test suite passes for csharp, java and python for the native build, and I hope the other languages should work too as there is nothing language-specific in my changes but I didn't test them as I don't have the necessary support for all of them. Concerning configure, --with-pcre is on by default and --without-pcre must be explicitly given to disable it, as discussed. And the changes to configure are finally very simple as the standard AX_PATH_GENERIC() does work, it was my own fault for using it incorrectly. Finally I also added display of configured options to `swig -version` and fixed running the test-suite from a different directory. I intend to commit all these changes to svn trunk soon, please let me know if there any objections. Thanks! VZ |
From: William S F. <ws...@fu...> - 2010-07-17 09:20:38
|
Vadim Zeitlin wrote: > On Wed, 23 Jun 2010 23:30:54 +0100 William S Fulton <ws...@fu...> wrote: > I intend to commit all these changes to svn trunk soon, please let me know > if there any objections. > Go for it. William |
From: Vadim Z. <vz...@ze...> - 2010-07-22 17:21:36
|
On Sat, 17 Jul 2010 10:20:27 +0100 William S Fulton <ws...@fu...> wrote: WSF> Vadim Zeitlin wrote: WSF> > On Wed, 23 Jun 2010 23:30:54 +0100 William S Fulton <ws...@fu...> wrote: WSF> WSF> > I intend to commit all these changes to svn trunk soon, please let me know WSF> > if there any objections. WSF> > WSF> Go for it. After waiting for a few more days for a reply in the test suite thread I've finally committed everything I had, including both regex and makefile patches. AFAICS the latter shouldn't make things worse and at least like this test suite can be used from a separate build directory. Please let me know about any (new) problems, thanks, VZ |
From: William S F. <ws...@fu...> - 2010-07-23 20:17:39
|
Vadim Zeitlin wrote: > On Sat, 17 Jul 2010 10:20:27 +0100 William S Fulton <ws...@fu...> wrote: > > WSF> Vadim Zeitlin wrote: > WSF> > On Wed, 23 Jun 2010 23:30:54 +0100 William S Fulton <ws...@fu...> wrote: > WSF> > WSF> > I intend to commit all these changes to svn trunk soon, please let me know > WSF> > if there any objections. > WSF> > > WSF> Go for it. > > After waiting for a few more days for a reply in the test suite thread > I've finally committed everything I had, including both regex and makefile > patches. AFAICS the latter shouldn't make things worse and at least like > this test suite can be used from a separate build directory. > > Please let me know about any (new) problems, thanks, Do you know how come with --without-pcre it still goes and looks for sed? Seems like a bug in the implementation of the disabling of it. Just wondering as not very important. I think if pcre is not installed, we need to be more helpful in getting it installed. For example list the ways to install it on debian, cygwin, redhat, suse, bsd, solaris and other common operating systems. Failing that provide a wget command to run which puts the source in a directory and then SWIG should find this and build it when running configure again. Maybe even not mention that it can be disabled in the configure error message, as it should be dead easy to get it built and installed. It isn't good enough if it isn't easy to build and install pcre as part of the swig build and so disabling pcre support shouldn't be a good option to take. This is how to install required packages on Ubuntu: sudo apt-get install libpcre3-dev Can we please test different operating systems and reply back to this thread with the results. I'd like confidence in this before releasing. The documentation on advanced renaming is long overdue, thanks very much for this, it really is useful. What would really complete this would be to put the list of %rename predicates from swig.swg into the html docs with a description and perhaps a C++ example of the declarations (from a new testcase) that they operate on and to also mention %$not. Also a few more regex examples will make the regex support more accessible, eg document the examples used in the test-suite, CHANGES.current and misc.c. Finally, thanks ever so much for your work (and Torsten's) in this area, polishing off regex support is long overdue. William |
From: William S F. <ws...@fu...> - 2010-07-23 20:10:22
|
William S Fulton wrote: > I think if pcre is not installed, we need to be more helpful in getting > it installed. For example list the ways to install it on debian, cygwin, > redhat, suse, bsd, solaris and other common operating systems. Failing > that provide a wget command to run which puts the source in a directory > and then SWIG should find this and build it when running configure > again. Maybe even not mention that it can be disabled in the configure > error message, as it should be dead easy to get it built and installed. > It isn't good enough if it isn't easy to build and install pcre as part > of the swig build and so disabling pcre support shouldn't be a good > option to take. I was just thinking about this some more with the Windows build in mind. Our cross compile for Windows will fail as we can't install prebuilt libcre dlls like we install it for linux. So how about automatically running wget (if available) to obtain the install package into a known location in the SWIG source tree and then configure and build it from source all automatically. I've seen this done in other projects and the solution might work for all operating systems, not just windows. William |
From: Vadim Z. <vz...@ze...> - 2010-07-23 20:40:26
|
On Fri, 23 Jul 2010 20:47:47 +0100 William S Fulton <ws...@fu...> wrote: WSF> Do you know how come with --without-pcre it still goes and looks for WSF> sed? Seems like a bug in the implementation of the disabling of it. Just WSF> wondering as not very important. This appears to be due to AC_REQUIRE([AC_PROG_SED]) in ax_path_generic.m4. As http://www.gnu.org/software/hello/manual/autoconf/Prerequisite-Macros.html explains, AC_REQUIRE(macro) expands "macro" _before_ the body of the macro in which AC_REQUIRE itself appears and it seems it's also pushed out of the containing AS_IF(), and the bottom sentence of the link above seems to imply that this is done intentionally although I don't really know why. Anyhow, I suggested adding a macro combining --without-foo option definition and xxx-config check to autoconf archive to the current maintainer of AX_PATH_GENERIC and I'll try to address this issue when doing it which should happen eventually although I don't know when exactly. WSF> I think if pcre is not installed, we need to be more helpful in getting WSF> it installed. For example list the ways to install it on debian, cygwin, WSF> redhat, suse, bsd, solaris and other common operating systems. Honestly, I think it's a battle lost in advance, you simply can't know how to install it everywhere. Presumably users of the given OS/environment know will know how to do it anyhow and if they don't, or can't, they can always use --without-pcre. WSF> Failing that provide a wget command to run which puts the source in a WSF> directory and then SWIG should find this and build it when running WSF> configure again. And then what if wget is not available (some systems only ship with curl out of the box and some, of course, don't come with either)? Once again, it's MHO only but I think it's hopeless to try to turn SWIG configure into a cross-platform installation program, there are special projects which try to do this and none of them is exactly simple (nor widespread presumably because people just prefer to use apt-get/pkg-get/fink/pkg-add/...). To summarize, I'm very sceptical about this, it's a great recipe for spending a lot of time for not much gain. The only modern system without sane package management is Windows and for it I'd guess 99% of SWIG users just use a precompiled binary anyhow and don't build it themselves. WSF> I was just thinking about this some more with the Windows build in mind. WSF> Our cross compile for Windows will fail as we can't install prebuilt WSF> libcre dlls like we install it for linux. The simple solution that I use for my own builds is to link it statically. It's much simpler this way and IMHO the advantages of dynamic linking are just not worth the trouble. Just notice that you must point configure to the correct pcre-config when cross-compiling using static library as otherwise you wouldn't get the -DPCRE_STATIC flag which is required. WSF> This is how to install required packages on Ubuntu: WSF> sudo apt-get install libpcre3-dev WSF> Can we please test different operating systems and reply back to this WSF> thread with the results. I'd like confidence in this before releasing. FWIW it's the same under Debian. Although obviously it wasn't always called libpcre3, I think it was libpcre2 in the previous Debian release. And it will become libpcre4 one day presumably... WSF> The documentation on advanced renaming is long overdue, thanks very much WSF> for this, it really is useful. What would really complete this would be WSF> to put the list of %rename predicates from swig.swg into the html docs WSF> with a description and perhaps a C++ example of the declarations (from a WSF> new testcase) that they operate on and to also mention %$not. Unfortunately some of them just plain don't work currently (including one of those I gave) so I'm not sure if it's worth doing this before fixing this. I don't know if you noticed my message but because of the problem in order of setting of the parent node pointer, many interesting things are actually impossible right now. And as usual when dealing with SWIG internals I'm quite thoroughly lost so I don't promise to fix this myself, at least not without (heavy) hints from you or somebody else. WSF> Also a few more regex examples will make the regex support more WSF> accessible, eg document the examples used in the test-suite, WSF> CHANGES.current and misc.c. I thought the examples I gave were representative but I'll try to reread the text again and maybe add some more, notably including %$not which probably does work and is indeed quite useful. WSF> Finally, thanks ever so much for your work (and Torsten's) in this area, WSF> polishing off regex support is long overdue. Thanks to Torsten for starting the ball rolling, I wouldn't think about working on this if I hadn't seen his patches and thought that it would be really great to have them in official SWIG. I also use SWIG relatively extensively (as you might have guesses by the amount of my questions ;-) and am glad to do something to contribute to the project which is so useful to me. Regards, VZ |
From: William S F. <ws...@fu...> - 2010-07-26 20:57:23
|
Vadim Zeitlin wrote: > On Fri, 23 Jul 2010 20:47:47 +0100 William S Fulton <ws...@fu...> wrote: > > WSF> Do you know how come with --without-pcre it still goes and looks for > WSF> sed? Seems like a bug in the implementation of the disabling of it. Just > WSF> wondering as not very important. > > This appears to be due to AC_REQUIRE([AC_PROG_SED]) in ax_path_generic.m4. > As http://www.gnu.org/software/hello/manual/autoconf/Prerequisite-Macros.html > explains, AC_REQUIRE(macro) expands "macro" _before_ the body of the macro > in which AC_REQUIRE itself appears and it seems it's also pushed out of the > containing AS_IF(), and the bottom sentence of the link above seems to > imply that this is done intentionally although I don't really know why. > > Anyhow, I suggested adding a macro combining --without-foo option > definition and xxx-config check to autoconf archive to the current > maintainer of AX_PATH_GENERIC and I'll try to address this issue when doing > it which should happen eventually although I don't know when exactly. > Okay, thanks. > > WSF> I think if pcre is not installed, we need to be more helpful in getting > WSF> it installed. For example list the ways to install it on debian, cygwin, > WSF> redhat, suse, bsd, solaris and other common operating systems. > > Honestly, I think it's a battle lost in advance, you simply can't know how > to install it everywhere. Presumably users of the given OS/environment know > will know how to do it anyhow and if they don't, or can't, they can always > use --without-pcre. > Well, in my experience users who's jobs are to install software don't know the first thing to do if it doesn't work straight off, especially in large companies that consume software where this is seen as a menial task outsourced to 3rd parties who get people with next to no experience in anything to do it. Rant over! It really will be better for everyone in the long run if it works straight off or if not has clear instructions as to what to do to make it work. > WSF> Failing that provide a wget command to run which puts the source in a > WSF> directory and then SWIG should find this and build it when running > WSF> configure again. > > And then what if wget is not available (some systems only ship with curl > out of the box and some, of course, don't come with either)? Once again, > it's MHO only but I think it's hopeless to try to turn SWIG configure into > a cross-platform installation program, there are special projects which try > to do this and none of them is exactly simple (nor widespread presumably > because people just prefer to use apt-get/pkg-get/fink/pkg-add/...). > > To summarize, I'm very sceptical about this, it's a great recipe for > spending a lot of time for not much gain. I appreciate this philosophy, but we need to put in the minimum amount for maximum gain. That should at least have a system whereby a notice points out that pcre was not found and the user is to download and put it in a certain folder and rerun configure so it can be built. At an absolute minimum, this needs doing so that I can make the Windows release and I don't see the task of implementing regex as anywhere near complete until I can do this. Are you able to modify Tools/mkwindows.sh to get this working again? I run Tools/mkrelease.py on Linux which calls mkwindows.sh when creating the release. I can have curl or wget or anything similar installed and working. > The only modern system without > sane package management is Windows and for it I'd guess 99% of SWIG users > just use a precompiled binary anyhow and don't build it themselves. > > WSF> I was just thinking about this some more with the Windows build in mind. > WSF> Our cross compile for Windows will fail as we can't install prebuilt > WSF> libcre dlls like we install it for linux. > > The simple solution that I use for my own builds is to link it statically. > It's much simpler this way and IMHO the advantages of dynamic linking are > just not worth the trouble. > > Just notice that you must point configure to the correct pcre-config when > cross-compiling using static library as otherwise you wouldn't get the > -DPCRE_STATIC flag which is required. > A static link for windows is preferable, it certainly will create fewer problems when users decide to copy the executable about the place. > WSF> The documentation on advanced renaming is long overdue, thanks very much > WSF> for this, it really is useful. What would really complete this would be > WSF> to put the list of %rename predicates from swig.swg into the html docs > WSF> with a description and perhaps a C++ example of the declarations (from a > WSF> new testcase) that they operate on and to also mention %$not. > > Unfortunately some of them just plain don't work currently (including one > of those I gave) so I'm not sure if it's worth doing this before fixing > this. I don't know if you noticed my message but because of the problem in > order of setting of the parent node pointer, many interesting things are > actually impossible right now. And as usual when dealing with SWIG > internals I'm quite thoroughly lost so I don't promise to fix this myself, > at least not without (heavy) hints from you or somebody else. > I got around to replying to that now. > WSF> Also a few more regex examples will make the regex support more > WSF> accessible, eg document the examples used in the test-suite, > WSF> CHANGES.current and misc.c. > > I thought the examples I gave were representative but I'll try to reread > the text again and maybe add some more, notably including %$not which > probably does work and is indeed quite useful. > There is just one 'regex:/pattern/subst/' example in the html docs and I was just thinking it would be good to have a few more, like the ones in the files I mentioned above. > > WSF> Finally, thanks ever so much for your work (and Torsten's) in this area, > WSF> polishing off regex support is long overdue. > > Thanks to Torsten for starting the ball rolling, I wouldn't think about > working on this if I hadn't seen his patches and thought that it would be > really great to have them in official SWIG. I also use SWIG relatively > extensively (as you might have guesses by the amount of my questions ;-) > and am glad to do something to contribute to the project which is so useful > to me. It's certainly a big feature for the release notes. William |
From: Vadim Z. <vz...@ze...> - 2010-08-14 14:34:41
|
On Mon, 26 Jul 2010 21:57:11 +0100 William S Fulton <ws...@fu...> wrote: WSF> > WSF> Also a few more regex examples will make the regex support more WSF> > WSF> accessible, eg document the examples used in the test-suite, WSF> > WSF> CHANGES.current and misc.c. WSF> > WSF> > I thought the examples I gave were representative but I'll try to reread WSF> > the text again and maybe add some more, notably including %$not which WSF> > probably does work and is indeed quite useful. WSF> > WSF> There is just one 'regex:/pattern/subst/' example in the html docs and I WSF> was just thinking it would be good to have a few more, like the ones in WSF> the files I mentioned above. I've added a couple of examples but they don't look very inspirational, even to me. The problem is that currently regex replacement is explained before the %rename limitation to some symbols only and I guess we should exchange these 2 sections as being able to use (the currently not yet described) regexmatch would allow to give more realistic and interesting examples. While doing this I also found an embarrassing bug in the regex code itself which I fixed. But while doing this I discovered something else that I hadn't realised before: apparently there can be only a single %rename attached to a declaration. So if you want to rename stuff globally and try to do e.g. %rename("%(regex/^Set(.*)/put\\1/)s") ""; %rename("%(regex/^Get(.*)/get\\1/)s") ""; it doesn't work as expected because the first rename is just lost, i.e. it's overwritten by the second one. This seems very unexpected to me, especially because it happens silently. Is this really intentional? Thanks, VZ |
From: William S F. <ws...@fu...> - 2010-08-18 20:23:34
|
Vadim Zeitlin wrote: > On Mon, 26 Jul 2010 21:57:11 +0100 William S Fulton <ws...@fu...> wrote: > > WSF> > WSF> Also a few more regex examples will make the regex support more > WSF> > WSF> accessible, eg document the examples used in the test-suite, > WSF> > WSF> CHANGES.current and misc.c. > WSF> > > WSF> > I thought the examples I gave were representative but I'll try to reread > WSF> > the text again and maybe add some more, notably including %$not which > WSF> > probably does work and is indeed quite useful. > WSF> > > WSF> There is just one 'regex:/pattern/subst/' example in the html docs and I > WSF> was just thinking it would be good to have a few more, like the ones in > WSF> the files I mentioned above. > > I've added a couple of examples but they don't look very inspirational, > even to me. The problem is that currently regex replacement is explained > before the %rename limitation to some symbols only and I guess we should > exchange these 2 sections as being able to use (the currently not yet > described) regexmatch would allow to give more realistic and interesting > examples. > > While doing this I also found an embarrassing bug in the regex code itself > which I fixed. But while doing this I discovered something else that I > hadn't realised before: apparently there can be only a single %rename > attached to a declaration. So if you want to rename stuff globally and try > to do e.g. > > %rename("%(regex/^Set(.*)/put\\1/)s") ""; > %rename("%(regex/^Get(.*)/get\\1/)s") ""; > For normal %rename one would never use "" as that would rename all symbols to the one identifier. But it looks like the regex or function call matching for %rename needs a bit more thought here. Perhaps we need a way to detect these function type matches by using somespecial syntax in the name, eg "*" instead of "": %rename("%(regex/^Set(.*)/put\\1/)s") "*"; %rename("%(regex/^Get(.*)/get\\1/)s") "*"; Just some ideas. William > it doesn't work as expected because the first rename is just lost, i.e. > it's overwritten by the second one. This seems very unexpected to me, > especially because it happens silently. Is this really intentional? > > Thanks, > VZ > > > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------------ > This SF.net email is sponsored by > > Make an app they can't live without > Enter the BlackBerry Developer Challenge > http://p.sf.net/sfu/RIM-dev2dev > > > ------------------------------------------------------------------------ > > _______________________________________________ > Swig-devel mailing list > Swi...@li... > https://lists.sourceforge.net/lists/listinfo/swig-devel |
From: Vadim Z. <vz...@ze...> - 2010-08-18 22:05:16
|
On Wed, 18 Aug 2010 21:23:23 +0100 William S Fulton <ws...@fu...> wrote: WSF> Vadim Zeitlin wrote: WSF> > But while doing this I discovered something else that I hadn't WSF> > realised before: apparently there can be only a single %rename WSF> > attached to a declaration. So if you want to rename stuff globally WSF> > and try to do e.g. WSF> > WSF> > %rename("%(regex/^Set(.*)/put\\1/)s") ""; WSF> > %rename("%(regex/^Get(.*)/get\\1/)s") ""; WSF> > WSF> For normal %rename one would never use "" as that would rename all WSF> symbols to the one identifier. But it looks like the regex or function WSF> call matching for %rename needs a bit more thought here. Perhaps we need WSF> a way to detect these function type matches by using somespecial WSF> syntax in the name, eg "*" instead of "": WSF> WSF> %rename("%(regex/^Set(.*)/put\\1/)s") "*"; WSF> %rename("%(regex/^Get(.*)/get\\1/)s") "*"; Another possibility could be to use %rename("put\\1", regextarget=1) "^Set(.*)"; %rename("get\\1", regextarget=1) "^Get(.*)"; This is a bit ugly though as it separates the pattern and the replacement and, of course, doesn't work currently. I actually thought that this: %rename("%(regex:/^Set(.*)/put\\1/)s", regextarget=1) "^Set"; %rename("%(regex:/^Get(.*)/get\\1/)s", regextarget=1) "^Get"; would work but it doesn't neither. I don't know why yet but in any case I don't think duplicating the regex like this is a good idea. Concerning your possibility of using "*" instead of "" I don't really see what would it change? I can't say I really understand the code in naming.c but it seems that in this case the %renames are all simply put into the global rename_list in LIFO order. It looks to me that what really needs to change is the retrieval from this list: instead of taking just the first matching element, all of the matches should be tried in turn until one of them returns something non-empty. Or would doing this be wrong for some other reason? Thanks, VZ |
From: Vokuhila-Oliba <ka5...@on...> - 2010-08-19 04:10:09
|
Vadim Zeitlin wrote: > %rename("put\\1", regextarget=1) "^Set(.*)"; > %rename("get\\1", regextarget=1) "^Get(.*)"; > I didn't follow the 'regex' completely - so this might be slightly OT (and you might already have discussed it) But what about a special keyword for regex-renames like "rxrename": %rxrename("put(.+)", "Set$1"); %rxrename("get(.+)", "Get$1"); The syntax is very close to substitution regexes in Perl. And I think that looks very naturally and is very easy to read. |
From: Vokuhila-Oliba <ka5...@on...> - 2010-08-19 04:20:02
|
Vokuhila-Oliba wrote: > Vadim Zeitlin wrote: > >> %rename("put\\1", regextarget=1) "^Set(.*)"; >> %rename("get\\1", regextarget=1) "^Get(.*)"; >> >> > I didn't completely follow the 'regex' discussion - so this might be slightly OT > (and you might already have discussed it) > > But what about a special keyword for regex-renames like "rxrename": > > %rxrename("put(.+)", "Set$1"); > %rxrename("get(.+)", "Get$1"); > > The syntax is very close to substitution regexes in Perl. And I think that looks very naturally and is very easy to read. > > > ... and BTW - this allows more than one substitution and is still easy to read/understand as in: %rxrename("put(.+)foo(.+)", "Set$1Bar$2"); %rxrename("get(.+)foo(.+)", "Get$1Bar$2"); |
From: William S F. <ws...@fu...> - 2010-08-19 18:36:06
|
Vadim Zeitlin wrote: > On Wed, 18 Aug 2010 21:23:23 +0100 William S Fulton <ws...@fu...> wrote: > > WSF> Vadim Zeitlin wrote: > WSF> > But while doing this I discovered something else that I hadn't > WSF> > realised before: apparently there can be only a single %rename > WSF> > attached to a declaration. So if you want to rename stuff globally > WSF> > and try to do e.g. > WSF> > > WSF> > %rename("%(regex/^Set(.*)/put\\1/)s") ""; > WSF> > %rename("%(regex/^Get(.*)/get\\1/)s") ""; > WSF> > > WSF> For normal %rename one would never use "" as that would rename all > WSF> symbols to the one identifier. But it looks like the regex or function > WSF> call matching for %rename needs a bit more thought here. Perhaps we need > WSF> a way to detect these function type matches by using somespecial > WSF> syntax in the name, eg "*" instead of "": > WSF> > WSF> %rename("%(regex/^Set(.*)/put\\1/)s") "*"; > WSF> %rename("%(regex/^Get(.*)/get\\1/)s") "*"; > > Another possibility could be to use > > %rename("put\\1", regextarget=1) "^Set(.*)"; > %rename("get\\1", regextarget=1) "^Get(.*)"; > > This is a bit ugly though as it separates the pattern and the replacement > and, of course, doesn't work currently. > > I actually thought that this: > > %rename("%(regex:/^Set(.*)/put\\1/)s", regextarget=1) "^Set"; > %rename("%(regex:/^Get(.*)/get\\1/)s", regextarget=1) "^Get"; > > would work but it doesn't neither. I don't know why yet but in any case I > don't think duplicating the regex like this is a good idea. > > Concerning your possibility of using "*" instead of "" I don't really see > what would it change? I can't say I really understand the code in naming.c > but it seems that in this case the %renames are all simply put into the > global rename_list in LIFO order. It looks to me that what really needs to > change is the retrieval from this list: instead of taking just the first > matching element, all of the matches should be tried in turn until one of > them returns something non-empty. Or would doing this be wrong for some > other reason? There must be some iteration through the list somewhere in the code as there is priority as to what is a better match, eg: struct Foo { void y(); }; void y(); The following %rename(Y1) y; will match Foo::y and ::y, but %rename(Y2) Foo::y; Will match only Foo::y and will have precedence over the first %rename. Also %rename(Y3) Foo::y(); will have a higher precedence than all of the aobve. There is also a wildcard rule documented in http://www.swig.org/Doc2.0/SWIGPlus.html#SWIGPlus_ambiguity_resolution_renaming so that: %rename(Y4) *::y; will match all methods called Y in all classes. I don't recall the priority of this three valid matches. Anyway, given that there is a priority lookup, surely we can extend this to the regular expressions, either by detecting the regular expression in the new name, or by some new syntax like "*". Probably better to just detect whenever a the %rename contains a printf like formatter, like: %rename("myprefix_%s") ""; or the more advanced ones using regex. William |
From: Vadim Z. <vz...@ze...> - 2010-11-12 13:45:10
|
On Thu, 19 Aug 2010 19:28:15 +0100 William S Fulton <ws...@fu...> wrote: WSF> > WSF> Vadim Zeitlin wrote: WSF> > WSF> > But while doing this I discovered something else that I hadn't WSF> > WSF> > realised before: apparently there can be only a single %rename WSF> > WSF> > attached to a declaration. So if you want to rename stuff globally WSF> > WSF> > and try to do e.g. WSF> > WSF> > WSF> > WSF> > %rename("%(regex/^Set(.*)/put\\1/)s") ""; WSF> > WSF> > %rename("%(regex/^Get(.*)/get\\1/)s") ""; ... WSF> There must be some iteration through the list somewhere in the code as WSF> there is priority as to what is a better match You were right, of course, there is already an iteration over all rename_list elements in Swig_name_nameobj_lget() in naming.c. And the problem lies in this function too: it currently assumes that if there is no "target name", then the %rename always matches. This is, however, not true in the above case as a %(regex)s expansion may fail. So I'd like to propose the following patch: --- a/Source/Swig/naming.c +++ b/Source/Swig/naming.c @@ -1302,7 +1302,13 @@ Hash *Swig_name_nameobj_lget(List *namelist, Node *n, String *prefix, String *na : Swig_name_match_value(tname, sname); Delete(sname); } else { - match = 1; + /* Applying the renaming rule may fail if it contains a %(regex)s expression that doesn't match the given name. */ + String *sname = NewStringf(Getattr(rn, "name"), name); + if (sname) { + if (Len(sname)) + match = 1; + Delete(sname); + } } } if (match) { It makes the above example work for me and, AFAICS, shouldn't break anything else. Do you have any objections against committing this? Thanks, VZ P.S. Thanks a lot for adding swigprint gdb command, it made debugging this incomparably easier. |
From: Torsten L. <t.l...@gm...> - 2010-07-27 18:09:42
|
Hi Vadim, On Fri, 2010-07-23 at 22:40 +0200, Vadim Zeitlin wrote: > I also use SWIG relatively > extensively (as you might have guesses by the amount of my > questions ;-) > and am glad to do something to contribute to the project which is so > useful to me. BTW: I see you are also very active on wx-dev. Did you ever try to build wxPython with current SWIG by chance? I'd really like to use wx 2.9 for development at work, as 2.8 is quite old already and almost everything I add manually to 2.8 is already available in 2.9. Any insights? Thanks, Torsten |
From: Vadim Z. <vz...@ze...> - 2010-06-23 10:29:29
|
On Mon, 21 Jun 2010 19:28:30 +0100 William S Fulton <ws...@fu...> wrote: WSF> SWIG_LIB_PCRE in patch (5) - couldn't this be named more generically and WSF> submitted to the autoconf archive? I looked into doing this and it turns out that there is already an AX_PATH_GENERIC() macro in the archive which is supposed to test for any library with "library-config" script. So it should have been really easy to test for PCRE with it. Unfortunately the macro seems to be broken. I posted a question about it and I'd prefer to wait for a reply before starting reimplementing it. Anyhow, I think we can switch to using AX_LIB_CURL() later, using SWIG_LIB_PCRE() should work for now and IMHO isn't critical for committing these patches, do you agree? I'd like to change just one thing before committing: currently configure defines HAVE_PCRE_H in swigconfig.h which seems strange to me, as it's not just the header file that we need but the library itself. So I'd like to rename this to just HAVE_PCRE. Any objections to doing this? Thanks, VZ |