From: Daniel J. <dan...@gm...> - 2016-08-31 11:57:21
|
First a word to GSoC: I'm of course sad that I didn't made it through the final evaluation, but it was definitely the right decision. I had not achieved the main goal (making a release, updating gnulib) and thus would've decided the same way (there was a checkbox asking about my own impression about the project status where I noted exactly this impression myself, too). > BUFSIZ is a pretty standard constant for all string buffers. Eh, can you support that? It's a standard constant for STREAM buffers. This is even defined by the C (C99) standard. Using it for anything that's not directly related to a stream thus seems wrong to me. The only thing that might be interpreted as "standard for [..] string buffers" is this quote from the glibc documentation: > Sometimes people also use BUFSIZ as the allocation size of buffers > used for related purposes, such as strings used to receive a line of > input with fgets (see Character Input). There is no particular > reason to use BUFSIZ for this instead of any other integer, except > that it might lead to doing I/O in chunks of an efficient size. Though this too does not specify whether BUFSIZ will be small enough to be put onto the stack. Moreover it's just in the documentation of a single libc, there might be systems that have a huge BUFSIZ but only provide limited stack space. > Let us revisit this issue at a later date. > I think the with_string_0 mechanism is good enough. > If disagree, you will have to argue for it to be changed pervasively > throughout CLISP. with_string_0 is not involved here. I'm concerned by this (from regexi.c): begin_system_call(); ret = (regmatch_t*)alloca((re->re_nsub+1)*sizeof(regmatch_t)); end_system_call(); re->re_nsub is the number of subexpressions, and if the regex is in anyway "modifyable" by a malicious actor (e.g. a POST parameter for a search field), then that actor could pass a regex with lots of subexpressions, thus causing above alloca to produce a stack overflow (in the best case). > This is why we don't run gnulib-tool in that directory! > We only ever run it in src. Hm, I'm afraid that this not a good idea, at least not a scalable one. Let me explain: We have modules because we don't want to have their code in core CLISP, and we want to be able to (or let the user) provide modules to extend CLISP at will (and even at runtime, with dynamic loading). And adding a new module should not require changes to core CLISP, right? A user should be able to write some module, and let clisp-link from the installed CLISP do it's magic. Now assume a CLISP module needs (for example) access to some_function, but core CLISP does not. The gnulib module the_module provides that some_function on systems where it's not available. Should we add the_module to core CLISP? I think no, because that would bloat core CLISP (and we'd need special linker flags to actually have it in the resulting binary). Thus we add it to the CLISP module, and everything's fine. Until we extend the CLISP module and it suddenly needs another_function. Core CLISP happens to need this function, too. So the gnulib module another_module which provides this function is already included in CLISP. Now we could just use that and be done with it. Until we need another version (newer, older, using xalloc-die instead of xalloc, ...) of it. Or another_module and the_module both depend on a gnulib module lowlevel_thing. another_module only works with lowlevel_thing from a year ago, but the_module needs a recent one. My point is: gnulib is not designed to be something "shareable" across projects (and core CLISP and a module is basically that: two separate projects). Thus IMO the correct approach at using gnulib is to have a gnulib checkout for core CLISP, using only the gnulib modules needed by core CLISP. And letting each CLISP module (which wants/needs to use gnulib) maintain an own gnulib checkout with only the gnulib modules needed by that particular CLISP module. This adds some bloat when functionality overlaps (rawsock and socket.d, and basic stuff like file IO), but probably only on "gnulib intensive" platforms (windows?): https://sourceforge.net/p/clisp/bugs/634/ Also I think complaining about a missing libgnu.so won't help. That's just not the way gnulib is supposed to be used. > what was the problem [updating gnulib for core CLISP]? I'm not entierly sure. Makefile.devel tried to update (?) something (configure?) for all modules first, but this failed for all modules. IMO updating gnulib should - in the end - be no more effort than running gnulib-tool --update (in the correct directories, that is the top level directory and every module directory that has an own gnulib checkout, we could put that into a script or Makefile.devel then). > Fine. This means that the change necessary for a release is actually > quite small: > [snip] > Right? Unfortunately it's not that easy: rawsock fails to build on Windows (MinGW) because the gnulib code (from core CLISP) is too old and makes a (now) false assumption about the internals of MinGW header files. Thus, the necessary changes are: 1) Either update gnulib for core CLISP, or give rawsock its own gnulib (that's what I did and - due to the reasoning above - I'd argue for) 2) Remove "windows" specific code from rawsock.c: The typedef, including windows headers, the parts with #if defined(WIN32_NATIVE) At this point it will compile, but has reduced functionality, because gnulib has a (IMHO) design flaw: If there's no (for example) netdb.h, then the corresponding gnulib module provides one. But it does not #define HAVE_NETDB_H, thus our code would not use it (because of the #if defined(HAVE_NETDB_H) conditional source parts). Therefore we need to either remove the #if defined(..) stuff, or (better) find a way to determine whether gnulib does provide a replacement or not. The first option is straightforward but will lead to issues with platforms that are not supported by gnulib or where gnulib does not provide a replacement (should we again target them). > PLAN: > > -1- fix rawsock on windows and make a release (2.50) > 2/3 *.d --> *.c rename > 2/3 switch to autotools (dropping generated files) > -4- update gnulib > -5- your proposed regexp changes > -6- release (3.0) Due to all of the above discussion I think the first and most important thing to do is to find a consensus on how we use gnulib, and then update it (otherwise rawsock will not work). > Okay, so you want to go the way of Emacs - the developers have to > install autotools and the generated files are excluded from VCS. > Fine. > Let us do that after the release. > > Note, however, that you should use "hg mv" for configure.in --> > configure.am transition and make changes to configure.am only after > committing the "mv" operation (same for _all_ renaming). Ok :) |