Activity for Jörg Höhle

  • Jörg Höhle Jörg Höhle committed [208836]

    CLISP_UNICODE is always defined in modules, so use #if, not #ifdef.

  • Jörg Höhle Jörg Höhle committed [93e6e8]

    regexp: Compute character string index based on the current O(misc_encoding).

  • Jörg Höhle Jörg Höhle posted a comment on ticket #667

    Quite on the contrary, my current thoughts are strongly influenced by "how can we immediately bind with the correct values (not wasting time for other initializations)?" I'll write down my proposal ASAP, but have no time now. But the original bug report, unlike bug #375 is about FINALLY, not initialisation. In clisp, currently both are mixed up, because the order of FOR clauses currently influences the stepping in weird and unforeseen ways.

  • Jörg Höhle Jörg Höhle posted a comment on ticket #723

    I can't remember any LISP (even predating Common Lisp) where assoc was implemented as the paper describes. In papers, one often ignores \bottom and that some functions are partial. At the implementation level, one wants to distinguish the case "not found" -> NIL from the case "found key, value NIL". For decades, the obvious solution has been for ASSOC to return the CONS whose CAR is the found key. Decades later (i.e. in Java times), people would devise the API to signal an error when the key is not...

  • Jörg Höhle Jörg Höhle posted a comment on ticket #375

    Reverting was TRT. The old clisp code already has all the pieces to evaluate initialisation forms in an outer environment. It just needs to assemble the pieces differently, in some (yet unknown exactly) circumstances. The difference can be seen in (macroexpand-1'(loop for x across X for y across Y)) and (macroexpand-1'(loop repeat (random 2) repeat (random 5))) The first clause nicely uses LET, the second introduces SETQ and completely changes the scope.

  • Jörg Höhle Jörg Höhle posted a comment on ticket #375

    Regarding for-as-equals-then and section 6.1.1.4, I'm more and more convinced that there should be an amendment to ANSI-CL. It should distinguish for-as-equals-then from for-as-equals! for-as-equals Clearly, the environment must encompass all variables, as the code ("form1") can refer to values from a previous iteration (depending on the order of for-clauses). for-as-equals-then Here the change to ANSI-CL wording should be that only form2 is required to be evaluated in a lexical environment encompassing...

  • Jörg Höhle Jörg Höhle posted a comment on ticket #720

    The best I've come up with so far is (multiple-value-bind (run args) (cmd-args) (let ((se (socket:socket-server))) ;; Sleep between socket-connect and close prevents (:APPEND "foo" :APPEND T ...) most of the time (on MacOS) (ext:run-program run :arguments (append args (list "-q" "-q" "-x" (format nil "(close (progn (socket:socket-connect ~D) (sleep .01s0)))" (socket:socket-server-port se)))) :wait nil :input nil :output nil) (unwind-protect (with-open-stream (so (socket:socket-accept se)) (list (check-os-error...

  • Jörg Höhle Jörg Höhle committed [bbe38f]

    Fix --without-unicode tests.

  • Jörg Höhle Jörg Höhle posted a comment on ticket #722

    With the attached patch, make check-tests check-sacla-tests passes (minus bug #720). Ishould submit a bug report to SF because their preview does not always match the final outcome...

  • Jörg Höhle Jörg Höhle committed [14ce05]

    (destructure-type): In LOOP destructuring, type NIL originally meant: "Skip type declaration for this variable" (patch#53).

  • Jörg Höhle Jörg Höhle modified ticket #722

    build failures --without-unicode

  • Jörg Höhle Jörg Höhle modified ticket #722

    build failures --without-unicode

  • Jörg Höhle Jörg Höhle created ticket #722

    build failures --without-unicode

  • Jörg Höhle Jörg Höhle modified a comment on ticket #691

    I believe pcre can cope with NUL characters This is completely unimportant. YMMV. That TRE library explicitly mentions support for NUL on its title page (section "binary pattern and data support" in README.md).

  • Jörg Höhle Jörg Höhle created ticket #53

    Handle NIL as a type in LOOP as it once was intended

  • Jörg Höhle Jörg Höhle posted a comment on ticket #667

    There clearly are lots of diverging opinions about values within FINALLY. I just came across Robert Strandh's A modern implementation of the LOOP macro (loop for i from 0 below 10 sum i finally (print i)) "we have an example of a non-conforming implementation as explained in section 1.5.1. The reason is that the standard clearly stipulates that every implementation must print 9, whereas MIT loop prints 10." In his eyes, it must look like one bogus implementation (MIT loop) nevertheless became successful...

  • Jörg Höhle Jörg Höhle posted a comment on ticket #691

    I believe pcre can cope with NUL characters This is completely unimportant. YMMV. That TRE library explicitly mentions support for NUL on its title page (section "binary pattern and data support" in README.md).

  • Jörg Höhle Jörg Höhle posted a comment on ticket #691

    Both patches are in the repository. Shall we close this bug report, or wait until a further patch enables compilation in a build --without-unicode? (Note that it didn't compile previously either, so it's arguably a different bug). Also, perhaps I should have added a testcase, however there are already tests that fail (IIRC ext-clisp.tst) in --without-unicode because of embedded non-ASCII character literals, and #-UNICODE "..." is no help.

  • Jörg Höhle Jörg Höhle committed [0e756e]

    regexp: Use UTF-8 to communicate with gnulib/regexp, but keep O(misc_encoding) for locale-dependent error messages.

  • Jörg Höhle Jörg Höhle committed [e46fe3]

    regexp: Compute character string index based on the current O(misc_encoding).

  • Jörg Höhle Jörg Höhle posted a comment on ticket #691

    Please apply this patch (#2). It forces UTF-8 upon the string, because that's what gnulib's regexp recognizes (it has special code for it and handles multibyte UTF-8 correctly). For the error message from regeerror(), it still uses O(misc_encoding), because for the time being, one can see it as the user overridable locale_encoding. So we can postpone the debate upon whether to introduce O(locale_encoding). The regexp module still does not support --without-unicode

  • Jörg Höhle Jörg Höhle posted a comment on ticket #691

    Please apply this patch (#1). It fixes the initial bug report in all configurations compiled with ENABLE_UNICODE (presumably >95% these days). The code (past and present) doesn't work in a built --without-unicode.

  • Jörg Höhle Jörg Höhle committed [760a66]

    CLISP_UNICODE is always defined in modules, so use #if, not #ifdef.

  • Jörg Höhle Jörg Höhle posted a comment on ticket #720

    For instance, with two interactive shells, (close(socket-accept #)) causes the first shell to see :APPEND from a following socket-status (then EPIPE 32 from write-line). IOW, among the expected result list, even the first :OUTPUT is a bogus expectation. There's too much concurrency. I don't know wheter (close(prog1 (socket-connect d)(sleep 1)) would help, that's still heuristic.

  • Jörg Höhle Jörg Höhle created ticket #720

    Flaky test on MacOS: socket-server with run-program

  • Jörg Höhle Jörg Höhle posted a comment on ticket #691

    Here's a patch to compute the Lisp-side string index on the way back, according to the encoding used previously. My feeling is that it's only step 1, the second one being: force UTF-8 as encoding, because that's the only onw that gnulib regexp knows about, and the third would possibly export something like locale_encoding, instead of misc_encoding, because that's actually what gnulib's regex considers (searching for "UTF[-]8" in the locale setting). Bruno, I don't understand your reference to libunistring...

  • Jörg Höhle Jörg Höhle posted a comment on ticket #375

    6.1.1.4 Expanding Loop Forms http://clhs.lisp.se/Body/06_aad.htm "the form1 and form2 in a for-as-equals-then form includes the lexical environment of all the loop variables." Hence the second x in for x = x clearly cannot refer to an outside environment, in ANSI-CL. This exception causes for-as-equals-then to be a strange citizen, IMHO. How can you maintain global left to right evaluation when the lexical environments differ?

  • Jörg Höhle Jörg Höhle modified a comment on ticket #375

    note that (loop for x = x then (1+ x)) still does not work Maybe clisp should bail out in all such cases, instead of having half the initialisation forms evaluated in the outside environment and the other half in an environment encompassing all loop variables? The bug is about this CLHS issue: Issue LOOP-INITFORM-ENVIRONMENT: VAGUE http://clhs.lisp.se/Issues/iss222.htm Please update doc/impissue.xml if clisp decides to follow the (reportedly much debated) recommendation. I think it's actually a very...

  • Jörg Höhle Jörg Höhle posted a comment on ticket #375

    note that (loop for x = x then (1+ x)) still does not work Maybe clisp should bail out in all such cases, instead of having half the initialisation forms evaluated in the outside environment and the other half in an environment encompassing all loop variables? The bug is about this CLHS issue: Issue LOOP-INITFORM-ENVIRONMENT: VAGUE http://clhs.lisp.se/Issues/iss222.htm Please update doc/impissue.xml if clisp decides to follow the (reportedly much debated) recommendation. I think it's actually a very...

  • Jörg Höhle Jörg Höhle posted a comment on ticket #667

    So if I concede that end-test and stepping can be distinct, what about the CLtL2 example? (loop for i from 1 to 10 thereis (> i 11) finally (print i)) ; 11 CLHS specifically defines the behaviour of the arithmetic FOR/AS: 6.1.2.1.1. http://clhs.lisp.se/Body/06_abaa.htm "That is, iteration continues until the value var is stepped to the exclusive or inclusive limit supplied by form2." IOW, the variable is first stepped, then is the end test performed. So the value in well-defined within the epilogue,...

  • Jörg Höhle Jörg Höhle posted a comment on ticket #667

    The use of loop variables in the epilogue is IMHO not the problem. It is legal code, because IMHO the scope of all variables encompasses the epilogue, witness numerous examples using RETURN-FROM. So what's the problem? The exact value of i. (loop for c from 0 to 1 for i on '(1 2 3 4 5) finally (return i)) => (3 4 5) ; wrong IMHO The relevant piece is 6.1.2.1 Iteration Control http://clhs.lisp.se/Body/06_aba.htm "If multiple iteration clauses are used to control iteration, variable initialization...

  • Jörg Höhle Jörg Höhle posted a comment on ticket #691

    Pascal is right, I'm rusty and forgot about no naïve regexp handling multibyte correctly, e.g. .{n,m} might stop amid multibyte sequences. However, I recall some regexp APIs do have a multibyte flag. pcre: I'm surprised that cpcre.c doesn't set the flag pcre_utf8 automatically, since it unconditionally uses utf_8 as encoding. Setting that flag would put us back on the safe side. PCRE:PCRE-NAME-TO-INDEX side note: It's illogical and buggy that pcre_get_stringnumber gets called with misc_encoding,...

  • Jörg Höhle Jörg Höhle posted a comment on ticket #691

    "wide chars" is no solution if we mean the same thing. IIRC regexp could be compiled with something like 16 bit wide characters. That changes nothing (1:2 instead of 1:1) and would even waste space when given UTF-8. Using UTF-16 or UCS-2 doesn't appeal as a solution either, because not every unicode fits into 16 bit. I forgot whether CLISP uses UCS-2 internally for its characters and strings. If yes, compiling regexp.c with a 16bit character type would indeed be a good match. Otherwise, as I said...

  • Jörg Höhle Jörg Höhle posted a comment on ticket #691

    Citizen X buys a cheap house near an airport. It's cheap, because nobody wants to live & sleep near an airport. Then he complaints about the noise at night. In some cases, the jurisdiction is such that indeed, the airport is condemned to restrict flights at night. Here, instead of implementing a regexp in Lisp (e.g. cl-pcre) which would not suffer from that bug, a poorly advised idea was to interface to an external C library, which from day 1 was clear would lead to problems outside the 1:1 area....

  • Jörg Höhle Jörg Höhle modified a comment on ticket #53

    because these directives cannot be buffered in a string I don't think it's a good idea, for the exact same "within-stream" vs "out of band" argument. Suppose one performs a refactoring of the form (format "...~=" #) to (format "Query: ~A" (format nil #)). An explicit FINISH-OUTPUT is clearly visible and must come last. A ~= to a string output stream would be a NOP, and the functionality would silently and subtly get lost! Well, perhaps an argument along "makes it too easy to subtly break one's SW"...

  • Jörg Höhle Jörg Höhle posted a comment on ticket #53

    because these directives cannot be buffered in a string I don't think it's a good idea, for the exact same "within-stream" vs "out of band" argument. Suppose one performs a refactoring of the form (format "...~=" #) to (format "Query: ~A" (format nil #)). An explicit FINISH-OUTPUT is clearly visible and must come last. A ~= to a string output stream would be a NOP, and the functionality would silently and subtly get lost! Well, perhaps an argument along "makes it too easy to subtly break one's SW"...

1