Menu

#122 serious problem with fillParagraph and comments

9.0b3
closed
None
Bug
AlphaTcl
critical
2017-05-03
2017-04-08
No

There is a serious problem with fillParagraph in connection with comments: if the paragraph filled is
preceded immediately by a commented block of text, sometimes the filling messes up the two blocks,
either resulting in some commented text being part of the non-commented text, or some
non-commented text ending up inside the comment.

Here is a rather small example (must be in TeX mode):

% The ojiijiji
% defines in this way a
that is, $R_\bullet:\Deltagen\op\to\Grpd$.
The pullback construction

Place the insertion point on the third line and do Cmd-I.

Curiously enough, the following example in Text mode has no problem:

The ojiijiji
defines in this way a
that is, $R_\bullet:\Deltagen\op\to\Grpd$.
The pullback construction

I promise to look into this, hopefully over Easter. At the moment I just want to warn everybody.

Discussion

  • Joachim Kock

    Joachim Kock - 2017-04-16

    Hi Bernard,

    I have nailed down the fillParagraph problem.
    It is a difference in how [search] matches in
    AlphaX and in AlphaCocoa:

    set startPara "(^|[^\])%"
    search -w $win -n -f 0 -r 1 -- "$startPara" $pos

    In AlphaX the ^ takes precedence so that if there
    is a leading % on a line then only that % is matched.
    In AlphaCocoa, a longer match is found, namely
    including the \n before the %. That \n matches the
    character range in the square bracket, which is
    alternative to ^.

    This has nothing to do with the fact that we are
    searching backwards (usually a good source of confusion).
    The discrepancy is the exactly the same when seaching
    forward.

    I think the best convention is the one of AlphaCocoa,
    because it conforms to the regular-expression rule that
    between branches separated by | the longest match
    should take precedence

    (as I learned many years ago from this very nice book:
    https://www.amazon.fr/Introduction-expressions-r%C3%A9guli%C3%A8res-Bernard-Desgraupes/dp/2711786803
    )

    If you agree, I will change the [^\] to [^\\r\n]
    in the definition of TeX::startPara (as well as TeX::endPara).

    I searched AlphaTcl for the pattern ^| to see if there were
    other obvious potential victims of the AlphaX convention.
    There are some similar patterns in [quote::Undisplay] and in
    [TeX::convertDoubleDollarSigns] and friends, but they are
    used in regexps and regsubs, not in searches, and they
    are not part of the patterns actually being substituted,
    so I think it is best to leave those expressions as they are.

    Cheers,
    Joachim.

     
  • Joachim Kock

    Joachim Kock - 2017-04-16

    On 16/04/2017 13:33, Bernard Desgraupes wrote:

    Could this have to do with the ?m flag option which is automatically
    set by the search command (see the -ml option to disable it) ?

    Yes! That's it.

    I was not aware that [search] had this flag set
    automatically.

    So the issue has nothing to do with longest matches.

    Then perhaps it is AlphaX who is right!

    In any case the confusion is that

    • the ?m flag in Tcl regexp does TWO things:
      (1) it allows ^ and $ to match \n
      (2) it prohibits . and [^a-z] to match \n

    • the -ml flag in AlphaCocoa search does ONE thing:
      (1) it allows ^ and $ to match \n

    • AlphaX does not have such a flag for search,
      but its behaviour agrees with Tcl regexp with
      the ?m flag.

    As an illustration, try a file with two lines

    a
    b

    and do this from the status line:

    search -r 1 {a[^c]b} 0

    In AlphaCocoa, there is a match in position 0 3
    In AlphaX there is no match.

    Now try this:

    search -r 1 {^b} 0

    Both AlphaX and AlphaCocoa finds a match in
    position 2 3.

    The AlphaCocoa behaviour is exactly as documented
    in the Alpha Commands file, since the -ml flag
    specification does not mention (2).

    However, one could consider if it would be better
    design to modify Alpha's -ml flag so that it agrees
    completely with the regexp ?m flag, and such that
    Alpha's behaviour is consistent with AlphaX's.

    On one hand it is good to stay as close as possible
    to Tcl conventions, and on the other hand, there
    could be other occurrences in AlphaTcl that assume
    AlphaX's behaviour.

    Cheers,
    Joachim.

     
    • Bernard Desgraupes

      I don't think we should bother about Tcl regexp here. The search command does not rely on Tcl at all. It is implemented (in AlphaCocoa) using Cocoa's NSRegularExpression class and methods which provide Perl compliant regexps syntax.
      When AlphaTcl calls the regexp command there may be idiosyncrasies but them do not concern the search command.

       
  • Joachim Kock

    Joachim Kock - 2017-04-16

    OK, I have just committed the fixed regular expression in TeX mode then.
    Indices must be rebuilt to apply the fix.

     
    • Bernard Desgraupes

      Great! Thanks!
      To force a rebuild of the indices on everyone automatically you may increase the hardCodedCounter in SystemCode/Init/AlphaVersionInfo.tcl and commit.

      Envoyé de mon iPhone

       
  • Joachim Kock

    Joachim Kock - 2017-04-16
    • status: open --> fixed
     
  • Bernard Desgraupes

    • status: fixed --> closed
    • Version: 9.0b1 --> 9.0b3
     

Log in to post a comment.