Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo

Close

#141 Add an option to show only binary changes in a C or C++ file

open
nobody
Plugins (27)
2
2004-03-24
2003-09-02
Matt Ball
No

It would be useful if the difference engine could identify
changes in a C or C++ file that would not cause a
change to the compiled object file. By doing so,
someone could use the differencing tool to only show
the important changes to a source file. For instance, if
a developer changes an entire file from K&R brace style
to Microsoft style, the person performing the code
review would prefer to ignore all the curly bracket
changes and only look at the real functional change.
This feature would be similar to the existing 'ignore
whitespace' option, except that it would also ignore
changes that do not affect the order of token parsing
for the compiler.

This change doesn't seem too hard to implement, since
WinMerge already performs syntax highlighting, which
requires token parsing.

Discussion

  • Perry
    Perry
    2003-09-03

    Logged In: YES
    user_id=60964

    WinMerge consists of several modules. The diff engine is gnu
    diff code, and that does the diff'ing. The editor is crystal
    text editor code, and that does the syntax highlighting. So
    there is no existing infrastructure for what you describe,
    unfortunately.

     
  • Kimmo Varis
    Kimmo Varis
    2003-09-04

    • priority: 5 --> 2
     
  • Kimmo Varis
    Kimmo Varis
    2003-09-04

    Logged In: YES
    user_id=631874

    And even if we could use that syntax hilight information, it
    wouldn't be very useful. Syntax hilight just looks for
    keywords. So it wouldn't notice if bracket is moved to next
    line. It only sees that there is a bracket.

    Lowering priority.

     
  • Kimmo Varis
    Kimmo Varis
    2003-09-04

    Logged In: YES
    user_id=631874

    Oops. Poor example - that would be what is requested. :)
    Anyway, just looking for keywords does not help.

     
  • Perry
    Perry
    2003-09-04

    Logged In: YES
    user_id=60964

    I think the problem with this is that you need to run the
    code through a preprocessor, and also collapse all
    whitespace down to a single space (in order to map all
    variants of the same code to a canonical form). Then you
    wind up with a file with no line returns at all, and not
    very much fun to look at.

    Maybe you need a smart tokenizer to run after the
    preprocessor, which has some canonical way of breaking up
    line returns.

    That would be interesting, but that definitely doesn't exist
    in WinMerge (the syntax highlighter doesn't really tokenize
    C++, which is a fairly involved job, and definitely doesn't
    do the work of a preprocessor, such as expanding macros from
    system headers).

     
  • ganier
    ganier
    2003-09-07

    Logged In: YES
    user_id=804270

    If I understand what you mean, it is very ambitious. For me,
    an important point for such a tool is to detect changes to
    variable name. And that is very difficult.

    The bracket problem, if we limit to this, is more approachable.
    I think what propose Puddle may work : when there is a {
    (or }), collapse all whitespace except one before this
    character, and insert an EOL after if there is none. A
    preprocessor can do this.

    If you are interested, we are working on the code to allow
    preprocessing with scriptlet. You could help in writing such a
    scriptlet :).

     
  • Matt Ball
    Matt Ball
    2003-09-08

    Logged In: YES
    user_id=679162

    (Just a quick clarification to the summary: The first sentence
    should read "... that would cause a change..." with the
    word "not" removed).

    After better understanding WinMerge architecture, it's clear
    that this change isn't trivial.

    I could probably help with some code development, but before
    diving into it, we should probably examine the best place to
    put this change. Since WinMerge uses 'diff' to perform the
    differencing, maybe this feature is best added to the 'diff' tool
    as an optional parameter. If the diff team accepts this as an
    option, it could perhaps be added to WinMerge afterwards.
    Supporting only brackets could be a useful stepping stone,
    but I was hoping to have a change that could go all the way.

     
  • Kimmo Varis
    Kimmo Varis
    2003-10-21

    • labels: --> 572064
     
  • Kimmo Varis
    Kimmo Varis
    2003-10-21

    Logged In: YES
    user_id=631874

    Laoran has landed his plugins-patch to CVS. I think this can
    be done using these preprocessing plugins (like many other
    neat things).

    C/C++-compiler knows best about changes. So maybe plugin can
    have C/C++ parser to preprocess files before comparing with
    WinMerge.

     
  • Perry
    Perry
    2003-10-21

    Logged In: YES
    user_id=60964

    Yes, I meant what Kimmo said, I just didn't say it clearly.
    I meant a C++ compiler preprocessor. That is, a preprocessor
    that not only understands C++, but that has a set of system
    includes to process, because lots of macros are found in
    system includes, and these are handled during the preprocess
    phase of traditional C++ compiling.

    I don't think the gnu diff team will want anything to do
    with this (surely I would not want this if I were maintainer
    of gnu diff, because it is very specific to C++), but I
    suppose it never hurts to ask :)

    I think the best way to do this would be to call a real C++
    preprocessor -- that is, MSVC6 or gcc. MSVC6 has a
    commandline parameter to only invoke the preprocessor,
    although I forget what it is. gcc has one also (I also
    forget that one).

    Anyway, my thought is that this RFE requires real C++
    tokenizing, not little simple word breaking like the
    tokenizers inside of WinMerge.

     
  • Matt Ball
    Matt Ball
    2003-10-24

    Logged In: YES
    user_id=679162

    gcc uses the -E option to perform preprocessing only.
    However, the preprocessed output still maintains the original
    whitespace. For the preprocessing option to work, the
    preprocessor would need to uniformly place white space
    between the operators. Since this preprocessed file would be
    unreadable, there would need to be a way to display the
    original file with differences based on the pre-processed file.

     
  • ganier
    ganier
    2003-10-25

    Logged In: YES
    user_id=804270

    This is the goal of preprocessing plugins. Currently files may
    be changed as long as the lines structure remains
    (delete/insert/change text inside line is allowed, not
    delete/insert/move lines).

    The gcc preprocessor probably needs to change lines layout.
    That requests work in WinMerge and work in the plugin.
    WinMerge must transform the list of differences based on the
    pre-processed to a list of differences in the original file. So
    the plugin must inform WinMerge about the changes in the
    line structure. In other words, the plugin must create and
    return a structure of deleted/moved/inserted lines.

    First a compromise has to be find between the simplicity of
    this structure and the complexity of the preprocessing.

     
  • Perry
    Perry
    2003-10-25

    Logged In: YES
    user_id=60964

    Laurent, I'm not sure that we can do this and meet the
    requirements you gave ? Details:

    gcc -E will add lines (it must include lots of lines from
    header files into the compilation module). These cannot be
    mapped back to the source files, because they are new lines.

    What if, in this case, we abandoned being able to map back
    to the original file ?

    I think that with lossy character conversion something
    similar can already happen. If you open a unicode file with
    WinMerge.exe, or if you open a CP936 file with WinMerge.exe
    and ask for it as CP1252, because that is your default
    codepage and you didn't change the GUI to ask for CP936
    (there is no such GUI anyway, right now), then you wind up
    looking at a file in the editor that we have to make
    readonly, because we can't let you edit it (because we can't
    save it back to the original form).

    I'm looking at it like this:

    If I diff two CP936 files in WinMerge.exe, then WinMerge.exe
    shows me two files which are lossy portraits of the
    originals. So, I just want to see differences between the
    lossy portraits, and I don't care how they really correspond
    to the originals. (Hm, in this case, they correspond
    line-by-line, so actually I know exactly how they compare,
    so my analogy is not so good.)

    If I diff two cpp files, and choose the
    preprocessor/normalize plugin(s), then I want to see
    differences between two preprocessed & normalized portraits.

    Ok, my reasoning breaks down here, I guess; of course, in
    practice, I really *do* want to know how the differences in
    the portraits correspond to the original source files -- in
    fact, the primary thing I care about is probably differences
    in the original source files.

     
  • ganier
    ganier
    2003-10-25

    Logged In: YES
    user_id=804270

    > I want to see differences between two preprocessed &
    > normalized portraits
    That makes four files ?

    In my opinion :
    * if you care about visualization of the original files : you use
    preprocessing (and we must find a way to translate the list of
    differences).
    * if you care about vizualisation of the transformed file (like
    CP example) : you use unpacking. No problem for diffing. And
    the plugin is responsible to pack again.

    > These cannot be mapped back to the source files, because
    > they are new lines
    All inserted lines are inserted because of a command in one
    line (called i, like #include "winmerge.h") in the original file. If
    all the lines inserted because of the line i find a match, then
    the line i is identical, else the line i is a difference.
    diff flag line i = logical or (diff flags from all lines inserted
    because of i).

     
  • Perry
    Perry
    2003-10-25

    Logged In: YES
    user_id=60964

    >> I want to see differences between two preprocessed &
    >> normalized portraits
    >That makes four files ?

    Hm. I'm getting confused also now. There are two original
    files, and two displays, and the original files aren't
    exactly what is in the display. Never mind my earlier weird
    terminology :)

    re: preprocessing and unpacking

    I'm not sure I understand what these mean.

    Preprocessing is some changes introduced between, say,
    file1.cpp and _LT11.tmp, and then unpacking is loading
    _LT11.tmp into the display buffer ?

    (I use file1.cpp for the actual file being compared, and
    _LT11.tmp for the temporary copy that WinMerge creates.)

    >all the lines inserted because of the line i find a match,
    >then the line i is identical, else the line i is a difference.
    >diff flag line i = logical or (diff flags from all lines
    inserted
    >because of i).

    That is neat if you can do that!

    That requires help from the plug-in, to tell you which new
    lines came from which original lines, yes ?

     
  • ganier
    ganier
    2003-10-26

    Logged In: YES
    user_id=804270

    > Preprocessing.
    Maybe prediffing is better (although not very english).
    Preprocessing is some changes to the temp files created in
    Rescan, just before giving these files to diffutils.
    I think I like prediffing because it is explicit.

    > Unpacking/packing
    As we would do for a zipped file, unpack it to load it in display
    buffer, and repack it when the user saves a modified file.
    File is unpacked in the display buffer, so is it in diffutils too.

    > That is neat if you can do that!
    I agree :)
    BTW, I am not sure any more about moved lines, current
    WinMerge version does not display moved lines.

     
  • Perry
    Perry
    2003-11-05

    Logged In: YES
    user_id=60964

    I suggest a resolution that (#1) we will not do this in
    WinMerge, and (#2) a good RFE would be a plugin to do this.

    Therefore, I suggest closing this and opening a new RFE for
    this functionality as a plugin. Although, there is lots of
    discussion here that it might be a shame to lose.

     
  • Perry
    Perry
    2003-11-05

    Logged In: YES
    user_id=60964

    Or, if we add a new RFE Category (I just submitted RFE for
    that), we could simply reclassify this into the new RFE
    category of "Plugin".

     
  • Bill Binder
    Bill Binder
    2003-11-13

    Logged In: YES
    user_id=553342

    For this kind of thing why not use GNU indent from a file
    pack/unpack plugin to get the files formatted the same way?
    Keep the indent options in registry or wire them into the
    plugin?

    Indent is really for C rather than C++, but it usually manages
    something plausible.

     
  • ganier
    ganier
    2003-11-14

    Logged In: YES
    user_id=804270

    Good idea. At worst, some lines will be bad formatted, and
    differences will be shown when we want to ignore them.

    > The gcc preprocessor (or indent) probably needs to change
    > lines layout.
    > That requests work in WinMerge and work in the plugin.
    > (WinMerge must transform the list of differences based on
    > the pre-processed to a list of differences in the original file).
    Important :
    > So the plugin must inform WinMerge about the changes in
    > the line structure. In other words, the plugin must create
    > and return a structure of deleted/moved/inserted lines.
    Very important indeed, as this must be applied to indent
    code. Or is there an option to produce something similar ?

     
  • Kimmo Varis
    Kimmo Varis
    2004-03-24

    • labels: 572064 --> Plugins
     
  • Kimmo Varis
    Kimmo Varis
    2004-03-24

    Logged In: YES
    user_id=631874

    Category --> Plugins.

     
  • ganier
    ganier
    2004-03-24

    Logged In: YES
    user_id=804270

    OK, probably we can not have the structure of changed lines
    by gcc. So no prediffing plugin for this.

    There are two possibilities
    A) unpacker plugins : the two files are formated before being
    displayed.
    (and they are of course formated before diffing)

    B) use normal diffing, and postprocess the result :
    ignore diffs that differ only by space, tabs, or N/L.

    see http://xxdiff.sourceforge.net/doc/xxdiff-secrets.html,
    paragraph "Per-Hunk Ignore Whitespace"

    Not as powerful as gcc, but it may be used for a lot of files,
    not only C/C++ ones.