Essentially remove the ASSERT about lines without
zero-termination. This allows me to load a binary text
file successfully in debug mode.
Altered and original files (GhostTextBuffer.cpp & MergeDoc.cpp)
Logged In: YES
Also I set the readonly flag if the file had embedded zeros.
Note that both files appear to be readonly in this case,
although my code is only setting the readonly flag for one
side -- I've not figured out why the other gets marked as
So, now if there was a way to ask the editor to show the
first line without eol characters, then we could tell people
who can't load binary files, to load them this way and find
out where the embedded zeros are...
Logged In: YES
Why should we care to load files with zeros inside (or double
zeros when Unicode) ?
This character is invalid for text files (we can live we that),
and it is a very special character for all strings (very very
For example, my registry export has some 0x4 characters
inside, as there are binary data. I suppose there may be
some 0x0 even.
First I don't really know if there may be some zero in a
exported registry file, but if it is possible, I would just say
that it is one more bug from Microsoft.
What should we do with a zero ? truncate the file : that is
not what I want. Display as read-only ? But if I want to
merge. Propose to change the offending zero first ? But if I
want to keep it.
And also, according to the bug you found, a new line is
created when a zero is found. But why ? It is not an EOL. It
is an invalid character, not a control character (or yes it is a
control character, the end of string one).
Also lot of problems with strings : if I want to paste a text
with a zero inside, the text will be truncated.
Proposed solution : recognize file as binary and refuse to load
(I suppose just the unicode code need to be updated). The
message could be more explicit too.
PS : for example, notepad replace the zero with a space.
Why ? It would just change the binary data in my file. And
not even a warning.
re: pasting text with zero inside
Right now, the zeros are treated as empty line termination,
My proposal is to just display the file contents, more or
less -- this isn't an exact display, because there is no
obvious way to show zeros. So this just shows approximately
how the file looks. And you can't edit it, because we're not
showing it exactly and anyway, we can't handle editing files
with zeros in them.
This is really for only one use -- when the user has a file
that WinMerge says is binary, but the user knows it is
really mostly text, and they'd like to see it compared.
Although the other thing I'd like in this case, is a way to
tell them (or better, show them) where the offending zeros are.
In summary, my proposal:
> What should we do with a zero ?
I suggest what I did already, which is throw it away and
pretend the line ended.
> But if I want to merge ?
Too bad, not allowed.
Anyway, this is mostly because we can do this right now,
AFAICT, just by removing that ASSERT and adding the code I
added here to make it read-only. However, this might cause
some other problems -- I didn't test this any further than
that it will display the file.
> just by removing that ASSERT and adding the code I
> added here to make it read-only
Some ASSERT are really useful to detect bugs. And the code
has kept evoluting since three months (EOL, undo/redo,
MAC/DOS/UNIX) so I prefer to keep this ASSERT.
> Right now, the zeros are treated as empty line termination,
> but UniMemFile.
Rather replace them with a space (like notepad does). This
will for sure break nothing.
Also I want to know there is a zero, I won't remind the
reason why the read-only flag is set. So a dialog is really
A cleaner solution is to develop an unpacker to replace the
zero with another character. Then the user may compare the
files, but only if he makes this choice.
Even better the plugin could search a character replacement
for the zero. If it finds none it replaces with space and sets
the read-only flag.
(needs a change to the plugin interface to set the read-only
flag, but that is easy).
> Rather replace them with a space (like
> notepad does). This will for sure break nothing.
What does pretending they are line terminations break ? I
mean, I have no hopes of editing such a file anyway -- I
don't care to even try. I'm not in a hurry to change it to
space, because that doesn't fit as well into the UniMemFile
interface -- or at least right at the moment, I'm not sure
how to do that. Right now you can tell there was a zero by
what happens from the ReadString call (it has no eol, but it
isn't done); I don't really want ReadString to silently
change them to spaces without telling the caller.
> Also I want to know there is a zero, I won't
> remind the reason why the read-only flag is
> set. So a dialog is really necessary.
I think this brings up a problem we actually have already --
you can't see why the file is read-only. Perhaps the source
had the read-only attribute. Perhaps the representation
conversion (ie, codeset conversion) was lossy. (I don't know
if there are any other possibilities.) We should file this
as a -- hmm, a bug or an RFE ? An RFE I guess -- some way to
tell why the file is read-only.
> (unpacker idea)
Over my head, so I can't comment.
Actually I don't really understand what an unpacker is,
still -- is the concept that it is unpacking the file into
the display buffer, or is it something else ? My problem is
that I'm not sure what the "unpacking" is. I think of
someone opening a box (unpacking), but I'm not sure what the
box represents, or what the object we're taking out is. That
is, I just haven't followed along the plugin stuff, and am
not yet understanding the basic concept -- which is probably
a simple analogy that I just don't yet get.
Never mind, I read the Plugins/readme.txt, and unpacking
means transforming the file into a file (or a buffer) for
display. Pretending that is a zip archive makes the word
"unpack" make sense to me, and you mentioned that in the
readme, so I think you may have had that in mind.
So, I think I can understand that we could make a plugin to
do this, but, it seems like a lot more work than just
removing this one ASSERT (which is what this patch does, to
make it work) ?
I suppose I could change this patch to instead manufacture
an actual line termination for the line ending where the
zero was, and then I wouldn't even have to remove the ASSERT
-- you want to keep the ASSERT, correct ?
Logged In: YES
Why to add some workarounds for binary files. We can't
compare or edit them. There are many editor capable of
showing binary files and many binary edit tools. I agree
messagebox for user should be clearer. Unfortunately I
couldn't modify texts or add new texts just before 2.0
release so I used existing one. But it's 2.1 now and all is
possible. Just tell user file cannot be opened because there
are embedded zeroes in it yadda yadda.
Implement real binary view. One nice binary view/edit
control is here:
I wasn't proposing adding a workaround; I was proposing
removing an ASSERT, because we *can* compare binary files,
right now, if they are mostly text -- that is, we can
compare the textual parts fine. At least, so it seems to me,
when I tried it with the sample files I posted for the bug
(but, they are *very* small sample files!)
My basic point is that AFAICT, we actually can compare
binary files right now by treating them as text files. This
is of course only useful if they have text sections, or if
they are largely text.
I do not insist on this however :)
If (as I suspect :)) there is no more discussion on this in
a couple of days, I'll close this as Rejected.
> My basic point is that AFAICT, we actually can compare
> binary files right now by treating them as text files.
Ok. But how useful this half-way binary support really is?
Reliable? If we open door for binary file compares, people
start using it and sending reports about cases it does not
work; we get a lot of bug reports about feature we are not
supporting... Who has time to look for those? For me this is
binary thing (no pun intented): we have real binary file
support or no support.
Hmm. Of course this debends on how we define binary files.
:) Current algorithm searches for zeros.
Lets move this discussion to developer forum. I'll go post
Closing this as Rejected.
Its TTL (time-to-live) field expired :)