#1289 Copy truncates string at null byte

Bug
closed-fixed
Neil Hodgson
Scintilla (789)
5
2014-07-29
2012-02-02
Eric Promislow
No

In scite, select a string that contains an embedded null byte. Paste it into another app. The string
will be truncated at the null byte.

It's hard to say about copying a string in another app. VS2010 lets me copy a string with
an embedded null. I can paste it back in a VS binary editor, but when I try to paste it
into a text view, the embedded nulls are dropped, but the string isn't truncated.

Ref Komodo bug http://bugs.activestate.com/show_bug.cgi?id=89557

Discussion

  • Neil Hodgson
    Neil Hodgson
    2012-02-02

    • assigned_to: nobody --> nyamatongwe
    • status: open --> open-invalid
     
  • Neil Hodgson
    Neil Hodgson
    2012-02-02

    This depends on platform. Windows defines CF_TEXT and CF_UNICODETEXT as ending at a null character.
    http://msdn.microsoft.com/en-us/library/windows/desktop/ff729168\(v=vs.85).aspx

    This topic has a history and some people have strong opinions. In the past, you would sometimes receive large buffers with rubbish after the null. If you really want to include null characters in text then you should define a new clipboard format for this purpose.

     
  • Eric Promislow
    Eric Promislow
    2012-02-02

    I resolved the Komodo bug as Limitation, and personally don't understand the use case for wanting to manipulate null bytes -- most programming languages have a way of representing troublesome characters. As I wrote above, even Visual Studio supports copying and pasting null bytes only within its binary editor.

     
  • Eric Promislow
    Eric Promislow
    2012-02-02

    • status: open-invalid --> closed-wont-fix
     
  • I use Scite to look at data. Scite is able to load files with embedded null character. But if I copy a string with an embedded null character and paste it within another document or in another editor (Notepad++, UltraEdit, Excel) it strip everything after the null character.

    This is of course a bit confusing and at first glance looks like a bug.

    What other text editors (Notepad, UltraEdit) do in such situation is to replace the null character 0x00 by a space 0x20 when filling the clipboard.

    This solution is not perfect because the copy is then not identical to the original but:
    - most of the copy is not stripped and
    - most of the content is correct.

    Do you think this could be an acceptable solution for Scintilla?

    If so I'll be glad to submit a patch.

    Best regards,
    Vivian.

     
  • Eric Promislow
    Eric Promislow
    2013-01-05

    Vivian, I have no problem with your proposed patch. Probably better to have the null bytes replaced by strings than to lose some of the text.

     
  • Neil Hodgson
    Neil Hodgson
    2013-01-05

    Conversion of NULs to spaces for copy should be OK.

    For editing data files, adding an explicit binary bytes format and a "Paste Binary" command would allow for more accurate work in some cases.

     
  • Neil Hodgson
    Neil Hodgson
    2013-01-05

    • status: closed-wont-fix --> open-accepted
    • milestone: --> Bug
     
  • Neil, Eric,

    Thanks for your quick answer. In attachment you'll find a patch over the 'tip' of the mercurial repository.

    That patch replace null characters by spaces in the clipboard to avoid that its content is truncated in the paste operation.

    This already let me copy the clipboard to other application (Other text editors, Excel, etc.)

    Cheers,
    Vivian.

    PS: If you are interested I can also work on a second one that create an additional clipboard format to be more accurate when pasting. For that I would have to investigate to see what clipboard format already exist and how they are used in other programs otherwise the interest of the patch will only be to copy from Scite to Scite.

     
    Attachments
    • Neil Hodgson
      Neil Hodgson
      2013-01-09

      Having an extra format would be more work since a simple implementation would double the size of the clipboard. This could be avoided by 'promising' to deliver one or the other format when required by handling (on Windows) WM_RENDERFORMAT and WM_RENDERALLFORMATS.
      http://msdn.microsoft.com/en-us/library/windows/desktop/ms649030(v=vs.85).aspx
      If it was implemented, it should probably be an option for cut and copy since it may not be wanted by all applications and "Paste Binary" should be a separate command.

      It also needs a new clipboard type which is easy on Windows and OS X but more difficult on Linux where the clipboard is asynchronous.

       
  • Neil Hodgson
    Neil Hodgson
    2013-01-06

    Adding this to ScintillaWin.cxx means it only works on Windows and has to be written twice for narrow and wide characters. Adding it to SelectionText::Set and SelectionText::Copy in Editor.h would make it work on all platforms and would only need a char implementation.

     
  • I was not sure the same problem appear in the other platforms.

    I'll propose you another patch that fix SelectionText:Set and SelectionText::Copy.

    Cheers,
    Vivian.

     
  • Neil,

    In attachment you'll find a patch of SelectionText:Set and SelectionText::Copy you have propose me to implement.

    Vivian.

     
    Attachments
  • Neil Hodgson
    Neil Hodgson
    2013-01-09

    Committed as [4b0f3e] with spelling fixed, method capitalized to match Scintilla convention and bug number added to commit message.

     

    Related

    Commit: [4b0f3e]

  • Neil Hodgson
    Neil Hodgson
    2013-01-09

    • status: open-accepted --> open-fixed
     
  • I'll take more care for the convention next time :-)

    Thanks for your help.

     
  • Neil Hodgson
    Neil Hodgson
    2013-01-17

    • status: open-fixed --> closed-fixed