Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo

Close

How does one delete NUL chars?

2008-05-16
2012-11-13
  • So, I am still using Kedit as my text editor (some simple reasons like speed, the ALL command, command line support, and KEXX), but I also use Notepad++.

    I *still* cannot find a simple way to load a file in Notepad++ which contains NUL chars, so it looks like:

    U{NUL}s{NUL}e{NUL}r{NUL}

    which should just read "User".

    In Kedit, I search and replace with regex all NUL chars, do all command for all blank lines, delete them, and then restore the ALL.

    How would I do this in Notepad++? Does it support removing NUL chars from a regex command?

    Thanks, I am sure someone has already figured this out. Oh, and sorry, but the version of the ALL command built into TextFX - sorry, fail!! Great stuff in TextFX, but the ALL command clones just do not do it for me.

    Michael

     
    • Airdrik
      Airdrik
      2008-05-20

      This would seem to be an encoding issue where the file being read is really unicode, but is being read as ascii.  Unicode is 2 bytes/char, whereas Ascii is 1 byte/char, and the first 256 characters in the Unicode character set correspond to the ascii character set. 

      If it is the case that the files are in unicode (UTF-8/UCS-2 -- I don't know much about these) but being opened as if it were Ascii, then why isn't the file getting recognized as unicode, and what do you need to do to get it to recognize it as such. 

      That's about as much as I know about unicode.  One of the developers who knows more on the subject should pitch in and see what's going on.

       
    • Fool4UAnyway
      Fool4UAnyway
      2008-05-16

      The easiest way to do this is by using the advanced Find/Replace dialog.

      Select a [NUL] character.
      Press CTRL+R. The [NUL] will be copied into the Find field.

      Press Find.
      Press Replace & Find Again.

      Uncheck Selection.
      (Uncheck Wrap mode.)

      Press Find.
      Press Replace Rest.

      Unfortunately, you can't uncheck Selection from the start: it would skip the (first) selected [NUL] character.

       
    • Harry
      Harry
      2008-05-16

      Press ctrl-f for find dialog, select the extended option, type
      \0
      in the find field and nothing in the replace field, then replace all

       
    • Ok, thanks for the help, everyone!

      The speed difference in Kedit and Notepad++ is huge, this search and replace action takes a second or s in Kedit and well over a minute in Notepad++, but it did work...

       
  • petke himself
    petke himself
    2010-10-25

    Notepad++ is great. However.

    This is still an issue with the most resent version 5.8.2

    Ultraedit does not have this issue. But its not freeware.

    We have a lot of these UTF 16 bit files at work. It would like to continue using notepad++, but will have to switch if this is not fixed. If the specific problem cant be solved then maybe a more general quick fix would be for there to be a option to turn off all symbols.

    Thank you.

     
  • If it's really UTF-16 or UCS-2, and you want to delete the NULs because you've got {NUL}a{NUL}b{NUL}c, then as said above, the correct way is to use the Encoding menu, click "Encode in UCS-2", then "Convert to UTF8" (or whatever you would prefer).

    If you've got the odd NUL, so "abc{NUL}def" that you want to delete, then just use the Ctrl-R solution described above.

    Incidentally, the statement "Unicode uses 2 bytes" is wholly incorrect.  Go read http://www.joelonsoftware.com/articles/Unicode.html which is an excellent explanation of what unicode is and is not.

    Cheers,
    Dave.

     
  • cchris
    cchris
    2010-10-25

    The initial post suggests an UCS2-LE encoded file. As Dave said, changing to ANSI (any flavour) or UTF-8 will do the trick, faster than a search and replace.

    \0 or \x00 will address occasional NULs in Extended mode, NOT in Regulalr expression mode.

    Btw, Notepad++ does not supprt Unicode 32 bit formats. Is there an actual need for it?

    CChris

     
  • @CChris:  If he sees the NULs in Notepad++, the file encoding is not correctly detected (without BOM, the current Utf8_16_Read::determineEncoding method is not very good).  The problem is that "Encode in UCS-2 …" does not behave as it should; it does not reinterpret the bytes from the encoded file, and thus will not correctly read an UCS-2 file which is not detected as such.  Is this a known issue?

     
  • cchris
    cchris
    2010-10-26

    In 18 months, I may have seen 3 or 4 such reports, so the actual problems seem scarce enough, and may not have been properly identified.

    CChris