Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo

Close

How to save an utf-8 file without BOM

2. Help
2005-05-23
2012-11-13
  • Hi,

    This question would sounds stupid, (and really is) but I searched all over NPP to find a way to save a utf-8 file without BOM (for obvious php header reasons) but I can't seem to find it.

    Also when i open a file saved in scite with utf-8 coockie format, NPP reads it as ansi and I only can change the "display as utf-8" but it still remains an ansi file to NPP.

    I know i am missing something here, thanx and please help me out!

     
    • Don HO
      Don HO
      2005-05-23

      Hmm, it seems the menu item name "Display as UTF-8" is misleading...

      > I searched all over NPP to find a way to save a utf-8 file without BOM

      Just choose "Encoding in ANSI" and check "Display as UTF-8" then save your file.

      > Also when i open a file saved in scite with utf-8 coockie format,
      > NPP reads it as ansi and I only can change the "display as utf-8" but it still remains an ansi file to NPP.

      It should display correctly if you just check "Display as UTF-8".
      Without BOM, there's no way to know whether if the loaded file is encoded in UTF-8.
      That's why, IMO, a file without BOM should be an ansi file, and it can be interpreted as UTF-8 file.

      I'd like change the name of the item "Display as UTF-8" to "UTF-8 without BOM" or "UTF-8 without signature". If you have a better idea, please let me know.

      Don

       
      • Hi Don

        Thank you for your answer. Now I understand what "display as utf-8" means ;-) It's actually utf-8 without BOM (in combination with ansi), if i get it correctly this time.

        You see that little checkbox in the open file dialog box that says "Open as read-only"? (of course u do, u created it!)

        I really like to have the same kind of option for saving utf-8 files, when 'Save as...' dialog box appears, I like to see  "Use signature (BOM)" as a checkbox below, unchecked by default.

        Also NP++ doesn't remember checked 'display as utf-8 ' option, every time i open my file i have to check 'display as utf-8' option.

        But thanx again for telling me what exactly "display as utf-8" is. It is doing the same thing as i wanted, saving utf-8 without BOM.

         
        • I have also an idea for 'display as utf-8' :

          why not drop display as utf-8 option and just add 2 different utf-8 to format menu:

          Encode with UTF-8
          Encode with UTF-8 (NO BOM)

          It's very clear then which one should we choose

          I hope u implement this idea

           
      • Dave
        Dave
        2005-06-27

        >> I searched all over NPP to find a way to save a utf-8 file without BOM
        >
        > Just choose "Encoding in ANSI" and check "Display as UTF-8" then save your file.

        But I am assuming that the poster would also want to preserve any UTF-8 characters in his UTF-8 formatted (without BOM) file, which is not what happens according to the following post to the forum:

        http://sourceforge.net/forum/message.php?msg_id=2957099

        > It should display correctly if you just check "Display as UTF-8". 
        > Without BOM, there's no way to know whether if the loaded file is encoded in UTF-8.

        It is usually possible to tell by applying some 'Unicode detection' heuristics, although I don't know if it is possible to do with 100% certainty.  However, if the user can indicate to Notepad++ that a file is UTF-8, that's not a problem.  The issue is whether the encoding is preserved if saved with the "display as UTF-8" option is selected.

        > That's why, IMO, a file without BOM should be an ansi file, and it can be interpreted as UTF-8 file.

        I'm a little bit confused with what you mean by the term "ANSI" in this context; do you mean ASCII?

        > I'd like change the name of the item "Display as UTF-8" to "UTF-8 without BOM" or "UTF-8 without signature". If you have a better idea, please let me know.

        How about some sort of "document is Unicode" option?

         
        • Don HO
          Don HO
          2005-06-27

          > But I am assuming that the poster would also want to
          > preserve any UTF-8 characters in his UTF-8 formatted
          > (without BOM) file, which is not what happens according
          > to the following post to the forum:

          For me, if there's no BOM, then it's a ANSI file (or ASCII file if you want).

          > It is usually possible to tell by applying some 'Unicode
          > detection' heuristics, although I don't know if it is
          > possible to do with 100% certainty.

          If there's a way to detecte a "UTF8 without BOM" file, I'll consider to implement it.

          > I'm a little bit confused with what you mean by the term
          > "ANSI" in this context; do you mean ASCII?

          Here, I meant AISI file as "Non Unicode File"

          > How about some sort of "document is Unicode" option?

          in the latest version 3.0, this item called "UTF-8 without BOM"

          Don