Menu

#142 INI Edits/tweaks must be in ANSI (UTF-8 with BOM fails)

Current
closed
nobody
5
2014-08-16
2012-01-16
afaern
No

Don't know if this is really much of an issue since most INI tweaks are to Oblivion.ini and its variable names are all ANSI, but if you save INI tweaks as UTF-8, you end up with what you see in the attached image (the inability to cut and paste from that pane may be the subject of a future feature request).

There are two reasons this likely slipped through the unicode conversion. The first is a pair of circumstances related to the encoding itself:
1) Most INI files use only ANSI characters
2) A Unicode file without a BOM that contains only ANSI characters is an ANSI file

The other is a pair of Python issues:
1) A BOM is not inserted into the beginning a unicode file python writes unless you tell it to (i.e. file.write( codecs.BOM_UTF8 ))
2) More importantly Python screws up when handed a UTF-8 stream with a BOM. Try this:

import codecs
codecs.BOM_UTF16.decode("utf16")

Result is the BOM properly stripped (i.e. u''), however doing the same with UTF-8 results in u'\ufeff'

I created a test INI programatically in python to verify that it wasn't a bug in Notepad++

import codecs

file=open('C:\games\Oblivion\Data\INI Tweaks\testfile.ini', 'w')
file.write(codecs.BOM_UTF8)
file.write(u'[General]\n')
file.write(u'uGridDistantCount=25\n')
file.write(u'uGridDistantTreeRange=25\n')
file.flush
file.close

Results were the same as the picture

Discussion

  • afaern

    afaern - 2012-01-16

    BOM characters at beginning

     
  • afaern

    afaern - 2012-01-16

    err, should have mentioned, using r2208 on Win7, Python 2.7.2

     
  • Jacob Lojewski

    Jacob Lojewski - 2012-01-16

    Actually, Python can handle UTF-8 with BOM fine (reading and writing), you just have to use the "utf-8-sig" coded when opening it for writing:

    with codecs.open("testfile.ini","w","utf-8-sig") as file:
    do stuff

    The thing with INI tweaks is - Oblivion.ini/Skyrim.ini must NOT be in UTF-8 format. They need to be in ANSI (ASCII + extended characters) for Skyrim/Oblivion to read them correctly. Yes, UTF-8 without BOM is almost the same thing, until you start using the extended characteres (accented vowels, for example).

    Anyway, the point is: currently the same code used for parsing Oblivion.ini/Skyrim.ini is the code used for parsing the INI Tweaks. So I'll have to update this to work slightly differently.

     
  • Jacob Lojewski

    Jacob Lojewski - 2012-01-16

    Also, if the file is opened for reading using the 'utf-8-sig' codec, the BOM will be stripped automatically if present.

     
  • Arthmoor

    Arthmoor - 2012-12-23
    • labels: Other --> Other, ini tweaks
    • status: open --> closed
    • milestone: --> Current
     
  • Arthmoor

    Arthmoor - 2012-12-23

    Closing. INI Tweaks in UTF-8 will need to be supplied without a BOM marker.

     
  • tox2ik

    tox2ik - 2013-06-30

    [quote]
    The thing with INI tweaks is - Oblivion.ini/Skyrim.ini must NOT be in UTF-8 format. They need to be in ANSI (ASCII + extended characters) for Skyrim/Oblivion to read them correctly.
    [unquote]

    What makes you say that? I changed the default encoding of class OblivionINI to utf-8 in revision 3001-3002. I did a sanity check prior by saving skyrim.ini and oblivion.ini with japanese glyphs. The games read the utf-8 ini files and loaded "just fine". If the case is indeed that ini files can not be in utf-8 then any inis with unicode must be rejected by the function def install in class InstallersData from bosh.py.

     

Log in to post a comment.

MongoDB Logo MongoDB