Re: [sqlmap-users] Back-end DBMS charset encoding
Brought to you by:
inquisb
From: Miroslav S. <mir...@gm...> - 2011-01-19 14:30:47
|
addendum: most simple explanation for the "priority among all charsets is the encoding of the web page" is that, as we need to choose one, let it be the most obvious one :))) On Wed, Jan 19, 2011 at 3:25 PM, Miroslav Stampar <mir...@gm...> wrote: > hi all. > > as i was really interested into this issue i had to set up a testing > environment to find out what's going on :))) > > i've choose simplest (disposable) testing environment: XAMPP > > two tables: users_utf8 & users_latin > two vulnerable GET pages: get_int_utf8.php & get_int_latin.php > > well, conclusion and my answer to the given question: "What's should > be the general "consensus" for data retrieval": > > priority among all charsets is the encoding of the web page, and > that's because three reasons: > > 1) connection from the web server to the backend DBMS will be most > certainly set to some "compatible" charset with the one at the page > itself - that means that all the data from DBMS to the web server will > be automatically converted to connection's charset > 2) once the web server has replied with the data, in case that the > data is not compatible with it's current character set it will in most > cases just do a simple replacement with '?' for problematic characters > (like in case from latin1 -> utf8) - which means a big screw up for > our data in "error" and "union" techniques as the data is irreversibly > lost > 3) finding out "proper" collation is a futile in a sense that in MySQL > for example you can put collation to everything (column, table, > connection, user, ...), and there is no "magic" bullet to know the > final collation of the retrieved data in a "time constrained" manner. > > interesting thing that should be pointed out is that you'll most > probably have problems with character sets of retrieved data here and > there for one obvious reason: > web page's connection to the backend DBMS dictates character set used > for retrieved data, we "violently" use it in sql injection attacks for > different tables with different character sets/collations which were > most probably not "meant" to be "compatible" with web page itself, > hence you'll lose information irreversibly during the conversion > process. > > kr > > On Tue, Jan 18, 2011 at 12:13 PM, mitchell <mit...@tu...> wrote: >> Will do :) >> >> # mitchell >> >> On 18 Jan 2011 13:11, "Miroslav Stampar" <mir...@gm...> wrote: >>> hi mitchell. >>> >>> thank you for your answer. i thought that nobody would :) >>> >>> we've done some serious work these days in this field and would like >>> to have it "stabilized". plz report any "strange" behavior in this >>> field if you encounter it. >>> >>> kr >>> >>> On Tue, Jan 18, 2011 at 12:01 PM, mitchell <mit...@tu...> wrote: >>>> Hi Miroslav, >>>> >>>> In say 80% of the cases I delt with Bulgarian sites, the data in the >>>> database used the same encoding as the encoding announced on the webpage, >>>> usually CP-1251. The rest use UTF. >>>> >>>> # mitchell >>>> >>>> On 17 Jan 2011 16:52, "Miroslav Stampar" <mir...@gm...> >>>> wrote: >>>>> Hi all. >>>>> >>>>> I have a general question to all those pentesters that are retrieving >>>>> data >>>>> from sites with "funny" charset encodings (...russian, chinese...). >>>>> >>>>> What's should be the general "consensus" for data retrieval: >>>>> >>>>> A) assume that the backend DBMS uses the "utf8" charset encoding >>>>> or >>>>> B) treat data retrieved with the same encoding as used in the page >>>>> or >>>>> C) find out the proper collation used and use that one? (i am not a fan >>>>> of >>>>> this one :) >>>>> or >>>>> D) don't care (some people tend to use mixed collations which is quite >>>>> romantic) >>>>> >>>>> Also, I would like to ask you all to try out the latest revision with >>>>> cases >>>>> that could be problematic and report impressions. >>>>> >>>>> Kind regards >>>> >>> >>> >>> >>> -- >>> Miroslav Stampar >>> >>> E-mail / Jabber: miroslav.stampar (at) gmail.com >>> Mobile: +385921010204 (HR 0921010204) >>> PGP Key ID: 0xB5397B1B >>> Location: Zagreb, Croatia >> > > > > -- > Miroslav Stampar > > E-mail / Jabber: miroslav.stampar (at) gmail.com > Mobile: +385921010204 (HR 0921010204) > PGP Key ID: 0xB5397B1B > Location: Zagreb, Croatia > -- Miroslav Stampar E-mail / Jabber: miroslav.stampar (at) gmail.com Mobile: +385921010204 (HR 0921010204) PGP Key ID: 0xB5397B1B Location: Zagreb, Croatia |