Re: [Modeling-users] type('') and unicode

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Yannick Gingras <yan...@sa...> writes:

> I stumbled upon some "if type(foo)=3D=3Dtype(''):" in the code=20
> (grep -r -E "type\(''\)" .).  This fail to match unicode that behave
> like as string but is not the same type :
>=20
>   >>> type(u'') =3D=3D type('')
>   0
>=20
> Is this some kind of obscure feature ?
>=20
> This is used in the Qualifier code and it seems likely to me that
> someone will eventually try to make a fetch with unicode.  In fact I
> might just try that right now...
>=20
>   UnicodeError: ASCII encoding error: ordinal not in range(128)
>=20
> Argh !

Funny... There's a message from you in the archives (20 Apr 2003, thread
is named "Working with unicode", I remembered Mario also discussed this
there) suggesting that this was working... Did I misundertand what you
were saying?

  In fact, this surprised me a lot since I've never made anything
  particular to support unicode. Given that python unicode support is
  not particularly wonderful (well, it wasn't when I looked at it 1 1/2
  year ago: I had to dive in the code to find the encoders/decoders, the
  documentation was almost inexistent, and to end with everything was
  messed up in my mind), so I just didn't care --and never needed it to
  be honest (except for xml models because we were putting latin1
  characters in them at that time).

> there is "type(foo) in (type(''), type(u''))" or I could encode my
> query or who knows what.  Since some RDMS (aka MySQL) choke on
> unicode, maybe it would be best to have every queries encoded in utf-8
> but I prefer to have your opinion 1st.

  As you can see, my opinion is that I have no opinion :/ I don't even
  know how the different database *and* the different python db-adaptors
  behave, and I must admit that I do not really want to look at that.

  You're right by saying such tests for strings should be made against
  regular and unicode strings, but I suspect this is only the easiest
  part of it. My opinion, though... I've been bitten by unicode too hard
  to be really objective about it.

So if you feel like looking at these things and summarize them (either
by proposing a procedure for using unicode w/ the framework, or by
submitting patches), I'll be happy to collaborate to the best of my
knowledge --again, this would imply my knowledge on the framework
mainly, because my unicode background is something like... empty...

  Others interested in this topic may react here too.

        Regards,

-- S=E9bastien.