Menu

segfault during _mysql.escape w/ Latin-1 data

Help
2001-12-22
2012-09-19
  • Skip Montanaro

    Skip Montanaro - 2001-12-22

    I'm using MySQLdb 0.9.0 w/ Python 2.2 and MySQL 3.23.41 on Mandrake Linux 8.1.  I'm getting a segfault at cursors.py, line 67:

      r = self._query(query % escape(args, qc))

    when there is Latin-1 data in the args.  I can reliably segfault the Python interpreter with this code:

    db = MySQLdb.Connect(...)
    c = db.cursor()
    c.execute("select min(id) from venues where venue=%s",
              (u'Caff\xe8 Lena',))

    Is this a known problem?  I don't really understand Unicode manipulation very well (read: not at all), so it's quite possible that I'm doing (or not doing) something that would avoid this problem.  Still, the interpreter shouldn't crash.

    If you can't reproduce this, I can rebuild Python and MySQLdb with -g so I can get a gdb stack trace.

    Skip Montanaro
    skip@pobox.com

     
    • Skip Montanaro

      Skip Montanaro - 2001-12-22

      Quick followup.  Here's a rather suspicious
      bit of code in _mysql_string_literal:

          s = PyObject_Str(o);
          in = PyString_AsString(s);

      If I pyo o from gdb I see that it is a Unicode
      string:

          (gdb) pyo o
          object  : u'Caff\xe8 Lena'
          type    : unicode
          refcount: 4
          address : 0x81e4370

      Calling PyString_AsString on s is probably
      not a good idea.

      Skip

       
      • Andy Dustman

        Andy Dustman - 2001-12-22

        My understanding is that s = PyObject_Str(o) effectively calls str(o) on the object and thus should return a string s. Therefore it should be safe to call PyString_AsString(s).

        I can reproduce this with your data and Python2.2b2 and MySQLdb-0.9.1, but I need debugging symbols myself. However:

        >>> u=u'Caff\xe8 Lena'
        >>> str(u)
        Traceback (most recent call last):
          File "<stdin>", line 1, in ?
        UnicodeError: ASCII encoding error: ordinal not in range(128)

        So I will need some checking in _mysql_string_literal to prevent core dumps. This will not completely solve your problem, because str() will not work and thus string_literal() will fail.

        >>> u.encode('latin1')
        'Caff\xe8 Lena'

        It is unfortunate that you cannot set the encoding  for str() on a unicode object.

        One possibility is to add a unicode converter to MySQLdb's conversion dictionary. Another is to u.encode('latin1') before passing this value to MySQLdb. The following patch will prevent a core dump but not fix your problem.

        Index: _mysql.c

        RCS file: /cvsroot/mysql-python/MySQLdb/_mysql.c,v
        retrieving revision 1.16
        diff -u -r1.16 _mysql.c
        --- _mysql.c    2001/10/17 03:21:22    1.16
        +++ _mysql.c    2001/12/22 19:20:14
        @@ -434,6 +434,7 @@
             int len, size;
             if (!PyArg_ParseTuple(args, "O|O:string_literal", &o, &d)) return NULL;
             s = PyObject_Str(o);
        +    if (!s) return NULL;
             in = PyString_AsString(s);
             size = PyString_GET_SIZE(s);
             str = PyString_FromStringAndSize((char *) NULL, size*2+3);

         
        • Skip Montanaro

          Skip Montanaro - 2001-12-23

          I just submitted a patch that seems to work if
          the data is just Latin-1.

           
          • Andy Dustman

            Andy Dustman - 2001-12-23

            Try my patch, attached to yours.

             

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.