MySQL for Python / Discussion / Help: segfault during

Skip Montanaro - 2001-12-22

I'm using MySQLdb 0.9.0 w/ Python 2.2 and MySQL 3.23.41 on Mandrake Linux 8.1. I'm getting a segfault at cursors.py, line 67:

r = self._query(query % escape(args, qc))

when there is Latin-1 data in the args. I can reliably segfault the Python interpreter with this code:

db = MySQLdb.Connect(...)
c = db.cursor()
c.execute("select min(id) from venues where venue=%s",
(u'Caff\xe8 Lena',))

Is this a known problem? I don't really understand Unicode manipulation very well (read: not at all), so it's quite possible that I'm doing (or not doing) something that would avoid this problem. Still, the interpreter shouldn't crash.

If you can't reproduce this, I can rebuild Python and MySQLdb with -g so I can get a gdb stack trace.

Skip Montanaro
skip@pobox.com

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Skip Montanaro - 2001-12-22
  
  Quick followup. Here's a rather suspicious
  bit of code in _mysql_string_literal:
  
      s = PyObject_Str(o);
      in = PyString_AsString(s);
  
  If I pyo o from gdb I see that it is a Unicode
  string:
  
      (gdb) pyo o
      object : u'Caff\xe8 Lena'
      type    : unicode
      refcount: 4
      address : 0x81e4370
  
  Calling PyString_AsString on s is probably
  not a good idea.
  
  Skip
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Andy Dustman - 2001-12-22
    
    My understanding is that s = PyObject_Str(o) effectively calls str(o) on the object and thus should return a string s. Therefore it should be safe to call PyString_AsString(s).
    
    I can reproduce this with your data and Python2.2b2 and MySQLdb-0.9.1, but I need debugging symbols myself. However:
    
    >>> u=u'Caff\xe8 Lena'
    >>> str(u)
    Traceback (most recent call last):
    File "<stdin>", line 1, in ?
    UnicodeError: ASCII encoding error: ordinal not in range(128)
    
    So I will need some checking in _mysql_string_literal to prevent core dumps. This will not completely solve your problem, because str() will not work and thus string_literal() will fail.
    
    >>> u.encode('latin1')
    'Caff\xe8 Lena'
    
    It is unfortunate that you cannot set the encoding for str() on a unicode object.
    
    One possibility is to add a unicode converter to MySQLdb's conversion dictionary. Another is to u.encode('latin1') before passing this value to MySQLdb. The following patch will prevent a core dump but not fix your problem.
    
    Index: _mysql.c
    
    RCS file: /cvsroot/mysql-python/MySQLdb/_mysql.c,v
    retrieving revision 1.16
    diff -u -r1.16 _mysql.c
    --- _mysql.c    2001/10/17 03:21:22    1.16
    +++ _mysql.c    2001/12/22 19:20:14
    @@ -434,6 +434,7 @@
         int len, size;
         if (!PyArg_ParseTuple(args, "O|O:string_literal", &o, &d)) return NULL;
         s = PyObject_Str(o);
    +    if (!s) return NULL;
         in = PyString_AsString(s);
         size = PyString_GET_SIZE(s);
         str = PyString_FromStringAndSize((char *) NULL, size*2+3);
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
    - Skip Montanaro - 2001-12-23
      
      I just submitted a patch that seems to work if
      the data is just Latin-1.
      
      If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
      - Andy Dustman - 2001-12-23
        
        Try my patch, attached to yours.
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

segfault during _mysql.escape w/ Latin-1 data

MySQL database connector for Python programming

Forums

Help

segfault during _mysql.escape w/ Latin-1 data

Index: _mysql.c

segfault during _mysql.escape w/ Latin-1 data

MySQL database connector for Python programming

Forums

Help

segfault during _mysql.escape w/ Latin-1 data document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Index: _mysql.c

segfault during _mysql.escape w/ Latin-1 data