Menu

#126 crash in gdk_rgb_convert_0888

v1.6
closed-fixed
nobody
None
5
2015-11-25
2015-08-18
No

Since about version 1.5 I get regular crashes when zooming and panning with a mapnik map layer. I can reproduce with various versions: 1.5.1, 1.6, current git head.

This is on OpenBSD-current/amd64 with following libraries:

  • gtk+2-2.24.28
  • gdk-pixbuf-2.30.8p1
  • pango-1.36.8
  • cairo-1.14.2
  • mapnik-2.2.0p3

Seems to be reproducible on Debian as well, see https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=792457 . The backtrace is the same as for me.

Here is my backtrace with current git version:

#0  gdk_rgb_convert_0888 (image_info=Variable "image_info" is not available.
) at gdkrgb.c:2146
2146              p[0] = bp2[2];
(gdb) bt
#0  gdk_rgb_convert_0888 (image_info=Variable "image_info" is not available.
) at gdkrgb.c:2146
#1  0x00001e1a1a219f57 in gdk_draw_rgb_image_core (image_info=0x1e19e2816400, drawable=0x1e19a23a6060,
    gc=0x1e1a54066240, x=0, y=87, width=70, height=256, buf=0x1e19bd81022e <Address 0x1e19bd81022e out of bounds>,
    pixstride=3, rowstride=768, conv=0x1e1a1a216600 <gdk_rgb_convert_0888>, cmap=0x0, xdith=0, ydith=0)
    at gdkrgb.c:3331
#2  0x00001e1a1a21a630 in IA__gdk_draw_rgb_image_dithalign (drawable=0x1e19a23a6060, gc=0x1e1a54066240, x=0, y=87,
    width=70, height=256, dith=GDK_RGB_DITHER_NONE, rgb_buf=0x1e19bd81022e <Address 0x1e19bd81022e out of bounds>,
    rowstride=768, xdith=0, ydith=0) at gdkrgb.c:3421
#3  0x00001e1a1a204f79 in gdk_drawable_real_draw_pixbuf (drawable=0x1e19a1710b20, gc=0x1e1a54066240,
    pixbuf=0x1e1a52dba720, src_x=186, src_y=0, dest_x=0, dest_y=87, width=70, height=256,
    dither=GDK_RGB_DITHER_NONE, x_dither=0, y_dither=0) at gdkdraw.c:1828
#4  0x00001e1a1a239afd in gdk_x11_draw_pixbuf (drawable=0x1e19a23a6060, gc=0x0, pixbuf=0x1e1a52dba720, src_x=0,
    src_y=0, dest_x=-186, dest_y=87, width=256, height=256, dither=GDK_RGB_DITHER_NONE, x_dither=0, y_dither=0)
    at gdkdrawable-x11.c:1496
#5  0x00001e17826c9dba in vik_viewport_draw_pixbuf (vvp=0x1e19e0ba1000, pixbuf=0x1e1a52dba720, src_x=0, src_y=0,
    dest_x=-186, dest_y=87, w=256, h=256) at vikviewport.c:1373
#6  0x00001e17827055d3 in maps_layer_draw_section (vml=0x1e19cf17dd40, vvp=0x1e19e0ba1000, ul=0x7f7fffff1e50,
    br=0x7f7fffff1e70) at vikmapslayer.c:1386
#7  0x00001e1782705e37 in maps_layer_draw (vml=0x1e19cf17dd40, vvp=0x1e19e0ba1000) at vikmapslayer.c:1470
#8  0x00001e17826af26d in vik_layer_draw (l=0x1e19cf17dd40, vp=0x1e19e0ba1000) at viklayer.c:260
#9  0x00001e17826cb8c3 in vik_aggregate_layer_draw (val=0x1e1a6aacf1d0, vp=0x1e19e0ba1000) at vikaggregatelayer.c:351
#10 0x00001e17826b2cd9 in vik_layers_panel_draw_all (vlp=0x1e19a49bd040) at viklayerspanel.c:570
#11 0x00001e17826b8851 in draw_redraw (vw=0x1e1a1f372050) at vikwindow.c:1178
#12 0x00001e17826b835c in draw_update (vw=0x1e1a1f372050) at vikwindow.c:1096
#13 0x00001e19c059224d in _g_closure_invoke_va () from /usr/local/lib/libgobject-2.0.so.4200.1
#14 0x00001e19c05a9baf in g_signal_emit_valist () from /usr/local/lib/libgobject-2.0.so.4200.1
#15 0x00001e19c05aad61 in g_signal_emit () from /usr/local/lib/libgobject-2.0.so.4200.1
#16 0x00001e17826b1958 in idle_draw_panel (vlp=0x1e19a49bd040) at viklayerspanel.c:256
#17 0x00001e1a16d6f0a2 in g_main_context_dispatch () from /usr/local/lib/libglib-2.0.so.4200.1
#18 0x00001e1a16d7122b in g_main_context_iterate () from /usr/local/lib/libglib-2.0.so.4200.1
#19 0x00001e1a16d721a5 in g_main_loop_run () from /usr/local/lib/libglib-2.0.so.4200.1
#20 0x00001e1a16621c91 in IA__gtk_main () at gtkmain.c:1268
#21 0x00001e178267fddd in main (argc=1, argv=0x7f7fffff26a8) at main.c:259

(gdb) list
2146              p[0] = bp2[2];
2147              p[1] = bp2[1];
2148              p[2] = bp2[0];
2149              p[3] = 0xff;
2150              bp2 += 3;
2151              p += 4;
2152            }
2153          bptr += rowstride;
2154          obuf += bpl;
2155        }

(gdb) frame 3
#3  0x00001e1a1a204f79 in gdk_drawable_real_draw_pixbuf (drawable=0x1e19a1710b20, gc=0x1e1a54066240,
    pixbuf=0x1e1a52dba720, src_x=186, src_y=0, dest_x=0, dest_y=87, width=70, height=256,
    dither=GDK_RGB_DITHER_NONE, x_dither=0, y_dither=0) at gdkdraw.c:1828
1828          gdk_draw_rgb_image_dithalign (real_drawable, gc,
(gdb) print *pixbuf
$1 = {parent_instance = {g_type_instance = {g_class = 0xdfdfdfdfdfdfdfdf}, ref_count = 3755991007,
    qdata = 0xe00000003}, colorspace = 1863087848, n_channels = 2001, bits_per_sample = 58, width = 11, height = 21,
  rowstride = 0, pixels = 0xb0000003b <Address 0xb0000003b out of bounds>, destroy_fn = 0xb5,
  destroy_fn_data = 0x1300000049, has_alpha = 1}

(gdb) info threads
  14 process 8828  0x00001e199b2bb71a in poll () at <stdin>:2
  13 process 16445  0x00001e199b2bb71a in poll () at <stdin>:2
  12 process 6364  0x00001e199b2bb71a in poll () at <stdin>:2
  11 process 16471  0x00001e199b2b83aa in write () at <stdin>:2
  10 process 23762  0x00001e199b2bb71a in poll () at <stdin>:2
  9 process 12119  0x00001e199b2bb71a in poll () at <stdin>:2
  8 process 29766  0x00001e199b2bb71a in poll () at <stdin>:2
  7 process 28707  0x00001e199b2bb71a in poll () at <stdin>:2
  6 process 30317  0x00001e199b2bb71a in poll () at <stdin>:2
  5 process 17933  0x00001e199b2bb71a in poll () at <stdin>:2
  4 process 3940  0x00001e199b29a6fa in kevent () at <stdin>:2
  3 process 30526  0x00001e199b2bb71a in poll () at <stdin>:2
  2 process 23171  0x00001e199b2bb71a in poll () at <stdin>:2
* 1 process 20101  gdk_rgb_convert_0888 (image_info=Variable "image_info" is not available.
) at gdkrgb.c:2146

This is the debug output (from viking -Vd) for this core:

* Connection #0 to host tile.openstreetmap.org left intact
** (viking): DEBUG: curl_download_uri: uri=http://tile.openstreetmap.org/17/70242/43858.png
* Found bundle for host tile.openstreetmap.org: 0x1e1a524aef40
* Re-using existing connection! (#0) with host tile.openstreetmap.org
* Connected to tile.openstreetmap.org (144.76.70.77) port 80 (#0)
> GET /17/70242/43858.png HTTP/1.1
Host: tile.openstreetmap.org
User-Agent: viking/1.6 libcurl/7.43.0 LibreSSL/2.0.0 zlib/1.2.3 libidn/1.32
Accept: */*

* Connection #0 to host tile.openstreetmap.org left intact
** (viking): DEBUG: curl_download_uri: uri=http://tile.openstreetmap.org/14/8778/5480.png
* Found bundle for host tile.openstreetmap.org: 0x1e1a524aefe0
* Re-using existing connection! (#0) with host tile.openstreetmap.org
* Connected to tile.openstreetmap.org (144.76.70.77) port 80 (#0)
> GET /14/8778/5480.png HTTP/1.1
Host: tile.openstreetmap.org
User-Agent: viking/1.6 libcurl/7.43.0 LibreSSL/2.0.0 zlib/1.2.3 libidn/1.32
Accept: */*

* Connection #0 to host tile.openstreetmap.org left intact
** (viking): DEBUG: curl_download_uri: uri=http://tile.openstreetmap.org/13/4389/2738.png
Segmentation fault (core dumped)

Discussion

  • Rob Norris

    Rob Norris - 2015-08-18
    • Description has changed:

    Diff:

    --- old
    +++ new
    @@ -9,6 +9,8 @@
     - mapnik-2.2.0p3
    
     Seems to be reproducible on Debian as well, see https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=792457 . The backtrace is the same as for me.
    +
    +~~~~
    
     Here is my backtrace with current git version:
    
    @@ -117,4 +119,4 @@
     ** (viking): DEBUG: curl_download_uri: uri=http://tile.openstreetmap.org/13/4389/2738.png
     Segmentation fault (core dumped)
    
    -
    +~~~~
    
     
  • Rob Norris

    Rob Norris - 2015-08-18

    Try to make MarkDown ingore debugger dump text

     
  • Rob Norris

    Rob Norris - 2015-08-31

    I can't reproduce this and I don't think it would be GTK+ bug.

    One possibility could be side effects from other bugs.

    But without more ways to force it going wrong, this will have to stay under a 'watching brief' situation until more evidence is collected to pinpoint the root error.

     
  • Guilhem BONNEFILLE

    Could it be related to #121?

     
  • Ralf Horstmann

    Ralf Horstmann - 2015-09-19

    I can still reproduce the crash with latest code from git, which includes the patch from #121

     
  • Szymon Bigos

    Szymon Bigos - 2015-09-22

    Hi,

    I can reproduce something very similar. The reason is (in my case, but i believe that here too) reference counter in cached pixels buffers. It always equals one although two (or maybe more) tasks use it. So, when cache is flushed, memory is deallocated, but sometime another task use it.

    In attachment is solution of this problem. It will be good to someone makes review. It solved my crach, but if cache related function will be used improperly, memory leakage may occur. I hope it helps.

     
  • Rob Norris

    Rob Norris - 2015-09-23

    Great work and analysis Szymon - this seems to be another subtle flaw in the memory tile cache you've managed to understand and work out.

    Anyway IMHO some memory leakage is better than randomly crashing.

    I'll continue using the patch this week+weekend and if no problems (I can't foresee any) I'll commit it then.

     
  • Ralf Horstmann

    Ralf Horstmann - 2015-09-23

    The patch fixes the crash for me. I'm not able to reproduce anymore with the patch applied. Thanks a lot!

     
  • Rob Norris

    Rob Norris - 2015-09-26

    Patch applied to master code repository

     
  • Rob Norris

    Rob Norris - 2015-09-26
    • status: open --> pending-fixed
     
  • Rob Norris

    Rob Norris - 2015-11-25

    crash in gdk_rgb_convert_0888 due to using deallocated memory.

    There is reference counter in cached pixel buffers.
    Previously it always equalled one although multiple tasks could be using it.
    Thus when the cache is flushed, memory was always deallocated but then another task would attempt to use it and crash.

    Thus now the cached pixel buffer is tracked properly with unref() after being used,
    so only when reference count is zero it is automatically deallocated.

     
  • Rob Norris

    Rob Norris - 2015-11-25
    • status: pending-fixed --> closed-fixed
     

Log in to post a comment.