From: Brian W. <bri...@gm...> - 2009-11-14 05:21:55
|
On Sat, Nov 14, 2009 at 1:44 AM, Gustavo Sverzut Barbieri <bar...@pr...> wrote: > On Fri, Nov 13, 2009 at 3:13 PM, Brian Wang <bri...@gm...> wrote: >> On Fri, Nov 13, 2009 at 8:12 PM, Gustavo Sverzut Barbieri >> <bar...@pr...> wrote: >>> On Fri, Nov 13, 2009 at 4:49 AM, Brian Wang <bri...@gm...> wrote: >>>> [snip] >>>>> OK. With the help of gdbserver/gdb, I am able to find where it's looping. >>>>> However, I don't know what causes it. Here it goes: >>>>> [svn r43601] >>>>> evas/src/lib/canvas/evas_object_textblock.c:3152 >>>> >>>> The line number is off... I've put some of my stuff in the front of >>>> the source code... My bad. >>>> >>>>> evas_object_textblock_text_markup_get() calls >>>>> escape = _escaped_char_match(p, &adv); >>>>> _escaped_char_match() returns "" and 'adv' is set to 0 >>>>> thus the looping. >>>>> >>>>> Here is the gdb backtrace: >>>>> -------------------------------------- >>>>> (gdb) bt >>>>> #0 _escaped_char_match (s=0x780a1 "追 ", adv=0xbec8c490) >>>>> at evas_object_textblock.c:2732 >>>>> #1 0x4008338c in evas_object_textblock_text_markup_get (obj=0x75380) >>>>> at evas_object_textblock.c:3152 >>>>> #2 0x40826be0 in _edje_part_recalc_single (ed=0x0, ep=0x0, desc=0x75ae0, >>>>> chosen_desc=0x40300000, rel1_to_x=0x0, rel1_to_y=0x40884c2c, >>>>> rel2_to_x=0x3, rel2_to_y=0x751c0, confine_to=0x0, params=0x749ac, flags=3) >>>>> at edje_calc.c:651 >>>>> #3 0x408271c8 in _edje_part_recalc (ed=0x10, ep=0x748bc, flags=1) >>>>> at edje_calc.c:1721 >>>>> #4 0x40828e10 in _edje_recalc_do (ed=0x780a1) at edje_calc.c:224 >>>>> #5 0x408473b4 in edje_object_size_min_restricted_calc ( >>>>> obj=<value optimized out>, minw=0x0, minh=0x4025e27c, restrictedw=0, >>>>> restrictedh=-1) at edje_util.c:2362 >>>>> #6 0x40847618 in edje_object_size_min_calc (obj=0x780a1, minw=0xbec8c490, >>>>> minh=0x40144860) at edje_util.c:2311 >>>>> #7 0x40210a6c in _sizing_eval (obj=0x5d060) at elm_label.c:55 >>>>> #8 0x40210e10 in elm_label_label_set (obj=0x5d060, label=0x8c88 " 追 ") >>>>> at elm_label.c:121 >>>>> #9 0x00008b1c in elm_main (argc=1, argv=0xbec8cd04) at elm-label-bug-test.c:36 >>>>> #10 0x00008bac in main (argc=1, argv=0xbec8cd04) at elm-label-bug-test.c:64 >>>>> -------------------------------------- >>>>> >>>>> I don't know what's special about the string that makes it end up the condition. >>>>> Checking if (strlen(escape)==0 && adv==0) seems to terminate the loop. >>>>> But I totally have no clue what's going on here... Fixing it without >>>>> knowing what's causing the condition is wrong. >>>>> >>>>> The string "追 " in UTF-8 is of value: 0xe8 0xbf 0xbd 0x20 >>>>> >>>>> Is the info above enough to track down the problem? >>>> >>>> I've tracked it down a bit. >>>> On x86, via gdb, >>>> ------------------------ >>>> (gdb) p escape_strings[sizeof(escape_strings)] >>>> $4 = 102 'f' >>>> ------------------------ >>>> >>>> On my arm, via gdb, >>>> ------------------------ >>>> (gdb) p escape_strings[sizeof(escape_strings)] >>>> $12 = 0 '\0' >>>> ------------------------ >>> >>> As you said below, it is invalid read. >>> >>> >>>> At the very last check of >>>> while ((*mc) && (*sc)), >>>> map_itr is equal to map_end and the difference is the on x86 *mc is >>>> non-zero and on my arm, *mc is zero. >>>> Therefore, on arm, the while loop is skipped all together and 'match' >>>> is still 1. And hence the looping... >>>> >>>> sizeof(escape_strings) on both platforms are the same: 1551. >>>> I would say that it is invalid read for both cases. The x86 case gets >>>> lucky and got away with it. >>>> I'm surprised that valgrind did not catch this. Or maybe I'm wrong... >>>> char a[] = "s"; >>>> sizeof(a) == 2 >>>> accessing a[2] is out of bounds >>> >>> this is correct. As why it does not warn, maybe there is another valid >>> string allocated right after, thus you end reading something valid and >>> valgrind will not warn you. >> >> I guess so too. I thought valgrind is pretty good at catching >> out-of-bounds accesses.. > > well, there are some flags to make GCC produce code with garbage > before and after arrays, I guess it's fortify source or something like > that. mudflap should help as well. > > >>>> We may check if (map_itr < map_end) after the first >>>> _advance_after_end_of_string() inside _escaped_char_match() >>>> or we may decrease map_end by 1 since every escape character is >>>> already terminated by a null character. >>>> Thus, my proposed patch (sorry for the bad part at the front, which is >>>> only for my own usage): >>>> ------------------------------------------------ >>>> Index: src/lib/canvas/evas_object_textblock.c >>>> =================================================================== >>>> --- src/lib/canvas/evas_object_textblock.c (revision 43601) >>>> +++ src/lib/canvas/evas_object_textblock.c (working copy) >>>> @@ -1608,6 +1608,18 @@ >>>> _layout_word_start(char *str, int start) >>>> { >>>> int p, tp, chr = 0; >>>> + >>>> +#if 1 >>>> + // >>>> + // coolbrian: break if the word is not within the ASCII range >>>> + // @note This is good for breaking up Chinese words, which are >>>> made of Chinese characters. >>>> + // Chinese characters do not look strange if they are not >>>> grouped to form a 'phrase'. >>>> + // Also, Chinese 'phrases' are of too many patterns and >>>> probably need a dictionary to look up. >>>> + // That would be too much. >>>> + // >>>> + if (((unsigned char)str[start]) >= 0x80) >>>> + return start; >>>> +#endif >>>> >>>> p = start; >>>> chr = evas_common_font_utf8_get_next((unsigned char *)(str), &p); >>>> @@ -2709,7 +2721,7 @@ >>>> const char *map_itr, *map_end, *mc, *sc; >>>> >>>> map_itr = escape_strings; >>>> - map_end = map_itr + sizeof(escape_strings); >>>> + map_end = map_itr + sizeof(escape_strings) - 1; >>>> >>>> while (map_itr < map_end) >>> >>> It's weird: >>> >>> map_end = map_itr + sizeof(escape_strings) >>> >>> is fine, as we compare for less than, so this address would never be >>> used. The last run should be on escape_strings: >>> "⊥\0" "\xe2\x8a\xa5\0" >>> >>> Thus: >>> >>> while (map_itr < map_end) >>> >>> --> yes, we're at map_itr = ⊥\0 >>> >>> >>> { >>> const char *escape; >>> int match; >>> >>> escape = map_itr; >>> >>> --> escape = ⊥\0 >>> >>> >>> _advance_after_end_of_string(&map_itr); >>> mc = map_itr; >>> >>> --> map_itr = \xe2\x8a\xa5\0 >>> >>> >>> sc = s; >>> match = 1; >>> while ((*mc) && (*sc)) >>> { >>> >>> --> enters >>> >>> >>> if ((unsigned char)*sc < (unsigned char)*mc) return NULL; >>> >>> --> I have no clue why this >> >> I don't either... >> >>> >>> >>> if (*sc != *mc) match = 0; >>> >>> --> fails, thus match = 0 >>> >>> >>> mc++; >>> sc++; >>> } >>> if (match) >>> >>> --> skipped, as match = 0 >>> >>> { >>> *adv = mc - map_itr; >>> return escape; >>> } >>> _advance_after_end_of_string(&map_itr); >>> >>> --> map_itr was \xe2\x8a\xa5\0, thus it's now one byte after, thus >>> map_itr == escape_strings + sizeof(escape_strings) and thus the next >>> while (map_itr < map_end) will fail. >> >> No. map_itr == escape_strings + sizeof(escape_strings) -1 >> Thus the next while (map_itr < map_end) will pass. > > So that's the bug. If map_itr was \xe2\x8a\xa5\0 and > _advance_after_end_of_string(&map_itr); looks for the first \0 (the > last char) then increments one, we clearly should be correct! :-( > > >> If we simplify the string a bit: >> char escape_strings[] = "\xe2\x8a\0"; // sizeof it == 4 >> say escape_strings is at address 0x0 >> map_itr = 0; >> map_end = 0 + 4 = 4; >> _advance_after_end_of_string(&map_itr); >> map_itr=3; > > what?! how advance after of string is returning 3??? > it should read 3 times, stop since it found 0, then increment one... > please debug that one :-) static void _advance_after_end_of_string(const char **p_buf) { while (**p_buf != 0) (*p_buf)++; (*p_buf)++; } After the while loop, *p_buf=2 (read 3 times but the 3rd time does not increment it). The second line would make it 3. Please have a look at the attached file. > > > -- > Gustavo Sverzut Barbieri > http://profusion.mobi embedded systems > -------------------------------------- > MSN: bar...@gm... > Skype: gsbarbieri > Mobile: +55 (19) 9225-2202 > -- brian ------------------ Cool-Karaoke - The smallest recording studio, in your palm, open-sourced http://cool-idea.com.tw/ iMaGiNaTiOn iS mOrE iMpOrTaNt tHaN kNoWlEdGe |