|
From: Дмитрий Д. <di...@gm...> - 2012-03-24 17:54:41
|
Hello!
i want add new functions -- wcscmp/wcsncmp/wsclen/wscnlen into Memcheck.
And i have questions:
1. Will such addition be accepted? Or there are caveats with WCHAR
functions?
2. Is mc_replace_strmem.c the correct file for such addition?
3. Will it be correct to use 20360 20370 20380 and 20390 for
wcscmp/wscncmp/wsclen/wcsnlen?
Comment in mc_replace_strmem.c say
-- begin --
Assignment of behavioural equivalence class tags: 2NNNP is intended
to be reserved for Memcheck. Current usage:
20010 STRRCHR
...skip...
20350 STRCASESTR
-- end --
Naive implementation for wcscmp() works under Fedora16/x64/gcc-4.6.3 (at
least it looks so :) )
#include <wchar.h>
#define WCSCMP(soname, fnname) \
int VG_REPLACE_FUNCTION_EZU(20360,soname,fnname) \
( const wchar_t* s1, const wchar_t* s2 ); \
int VG_REPLACE_FUNCTION_EZU(20360,soname,fnname) \
( const wchar_t* s1, const wchar_t* s2 ) \
{ \
register wchar_t c1; \
register wchar_t c2; \
while (True) { \
c1 = *(wchar_t *)s1; \
c2 = *(wchar_t *)s2; \
if (c1 != c2) break; \
if (c1 == 0) break; \
s1++; s2++; \
} \
if ((wchar_t)c1 < (wchar_t)c2) return -1; \
if ((wchar_t)c1 > (wchar_t)c2) return 1; \
return 0; \
}
#if defined(VGO_linux)
WCSCMP(VG_Z_LIBC_SONAME, wcscmp)
WCSCMP(VG_Z_LIBC_SONAME, __GI_wcscmp)
#elif defined(VGO_darwin)
// WCSCMP(VG_Z_LIBC_SONAME, wcscmp)
#endif
Thank You
Dmitry
|
|
From: Julian S. <js...@ac...> - 2012-03-26 18:07:48
|
Dmitry,
Do you have some examples of the false errors that this solves?
It would be good to see some examples.
> And i have questions:
> 1. Will such addition be accepted? Or there are caveats with WCHAR
> functions?
Depends on correctness, lack of bad side effects, and presence of
test cases. Especially the last one; memcheck/tests/str_tester.c
is the right place to add them.
> 2. Is mc_replace_strmem.c the correct file for such addition?
Yes
> 3. Will it be correct to use 20360 20370 20380 and 20390 for
> wcscmp/wscncmp/wsclen/wcsnlen?
> Comment in mc_replace_strmem.c say
> -- begin --
> Assignment of behavioural equivalence class tags: 2NNNP is intended
> to be reserved for Memcheck. Current usage:
>
> 20010 STRRCHR
> ...skip...
> 20350 STRCASESTR
> -- end --
Yes fine. Just don't re-use any existing numbers.
> Naive implementation for wcscmp() works under Fedora16/x64/gcc-4.6.3 (at
> least it looks so :) )
>
> #include <wchar.h>
> #define WCSCMP(soname, fnname) \
> int VG_REPLACE_FUNCTION_EZU(20360,soname,fnname) \
> ( const wchar_t* s1, const wchar_t* s2 ); \
> int VG_REPLACE_FUNCTION_EZU(20360,soname,fnname) \
> ( const wchar_t* s1, const wchar_t* s2 ) \
> { \
> register wchar_t c1; \
> register wchar_t c2; \
> while (True) { \
> c1 = *(wchar_t *)s1; \
> c2 = *(wchar_t *)s2; \
> if (c1 != c2) break; \
> if (c1 == 0) break; \
> s1++; s2++; \
> } \
> if ((wchar_t)c1 < (wchar_t)c2) return -1; \
> if ((wchar_t)c1 > (wchar_t)c2) return 1; \
> return 0; \
> }
>
> #if defined(VGO_linux)
> WCSCMP(VG_Z_LIBC_SONAME, wcscmp)
> WCSCMP(VG_Z_LIBC_SONAME, __GI_wcscmp)
> #elif defined(VGO_darwin)
> // WCSCMP(VG_Z_LIBC_SONAME, wcscmp)
> #endif
Looks OK. It's just a clone of strcmp, yes?
Do wide character comparisons involve locale stuff? That has
caused problems in the past. Or are they locale independent?
J
|
|
From: Дмитрий Д. <di...@gm...> - 2012-03-28 08:45:23
|
Julian,
thank for explanation.
Do you have some examples of the false errors that this solves?
> It would be good to see some examples.
>
>
An artificial example.
[ valgrind]$ cat wcscmp.c
#include <wchar.h>
int main()
{
const wchar_t w1[] = L"a";
const wchar_t w2[] = L"1234";
return
wcscmp(w1, w2);
}
With valgrind 12468/2269 :
[ valgrind]$ gcc -g -O0 -Wall -Wextra wcscmp.c && valgrind ./a.out
==6096== Memcheck, a memory error detector
==6096== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al.
==6096== Using Valgrind-3.8.0.SVN and LibVEX; rerun with -h for copyright
info
==6096== Command: ./a.out
==6096==
==6096== Conditional jump or move depends on uninitialised value(s)
==6096== at 0x34DA2A150A: wcscmp (wcscmp.S:429)
==6096== by 0x400508: main (wcscmp.c:7)
==6096==
==6096== Conditional jump or move depends on uninitialised value(s)
==6096== at 0x34DA2A1A92: wcscmp (wcscmp.S:795)
==6096== by 0x400508: main (wcscmp.c:7)
==6096==
==6096== Conditional jump or move depends on uninitialised value(s)
==6096== at 0x34DA2A1A97: wcscmp (wcscmp.S:797)
==6096== by 0x400508: main (wcscmp.c:7)
==6096==
==6096==
==6096== HEAP SUMMARY:
==6096== in use at exit: 0 bytes in 0 blocks
==6096== total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==6096==
==6096== All heap blocks were freed -- no leaks are possible
==6096==
==6096== For counts of detected and suppressed errors, rerun with: -v
==6096== Use --track-origins=yes to see where uninitialised values come from
==6096== ERROR SUMMARY: 3 errors from 3 contexts (suppressed: 2 from 2)
with modified V 12468/2269:
[ m_valgrind]$ gcc -g -O0 -Wall -Wextra wcscmp.c && valgrind ./a.out
==8297== Memcheck, a memory error detector
==8297== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al.
==8297== Using Valgrind-3.8.0.SVN and LibVEX; rerun with -h for copyright
info
==8297== Command: ./a.out
==8297==
==8297==
==8297== HEAP SUMMARY:
==8297== in use at exit: 0 bytes in 0 blocks
==8297== total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==8297==
==8297== All heap blocks were freed -- no leaks are possible
==8297==
==8297== For counts of detected and suppressed errors, rerun with: -v
==8297== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)
Depends on correctness, lack of bad side effects, and presence of
> test cases. Especially the last one; memcheck/tests/str_tester.c
> is the right place to add them.
>
> OK
> Yes fine. Just don't re-use any existing numbers.
>
> OK
Looks OK. It's just a clone of strcmp, yes?
>
Yes :)
1. char --> wchar_t
2. strcmp --> wcscmp
3. remove ld.so part
> Do wide character comparisons involve locale stuff? That has
> caused problems in the past. Or are they locale independent?
>
Hmm....
As i know wchar_t don't involve local stuff for comparisons and length
calculations.
All 4 runs produce identical output
(LANG=ru_RU.cp1251 gcc -g -O0 -Wall -Wextra wcscmp.c && valgrind ./a.out)
(LANG=ru_RU.utf8 gcc -g -O0 -Wall -Wextra wcscmp.c && valgrind ./a.out)
(LANG=C gcc -g -O0 -Wall -Wextra wcscmp.c && valgrind ./a.out)
(LANG=en_EN.utf8 gcc -g -O0 -Wall -Wextra wcscmp.c && valgrind ./a.out)
i'll look into wchar documentation more deeply.
Can You point me to a mentioned problems with wchar?
Dmitry
>
> J
>
|