From: SourceForge.net <no...@so...> - 2005-11-03 07:08:21
|
Read and respond to this message at: https://sourceforge.net/forum/message.php?msg_id=3410876 By: tml1024 Sigh. I have never used BSTRs, but I have some idea what they are. This code is full of problems. Where should I start? > use this code however you like, but don't > remove the comment at the beginning, Considering that the relevant snippet in your code is only three lines, even if those were correct and bug-free, I think you are overestimating the difficulty of the posted problem, and the value and cleverness of your solution. > #include <Windows.h> The convention is to use <windows.h>. Not even Microsoft recommends using MixedCase names for the header files in #include statements, as far as I know. Using Windows.h will break cross-compilation from Unix. > const unsigned int stringLength = lstrlenW(bstr); Isn't the point of BSTRs that they are *counted* wide-char strings, that can contain embedded zero wide characters. Using lstrlenW is wrong. Instead, get the real length of the BSTR using SysStringLen(bstr). lstrlenW only counts how many non-zero initial wide characters the BSTR points to. It might return too low a value (if the BSTR has embedded zero wide characters) or might very well overrun the BSTR and return too high a value (if there doesn't happen to be a zero wide character (i.e. two zero bytes) after the last wide character of the BSTR). This might even cause a crash if lstrlenA happens to wander into an unallocated or read-protected page. > char *const ascii = new char [stringLength + 1]; From a stylistic point of view using the identifier "ascii" is highly confusing, as BSTRs are in Unicode, not at all restricted to ASCII, or even "ANSI". The real problem is that in general you might need twice as many bytes (C chars) as the length of the BSTR when you represent it as a Windows multi.-byte char string. (Remember double-byte code pages.) And if you use a new Microsoft C runtime, I think the codepage used in the C runtime's locale might even be 65001 (UTF-8), where each wide char in the BSTR might take up to four chars when converted to a multi-byte string. > const unsigned int stringLength = lstrlenA(ascii); Again you use the name "ascii" even though "char *" strings in C and C++ aren't just ASCII, but in general multi-byte strings in the current locale's codepage. lstrlenA will count the bytes, not characters, so for strings with multi-byte characters the allocated MBSTR will be too long, as each multi-byte character takes only one wide character. (OK, or two, for Unicode code points outside the BMP, that need two wide characters, i.e. a surrogate pair.) > wcstombs(bstr, ascii, stringLength + 1); Surely you mean mbstowcs and not wcstombs. But even then it is still wrong. Why are you using stringLength+1, when you allocated the BSTR to contain only stringLength worth of wide characters? This will overwrite what happens to be in memory after the BSTR. ______________________________________________________________________ You are receiving this email because you elected to monitor this forum. To stop monitoring this forum, login to SourceForge.net and visit: https://sourceforge.net/forum/unmonitor.php?forum_id=286529 |