Hi,
Recently my team migrated our project from an old version of OWL to OWL 7.0.12 (with UNICODE character set). After the migration we encountered a crash when calling TOcDocument::LoadParts function.
After some investigation we found out that the reason for this crash is that in new version of OWL, the TOcPart name is written and read as TCHAR array instead of char array.
Please, take a look at part of TOcDocument::SaveParts function:
TOcPart& part = *i.Current();
int16 len = int16(part.GetName().Length());
// Write the part name string, pascal style [len]+chars, no 0
//
ok = ToBool(HRSucceeded(stream.Write(&len, sizeof len, &count)) &&
HRSucceeded(stream.Write((_TCHAR *)part.GetName(), len, &count)) &&
part.Save(sameAsLoaded, remember));
The issue is that len is equal to a number of wchars (if the UNICODE character set is being used), however ocf::TOcStream::Write method expects a number of BYTES (it calls ISequentialStream::Write). This causes only half of the name to be written (and later read).
Of course, the TOcDocument::LoadParts also has to be adjusted.
News: 2024/04/owlnext-7013-and-64423-updates
Wiki: OWLNext_Stable_Releases
Last edit: Vidar Hasfjord 2024-03-29
@pralkopodobny wrote:
Your analysis and solution look good to me. But please clarify with comments in the code. And I suggest renaming "len" to "byteCount", as length is often referring to string length (unit count in the character encoding).
I have assigned this ticket to you as well, so that you can take control of fixing it. Since this is an internal issue and fix, not affecting the API, I have set milestone to 7.
Again, thanks for your contribution!
On second review, I spotted a potential issue with your fix.
Presumably, "len" here should remain the length of the string in number of _TCHARs (not byte count), so make sure your fix doesn't change that. Which in turn means that LoadParts needs to also convert the length read back in, into byte count, to correctly read the string.
Note that beyond a simple minimalistic patch, there is an opportunity to improve the code here, thereby also fixing exception-safety issues with this code. Since these functions should return bool on failure, they should handle exceptions within. For example:
Last edit: Vidar Hasfjord 2024-03-27
@vattila wrote:
I'll take that into account when I prepare a fix for this issue.
@vattila wrote:
Good idea. I'll do something like what you suggested in a separate commit.
Last edit: Marceli Sokólski 2024-03-25
@pralkopodobny wrote:
Yes, it is wise of you to separate the fix for the Unicode issue from further code improvement. We probably should only make the latter changes on the trunk, since they will subtly change the API (by changing which exceptions are propagated), thereby possibly affecting client code. The 7 branch is now stable and should not have any changes that may require client code rewrite.
Improving exception-safety is applicable to the core OWL library as well, so my Try wrapper may belong in "owl/except.h" for wider reuse. As written, it also requires C++20 (due to the use of std::source_location), which is only supported by the Microsoft compiler, so far.
For more about exception-safety, see Exceptions and OWLNext.
Related
Wiki: Exceptions_and_OWLNext
Hi,
I committed [r6901].
It's my first time working with svn so could you please check if everything is ok?
I locally merged this change with 7.0.12 and tested it in our application. It seems to work.
Related
Commit: [r6901]
@pralkopodobny wrote:
Good, your commit looks OK here. Your revision [r6901] has been committed on the branch correctly, and your commit message is following our conventions (except for missing the period/full stop we like to use, to indicate that the message is not truncated). Congratulations on your first commit — may there be many more!
Here's further code review:
len >= sizeof(name)
should belen >= std::size(name)
orlen >= COUNTOF(name)
(for compatibility with 6.44).Related
Commit: [r6901]
@vattila wrote:
Thank you :)
I believe the name limitation is (or at least was) 32 characters (here's old post complaining about this).
@vattila wrote:
You're right, I accidentally doubled number of allowed characters in name when unicode character set is used.
Could you tell me what is preferred convention for named constants (I was thinking about something like this
and then
Looks good to me, if defined in a class. If as a local variable, start with a lower-case letter. If as an implementation detail in file scope, don't use the static-keyword, but use an anonymous namespace instead, and add a trailing underscore.
And add a code comment explaining the 32 character limit. Also, document the limit it in the SaveParts function documentation (a Doxygen-style comment, using "\note").
Last edit: Vidar Hasfjord 2024-03-25
For more about our conventions, see Contributing to the OWLNext Project.
Related
Wiki: Contributing
The len check should be fixed in commit [r6902].
Related
Commit: [r6902]
Quick review (including nitpicking, so you get the hang of our conventions):
Additional notes about unrelated issues:
Related
Commit: [r6902]
Last edit: Vidar Hasfjord 2024-03-25
@pralkopodobny, it just occurred to me that we should check the size of the written part name as well, so that the code never writes an invalid file (with a part name length larger that 32). Truncation is not an option, so we'll have to introduce an error here (return 'false')? If so, a WARN statement should be added, also.
Last edit: Vidar Hasfjord 2024-03-25
I checked on svn website how to change message of a commit. I found out that I should use
where N is a revision number. So I tried to add a full stop at the end of message by running
But I can not see the change on sourceforge. Have I made a mistake?
@pralkopodobny wrote:
It is a quirk of the Allura software at SourceForge: The Commit pages are static and based on the original log messages. However, you can see your edited message in the Code Commit Log (press the History icon in the Code section).
So, no mistake, you are doing very well!
PS. I highly recommend TortoiseSVN, if you are working on Windows.
Last edit: Vidar Hasfjord 2024-03-29
@pralkopodobny, regarding tickets, another convention you should follow is to document your changes to the code here in the ticket comments by referring to the revisions and the branches updated and merged. See my recent tickets for how I do it.
Alternatively, you can add this documentation in the ticket description itself, which is especially beneficial if the comment section is long and it is hard to find the revision documentation in amongst all the posts.
That said, you are already doing pretty good on this!
Last edit: Vidar Hasfjord 2024-03-29
Small refactor regarding convention was committed to trunk in [r6903].
Related
Commit: [r6903]
@pralkopodobny wrote:
To eliminate warnings, this requires a static_cast. The statement should also be extended to check for
len < 0
. And it would be a good thing to add a WARN statement here as well. And the buffer can be created after the check, thus closer to use.For example:
Note that I removed the old comment here, since this can happen (the file may be corrupt).
In the future suggested overhaul for exception-safety, the WARN and return statements can be replaced by throwing an exception.
PS. Note that I presume
len == 0
is allowed here. But is it? If an empty string is not allowed, update my range check above. Also, if an empty string is not allowed, it should also never be written, so SaveParts needs to be updated to check for empty string, as well.Last edit: Vidar Hasfjord 2024-04-11
I spotted yet another issue in LoadParts: It performs suspect exception handling for catching the case where TOcStream construction fails. Apparently, in this case, failure is OK and just means that there are no parts (if we trust the code comment). However, the exception handler is too broad. It catches all TXObjComp exceptions and treats them all as OK.
It may be that this is just fine here, but I doubt it. In any case, it would be better to only handle the exception due to stream open failure and let any other exceptions through unhandled. Fortunately, TXObjComp has an ErrorCode that can be checked:
Note that this code improvement needs to be reviewed and tested, to ensure that stream construction failure actually throws TXObjComp with the xStreamOpenError code.
Another issue: File format.
Currently, and with the committed fix [r6901], the string encoding, and hence file format, used for saving and restoring TOcDocument changes between ANSI and UNICODE build modes. It may be argued that the file format should be stable and unaffected by the build mode, in which case we should do string conversion, if needed, when writing and reading strings.
To me it would make sense to always read and write Unicode (UTF-8, preferably), since that makes the file format consistent. Writing ANSI text makes the file format dependent on the particular ANSI code page in effect at runtime on the user's particular system.
However, using Unicode consistently would be a breaking change for client applications using the ANSI build mode.
What do you think?
PS. Is there currently any kind of indication to the OCF application about the ANSI code page used to save a document when you try to read it back in?
Related
Commit: [r6901]
Last edit: Vidar Hasfjord 2024-04-11
@vattila said:
This is actually a real problem I encountered in our application.
In OWL 6.20 the document format was always in char.
After a migration, I have to detect if the old stored data is in old format and manually convert it to a new format if necessary.
@pralkopodobny wrote:
That is convincing evidence to me that changing the string encoding, depending on the build mode, is the wrong thing to do.
If we chose to use UTF-8 consistently, we should get pretty good backwards compatibility for applications sticking primarily to ASCII. Does your application use characters outside the ASCII set for its documents?
It would be good to hear from other OCF users on this (@elotter and @sebas_ledesma, in particular).
Edit: Here is my proposal for how to deal with this issue:
Last edit: Vidar Hasfjord 2024-03-30
See also open ticket "TStorageDocument::GetProperty fails in Unicode build mode" [bugs:#330].
Related
Bugs: #330