The current behavior of SPrintf seems to be that the std::string used as a write-buffer has new data appended instead of having existing data replaced with new as the equivalent C function sprintf does. Consequently, I have to do "<string instance>.clear()" in order to reuse the same string during repeated SPrintf calls.
I'll be happy to submit a patch if you agree with my observation that SPrintf should mimic sprintf and always start at position 0 in its write-buffer and replace any existing data.
-Tom
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
The rationale for that behavior was that appending is "more primitive" than clearing and then appending, i.e., if all you have is a SPrintf that always clear()s, you cannot build a string efficiently with multiple calls to SPrintf. In contrast, if all you have is a SPrintf that never clears, it's easy to clear the target from the outside. Incidentally, you _can_ build a string efficiently with multiple calls to sprintf if you preallocate your buffer accordingly and do the appropriate pointer arithmetic (unpleasant but efficient).
That being said, it's hard to tell which behavior is better and closer to sprintf. I'm curious to hear what other people think.
Andrei
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Andrei, thanks for your reply. (Note that I have loved your Printf family since I first read your article introducing it--have you proposed it as a Boost library?)
Regarding which behavior is better, all I can say is that I used sprintf (and snprintf) for years (until SPrintf came along) and have always used the, to my perhaps naive view, "default" behavior where I passed it the beginning address of my buffer. I have never thought of using it to append because, for that operation, I usually used strcat with a formatted string (probably made with another sprintf call using another buffer).
If it were up to me, I think I would make the "clearing" behavior the default and expect the user to use something STL-like such as "SPrintf(buf.end(), fmt)(arg);" to append.
But again, I was never smart enough to see the appending possibility of sprintf, so most others may disagree.
-Tom
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
About the quality of SafeFormat - in order to really be up to snuff, FPrintf should do its own file locking, which is not portable. I never got around to doing it, and as a result SPrintf is slower than fprintf and less thread-safe (lines might get fragmented).
I, too, would like to minimize surprise to people used to the printf family, but it's hard to do so against efficiency concerns. (BTW the SPrintf(buf.end(), fmt)(arg) and SPrintf(buf.begin(), fmt)(arg2) in your later message won't work, as you noted. It would be interesting to print into insert_iterators though.)
I'd be glad to hear input from others as well.
Andrei
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I have put together a test suite to see how the various versions of SPrintf we've discussed compare speed-wise. I added two new versions for testing which have the, IMHO, more sprintf-like behavior:
1. SPrintf which uses { string::assign } under the hood and,
2. SPrintf which uses { string::clear; string::append }.
Then I ran several tests 100,000,000 times each:
{
// test 1 - tests current behavior using two different string buffers
string buf;
SPrintf(buf, fmt)("quick")("jumped")("lazy")(3)(.3);
{
// test 2 - tests current behavior the way I very often use SPrintf in a loop situation
string buf;
SPrintf(buf, fmt)("quick")("jumped")("lazy")(3)(.3);
// reuse buf for a new string
buf.clear();
SPrintf(buf, fmt)("quick")("jumped")("lazy")(3)(.3);
}
{
// test 3 - same as test 2 but with an automatically overwriting version of SPrintf
// SPrintf which uses { string::assign } under the hood and,
string buf;
SPrintf(buf, fmt)("quick")("jumped")("lazy")(3)(.3);
{
// test 4 - same as test 2 but with another automatically overwriting version of SPrintf
// SPrintf which uses { string::clear; string::append }.
string buf;
SPrintf(buf, fmt)("quick")("jumped")("lazy")(3)(.3);
Results:
test 1: 238.95 sec.
test 2: 186.59 sec.
test 3: 181.49 sec.
test 4: 193.32 sec.
I modified:
SafeFormat.cpp
test/SafeFormat/main.cpp
test/SafeFormat/Makefile
I added:
test.pl (my Perl test script)
test/SafeFormat/test_sprintf.cpp
It seems to me that efficiency is not a huge problem no matter how you define SPrintf, it's more a matter of convenience and how one uses it most often.
I'm happy to send the test suite and all the modified Loki files if you're interested. I'm also happy to add more tests.
-Tom
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Well, I got in too much of a hurry--disregard my "test" results. After I sent the mail I went to get ready for work and kept pondering SPrintf and suddenly realized why the "append" member is being used--duh!
Then I went back and looked at the results of test/SafeFormat/main (which hadn't been aborting because of the default -DNDEBUG define to the compiler) and the truth was there in all its ugliness.
Back to the drawing board.
-Tom
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Now I have a version of SPrintf that works to clear the string at first write. I tried my best to make a derived class of PrintfState but couldn't make it work. So I copied SafeFormat.h and SafeFormat.cpp and made a new PrintfState2 which is the same as the original except:
It only defines the SPrintf2 function (same as SPrintf except it clears the string buffer at each use).
It has another version of the write function with a fourth argument to allow notice of first use.
It has a new data member "first_write" for knowing when to clear the string (actually I use assign under the hood).
I have added SPrintf2 to the tests, and it works!
If there is any interest I will send all.
If I had my way, I would call the new function "SPrintf" and the current function something like SPrintfAppend."
Other options are to have the user change a compiler definition for Loki lib to change SPrintf's behavior, but whatever, the new function I believe mimics sprintf.
-Tom
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Andrei, after looking at SPrintf some more, with all due respect, I believe that its default behavior should be to overwrite from the beginning of the buffer presented.
1. Even though it might affect current users, I believe that is the correct design.
2. If you absolutely don't want to break the API, maybe add a new function called something like 'SPrintfo' (and using string::assign() under the hood instead of string::clear first).
In any event, from looking at the SafeFormat.h I think it might be possible to have an SPrintf signature something like
SPrintf(string& s, const string::iterator i, const string& format)(arg);
to use for appending (or overwriting) from i. Is that correct?
-Tom
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I understand. Probably the best thing to do right now is to gather more opinions. It would be of course possible to define two functions with different names or signatures that clear/don't clear the output before writing.
Andrei
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thanks for your suggestion. Although I do agree with you about the importance of efficiency, I disagree with your initial suggestion. I think if we modify SPrintf to replace the contents on every call, we lose both efficiency and functionality. Sometimes I will want to keep appending to the storage buffer.
Even if the existing SPrintf functionality was not as efficient as possible, I would urge that we keep it. Programmers have gotten use to the existing behavior, and implemented their code expecting that. If we change SPrintf's behavior, we force them to change their code, or we end up breaking their code without them knowing it.
Let's not modify SPrintf to replace every time even if that means programmers have to call buf.clear() between calls.
I find your idea for suggestion #3 intriguing. That idea may give us the best of both ways. Can you implement that idea without breaking existing behavior?
Cheers,
Rich
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
The standards iostreams are notoriously slow. In addition, they're only for formatting, so we'd be probably using only the streambuf interface. The best of all worlds would be to define StreambufSprintf that adds functionality over a streambuf, without changing SPrintf or FPrintf.
Andrei
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
While pondering FPrintf, I noticed again Andrei's note about it needing buffering for the FILE* device. Isn't it true that, except for stderr, all FILE pointers will generally have buffering unless the user does something special via setbuf or setvbuf?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
The current behavior of SPrintf seems to be that the std::string used as a write-buffer has new data appended instead of having existing data replaced with new as the equivalent C function sprintf does. Consequently, I have to do "<string instance>.clear()" in order to reuse the same string during repeated SPrintf calls.
I'll be happy to submit a patch if you agree with my observation that SPrintf should mimic sprintf and always start at position 0 in its write-buffer and replace any existing data.
-Tom
Thanks for the report and offer, Tom.
The rationale for that behavior was that appending is "more primitive" than clearing and then appending, i.e., if all you have is a SPrintf that always clear()s, you cannot build a string efficiently with multiple calls to SPrintf. In contrast, if all you have is a SPrintf that never clears, it's easy to clear the target from the outside. Incidentally, you _can_ build a string efficiently with multiple calls to sprintf if you preallocate your buffer accordingly and do the appropriate pointer arithmetic (unpleasant but efficient).
That being said, it's hard to tell which behavior is better and closer to sprintf. I'm curious to hear what other people think.
Andrei
Andrei, thanks for your reply. (Note that I have loved your Printf family since I first read your article introducing it--have you proposed it as a Boost library?)
Regarding which behavior is better, all I can say is that I used sprintf (and snprintf) for years (until SPrintf came along) and have always used the, to my perhaps naive view, "default" behavior where I passed it the beginning address of my buffer. I have never thought of using it to append because, for that operation, I usually used strcat with a formatted string (probably made with another sprintf call using another buffer).
If it were up to me, I think I would make the "clearing" behavior the default and expect the user to use something STL-like such as "SPrintf(buf.end(), fmt)(arg);" to append.
But again, I was never smart enough to see the appending possibility of sprintf, so most others may disagree.
-Tom
About the quality of SafeFormat - in order to really be up to snuff, FPrintf should do its own file locking, which is not portable. I never got around to doing it, and as a result SPrintf is slower than fprintf and less thread-safe (lines might get fragmented).
I, too, would like to minimize surprise to people used to the printf family, but it's hard to do so against efficiency concerns. (BTW the SPrintf(buf.end(), fmt)(arg) and SPrintf(buf.begin(), fmt)(arg2) in your later message won't work, as you noted. It would be interesting to print into insert_iterators though.)
I'd be glad to hear input from others as well.
Andrei
I have put together a test suite to see how the various versions of SPrintf we've discussed compare speed-wise. I added two new versions for testing which have the, IMHO, more sprintf-like behavior:
1. SPrintf which uses { string::assign } under the hood and,
2. SPrintf which uses { string::clear; string::append }.
Then I ran several tests 100,000,000 times each:
{
// test 1 - tests current behavior using two different string buffers
string buf;
SPrintf(buf, fmt)("quick")("jumped")("lazy")(3)(.3);
string buf2;
SPrintf(buf2, fmt)("quick")("jumped")("lazy")(3)(.3);
}
{
// test 2 - tests current behavior the way I very often use SPrintf in a loop situation
string buf;
SPrintf(buf, fmt)("quick")("jumped")("lazy")(3)(.3);
// reuse buf for a new string
buf.clear();
SPrintf(buf, fmt)("quick")("jumped")("lazy")(3)(.3);
}
{
// test 3 - same as test 2 but with an automatically overwriting version of SPrintf
// SPrintf which uses { string::assign } under the hood and,
string buf;
SPrintf(buf, fmt)("quick")("jumped")("lazy")(3)(.3);
SPrintf(buf, fmt)("quick")("jumped")("lazy")(3)(.3);
}
{
// test 4 - same as test 2 but with another automatically overwriting version of SPrintf
// SPrintf which uses { string::clear; string::append }.
string buf;
SPrintf(buf, fmt)("quick")("jumped")("lazy")(3)(.3);
SPrintf(buf, fmt)("quick")("jumped")("lazy")(3)(.3);
}
Results:
test 1: 238.95 sec.
test 2: 186.59 sec.
test 3: 181.49 sec.
test 4: 193.32 sec.
I modified:
SafeFormat.cpp
test/SafeFormat/main.cpp
test/SafeFormat/Makefile
I added:
test.pl (my Perl test script)
test/SafeFormat/test_sprintf.cpp
It seems to me that efficiency is not a huge problem no matter how you define SPrintf, it's more a matter of convenience and how one uses it most often.
I'm happy to send the test suite and all the modified Loki files if you're interested. I'm also happy to add more tests.
-Tom
Well, I got in too much of a hurry--disregard my "test" results. After I sent the mail I went to get ready for work and kept pondering SPrintf and suddenly realized why the "append" member is being used--duh!
Then I went back and looked at the results of test/SafeFormat/main (which hadn't been aborting because of the default -DNDEBUG define to the compiler) and the truth was there in all its ugliness.
Back to the drawing board.
-Tom
Now I have a version of SPrintf that works to clear the string at first write. I tried my best to make a derived class of PrintfState but couldn't make it work. So I copied SafeFormat.h and SafeFormat.cpp and made a new PrintfState2 which is the same as the original except:
It only defines the SPrintf2 function (same as SPrintf except it clears the string buffer at each use).
It has another version of the write function with a fourth argument to allow notice of first use.
It has a new data member "first_write" for knowing when to clear the string (actually I use assign under the hood).
I have added SPrintf2 to the tests, and it works!
If there is any interest I will send all.
If I had my way, I would call the new function "SPrintf" and the current function something like SPrintfAppend."
Other options are to have the user change a compiler definition for Loki lib to change SPrintf's behavior, but whatever, the new function I believe mimics sprintf.
-Tom
Andrei, after looking at SPrintf some more, with all due respect, I believe that its default behavior should be to overwrite from the beginning of the buffer presented.
1. Even though it might affect current users, I believe that is the correct design.
2. If you absolutely don't want to break the API, maybe add a new function called something like 'SPrintfo' (and using string::assign() under the hood instead of string::clear first).
In any event, from looking at the SafeFormat.h I think it might be possible to have an SPrintf signature something like
SPrintf(string& s, const string::iterator i, const string& format)(arg);
to use for appending (or overwriting) from i. Is that correct?
-Tom
I understand. Probably the best thing to do right now is to gather more opinions. It would be of course possible to define two functions with different names or signatures that clear/don't clear the output before writing.
Andrei
Another thought. Which is more efficient:
{ // 1
string buf;
SPrintf(buf, fmt)(arg);
string buf2;
SPrintf(buf2, fmt)(arg2);
}
{ // 2
string buf;
SPrintf(buf, fmt)(arg);
buf.clear();
SPrintf(buf, fmt)(arg2);
}
{ // 3
string buf;
SPrintf(buf, fmt)(arg);
SPrintf(buf.begin(), fmt)(arg2);
}
-Tom
Hi Tom,
Thanks for your suggestion. Although I do agree with you about the importance of efficiency, I disagree with your initial suggestion. I think if we modify SPrintf to replace the contents on every call, we lose both efficiency and functionality. Sometimes I will want to keep appending to the storage buffer.
Even if the existing SPrintf functionality was not as efficient as possible, I would urge that we keep it. Programmers have gotten use to the existing behavior, and implemented their code expecting that. If we change SPrintf's behavior, we force them to change their code, or we end up breaking their code without them knowing it.
Let's not modify SPrintf to replace every time even if that means programmers have to call buf.clear() between calls.
I find your idea for suggestion #3 intriguing. That idea may give us the best of both ways. Can you implement that idea without breaking existing behavior?
Cheers,
Rich
I'll give it a try.
Another point while looking at Printf, shouldn't it be using std::cout instead of std::stdout?
-Tom
The standards iostreams are notoriously slow. In addition, they're only for formatting, so we'd be probably using only the streambuf interface. The best of all worlds would be to define StreambufSprintf that adds functionality over a streambuf, without changing SPrintf or FPrintf.
Andrei
Hm, method 3 probably won't work with SPrintf's present behavior because I don't think it really changes the append function underneath the hood.
-Tom
While pondering FPrintf, I noticed again Andrei's note about it needing buffering for the FILE* device. Isn't it true that, except for stderr, all FILE pointers will generally have buffering unless the user does something special via setbuf or setvbuf?