Thread: std::vector::clear (was: [GD-Windows] VS.net rants, was Re: VC++ lag)
Brought to you by:
vexxed72
From: Rich <leg...@xm...> - 2003-06-13 15:21:20
|
In article <001d01c331b6$36c64c00$0100a8c0@r0x0rz>, "Neil Stewart" <ne...@r0...> writes: > For some reason, they have changed the functionality of vector::clear() to > actually deallocate memory, rather than simply doing erase(begin(), end()), > which leaves the memory allocated. It sounds to me like you were relying on an implementation dependency. My copy of "C++ Standard Library" by Josuttis just says that both of these will remove all the elements of the vector. AFAIK, the standard makes no requirements about allocation/deallocation behavior for these operations, but I would expect both to potentially deallocate memory when I delete the whole vector. If you want to clear the vector without changing its allocation, you should do vector::resize(0) IMO. -- "The Direct3D Graphics Pipeline"-- code samples, sample chapter, FAQ: <http://www.xmission.com/~legalize/book/> izfree: Open source tools for Windows Installer <http://izfree.sourceforge.net> |
From: Colin F. <cp...@ea...> - 2003-06-14 00:08:28
|
> > For some reason, they have changed the functionality of vector::clear() to > > actually deallocate memory, rather than simply doing erase(begin(), end()), > > which leaves the memory allocated. > > It sounds to me like you were relying on an implementation dependency. > My copy of "C++ Standard Library" by Josuttis just says that both of > these will remove all the elements of the vector. AFAIK, the standard > makes no requirements about allocation/deallocation behavior for these > operations, but I would expect both to potentially deallocate memory > when I delete the whole vector. If you want to clear the vector > without changing its allocation, you should do vector::resize(0) IMO. Suppose the author of a popular library changed the implementation of the sin() function to include Sleep(10000). Ah, it was a mistake for the programmer to DEPEND on the timing of the sin() function! ;-) Still, I like the idea of vector::clear() deallocating memory. I think that should have been the explicit specification all along. Using vector::resize(0) would have become a well-known optimization for avoiding future malloc() calls when the same vector is drained and filled to a similar level of capacity numerous times. If vector::clear() didn't guarantee deallocation, then would the only alternative be to clobber the object that contained the vector? (Wouldn't calling the destructor explicitly be bad?) Anyhow, unless I'm misunderstanding something, I'm glad for the change. I never liked the idea of memory sticking around, and the memory persisting as the maximum size of all previous uses of that vector. Sure, I can clobber the object that owns the vector, but sometimes that doesn't seem appropriate. The weird thing is DEPENDING on the NEW implementation of vector::clear() is just as bad as depending on the old implementation. It's a shame, because vector::clear() becomes a useless method -- unless users of vector::clear() can hope that the implementation will do what is statistically best for the platform. Could it be said that doing deallocation is an improvement under the Windows platform? Apparently not for some people. But maybe the dominant use of vector benefits from the deallocation behavior. Still, I could imagine a persisting vector getting a spike of stuff to fill it, getting cleared, and yet holding on to a huge amount of memory. Maybe it's bad design to keep the vector around after it was drained after a big spike. I suppose profiling memory and speed performance will always reveal what is really important for an application, and the design can be changed to respond to the reality of the current circumstances. --- Colin cp...@ea... |
From: Neil S. <ne...@r0...> - 2003-06-14 01:28:37
|
> It sounds to me like you were relying on an implementation dependency. > My copy of "C++ Standard Library" by Josuttis just says that both of > these will remove all the elements of the vector. AFAIK, the standard > makes no requirements about allocation/deallocation behavior for these > operations, but I would expect both to potentially deallocate memory > when I delete the whole vector. If you want to clear the vector > without changing its allocation, you should do vector::resize(0) IMO. I don't agree with this, for several reasons: 1. The standard says that clear() is simply erase(begin(), end()), with a post-condition of size()==0. It does not say that you can implement it any other way, and this implementation (a Dinkumware variant) does exactly that: it calls _Tidy(), so it is not following the standard on that score. 2. While the standard does not appear to say that erase() must not deallocate memory, it very explicitly states that insert() will allocate memory if required, and it is also very clear that resize() and reserve() should never reduce capacity() (i.e. deallocate memory). In other words, I believe the standard tells you what *will* happen during any operation and if it does not mention something, it means that it *will not* happen, not that "anything goes". You would have to be pretty anal to take this any other way, which is what I think Dinkumware have done. 3. Stroustrup says explicitly in "The C++ Programming Language" that a vector will never shrink, only grow. If the standard disagrees with this, then why, and why is it still in his book? 4. It is far more consistent to make erase() (and therefore clear(), due to point 1) operate in exactly the same manner as everything else in the vector, including reserve() and resize(), none of which reduce the memory used by the vector(). Every implementation I have seen does indeed treat erase() in the same, consistent manner, and only the Dinkumware one differs when it comes to clear(). It seems that everyone else has read the standard to mean that a vector will not shrink. 5. The original Dinkumware STL used in VC6 uses erase(begin(), end()), and is consistent with everything else in this respect. The fact that they have since changed this defies belief. The only reason I can think of is that they observed that many programmers were wasting a lot of memory on vectors and thought they could "optimise" applications by making clear() deallocate. The ironic thing is that they have actually hurt a lot of applications that were relying on non-deallocation for performance reasons. 6. From an implementation perspective, there is very little to gain from deallocating on clear() because you cannot assume that it *will* deallocate on all implementations, so you must always use other methods (e.g. swap()) to ensure deallocation when you require it. All they have achieved is to change the performance characteristics of the STL, which is deplorable, because clearly defined performance characteristics are one of the main features of the library. 7. Every implementation I have ever seen maintains capacity() on a clear(). For Dinkumware to say "sod what every other implementation has done for years" is quite offensive in my book and is another reaosn people will be able to spout about not using the STL. They have actually *damaged* standard C++ by doing this. The standard _may_ be unclear on this situation, but common practice is not so, if in doubt, they should have stuck with that. 8. Many, many people (including Stroustrup, it would appear) work on the assumption that no STL implementation ever deallocates on a clear(). Even if Dinkumware could argue that the standard does not actually say this, they cannot ignore the fact that a lot of people are assuming that this is the case. Doing so was very irresponsible IMO. I could go on, but I think the important point is that Dinkumware have done *no good* by taking this step, and clear() should be changed back to its original form in future releases. - Neil. |
From: Colin F. <cp...@ea...> - 2003-06-14 05:16:45
|
I think ::swap() is a lame way to "deallocate" a vector's memory. It requires creating a temporary vector, swapping contents with the vector whose buffer you want to throw away, and then clobbering the temporary vector. I find the idea of having to summon up a trashcan, trading its emptiness with the trash of another, and then destroying the trashcan is nutty! The vector class should have had a ::deallocate() method from the very beginning. Since it did not, and the ::swap() technique is lame, and clobbering the owner of vector instances is lame, and adding a new method to vector is lame, the least-lame option is to exploit the lack of implementation specification of ::clear(). Sure, existing applications will suffer a speed hit when recompiled, but the fix is as easy as searching for "clear()" and replacing with "resize(0)". I know that the "ease of fixing" is not a justification for changing the historical implementation! I think the new implementation of clear has a few benefits: (1) Reduce memory waste due to vectors keeping up to (2 * max_bytes_needed) over their lifetimes; (2) Make crashing due to stale pointers and references more likely! As it was(/is), when one did a clear() of a vector, pointers to elements would still work until the vector grew! With a ::clear() that deallocates, the probability of discovering stale pointers and references increases. (3) Custom allocators and deallocators will now have a better sense of actual required memory on an ongoing basis. As far as I'm concerned, the idea of something only growing and never shrinking sounds like a black hole or a memory leak! I think Microsoft faced tough choices: (a) Do nothing, and suffer the ever-expanding memory consumption of vectors scattered all over code; (b) Add a ::deallocate() method to vector, and experience the paranoia and wrath of cross-platform users of STL; (c) Change ::clear() to do deallocation and suffer the wrath and paranoia of cross-platform users of STL who relied on the *implementation* of ::clear() operating a certain way; Perhaps Microsoft will one day employ AIs in their workforce, and so some poor AI can take the fall when the public gets enraged. This would bring new meaning to employee "termination"! Of course, they would just reboot the employee to give it a new process ID, and thus flush away liability without affecting productivity. "Sure, PID 1337 put spyware and crash-hooks in IE 10.0, but PID 1337 was... fired." --- Colin cp...@ea... |
From: Neil S. <ne...@r0...> - 2003-06-14 11:23:24
|
> I think ::swap() is a lame way to "deallocate" a vector's > memory. It requires creating a temporary vector, swapping > contents with the vector whose buffer you want to throw away, > and then clobbering the temporary vector. I find the idea > of having to summon up a trashcan, trading its emptiness > with the trash of another, and then destroying the trashcan > is nutty! > > The vector class should have had a ::deallocate() method from > the very beginning. Yes it is lame and a deallocate() method, or something like it, would have been better. > Since it did not, and the ::swap() technique is lame, and > clobbering the owner of vector instances is lame, and adding > a new method to vector is lame, the least-lame option > is to exploit the lack of implementation specification of > ::clear(). My other post gives several reasons why this is a bad thing and, as I said, I don't think there is a lack of specification in the standard on this matter, something which is reflected in all the STLs I have seen, except for the one in VC7. The least lame option would be to leave things as they are for now, and fix the problem in the forthcoming, new standard, but by adding something like deallocate() and not changing clear(). > Sure, existing applications will suffer a speed hit when > recompiled, but the fix is as easy as searching for > "clear()" and replacing with "resize(0)". > > I know that the "ease of fixing" is not a justification for > changing the historical implementation! You're right, it's not a justification, so it isn't acceptable to just change things and hope everyone realises that what they thought was true is actually not the case. Think of all the C++ web pages out there, all the books, and all the programmers that are working to an assumption (actually, a specification) which is not being followed properly. It seems to me that getting all of them to change is far more of an issue than sticking to the accepted implementation and well-known swap() method for releasing the memory. > I think the new implementation of clear has a few benefits: > > (1) Reduce memory waste due to vectors keeping up to > (2 * max_bytes_needed) over their lifetimes; Well, apart from the fact that people already know how to free the memory used by a vector (the swap method), this won't work, because only one implementation does this. It's a worthless feature, because you cannot rely on it. > (2) Make crashing due to stale pointers and references > more likely! As it was(/is), when one did a clear() > of a vector, pointers to elements would still work > until the vector grew! With a ::clear() that > deallocates, the probability of discovering stale > pointers and references increases. I think you already know that this is a pretty ropey justification. ;) > (3) Custom allocators and deallocators will now have a > better sense of actual required memory on an > ongoing basis. What makes you think that? > As far as I'm concerned, the idea of something only growing > and never shrinking sounds like a black hole or a memory > leak! Well, you shouldn't be using vector then. That's what it is supposed to do, and it's a very useful idiom for many applications. > I think Microsoft faced tough choices: > (a) Do nothing, and suffer the ever-expanding memory > consumption of vectors scattered all over code; Or do nothing and - shock, horror - remain identical to every other implementation. - Neil. |
From: Colin F. <cp...@ea...> - 2003-06-14 14:06:29
|
> You're right, it's not a justification, so it isn't acceptable to just > change things and hope everyone realises that what they thought was true is > actually not the case. Think of all the C++ web pages out there, all the > books, and all the programmers that are working to an assumption (actually, > a specification) which is not being followed properly. It seems to me that > getting all of them to change is far more of an issue than sticking to the > accepted implementation and well-known swap() method for releasing the > memory. Things are changing all the time. I bought a new book on Linux driver development two years ago, and the very first program in the entire book, the "Hello, World!" of kernel modules, failed to compile due to the relocation of header files. Starting with a certain version of Red Hat Linux, the technique of getting network stuff to work radically changed -- lots of configuration files, etc, had been moved around. HTTP is changing. Java is changing. PHP is changing. Even Stroustrup's 2nd edition C++ book is worse than worthless now! I have a book called "The STL <PRIMER>" which is similarly obsolete. (I think Josuttis's book is great, by the way!) Think of all the C++ STL demo code without "using namespace std;"! Ideally changes would never affect existing code. The idea of Microsoft adding a "::deallocate()" method may have been a preferable alternative to changing "::clear()". Microsoft would depart from STL in either case (if one accepts that ::clear() should not deallocate), but at least a new method is something that would not affect existing code, and there's always the chance that "STL 2" would one day appear, and Microsoft could advocate their "::deallocate()" method. But maybe the intention was to affect existing code! Maybe making ::clear() do deallocation is a pragmatic solution to a serious problem plaguing 99% of the typical uses of vector. I'm not saying I would support something like that, but it would be the same kind of thinking that would lead a car company to lower safety standards and accept the increase in lawsuits because it ultimately lowered their costs. If changing ::clear() reduces disk access due to swapping, and thus generally improves the performance of applications running under Windows, "it's worth giving developers one more gotcha to be aware of". Speaking of awareness, Stroustrup's 3rd edition C++ book does not mention the "well-known swap() method for releasing the memory" of a vector. Stroustrup does propose that one can free up memory by assigning a new value to a vector, but how lame is his example: v = vector<int>(4,99); ? It's worse than the ::swap() idea -- unless creating and initializing a new vector to replace the existing vector makes additional sense to one's particular application. Josuttis's book does mention the ::swap() trick briefly, and I imagine that if I did a Google search for "STL vector deallocate" or something like that, I am sure I would discover the ::swap() trick. But, at the same time, one would discover that one cannot rely on "::clear()" NOT doing deallocation anymore...! Anyhow, if it is really critical to an application that a core function, class, or data struction work or look a certain way, then it should be wrapped! Maybe it's lame to wrap something that itself was meant to be the end-all and be-all of wrappers in the first place, like STL, but one has to dial up the countermeasures to address the paranoia. To summarize, I agree that changes cost humanity a lot of time and effort. I am intrigued by what may have motivated Microsoft to change ::clear()...Was it to address an internal need? At the same time I am wondering what application does so much vector filling and clearing that malloc() takes 10% of the CPU time. Polygon buffers that are adapting every frame? I'm glad this topic came up. In my opinion it is further support for defensive programming. I think it's safe to assume "Sleep(10000)" won't be added to the implementation of cos() any time in the future, but I don't see the same kind of absurdity in allowing ::clear() to deallocate, so I think it was kind of irresponsible to *rely* on clear() to not do deallocate! Maybe most people didn't so much "rely" on not deallocating as much as become surprised by the slowdown when it does start deallocating! --- Colin cp...@ea... |
From: Neil S. <ne...@r0...> - 2003-06-14 19:15:57
|
> Even Stroustrup's 2nd edition C++ book is worse than worthless now! > I have a book called "The STL <PRIMER>" which is similarly > obsolete. (I think Josuttis's book is great, by the way!) > Think of all the C++ STL demo code without "using namespace std;"! True, things change all the time, but the current C++ standard will not. Stroustrup's 3rd edition is based on the final draft of the standard, and he clearly states that a vector never shrinks, so it seems that he doesn't think the standard is vague on this matter either. > Ideally changes would never affect existing code. The idea of > Microsoft adding a "::deallocate()" method may have been a > preferable alternative to changing "::clear()". Microsoft > would depart from STL in either case (if one accepts that > ::clear() should not deallocate), but at least a new method > is something that would not affect existing code, and there's > always the chance that "STL 2" would one day appear, and > Microsoft could advocate their "::deallocate()" method. It would affect any code written using their non-standard extension which you then tried to build on another compiler. That's what standards are for. If they are not going to stick to them, why implement them in the first place? The second option, advocating the addition of the method to the new standard, is the way to go. > But maybe the intention was to affect existing code! Maybe > making ::clear() do deallocation is a pragmatic solution to > a serious problem plaguing 99% of the typical uses of vector. > I'm not saying I would support something like that, but it > would be the same kind of thinking that would lead a car > company to lower safety standards and accept the increase in > lawsuits because it ultimately lowered their costs. If > changing ::clear() reduces disk access due to swapping, and > thus generally improves the performance of applications > running under Windows, "it's worth giving developers one more > gotcha to be aware of". Yes, I'm pretty sure it was something like this that prompted them to make the change. They probably thought that the average developer didn't understand the no-shrink policy on vectors, and so decided it would be best for most people if they changed it. Unfortunately, they didn't show any concern for developers who took the time to understand exactly how it worked, or developers who wanted to write portable code, or in fact anyone who thinks that standards are worth a toss. They didn't even bother to mention it as a change to watch out for when migrating from VC6. > absurdity in allowing ::clear() to deallocate, so I think it > was kind of irresponsible to *rely* on clear() to not do deallocate! > Maybe most people didn't so much "rely" on not deallocating > as much as become surprised by the slowdown when it does start > deallocating! No. What was irresponsible was changing the behaviour to something which is illegal according to the standard. If anything, myself and others have been complete saps for relying on standards to be worth anything. - Neil. |