From: Steve W. <swa...@no...> - 2005-08-18 21:18:07
|
DAVID GAMEY wrote: > Steve, > > --- Steve Wampler <swa...@no...> wrote: > > >>Hi David, >> >>DAVID GAMEY wrote: >> >>>... >>>Steve, >>> >>>Just a hunch as this sounds strangely familiar. >> >>Can >> >>>you determine how many strings you're creating and >>>concatenating? >>> >>>I ran into a problem some years back where I was >>>returning map(c,s1,s2) where c was a 1 character >>>string. For each character I would do this 9 >> >>times >> >>>creating 9 new strings. As I was concatenating >> >>the >> >>>result with the previous function a character at a >>>time, it got pathelogical. Even for a few hundred >>>characters of message I overflowed the Icon GC >>>counter. >> >>Hmmm, *that* is interesting. I do create a *lot* of >>temporary strings. But the GC shouldn't see most >>of them as the GC *should* only be following >>referenced links and I'm not getting any errors or >>crash, so an internal overflow seems unlikely. >>However, I haven't come up with any other >>explanation, >>so... >> > > > I don't think it works that way. > > If I do: > > s2 := "" > while s2 ||:= f(read()) > > where f creates a new string, say map to force lower > case, then read() creates a string, f creates a new > string right after it (no space saving optimization). > And concatention does the real damage, since now the > entire string is recopied to the end of the area. > Think about this on a file of 1M records! I understand that (I think) but, unless things have drastically changed, the old string (previously referenced by s2) is not referenced anymore and so shouldn't be seen as the GC walks the string qualifiers. So during the reclamation phase the storage for the old string should be reclaimed. You will end up calling the GC more often since there's not space optimization available, but the actual storage itself shouldn't grow out of hand. If this *isn't* how it works anymore, then I think I'll claim that Icon's GC is now broken... In any event, I'm creating less than n! strings (my head hurts whenever I try to really work it out, so that's a guess) where n is the length of the input string. Far fewer than that are ever 'referencible' at any one time, of course, somewhere well less than 2^n. At the point I suddenly see the leftover storage after GC, n = 22. The maximum length of any string is n characters, so the total referenced string space should never be greater than n*(2^n) bytes (ignoring the number of qualifiers for the moment). So I max out at less than 92MB of string storage in use, and by the end of the program that should be back to a *very* small value (as seen in the program when n=21). -Steve -- Steve Wampler -- swa...@no... The gods that smiled on your birth are now laughing out loud. |