You can subscribe to this list here.
2005 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(2) |
Aug
(1) |
Sep
|
Oct
(16) |
Nov
(6) |
Dec
(5) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2006 |
Jan
|
Feb
|
Mar
(1) |
Apr
(1) |
May
|
Jun
|
Jul
|
Aug
|
Sep
(7) |
Oct
|
Nov
|
Dec
(8) |
2007 |
Jan
|
Feb
|
Mar
(14) |
Apr
|
May
(57) |
Jun
(4) |
Jul
(6) |
Aug
(2) |
Sep
(16) |
Oct
|
Nov
(3) |
Dec
(12) |
2008 |
Jan
(16) |
Feb
(3) |
Mar
|
Apr
|
May
(11) |
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
(2) |
Nov
(2) |
Dec
(2) |
2009 |
Jan
(1) |
Feb
(13) |
Mar
(9) |
Apr
|
May
(16) |
Jun
(4) |
Jul
(5) |
Aug
|
Sep
(15) |
Oct
|
Nov
(3) |
Dec
(4) |
2010 |
Jan
|
Feb
|
Mar
|
Apr
(2) |
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
(14) |
2011 |
Jan
(13) |
Feb
(61) |
Mar
(5) |
Apr
(5) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(8) |
Nov
(6) |
Dec
(2) |
2012 |
Jan
|
Feb
|
Mar
|
Apr
(13) |
May
(3) |
Jun
|
Jul
(2) |
Aug
(14) |
Sep
(1) |
Oct
(36) |
Nov
(37) |
Dec
(1) |
2013 |
Jan
|
Feb
(1) |
Mar
|
Apr
|
May
(2) |
Jun
|
Jul
(3) |
Aug
|
Sep
|
Oct
(4) |
Nov
(5) |
Dec
|
2014 |
Jan
(1) |
Feb
(35) |
Mar
(3) |
Apr
|
May
(7) |
Jun
(2) |
Jul
|
Aug
(1) |
Sep
(3) |
Oct
|
Nov
|
Dec
|
2015 |
Jan
|
Feb
|
Mar
(7) |
Apr
(4) |
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
|
2016 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(3) |
Jun
|
Jul
|
Aug
(10) |
Sep
|
Oct
|
Nov
|
Dec
|
2017 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
(9) |
Oct
|
Nov
(3) |
Dec
(4) |
2020 |
Jan
(5) |
Feb
|
Mar
(2) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(4) |
Dec
|
From: Alan S. <aj...@fr...> - 2007-12-14 22:18:12
|
>> easier to use APIs, > I need help on that one. Please make suggestions. I am out of ideas. Indeed, it can be abstract and tricky. I'm presently making lots of great use of libJudy in a software project. But even as one of the developers of the library, and even using the macros rather than the functions, I can get confused about how exactly to manage the pointers; especially for array-of-array, etc. I'm not sure what you could do about that, it's inherently abstract passing pointers to methods/functions that might modify them, when one array's value word is a pointer to another array. I believe that in C++ with operator overloading you might be able to use libJudy more transparently, but I've never tried it. Alan Silverstein |
From: Toni C. <tca...@to...> - 2007-12-14 10:52:19
|
One suggestion on the API side would be to remove all (public) macros and then use the Win32 API approach: Example (using macros from winnt.h): DECLARE_HANDLE(HZIP); HZIP OpenZip(const char *fn, const char *password); DWORD UnzipItem(HZIP hz, int index, const char * fn); Etc. Externally HZIP is just a named structure so cannot be passed to other different routines (unlike void*) and the correct level of indirection works etc. Internally you cast it to whatever you want: in this case, it's just a ptr to a struct. When I first looked at Judy, the most frustrating issues I had were dealing with the preprocessor macros and the lack of type safety in the use of (void*). A secondary issue is the error handling which needs to be a single check on each function return - personally, I am not a fan of returning error codes via arguments and prefer to just have a FALSE or NULL return from the function directly. It should be obvious to me why the error occurred and if not, there's always the Judy source to look at. Regards, Toni. From: jud...@li... [mailto:jud...@li...] On Behalf Of Doug Baskins Sent: 14 December 2007 09:37 To: zooko Cc: jud...@li... Subject: Re: Save the array to a file? Zooko: > Glad to hear it! Please don't waste your time trying to make Judy > multithread-safe. Ok, I will leave that one on the back burner. > , my wishlist includes faster performance > (in the single-threaded case), That is my plan. > simpler code base, That is my plan. > easier to use APIs, I need help on that one. Please make suggestions. I am out of ideas. > and possibly extensions such as a pluggable sort order (even though > that would probably be a big performance loss if it were used). I believe there only 2 practical ways with the method used with Judy. 1) "Numeric" sort as in numbers 2) "Dictionary" sort as in alphabetic. > Also > improving performance on 32-bit machines would be good. That's the plan, but there is really little difference from 32 and 64 bit. > 32-bit > machines are not going away -- they're just becoming smaller, > cheaper, lower-power, and more ubiquitous. Thanks for your inputs. Doug ----- Original Message ---- From: zooko <zo...@zo...> To: Doug Baskins <dou...@ya...> Cc: Jimi Xenidis <ji...@po...>; jud...@li... Sent: Monday, December 10, 2007 5:50:34 AM Subject: Re: Save the array to a file? On Dec 9, 2007, at 7:09 PM, Doug Baskins wrote: > Sorry for a delayed response. I have just settled down in my > Thailand home after > being on the road for about 6 weeks. I plan on doing some intense > work on Judy > for the first time in about a year and a half. Glad to hear it! Please don't waste your time trying to make Judy multithread-safe. Instead, my wishlist includes faster performance (in the single-threaded case), simpler code base, easier to use APIs, and possibly extensions such as a pluggable sort order (even though that would probably be a big performance loss if it were used). Also improving performance on 32-bit machines would be good. 32-bit machines are not going away -- they're just becoming smaller, cheaper, lower-power, and more ubiquitous. By the way, you might be interested in the modern work on cache- friendly data structures e.g. http://citeseer.ist.psu.edu/569573.html Regards, Zooko |
From: Doug B. <dou...@ya...> - 2007-12-14 10:10:15
|
Zooko: I have heard that one loud and clear. I am sorry I ever considered JudySL with null terminated strings. As Alan mentions in another post, it is a very difficult problem. However, I think I have found a solution and trying to find the time to implement it. The current proposed API is to just add a string length parameter to the JudySL API and change the name slightly. It looks kind of ugly because of the need to return the "length" of the string in the Next/Prev API's. Speaking of API's, this is apparently not my forte'. I would like some suggestions. This change would allow all characters (including null) to be in the index string. Does that address your questions? Doug PS. When Judy was invented, passing parameters was a big extra overhead to a call. That is no longer true in a modern processor. Also, a subroutine call was a big hit on performance, that also is no longer true. So, one more advantage of a macro is gone. ----- Original Message ---- From: zooko <zo...@zo...> To: Doug Baskins <dou...@ya...>; jud...@li... Sent: Monday, December 10, 2007 2:00:47 PM Subject: length-associated strings By the way, one specific detail of extending the API to Judy Trees is that I would like is to use keys which are variable-length bytestrings associated with a 32-bit length. I prefer not to use null-termating strings nowadays when explicit- length strings are safer, faster, and easier for my typical uses. It's unfortunate that I can't use these strings as keys in a sorted Judy Tree. Regards, Zooko |
From: Doug B. <dou...@ya...> - 2007-12-14 09:43:54
|
Mike: If my memory serves me correctly, there were API problems with that one. My current plan is to come up with a new JudySL/JudyHS. Also my hope is that the new Judy will be so fast that a Hash method will become pointless. Please send me a proposed API for that problem. The code is fairly easy to write. Thanks for the input, and sorry about that. Doug ----- Original Message ---- From: Mike Eynon <gon...@ya...> To: jud...@li... Sent: Monday, December 10, 2007 8:45:36 AM Subject: JudyHS iterator? Hi all. Is there a way to iterate through all the entries in a JudyHS hash? I understand why the entries cannot be returned to me in sorted order, but I still require a method for hitting all of the indexes in the hash without knowing (or carrying) what they are. Is there a way to do this? Going through the docs on SourceForge suggests the answer is, "no". Thanks, Mike. ____________________________________________________________________________________ Never miss a thing. Make Yahoo your home page. http://www.yahoo.com/r/hs ------------------------------------------------------------------------- SF.Net email is sponsored by: Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://sourceforge.net/services/buy/index.php _______________________________________________ Judy-devel mailing list Jud...@li... https://lists.sourceforge.net/lists/listinfo/judy-devel |
From: Doug B. <dou...@ya...> - 2007-12-14 09:36:57
|
Zooko: > Glad to hear it! Please don't waste your time trying to make Judy > multithread-safe. Ok, I will leave that one on the back burner. > , my wishlist includes faster performance > (in the single-threaded case), That is my plan. > simpler code base, That is my plan. > easier to use APIs, I need help on that one. Please make suggestions. I am out of ideas. > and possibly extensions such as a pluggable sort order (even though > that would probably be a big performance loss if it were used). I believe there only 2 practical ways with the method used with Judy. 1) "Numeric" sort as in numbers 2) "Dictionary" sort as in alphabetic. > Also > improving performance on 32-bit machines would be good. That's the plan, but there is really little difference from 32 and 64 bit. > 32-bit > machines are not going away -- they're just becoming smaller, > cheaper, lower-power, and more ubiquitous. Thanks for your inputs. Doug ----- Original Message ---- From: zooko <zo...@zo...> To: Doug Baskins <dou...@ya...> Cc: Jimi Xenidis <ji...@po...>; jud...@li... Sent: Monday, December 10, 2007 5:50:34 AM Subject: Re: Save the array to a file? On Dec 9, 2007, at 7:09 PM, Doug Baskins wrote: > Sorry for a delayed response. I have just settled down in my > Thailand home after > being on the road for about 6 weeks. I plan on doing some intense > work on Judy > for the first time in about a year and a half. Glad to hear it! Please don't waste your time trying to make Judy multithread-safe. Instead, my wishlist includes faster performance (in the single-threaded case), simpler code base, easier to use APIs, and possibly extensions such as a pluggable sort order (even though that would probably be a big performance loss if it were used). Also improving performance on 32-bit machines would be good. 32-bit machines are not going away -- they're just becoming smaller, cheaper, lower-power, and more ubiquitous. By the way, you might be interested in the modern work on cache- friendly data structures e.g. http://citeseer.ist.psu.edu/569573.html Regards, Zooko |
From: Alan S. <aj...@fr...> - 2007-12-10 22:48:03
|
Since you used the phrase "length-associated", I gather you found and read my treatise on the subject of length-associated versus length-terminated strings versus Judy? Yes, length-associated + any kind of sorting other than length-first, is a hard problem. Alan Silverstein |
From: Alan S. <aj...@fr...> - 2007-12-10 22:43:20
|
> ...possibly extensions such as a pluggable sort order (even though > that would probably be a big performance loss if it were used)... Bear in mind that libJudy is NOT primarily a sort engine. It just HAPPENS to sort, in a very specific way, because it is a radix tree => radix sort. Alan Silverstein |
From: zooko <zo...@zo...> - 2007-12-10 21:01:03
|
By the way, one specific detail of extending the API to Judy Trees is that I would like is to use keys which are variable-length bytestrings associated with a 32-bit length. I prefer not to use null-termating strings nowadays when explicit- length strings are safer, faster, and easier for my typical uses. It's unfortunate that I can't use these strings as keys in a sorted Judy Tree. Regards, Zooko |
From: Mike E. <gon...@ya...> - 2007-12-10 15:45:46
|
Hi all. Is there a way to iterate through all the entries in a JudyHS hash? I understand why the entries cannot be returned to me in sorted order, but I still require a method for hitting all of the indexes in the hash without knowing (or carrying) what they are. Is there a way to do this? Going through the docs on SourceForge suggests the answer is, "no". Thanks, Mike. ____________________________________________________________________________________ Never miss a thing. Make Yahoo your home page. http://www.yahoo.com/r/hs |
From: zooko <zo...@zo...> - 2007-12-10 12:51:00
|
On Dec 9, 2007, at 7:09 PM, Doug Baskins wrote: > Sorry for a delayed response. I have just settled down in my > Thailand home after > being on the road for about 6 weeks. I plan on doing some intense > work on Judy > for the first time in about a year and a half. Glad to hear it! Please don't waste your time trying to make Judy multithread-safe. Instead, my wishlist includes faster performance (in the single-threaded case), simpler code base, easier to use APIs, and possibly extensions such as a pluggable sort order (even though that would probably be a big performance loss if it were used). Also improving performance on 32-bit machines would be good. 32-bit machines are not going away -- they're just becoming smaller, cheaper, lower-power, and more ubiquitous. By the way, you might be interested in the modern work on cache- friendly data structures e.g. http://citeseer.ist.psu.edu/569573.html Regards, Zooko |
From: Doug B. <dou...@ya...> - 2007-12-10 02:09:30
|
Jimi: No this is not a RTFM. The Judy team often pondered this idea. Alan even wrote some code to "load" and "unload" a Judy Array. My experiments have shown that Judy is so fast compared to a disk access that doing just one element at a time in a "for loop" is just as adequate and gives you the opportunity to store the data in your format on the disk. However, I am open to suggestions. Sorry for a delayed response. I have just settled down in my Thailand home after being on the road for about 6 weeks. I plan on doing some intense work on Judy for the first time in about a year and a half. Thank you for your interest. Doug Baskins <dou...@ya...> ----- Original Message ---- From: Jimi Xenidis <ji...@po...> To: jud...@li... Sent: Wednesday, November 21, 2007 9:05:15 AM Subject: Save the array to a file? I hope this is not a RTFM, I looked at is was not obvious and the SF mail archive search is bogus :( I have a dynamic array that is seeded with out about 1000 entries from a data file. Is there any way to save (or pre-compile?) this initial set so I don't suffer the overhead of populating it from a parser? It would be fine if the result is static and I need to use another dynamic array for additions. -JX ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Judy-devel mailing list Jud...@li... https://lists.sourceforge.net/lists/listinfo/judy-devel |
From: Alan S. <aj...@fr...> - 2007-11-21 16:42:33
|
> I have a dynamic array that is seeded with out about 1000 entries from > a data file. Is there any way to save (or pre-compile?) this initial > set so I don't suffer the overhead of populating it from a parser? It > would be fine if the result is static and I need to use another > dynamic array for additions. One of the topics we discussed at length before the project was cancelled in 2002 was "persistent Judy arrays", meaning disk I/O. There wasn't any good quick answer. However, I believe that buried in the library is a batch-insertion set of functions... Maybe undocumented. If you can't find them, let me know and I'll look. In lieu of that, you can first/next and write out the data in any form you like, including some efficient binary form that takes minimal parsing, and J1S/JLI/JSLI it back in. Alan Silverstein |
From: Jimi X. <ji...@po...> - 2007-11-21 16:05:42
|
I hope this is not a RTFM, I looked at is was not obvious and the SF mail archive search is bogus :( I have a dynamic array that is seeded with out about 1000 entries from a data file. Is there any way to save (or pre-compile?) this initial set so I don't suffer the overhead of populating it from a parser? It would be fine if the result is static and I need to use another dynamic array for additions. -JX |
From: skaller <sk...@us...> - 2007-11-01 03:03:04
|
I've been trying to track down a bug in my system, which uses Judy .. I decided to try Valgrind.. I am getting a LOT of these: ==22122== Conditional jump or move depends on uninitialised value(s) ==22122== at 0x432F6E: JudyLPrev (flx_judy.pak:3662) which refers to { SEARCHLEAFNATIVE(Word_t, Pjlw, LeafPop1, Index); } ==22122== Conditional jump or move depends on uninitialised value(s) ==22122== at 0x438950: JudyLGet (flx_judy.pak:3662) ==22122== by 0x428478: JudyLLast (flx_judy.pak:34209) Line 34209 reads: if ((PValue = JudyLGet(PArray, *PIndex, PJError)) == PPJERR) I guess this means a jump based on PValue, which isn't initialised until after the JudyLGet() .. [Judy seems to work so this report for information only] -- John Skaller <skaller at users dot sf dot net> Felix, successor to C++: http://felix.sf.net |
From: skaller <sk...@us...> - 2007-09-17 20:49:54
|
On Mon, 2007-09-17 at 11:43 -0700, Doug Baskins wrote: > John: > > > If you're right, I can remove the -1 check. (But I tried that, > > and stuff failed which works with the check .. if my code > > was bugged it wouldn't work, even with the check..) > > NO, the -1 return is an "error flag" that says "check the error > structure" > for the type of error. Hmm .. you're right. I put in a debug print, and there are no errors in my code. -1 is never returned, only NULLs :) I must have been confused from when I did have an error (passed array/pointer to array when other was expected). -- John Skaller <skaller at users dot sf dot net> Felix, successor to C++: http://felix.sf.net |
From: skaller <sk...@us...> - 2007-09-17 20:30:16
|
On Mon, 2007-09-17 at 11:43 -0700, Doug Baskins wrote: > > The functions are the critical thing. The standard judy header > > shouldn't contain the 'convenience' macros, put them in > > another header so they can be used or not at the client's > > discretion. > ... > > Good suggestion, perhaps you should read the error section in Judy_3x > man > page. (also Judy_3x.htm#ERRORS). It starts off: "A lot of thought > (and time) > went into making error handling in Judy simple, ...". Yes, but you then document Judy in terms of the macros I will never use, so most of the documentation is useless to me. For the Felix binding, I don't need any macro crud, since Felix provides its own strongly typed wrapper. For example: type JLArray = "void**"; proc JudyLGet: JLArray * word * JError_t * ptr[ptr[word]] = "*$4=(Word_t*)JudyLGet(*$1,$2,$3);"; proc JudyLFirst: JLArray * ptr[word] * JError_t * ptr[ptr[word]] = "*(Word_t**)$4=(Word_t*)JudyLFirst(*$1,$2,$3);"; [The notation X * Y just means a pair x,y as in Cartesian Product] Since Felix does NO automatic conversions at all, there is no way to make an error calling the Felix functions (as you can in C). of course.. the wrappers can be bugged.. :) As to the Felix gc, the number of calls to Judy is small, and the 'convenience' of error handling etc the macros purport to provide actually gets in the way: I'd rather call the library functions directly so I know exactly what I'm getting into in terms of performance and error handling. Just as an example of the problems with the macros: JLI( PValue, PJLArray, Index); // JudyLIns() Word_t Index, Index1, Index2, Nth; Ok, so the third argument is a Word_t.. but what you don't say is that it has to be an lvalue. In C++ you'd call that Word_t& but it isn't mentioned anywhere which arguments are values and which are lvalues. In the C library functions it is explicit in the type: C doesn't have references so the argument is given as Word_t* It's obvious to you as the author but initially I didn't realise that functions like JudyLNext actually modify the Index argument as well as returning a pointer to the slot containing the data. The plain C interface makes that clear .. the macros simply *obscure* the semantics. IMHO of course :) > > I forget what that is? > > Word_t *abc = (void *)JudyLGet(); > > is an error in C++, even though both are pointers. I thought void * > was a > pointer to anything you wanted to make it. Ah. Yes, C++ banned *implicit* conversion FROM a void*. You can implicitly convert any pointer TO a void*. The argument is that converting TO a void* is always safe, but converting FROM a void* to a type should require an explicit cast to say what type you put in it. So you have to write: void *pv; Word_t *abc = (Word_t*)pv; This case is a bit unfortunate, having to repeat the type name. But 'implicit conversion' also applies to void f(Word_t*); f(pv); // ERROR f(*Word_t*)pv); // OK This is particularly important in C++ because of overloading: if the conversion were implicit, you might accidentally call the wrong function with a void* argument because ANY function accepting a pointer to anything would accept it -- so the rule requires an explicit cast. The same thing applies to integer<-->pointer casts: in K&R C I think this was implicit and explains why HUGE amounts of C code are bugged by Unix programmers. [I have a pet hatred of Unix nerds, because I originally worked on 16 bit segmented DOS code and got sick of the uppity 'so superior' attitude these people had .. so now I laugh, because DOS programmers never made the mistake of assuming pointers and integers were the same size .. or even that pointers were linear .. :] -- John Skaller <skaller at users dot sf dot net> Felix, successor to C++: http://felix.sf.net |
From: skaller <sk...@us...> - 2007-09-17 19:52:40
|
On Mon, 2007-09-17 at 11:43 -0700, Doug Baskins wrote: > John: > > If you're right, I can remove the -1 check. (But I tried that, > > and stuff failed which works with the check .. if my code > > was bugged it wouldn't work, even with the check..) > > NO, the -1 return is an "error flag" that says "check the error > structure" > for the type of error. My assumption was that no ordinary call can lead to an error. That is, calls like 'JudyLNext' 'JudyLGet' always return NULL for 'not found' case. So what you're telling me is my original assumption is right and there is a programming error in my code. The problem is .. if I DO check, my code works. How can that be? If there is a programming error something should break, but it doesn't. -- John Skaller <skaller at users dot sf dot net> Felix, successor to C++: http://felix.sf.net |
From: skaller <sk...@us...> - 2007-09-17 18:55:35
|
On Mon, 2007-09-17 at 09:30 -0700, Doug Baskins wrote: > John: Sorry for the length of the replies, and also the fact they're not deterministic (you're making me think, so this is a thought in progress response .. :) > Please comment on what the next prototype of Judy has in Judy.h: > > > #define JLN(PValue, PArray, Index) > \ > { > \ > (Index)++; > \ > if ((PArray) && (Index)) > \ > (PValue) = (Pvoid_t) JudyLFirst(PArray, &(Index), PJE0); > \ > else > \ > (PValue) = NULL; > \ > } ouch .. I guess that will work, but it's a hack.. get_next_or_equal is changed to get_next by incrementing the argument pointer? :)) The problem with that is it modifies the argument, even in the case there is no 'next'. You can fix that by copying the value, and only setting the original Index on success. BTW: C++ and most C compilers have 'inline' functions now. 'inline' is ISO C99 too. These would be better than macros. However in C, you cannot write a function taking a reference argument (lvalue), so you have to use macros if you want to do that, instead of the "C way" which is to use a pointer. Personally, having done much functional programming, I prefer the pointer anyhow. In the example macro above, 'Index' is an lvalue = reference. > Perhaps JudyLGet, JudyLNext and JudyLFirst should return a NULL with > a corruption error. I am leaning that way too. But NULL isn't an error! It just means 'not found' which is a normal result. I would use the hook function idea. By default, corruption should core dump (call abort()) with a message to stderr. Basically, if the function fails it shouldn't return. However, malloc failure isn't a hard error. The problem is if you return, the user has to decide what to do.. if you have no memory left, your screwed anyhow :) If Judy is in the middle of doing some modifications to the Judy array and malloc fails, then the array may be in an inconsistent and unrecoverable state. Unless you can guarantee to ALWAYS leave the Judy array in a consistent state on malloc failure .. there's no point returning an error code: the user program MUST abort -- it can't even delete the Judy array. OTOH the hook function CAN allow for a retry on malloc failure. It isn't clear this is a good idea though, since you already provide a malloc hook? In that case, malloc_hook() returning 0 should call the error hook, since if the user wanted to try to recover, they could set the hook function, call malloc inside it, and free up memory and retry the malloc, all inside the hook. So there's a good argument to never return ANY error code, not even on malloc failure. The main problem is that in a multi-threaded C environment, the error hook is global, so it has to do Posix per-thread memory operations to get, for example, a jumpbuf to jump to an per thread error handler. C++ has no such problem, the hook can just throw an exception. Note an interesting problem for ME: Felix is using Judy to IMPLEMENT a garbage collector .. so if you're calling the garbage collector when you run out of memory .. well, the collector still has to work! So technically I actually *have* to use an allocation hook and some reserved memory for Judy, to actually free up memory for the client application. The problem here is that Judy isn't re-entrant with respect to the allocator. in other words, there is no where to PUT the allocation object, you can only store the hook into global memory. The only solution to that is to pass an extra argument to every Judy function, which is a pointer to an object containing the allocator hook and data. Since Felix could run multiple collectors in multiple threads, this is technically necessary. The problem is, the extra argument will cost on every call, especially on 32 bit machines. [The alternative is very ugly -- a mutex and/or Posix per thread memory: passing a pointer is better] Felix itself always passes such context around with a pointer. We do not use global variables at all (because they destroy thread safety and re-entrancy). But that's a cost I'm prepared to pay and you may not be. > > [BAD .. never call the public interface of a routine inside > > any routine at the same level! Doing so makes it impossible > > to wrap the public interface without interfering with the > > implementation] > > Please give me an example how JudyLNext() should be written > given the above api for JLN() and assume that JudyLFirst() > never returns a PPJERR. You would just define JudyLFirst_private, which can be called by other routines then wrap it in JudyLFirst. If you make it an inline function it should be optimised away. > > BTW: using ~0UL is a bad idea. This is NOT -1 on XP64, because > > long is only 4 bytes, and when cast to 8 bytes it may lead to > > How about a ~0ULL ? will C++ complain? g++ wont. However, MSVC++ might, I'm not sure. Neither C89 nor C++ require long long. But the point is, you shouldn't need that if you don't ever return an error code. > What your asking for is to combine: I make multiple suggestions, which doesn't help .. :) My design paradigm could be different from yours. I try to write functions which can't fail. If a failure is detected internally, they just abort the program. This doesn't mean you abort on 'end of file' when reading, because end of file isn't an error, its an *expected* condition. Now the problem is, in some application a Judy Array corruption isn't a hard error either. For example a web server loading a plugin which fails, would just kill use of that facility .. you don't want the whole webserver to be taken down by a bug in a plugin. Thus .. I can't say my approach is necessarily the only correct one. Just to say again another way -- my basic philosophy is that if an error is detected, the program should just abort (or call a user hook which defaults to doing that). Anything less assumes the code being executed isn't an essential part of the program. The thing is, this can be the case.. real code DOES contain non-essential enhancements. > Given the apparent lack of use of these error returns, I tend to agree > with you. Please suggest the semantics for 32 bit Judy1Count() return > when the Array is full (2**32 and 2**32 - 1 entries). Note: 64 bit > Judy1Count() does not have this problem -- yet, perhaps in the 23rd > century. You could return a 64 bit integer. Either long or long long, or, easier: struct Judy64 { uint32 hi; uint32 lo; }; It's a bit messy, but most users will just go: int n = Judy1Count (..).lo; and ignore the high word, because they know they didn't put 2^32 entries in the array :) It does cost a bit more to return the value though. > Also, should an error return due to malloc(2) fail allow the Array to > be > corrupt. It is very difficult to keep the array in-tact with a malloc > fail. Yes, I expected that. > > Corrupted Judy structure should instantly abort(). > > If you have corruption, you need the earliest possible core dump. > > I have already been thru that. People think it is a bug in Judy if it > core dumps inside of the Judy code. Please comment. But as you see from my own 'bugs in Judy' which were actually bugs in MY code .. the converse can happen too. It's worse, because I can fail to check the error code. The thing with aborting is you can fprint(stderr a message which makes it impossible for the user to miss it. The problem with that isn't "users complaining Judy aborted" -- well that may be a 'human' problem for you, but it isn't technically the issue: the technical issue is more like what I described above: if this part of the code is a 'non-essential' feature which fails in a contest where you don't want the whole program to fail. In C++ the hook function is the right solution, and indeed that is precisely what the C++ standard library actually uses in the same context. The hook can throw an exception, which will terminate the program, or if we have the 'non-essential feature' the user can catch it. In C, the only way to emulate that is with setjmp/longjmp, otherwise you do need to return an error code so the application can take appropriate action. The 'right' solution is two level interface: the fully nasty one returning the error codes, and a more abstract wrapper that is easier to use and safer. Of course that's what you're trying to provide with the macros. -- John Skaller <skaller at users dot sf dot net> Felix, successor to C++: http://felix.sf.net |
From: Doug B. <dou...@ya...> - 2007-09-17 18:43:57
|
John:=0A=0A----- Original Message ----=0A=0A>> Judy returning a -1 in the "= pointer to value" word do so because:=0A>> =0A>> 1) There was a malloc(2) f= ailure.=0A>> 2) There was a Judy array corruption.=0A>> 3) There was a prog= ramming mistake.=0A=0A> Are you sure? Because I have correct code, which do= esn't work=0A> UNLESS I also check for the "-1 in a pointer".=0A=0A> That w= as the problem I had. I was checking ONLY for NULL,=0A> and I was getting a= segfault .. it turned out to be the -1=0A> was returned.. =0A...=0A=0A> If= you're right, I can remove the -1 check. (But I tried that,=0A> and stuff = failed which works with the check .. if my code=0A> was bugged it wouldn't = work, even with the check..)=0A=0ANO, the -1 return is an "error flag" that= says "check the error structure"=0Afor the type of error.=0A...=0A> One so= lution is to provide an error function HOOK, this is a user=0A> supplied fu= nction like:=0A...=0A=0AThe macro interface has the hooks for adjusting how= errors are handled.=0ADuring development, I have it print out all errors, = and then during=0Aproduction code, just malloc(2) failures.=0A...=0A=0A> Yo= u already have an allocator hook, right?=0A=0AYes, take a look at JudyMallo= c, JudyMallocVirtual, JudyFree and=0AJudyFreeVirtual in test/Judy1LHTime.= c All memory allocation=0Agoes thru these routines. This is how Judy1LHT= ime calculates=0Ahow much memory Judy uses.=0A=0A> The functions are the cr= itical thing. The standard judy header=0A> shouldn't contain the 'convenien= ce' macros, put them in =0A> another header so they can be used or not at t= he client's=0A> discretion.=0A...=0A=0AGood suggestion, perhaps you should = read the error section in Judy_3x man=0Apage. (also Judy_3x.htm#ERRORS). = It starts off: "A lot of thought (and time)=0Awent into making error handl= ing in Judy=0Asimple, ...". It violates my 1 screen full=0Aof stuff -- it = is 2 screen fulls. Please re-write it if it is not clear to you and I will= =0Apublish it.=0A=0A>> The latest wrinkle with C++ not compatible with Judy= is troublesome=0A>> for me.=0A=0A> I forget what that is?=0A=0A Word_t= *abc =3D (void *)JudyLGet();=0A=0Ais an error in C++, even though both are= pointers. I thought void * was a=0Apointer to anything you wanted to mak= e it.=0A=0A> -- =0A> John Skaller <skaller at users dot sf dot net>=0A> Fel= ix, successor to C++: http://felix.sf.net=0A=0ADoug Baskins=0A=0A=0A=0A=0A= =0A |
From: skaller <sk...@us...> - 2007-09-17 17:20:17
|
On Mon, 2007-09-17 at 07:15 -0700, Doug Baskins wrote: > John: > > > Word_t *pshape= (Word_t*)JudyLGet(j_shape,(Word_t)memory,&je); > > if(pshape==NULL||pshape == (Word_t*)(void*)PPJERR) abort(); > > I have been considering on getting rid of the PJError_t parameter in > the > Judy routines that do not malloc(2). It seems that nobody uses it. Yeah, I know. I suspect the main use is when using a debugger, if you set it to a static variable, you can then examine the variable easily. So it may not be entirely useless. > Judy returning a -1 in the "pointer to value" word do so because: > > 1) There was a malloc(2) failure. > 2) There was a Judy array corruption. > 3) There was a programming mistake. Are you sure? Because I have correct code, which doesn't work UNLESS I also check for the "-1 in a pointer". That was the problem I had. I was checking ONLY for NULL, and I was getting a segfault .. it turned out to be the -1 was returned.. .. but I could be confused because I ALSO did initially have another bug in my code (passing a pointer to a Judy array instead of the actual Judy array, which is itself just a pointer). If you're right, I can remove the -1 check. (But I tried that, and stuff failed which works with the check .. if my code was bugged it wouldn't work, even with the check..) > Word_t *pshape= (Word_t*)JudyLGet(j_shape,(Word_t)memory,NULL); > > Judy was designed to have a zero (NULL) in the error parameter place. OK, but my Judy calls in the Felix RTL are all done by C++ member functions of the 'garbage collector' object, so having a slot for the judy error code is no problem. however it is never examined. Probably it should be to check for malloc failure (but I'm lazy .. :) > One of the original (perhaps misguided) reasons for the -1 return was > to avoid the "core dumps" from within the Judy code. I got tired of > explaining to programmers that it was their error in the calling > parameters. Hehe.. I understand :) One solution is to provide an error function HOOK, this is a user supplied function like: void bug(int code, void *judy_array, void *client_data) { fprintf(stderr, "Got Judy Error %d on %p\n",code, judy_array); abort(); } In C++ you might throw an exception. The default should be to print a diagnostic on stderr and core dump. If this is a malloc failure, the bug() routine could return, in which case the Judy should retry the malloc(). The user routine could free up some memory. [This is only a crude sketch .. don't implement it directly as described!] You already have an allocator hook, right? > As for your mistake with JudyLGet(), we all have done it and I have > considered > changing interface. No: the interface is correct. I mean, the way the types are handled is quite messy and might be improved, although C makes it hard, however .. > The design of JudyLIns() requires the "&" so that the > "allocate the array with NULL pointer" property of Judy can be used. Indeed. And the others don't have the & which is also correct, because they don't add or delete from the array, in particular not an empty array. In Felix binding to Judy (not in the run time system) I wrapped away this quirk, but that was a design choice, and one I could make because Judy DIDN'T try to be nice. > Again, this was not done originally because of the additional overhead > of a subroutine call -- no longer significant in a modern processor. Rot :) An extra indirection can easily matter in a time critical routine like a garbage collector. But the REAL reason you did it the way you did was because you got it right... :) > The Judy macros (JLG() etc.) were an attempt to avoid that mistake, > but > it seems that programmers don't like using them for reasons I don't > understand. Macros suck. You macros are horrid .. :) The thing is you didn't document them correctly, because it was 'obvious' to you. In some case the macros require an lvalue, not a value, and that isn't explained anywhere. The man pages document the macros, and even the website does them first, then shows which functions they use. The functions are the critical thing. The standard judy header shouldn't contain the 'convenience' macros, put them in another header so they can be used or not at the client's discretion. > On another subject, you suggest the JudyLNext() return some > stateful information -- for performance. I studied that very > extensively 6 years ago to improve the performance on Next/Prev. > Again, on a modern processor, these routines almost always run when > the > data is "in the cache" and they almost disappear from the profiles. > (Having stateful information does not help in avoiding cache misses). Yes, good point! And the state object will also use up cache, and may interfere with performance. > I am still actively working on a new Judy. Improved speed was the > highest objective, but it looks like greatly improved memory > efficiency > will be the biggest improvement. Cool! > I take your suggestions very seriously, so keep them coming. Sure .. I love complaining :) > Nobody likes the Judy semantics, huh? The semantics are right. Its the docs and the types that aren't so hot. Even the names of the routines are confusing to me. I wrote the same set of routine MANY times in the past when designing indexed files (in the old days before databases). You have first, last, next, previous, next_or_equal, previous_or_equal, and get_equal which correspond to the ordering terms: 0 inf > < >= <= == so I would have called the routines more like: first, last, gt, lt, ge, le, eq but this along with most of my other complaints are presentation issues NOT semantics. > The latest wrinkle with C++ not compatible with Judy is troublesome > for me. I forget what that is? -- John Skaller <skaller at users dot sf dot net> Felix, successor to C++: http://felix.sf.net |
From: Doug B. <dou...@ya...> - 2007-09-17 16:30:35
|
John:=0A=0APlease comment on what the next prototype of Judy has in Judy.h:= =0A=0A=0A#define JLN(PValue, PArray, Index) = \=0A{ = \=0A (Index)++; = \=0A if ((PArray) && (Index)= ) \=0A (PValue) = =3D (Pvoid_t) JudyLFirst(PArray, &(Index), PJE0); \=0A else = = \=0A (PValue) =3D NULL; = \=0A}=0A=0APerhaps JudyLGet, JudyLNext and JudyLFirst should= return a NULL with=0Aa corruption error. I am leaning that way too.=0A=0A= > [BAD .. never call the public interface of a routine inside=0A> any routi= ne at the same level! Doing so makes it impossible=0A> to wrap the public i= nterface without interfering with the=0A> implementation]=0A=0APlease give = me an example how JudyLNext() should be written=0Agiven the above api for J= LN() and assume that JudyLFirst()=0Anever returns a PPJERR.=0A=0A> BTW: usi= ng ~0UL is a bad idea. This is NOT -1 on XP64, because=0A> long is only 4 b= ytes, and when cast to 8 bytes it may lead to=0A=0AHow about a ~0ULL ? wil= l C++ complain?=0A=0A> In any case, only JudyLIns/Del can legitimately retu= rn such a code,=0A> and it always means 'out of memory', in which case NULL= would do.=0A> That is: NULL is the only publicly required error code for J= udyL,=0A> and -1 for Judy1.=0A=0AWhat your asking for is to combine:=0A=0A = 1) Out of memory=0A=0A 2) Programmer errors - calling parameters=0A 3) Judy= Array corruption.=0A Which is actually, according to Judy.h=0A=0A = JU_ERRNO_NULLPPARRAY =3D 3, // see above.=0A JU_ERRNO_NONNU= LLPARRAY =3D 10, // see above.=0A JU_ERRNO_NULLPINDEX =3D 4, = // see above.=0A JU_ERRNO_NULLPVALUE =3D 11, // see above.= =0A JU_ERRNO_NOTJUDY1 =3D 5, // PArray is not to a Judy1 ar= ray.=0A JU_ERRNO_NOTJUDYL =3D 6, // PArray is not to a Judy= L array.=0A JU_ERRNO_NOTJUDYSL =3D 7, // PArray is not to a = JudySL array.=0A JU_ERRNO_UNSORTED =3D 12, // see above.=0A= =0A into ONE common error.=0A=0AGiven the apparent lack of use of these err= or returns, I tend to agree=0Awith you. Please suggest the semantics for 3= 2 bit Judy1Count() return=0Awhen the Array is full (2**32 and 2**32 - 1 ent= ries). Note: 64 bit =0AJudy1Count() does not have this problem -- yet, per= haps in the 23rd century.=0AAlso, should an error return due to malloc(2) f= ail allow the Array to be=0Acorrupt. It is very difficult to keep the arra= y in-tact with a malloc fail.=0A=0A> Corrupted Judy structure should instan= tly abort().=0A> If you have corruption, you need the earliest possible cor= e dump.=0A=0AI have already been thru that. People think it is a bug in Ju= dy if it=0Acore dumps inside of the Judy code. Please comment.=0A=0A =0ADo= ug Baskins <dou...@ya...>=0A=0A=0A=0A----- Original Message ----= =0AFrom: skaller <sk...@us...>=0ATo: judy <judy-devel@lis= ts.sourceforge.net>=0ACc: felix-impl <fel...@li...>=0AS= ent: Monday, September 17, 2007 2:38:51 AM=0ASubject: errors=0A=0AI'm havin= g one hell of a time trying to figure out how to detect=0Awhen JudyL routin= es got an error or not.=0A=0AIt seems JudyLGet can return BOTH PPJERR and N= ULL independently,=0Aindeed JudyLFirst contains this code:=0A=0A PPvoid_t= PValue;=0A if ((PValue =3D JudyLGet(PArray, *PIndex, PJError)) =3D=3D PP= JERR)=0A return(PPJERR);=0A if (PValue !=3D (PPvoid_t) NULL) retur= n(PValue); // found *PIndex.=0A return(JudyLNext(PArray, PIndex, PJError= ));=0A=0A[BAD .. never call the public interface of a routine inside=0Aany = routine at the same level! Doing so makes it impossible=0Ato wrap the publi= c interface without interfering with the=0Aimplementation]=0A=0AThe problem= is I'm getting BOTH conditions returned to a client=0Aand it is just nonse= nse to have to test both PPJERR and NULL.=0A=0ALogically, JudyLGet should r= eturn a pointer to the value=0Aslot, or NULL if the key doesn't exist. No e= rror is possible=0Aon any of the JudyL reading routines: only JudyLIns/Del = can=0Apossibly have an error (because they're the only ones that=0Aallocate= memory, and memory allocation failure is the only=0Aerror Judy can possibl= e have).=0A=0ACorrupted Judy structure should instantly abort().=0AIf you h= ave corruption, you need the earliest possible core dump.=0A=0ABTW: using ~= 0UL is a bad idea. This is NOT -1 on XP64, because=0Along is only 4 bytes, = and when cast to 8 bytes it may lead to=0A=0A 0x0000_0000_FFFF_FFFF=0A= =0Awhich is not what you wanted .. depending on how the cast is done.=0AIf = you cast to 'long long =3D Word_t' first, you'll get all FF bytes,=0Abut if= you cast to void* first, you'll only get a half word.=0A=0AIn any case, on= ly JudyLIns/Del can legitimately return such a code,=0Aand it always means = 'out of memory', in which case NULL would do.=0A=0AThat is: NULL is the onl= y publicly required error code for JudyL,=0Aand -1 for Judy1.=0A=0A=0A=0A--= =0AJohn Skaller <skaller at users dot sf dot net>=0AFelix, successor to C+= +: http://felix.sf.net=0A=0A-----------------------------------------------= --------------------------=0AThis SF.net email is sponsored by: Microsoft= =0ADefy all challenges. Microsoft(R) Visual Studio 2005.=0Ahttp://clk.atdmt= .com/MRT/go/vse0120000070mrt/direct/01/=0A_________________________________= ______________=0AJudy-devel mailing list=0AJ...@li...= =0Ahttps://lists.sourceforge.net/lists/listinfo/judy-devel=0A=0A=0A=0A=0A |
From: Doug B. <dou...@ya...> - 2007-09-17 14:15:43
|
John:=0A=0A> Word_t *pshape=3D (Word_t*)JudyLGet(j_shape,(Word_t)memory,&je= );=0A> if(pshape=3D=3DNULL||pshape =3D=3D (Word_t*)(void*)PPJERR) abort()= ;=0A=0AI have been considering on getting rid of the PJError_t parameter in= the =0AJudy routines that do not malloc(2). It seems that nobody uses it.= =0AJudy returning a -1 in the "pointer to value" word do so because:=0A=0A1= ) There was a malloc(2) failure.=0A2) There was a Judy array corruption.=0A= 3) There was a programming mistake.=0A=0AThe NULL return simply means "not = found". In JudyLGet() this is=0Anot an error (unless the programmer does n= ot expect it).=0A=0AFor example, in the above code, the "je" error structur= e is never used.=0AIf you were lucky and looked at the error code number, i= t may of said: =0A"not a Judy array" or some clue like that.=0A=0ASo, it mi= ght as well of been written:=0A=0AWord_t *pshape=3D (Word_t*)JudyLGet(j_sha= pe,(Word_t)memory,NULL);=0A=0AJudy was designed to have a zero (NULL) in th= e error parameter place.=0A=0AIn Judy routines that do not call malloc(2), = then the -1 return means a=0A(2 Judy array corruption or a (3) programming = mistake -- both of which=0Ashould be gone with production code. (32 bit Jud= y1Count() is an exception).=0A=0AIn using Judy for the last 6 years, I have= very seldom run into a Judy array=0Acorruption -- usually the result of a = (random) RAM failure.=0A=0AOne of the original (perhaps misguided) reasons= for the -1 return was=0Ato avoid the "core dumps" from within the Judy cod= e. I got tired of =0Aexplaining to programmers that it was their error in = the calling parameters.=0A=0AAs for your mistake with JudyLGet(), we all ha= ve done it and I have considered=0Achanging interface. The design of JudyL= Ins() requires the "&" so that the =0A"allocate the array with NULL pointer= " property of Judy can be used.=0AThis is very important in "applications" = such as JudySL(). However, it is=0Apossible (and perhaps desirable) to do = that with an additional code outside=0Athe JudyLIns() code and code that ca= lls malloc(2)/free(2).=0AAgain, this was not done originally because of the= additional overhead=0Aof a subroutine call -- no longer significant in a m= odern processor.=0AThe Judy macros (JLG() etc.) were an attempt to avoid th= at mistake, but=0Ait seems that programmers don't like using them for reaso= ns I don't understand.=0A=0AOn another subject, you suggest the JudyLNext()= return some=0Astateful information -- for performance. I studied that ver= y=0Aextensively 6 years ago to improve the performance on Next/Prev. =0AAg= ain, on a modern processor, these routines almost always run when the=0Adat= a is "in the cache" and they almost disappear from the profiles.=0A(Having = stateful information does not help in avoiding cache misses).=0A=0AI am sti= ll actively working on a new Judy. Improved speed was the=0Ahighest object= ive, but it looks like greatly improved memory efficiency=0Awill be the big= gest improvement.=0A=0AI take your suggestions very seriously, so keep them= coming.=0ANobody likes the Judy semantics, but I have not found a better= =0Asolution or even had suggestions -- with code examples.=0A=0AThe latest = wrinkle with C++ not compatible with Judy is troublesome=0Afor me.=0A=0A=0A= > The typing and docs on the interface are not so good.. ouch.=0A=0AI accep= t suggestions with examples on that too. I do not like a=0A"man" page long= er than 1 screen full.=0A=0AThanks for your interest and inputs.=0A=0ADoug= =0A =0ADoug Baskins <dou...@gm...>=0A=0A=0A=0A----- Original Messa= ge ----=0AFrom: skaller <sk...@us...>=0ATo: aj...@fr...= =0ACc: jud...@li...; fel...@li...= =0ASent: Monday, September 17, 2007 6:53:26 AM=0ASubject: Re: [Felix-impl] = judy bug=0A=0AOn Sun, 2007-09-16 at 22:08 -0600, Alan Silverstein wrote:=0A= > > Ah, found bug in my code -- how I hate that moronically stupid C=0A> > = language. Was using &a instead of a for JudyLGet.=0A> =0A> All too easy. = I've made recent use of libJudy myself in my contract=0A> job, and even as = one of the authors I find it surprisingly difficult to=0A> keep the pointer= expressions correct. In particular when *Pvalue return=0A> by JudyL or Ju= dySL, which it only knows as a word -- no forced=0A> interpretation of the = contents -- is known to the caller to contain a=0A> pointer. So you must f= irst check for Pvalue being null, etc:=0A> =0A> foo_t * Pfoo;=0A> J= LG(Pvalue, ...);=0A> =0A> Pfoo =3D ((Pvalue =3D=3D NULL) ? NULL : (foo_= t *) (*Pvalue));=0A> =0A> It can get confusing, especially when dealing wit= h array-of-array, etc.=0A=0AI'm not using the macros .. so it is much worse= : Ihave this, =0Afor example:=0A=0A Word_t *pshape=3D (Word_t*)JudyLGet(j_= shape,(Word_t)memory,&je);=0A if(pshape=3D=3DNULL||pshape =3D=3D (Word_t*)= (void*)PPJERR) abort();=0A=0A> But it sure has sped up some slow programs d= ramatically.=0A=0AMy experience is slightly different. The Felix gc is actu= ally slightly=0Aslower with Judy than the doubly linked list I used before.= =0AMemory consumption is about the same I think .. not sure.=0A=0AFor me th= e BIG difference was in another area: functionality.=0A=0APreviously, the F= elix gc used memory blocks like:=0A=0A [ shape pointer ]=0A [ array c= ount ]=0A [ flags for gc ]=0A [ previous block ]=0A [ next block ]= =0A [ padding for alignement]=0A=3D=3D=3D=3D> [ USER HEAP BLOCK ]=0A= =0AWith a 'header' in front of every block on the heap to retain RTTI=0Afor= the garbage collector, array repeat count, and flags etc,=0Aas well as the= prev and next links, overhead 48 bytes on 64 bit box.=0AFor list nodes of = two words (16 bytes) that's a lot of overhead.=0A=0AFurthermore pointers we= re like:=0A=0A struct { void * heap_block; offset_t offset; }=0A=0Abecau= se the garbage collector could only work with pointers to the=0Ablock, not = with pointers INTO the block.=0A=0AJudy changed all that. The gc now has ZE= RO header, i.e. there is =0Ano header at all. Instead a JudyL array maps po= inters to RTTI=0Ashape data. A separate map for array count is kept for ent= ries=0Awith lengths NOT equal to 1 -- so non-arrays don't have an entry=0Ai= n the JudyL array (that's 99% of most objects).=0A=0AAnd using JudyL I can = find the key equal to or just before any=0Apointer .. so the gc now works w= ith interior pointers.=0A=0AThe ability of the gc is further enhanced, sinc= e it can now=0Adetect 'foreign pointers' in O(1) [pointers not being manage= d=0Aby the gc] =0A=0AFurthermore the user could malloc() some data, play ab= out,=0Aand then *register* the object for gc management .. impossible=0Aif = there had to be a header block on managed memory.=0A=0AAnd Felix pointers a= re now plain C pointers. This actually=0Asimplified the programming languag= e and standard library=0Aquite significantly (forgetting about the performa= nce=0Aimprovement and simplification of the run time library ..)=0A=0ASo th= e result is: roughly the same amount of memory is wasted=0Aon meta-data. Ju= dy saves here and there on the constant block=0Aoverhead, but using separat= e arrays means the keys use up=0Aspace several times. Functionality/capabil= ity is vastly =0Aimproved without sacrificing performance, which is critica= l=0Afor a garbage collector.=0A=0AFurthermore, it is possible to scan the R= TTI of all the=0Aobjects in memory without scanning the actual memory.=0AWh= en following pointers, it is only necessary to examine=0Aobjects that actua= lly contain them .. so the gc can =0Afind all the reachable objects WITHOUT= actually paging=0Ain every object (only ones with pointers in them need=0A= to be paged in). This should improve cache/VM locality=0Aa bit, though I ha= ve done no measurements.=0A=0AI read somewhere that a standard Hashtable is= as fast=0Aas Judy.. so why bother. Well, I have given one answer.=0AHashta= ble can't give a sorted list of keys. Judy Can.=0A=0AHashtbl can't find 'ne= xt after' or 'next after or equal to'=0Aor 'Next empty' and Judy can.=0A=0A= The typing and docs on the interface are not so good.. ouch.=0ABut the actu= al *design* of the interface is close to perfect.=0A=0A=0A-- =0AJohn Skalle= r <skaller at users dot sf dot net>=0AFelix, successor to C++: http://felix= .sf.net=0A=0A--------------------------------------------------------------= -----------=0AThis SF.net email is sponsored by: Microsoft=0ADefy all chall= enges. Microsoft(R) Visual Studio 2005.=0Ahttp://clk.atdmt.com/MRT/go/vse01= 20000070mrt/direct/01/=0A_______________________________________________=0A= Judy-devel mailing list=0AJ...@li...=0Ahttps://lists.= sourceforge.net/lists/listinfo/judy-devel=0A=0A=0A=0A=0A |
From: zooko <zo...@zo...> - 2007-09-17 13:13:10
|
On Sep 17, 2007, at 1:34 AM, skaller wrote: > On Mon, 2007-09-17 at 00:08 -0600, zooko wrote: >> On Sep 16, 2007, at 11:53 PM, skaller wrote: >> >>>> But it sure has sped up some slow programs dramatically. >>> >>> My experience is slightly different. The Felix gc is actually >>> slightly >>> slower with Judy than the doubly linked list I used before. >>> Memory consumption is about the same I think .. not sure. >> >> What chip architecture are you using? Other people's reports suggest >> that if your machine has 64-byte cache lines then Judy will perform >> much better than with other cache line sizes. > > I personally have a pair of AMD64s at the moment, but Felix is > a portable compiler generating portable ISO C++, so generated > code runs on all architectures and all OS with a C++ compiler. Well that makes sense, but I wanted to know on what machine(s) you were basing your observation that the new Felix gc (with Judy) is slightly slower than the old (with the doubly linked list). Regards, Zooko |
From: skaller <sk...@us...> - 2007-09-17 07:40:23
|
On Mon, 2007-09-17 at 00:08 -0600, zooko wrote: > On Sep 16, 2007, at 11:53 PM, skaller wrote: > > >> But it sure has sped up some slow programs dramatically. > > > > My experience is slightly different. The Felix gc is actually slightly > > slower with Judy than the doubly linked list I used before. > > Memory consumption is about the same I think .. not sure. > > What chip architecture are you using? Other people's reports suggest > that if your machine has 64-byte cache lines then Judy will perform > much better than with other cache line sizes. I personally have a pair of AMD64s at the moment, but Felix is a portable compiler generating portable ISO C++, so generated code runs on all architectures and all OS with a C++ compiler. > http://permalink.gmane.org/gmane.comp.lib.judy.devel/44 > > Thanks for the nice description of your felix gc algorithm. That is > a very cool trick! Yup, the calculation is actually: Word_t cand = (Word_t)p; // candidate pointer Word_t fp=cand; Word_t *w = (Word_t*)JudyLLast(j_shape,&fp,&je); // equal or less registered pointer if(w == NULL || w == (Word_t*)(void*)PPJERR) return; // no such pointer, candidate is a foreigner, ignore it if( (*w & 1UL) == reachable) return; // already handled or handling this object, // don't recursively scan it gc_shape_t *shape = (gc_shape_t*)(*w & ~1UL); // shape of lower object unsigned long n = get_count((void*)fp) * shape->count * shape->amt; // length of the object: also uses another JudyL array if(cand >= (Word_t)(void*)((unsigned char*)(void*)fp+n)) return; // not interior to the lower object: candidate is a foreigner *w = (*w & ~1uL) | reachable; // we own the object, it's reachable // use the RTTI to find the pointers in the object and // recursively examine them This uses 2 1/2 JudyL arrays, one for the registered pointers, one for the RTTI, and 1/2 for the array count (if the object is an array). It's cute, there is no special test for NULL either: NULL is 'just another foreign pointer'. Now, the scan on all objects following which checks the low bit of the address (malloc always returns even addresses) removes garbage from the registered pointer array into the 'to be deleted array', then the finalisers for them is run by scanning that array, and the array is bulk cleared. The sequential scans could *probably* be improved if JudyL returned a 'state' object which could be used as an argument to a 'get_next' function (unsafe, would only work if the array wasn't modified). C actually makes that hard. If you're recursing down a tree you want to 'yield' the current thing, and then 'resume' where you left off in the visitation. You not only have to use an array to keep state instead of the stack, you also have to keep track of the current code location so you can jump back to it. Felix actually automates precisely that mechanism, which I call control inversion. It's the same as using a pthread with a lock to cause context switching, except there's no pthread, just a heaped C++ class object with a resume() method that uses a 'switch' to jump to where it last yielded from. If Judy was actually able to do this it would be even faster, since it would save 'recursing' down the bytes of the last key to find the place to start searching for the next key. On a 64 bit machine, addresses are 8 bytes, and 8 lookups per pointer isn't nearly as 'invisible' as 4 on a 32 bit machine, even if the high order byte nodes are already in the cache. [And yes in *w = (*w & ~1uL) | reachable; the ~1uL isn't portable, probably won't work on win64 ..] -- John Skaller <skaller at users dot sf dot net> Felix, successor to C++: http://felix.sf.net |
From: zooko <zo...@zo...> - 2007-09-17 06:08:17
|
On Sep 16, 2007, at 11:53 PM, skaller wrote: >> But it sure has sped up some slow programs dramatically. > > My experience is slightly different. The Felix gc is actually slightly > slower with Judy than the doubly linked list I used before. > Memory consumption is about the same I think .. not sure. What chip architecture are you using? Other people's reports suggest that if your machine has 64-byte cache lines then Judy will perform much better than with other cache line sizes. http://permalink.gmane.org/gmane.comp.lib.judy.devel/44 Thanks for the nice description of your felix gc algorithm. That is a very cool trick! Regards, Zooko |