From: john s. <sk...@us...> - 2012-04-02 21:04:32
|
[Changed topic, old one is SPAMing the conversation: the 'gratulations is suggests i 'on a 'illion 'ollars .. :] On 03/04/2012, at 4:35 AM, Alan Silverstein wrote: > Gang, > >> Gcc at least has a way to "hint" to the compiler whether a branch is >> likely to be taken. Felix actually emits these hints. However I have >> not noticed gcc doing anything with them. > > Amusing, it reminds me that in the last year I carefully marked some new > source functions "inline", but later discovered that gcc was basically > ignoring my advice and doing whatever the heck it wanted. Turns out > "inline" is merely a suggestion, and I don't know of a way to force it > ("I know what I'm doing and I want to waste space for performance") > other than reverting to macros. (Maybe there's some gcc option or > #pragma I didn't look for...) Well .. you can try clang 3.0 :) Or if you want to use a "real" language, use Felix, where "inline" isn't a hint. > >> A compromise API would be like: "here is an array size 100, fill it >> with 100 entries starting at key such-and-such". You could then >> optimally process the 100 entries, then get the next 100. > > I vaguely recall (this was > 10 years ago) discussing multi-index > operations like this. I know I actually WROTE code that batch-created > (inserted) a Judy1/L array rather faster than by simple iteration, and > it worked fine, but I don't know where it is today. (Given a simple > array, or was it parallel arrays, of sparse indexes and values, libJudy > internally-smart code can create a valid Judy tree pretty quickly, I > think maybe 3X the one-at-a-time mode?) Actually that would be quite useful. I actually use Judy as a temporary data structure. The GC algorithm is basically: find all reachable objects, then delete everything not reachable. All-reachable is a set of pointers (Judy) constructed by scanning. Everything-not-reachable is a set of pointers. Also Judy. This is constructed because the practice is to run finalisers on all these objects *first* before deallocation. >> Generally, a map of word to "anything" would be good. This can be >> done by map to a word, which is a pointer. At the cost of an >> indirection and a memory management problem. > > Agreed. The solution, which we also toyed with but did not implement, > is to have libJudy offer APIs like JudyL but with arbitrary-sized value > areas whose memory is managed by libJudy. In which case don't just > return a pointer to the caller, have them give you a pointer to the > data, and its size, and the library does the mem-create and copy-in. Just an option for two words would be useful. Why? Well, there is a serious bug/misconception in the Judy build. Actually, the Felix clone makes the same mistake. I'm running on a 32 bit machine. I build only 32 bit Judy. Stupid. I still need 64 bit integers. I'm running on a 64 bit machine. I build 64 bit Judy. Stupid. I still need "int". That's only 32 bits. Heck, "long" is only 32 bits on Win64. What should be built is: 32 and 64 bit arrays. On BOTH 32 and 64 bit machines. I think this can be done now. In addition we should have 32->64 and 64->32 maps (JudyL). And 64->128 (64 bit machine only). >> A trivial example: an IP address is 4 bytes, some of which can be 0. >> Of course this can be done now, with a JudyL instead of S. > > Right. C strings are what I call "length-terminated" by the NUL bytes. In the C++ Standard they're called NTBS: Null terminated byte strings. So that's a technical, formal, and standardised name for "C strings" :-) > Perl strings for example are "length-associated," and any bit pattern > can appear in the string. Ditto C++ strings, and strings in almost every sane language other than C. Actually .. *including* C, since C also has length controlled buffers, just that you have to manage the length yourself. > You can easily store them in a JudyL > meta-trie where the top level is the length, but you end up with them > sorted by length first, then lexicographically, not just > lexicographically. You might not care. This is similar to the property naive Hash-table has. You lose ordering in return for O(1) operations. -- john skaller sk...@us... http://felix-lang.org |