From: Doug B. <dou...@ya...> - 2012-04-01 03:39:18
|
Phil: Thanks for the complement. Wow, this is the first (that I remember) verification of somebody reading the the application notes on Judy. I had forgotten all about them. They were designed to be innovative and interesting -- or at least I thought. I have never been asked a question about any of them. <http://judy.sourceforge.net/examples/count_by_value.pdf > That application note you call "striped" is the simplest and unintuitive one. I believe it was written by Bob Montgomery, or at least suggested by him. I don't remember why his name is not on the paper. At the time, I could not think of a use for that algorithm, but it seems you have found one. I would like to understand more of the details. If you need confidentiality, email me privately. I see your data seems to compare two methods -- JudyL(native) and Judy1(striped). The striped version is unbelievable fast (addition) if the sums cover a lot of numbers. I have used the analogy of "faster than the speed of light" I.E. addition (+) done faster than light could travel across the CPU chip. Of course the "real" partial additions were done at the time of insertion. I assume that your data sets do not have a lot of numbers (Values) to sum, since you are comparing with JudyL. I am still working on Judy and could spend time of improving areas such as the "counting" speed. That has always been on my list, but at a low priority. I have had very little feedback on people using that feature of Judy. Thanks for your interest, Doug Baskins Doug Baskins <dou...@ya...> >________________________________ > From: Phil Stanhope <sta...@gm...> >To: aj...@fr... >Cc: dou...@ya... >Sent: Thursday, March 29, 2012 5:45 AM >Subject: Re: I'd like to congratulate you and the team that worked on Judy Arrays > > >See below. When I say "striped" I'm referring specifically to this: http://judy.sourceforge.net/examples/count_by_value.pdf > >Naive is a staight JUDYL implementation. > >I'm using network address ranges (a subset of a potential of 4GB IPV4 addresses). The range is not random across the 4GB space. But nonetheless it's "sparse" in terms of total potential # of keys. > >More below. > > > >On Thu, Mar 29, 2012 at 5:58 AM, Alan Silverstein <aj...@fr...> wrote: > >Phil, >> >> >>> My point about doing things in software in the cloud is that the >>> reason for this initial work is to build what amounts to a real-time >>> IP filter/throttle. That's stuff that's been solved for years. >> >>Do you mean, using dedicated HW? >> >> >> > >Exactly. > > >> But in a cloud hosting environment -- you don't get to put a h/w >>> device in front of your infrastructure of rented machines that you >>> spin up/down on demand. Sure, the hoster may have that type of >>> infrastructure -- but you as the lessee don't get access to it. >> >>Let me see if I follow you... Cloud computing, for example, is using >>modern computing speed to virtualize features that were formerly in HW. >>Somewhat ironically, problems previously solved in dedicated HW are now >>re-instantiated in SW too. Is that it? >> >> >> > >Exactly. > > >> In my physical data centers -- I can rely on physical devices for this >>> type of work. But in my cloud data centers -- I need to incorporate >>> this as a last line of defense against overloading some of my service >>> endpoints. This is but one use case. >> >>OK. >> >> >>> Note -- I've done no tuning whatsoever. Just built libjudy (on either >>> OSX 64-bit or CentOS 6.2 64-bit). Worked unchanged as you'd >>> hope/expect. >>> CPU caches are bigger than they used to be. And there are two-levels >>> of cache. I wonder if this could be tweaked to improve performance >>> even more. >> >>I suppose so! >> >> >>> Since you raise IPV6 -- it's 128-bit ... so it would take two striped >>> Judy1 arrays (well 64 Judy1 I guess) to have a IPV6 sparse array. >> >>What do you mean by striped in this context? I think in terms of >>array-of-array where each level decodes a byte (within libJudy code) or >>a word (32/64) at the client level. On a 64-bit system (which pretty >>much everything is these days, 32 is consider a microprocessor), JudyL >>array of JudyL array should suffice, just two levels. >> >> >> > >Striped array as used in this => http://judy.sourceforge.net/examples/count_by_value.pdf > >> We don't accept IPV6 for our service yet. I'll hold off on that >>> implementation for now. But I do want to create a simple library for >>> tracking in real-time a set of stats per key (count, min/max, etc). >>> It'd be pretty simple to pack these types of counters into a small >>> collection of these arrays. Track 4-5 key metrics on working expanse >>> of 256MB keys in < 64MB of memory. Yeah -- I remember when that was >>> an extraordinary amount of memory. But these days -- tiny fragment (I >>> just ordered a under-desktop test server today with 64GB of memory). >> >>Yup. About 33 years ago in very nearly this same spot (I'm back in >>Building 1 at the HP/Avago Fort Collins site), we wrote business >>software (like accounts receivable and order entry) for a computer with >>32K BYTES of main memory. >> >> >> > >My first paying job as a programmer was to write an inventory control system on an HP-250 in late 1981. http://www.hpmuseum.net/display_item.php?hw=249 > >:) > > >> Here's the updated test stats -- think I got my timing counters right >>> this time. The last one had an incorrect printf -- showed count & >>> init time value. Added cost to iterate through every Index value as >>> well as calcuate average retrieval. Nothing is random at this point. >>> But don't expect much if any change given what's going on under the >>> covers. >> >>Groan, a data dump. :-) OK, what does this mean? >> >> >>> ~/Downloads/ystud (master)> ./counter_test 1.0.0/4 >>> CIDR 1.0.0/4 has 0x10000000 (268435456) addrs netmask=F0000000 from >>> 00000000 (0.0.0.0) to FFFFFF0F (15.255.255.255) >> >>This is the range, not the current population, right? >> >> >> > >The range became the population. I'll be updating. But I took a separate program that was validating some network range calculations & masking capabilities and added the judy stuff to it. > > >> JUDY1 (striped) keys=268435456 size=0016.07MB init=26.909740s >>> count=0.000006s get=0.119964us iter=32.202682s >>> JUDYL (naive) keys=268435456 size=2128.07MB init=42.306259s >>> count=0.001110s get=0.119445us iter=32.063248s >> >>Not sure what striped and naive mean here. Are the key counts real >>(actually loaded) or potential? Anyway yeah, small and fast is the name >>of the game... >> >> > >Actually loaded. > > >And have started to use this in a different way ... but been stalled this week now that I'm back from Beijing and dealing with other stuff. > >The modern equivalent -- sort of -- is something called LevelDB from Google. >http://code.google.com/p/leveldb/ > > >Anyway, in real life in recent projects I've actually created some >>fairly elaborate libJudy structures and had them work well, like JudyL >>=> JudyL => Judy1, and inverse lookups (map and back-map). >> >>Cheers, >>Alan >> > > > |
From: john s. <sk...@us...> - 2012-04-01 13:20:50
|
On 01/04/2012, at 1:39 PM, Doug Baskins wrote: [] FYI: Judy is used in the Felix Garbage collector to keep track of every allocated object, its length, and also a run time description .. all non-invasively. It's also used to implement sparse arrays. Coming soon .. a string -> T associative array .. aka "dictionary" or "hash". Only hassle with that is the null terminator. It may be interesting to consider the "cursor" used in the modified implementation. If you consider a typical sequential scan you have key = next_key(key) ... which implies looking up the whole array to find the old key, before finding the next one. In a 64 bit implementation that can imply 8 lookups .. even when the next key is in the same cache line as the previous one. The other implementation achieved better performance than Judy by retaining the last position, making a sequential scan much faster. However that's wrong IMHO, a single inbuilt cursor in the public interface is a bad design. However there are two alternatives: just using it as a cache (hiding the details) is one, the other is to provide a way to save the current position in a separate object. Either way, sequential scans are common enough to think about it. For example when Felix uses Judy as an application data structure the garbage collector has to do a sequential scan of the values as part of the usual "mark-sweep" processing (which itself also uses Judy for temporary storage!!) -- john skaller sk...@us... http://felix-lang.org |
From: Doug B. <dou...@ya...> - 2012-04-02 13:32:15
|
John: Glad to hear of your successes with Judy. I have often thought of how to do a faster Judy[1L]Next(). It has always fallen below my priorities list -- partly because I believed that modern processor caches would lessen the overhead of a complete re-walking of the tree to almost the same place (thus same cache-lines). However, I have recently discovered another problem with modern processors that may be a mitigating factor. That is "branch prediction" miss-match of the compiler with the processor. So here is my best shot on how I would approach the problem: Note: Key & Index words are used interchangeably and mean the same thing. A walk of the Judy tree often terminates with an object called a Leaf. ALeaf has 1..256 Key/Value items in sorted order for an average length of around 20+ elements. If Judy[1L]NextX() returned the whole Leaf (or partial if required), then that array of 20+ Keys/Values could be scanned very quickly. It would be up to the caller to determine how long this data would be valid (I.E. no Inserts or Deletes.) The caller would have to supply the memory for storing the Keys/Values along with how many were returned and the current Key. The caller would also haveto remember what was the last Key accessed (cursor) out of the memory todetermine the next Key. The distance from Key (Leaf population) to the Value would also be required. There a few other minor details I haven't mentioned. Actually, it would be fairly easy to implement, but the API wouldbe ugly. Let me know if this seems worthwhile. It seems it has the potential of being much faster. About the Null terminator problem with JudySL. I think a student in perhaps Brazil has written a *Next() to JudyHS(). If I remember correctly, a simple #define will change JudyHS() into a sorted (numeric?) order store of the tree of trees. If your interested, I can research it further. Thanks for your interest, doug Doug Baskins <dou...@ya...> >________________________________ > From: john skaller <sk...@us...> >To: Doug Baskins <dou...@ya...> >Cc: "aj...@fr..." <aj...@fr...>; judy <jud...@li...>; Phil Stanhope <sta...@gm...> >Sent: Sunday, April 1, 2012 8:20 PM >Subject: Re: I'd like to congratulate you and the team that worked on Judy Arrays > > >On 01/04/2012, at 1:39 PM, Doug Baskins wrote: >[] > >FYI: Judy is used in the Felix Garbage collector >to keep track of every allocated object, its length, >and also a run time description .. all non-invasively. > >It's also used to implement sparse arrays. > >Coming soon .. a string -> T associative array .. aka "dictionary" >or "hash". Only hassle with that is the null terminator. > >It may be interesting to consider the "cursor" used in the modified >implementation. If you consider a typical sequential scan you have > > key = next_key(key) ... > >which implies looking up the whole array to find the old key, before >finding the next one. > >In a 64 bit implementation that can imply 8 lookups .. even when the >next key is in the same cache line as the previous one. > >The other implementation achieved better performance than Judy >by retaining the last position, making a sequential scan much faster. >However that's wrong IMHO, a single inbuilt cursor in the public >interface is a bad design. However there are two alternatives: >just using it as a cache (hiding the details) is one, the other is to >provide a way to save the current position in a separate object. > >Either way, sequential scans are common enough to think about it. >For example when Felix uses Judy as an application data structure >the garbage collector has to do a sequential scan of the values >as part of the usual "mark-sweep" processing (which itself also uses >Judy for temporary storage!!) > > >-- >john skaller >sk...@us... >http://felix-lang.org > > > > >------------------------------------------------------------------------------ >This SF email is sponsosred by: >Try Windows Azure free for 90 days Click Here >http://p.sf.net/sfu/sfd2d-msazure >_______________________________________________ >Judy-devel mailing list >Jud...@li... >https://lists.sourceforge.net/lists/listinfo/judy-devel > > > |
From: john s. <sk...@us...> - 2012-04-02 14:28:30
|
On 02/04/2012, at 11:32 PM, Doug Baskins wrote: > John: > > Glad to hear of your successes with Judy. > > I have often thought of how to do a faster Judy[1L]Next(). It has > always fallen below my priorities list -- partly because I believed > that modern processor caches would lessen the overhead of a > complete re-walking of the tree to almost the same place (thus same > cache-lines). I believe that is probably right, *provided* the code is in a tight enough loop. > However, I have recently discovered another problem with > modern processors that may be a mitigating factor. That is > "branch prediction" miss-match of the compiler with the processor. Yes. Branch prediction causes speculative cache loads of both instructions and data. Gcc at least has a way to "hint" to the compiler whether a branch is likely to be taken. Felix actually emits these hints. However I have not noticed gcc doing anything with them. What it is *supposed* to do is reorder the blocks so that branch prediction will be good > > So here is my best shot on how I would approach the problem: > > Note: Key & Index words are used interchangeably and mean the same thing. > > A walk of the Judy tree often terminates with an object called a Leaf. > A Leaf has 1..256 Key/Value items in sorted order for an average > length of around 20+ elements. If Judy[1L]NextX() returned the > whole Leaf (or partial if required), then that array of 20+ Keys/Values > could be scanned very quickly. It would be up to the caller to > determine how long this data would be valid (I.E. no Inserts or Deletes.) > The caller would have to supply the memory for storing the Keys/Values > along with how many were returned and the current Key. The caller would > also have to remember what was the last Key accessed (cursor) out of the > memory to determine the next Key. The distance from Key (Leaf population) > to the Value would also be required. There a few other minor details > I haven't mentioned. Right. I understand I think. So you'd do a call like next_block ... and you'd get a single entry (as usual) OR if you hit a leaf, an array. > Actually, it would be fairly easy to implement, but the API would be ugly. Hmmm .. > Let me know if this seems worthwhile. It seems it has the potential of being > much faster. There's a way to hide this by control inversion: for_each (Jarray, function); perhaps more generally: map ... fold_left fold_right These are traditional functional programming things. Because you have to supply a callback, it's a lot more of a pain to use: you are the slave of the iteration, instead of the master. BUT .. the big advantage here is that it could use faster sequential scanning technology .. without exposing the details like the "get a block" API mentioned above. A compromise API would be like: "here is an array size 100, fill it with 100 entries starting at key such-and-such". You could then optimally process the 100 entries, then get the next 100. > About the Null terminator problem with JudySL. I think a student in perhaps > Brazil has written a *Next() to JudyHS(). If I remember correctly, a simple > #define will change JudyHS() into a sorted (numeric?) order store of the tree of trees. > If your interested, I can research it further. Well, Felix at least has a use for: set of words map of word to word set of strings map of string to word map of string to string Generally, a map of word to "anything" would be good. This can be done by map to a word, which is a pointer. At the cost of an indirection and a memory management problem. However Felix uses C++ strings which are 8 bit clean. Text will never contain a null: utf-8 encoded Unicode contains no nulls. ASCII contains no nulls (in human readable text). But strings also get used for binary blobs :) A trivial example: an IP address is 4 bytes, some of which can be 0. Of course this can be done now, with a JudyL instead of S. -- john skaller sk...@us... http://felix-lang.org |
From: Alan S. <aj...@fr...> - 2012-04-02 18:35:14
|
Gang, > Gcc at least has a way to "hint" to the compiler whether a branch is > likely to be taken. Felix actually emits these hints. However I have > not noticed gcc doing anything with them. Amusing, it reminds me that in the last year I carefully marked some new source functions "inline", but later discovered that gcc was basically ignoring my advice and doing whatever the heck it wanted. Turns out "inline" is merely a suggestion, and I don't know of a way to force it ("I know what I'm doing and I want to waste space for performance") other than reverting to macros. (Maybe there's some gcc option or #pragma I didn't look for...) > A compromise API would be like: "here is an array size 100, fill it > with 100 entries starting at key such-and-such". You could then > optimally process the 100 entries, then get the next 100. I vaguely recall (this was > 10 years ago) discussing multi-index operations like this. I know I actually WROTE code that batch-created (inserted) a Judy1/L array rather faster than by simple iteration, and it worked fine, but I don't know where it is today. (Given a simple array, or was it parallel arrays, of sparse indexes and values, libJudy internally-smart code can create a valid Judy tree pretty quickly, I think maybe 3X the one-at-a-time mode?) > Generally, a map of word to "anything" would be good. This can be > done by map to a word, which is a pointer. At the cost of an > indirection and a memory management problem. Agreed. The solution, which we also toyed with but did not implement, is to have libJudy offer APIs like JudyL but with arbitrary-sized value areas whose memory is managed by libJudy. In which case don't just return a pointer to the caller, have them give you a pointer to the data, and its size, and the library does the mem-create and copy-in. > A trivial example: an IP address is 4 bytes, some of which can be 0. > Of course this can be done now, with a JudyL instead of S. Right. C strings are what I call "length-terminated" by the NUL bytes. Perl strings for example are "length-associated," and any bit pattern can appear in the string. You can easily store them in a JudyL meta-trie where the top level is the length, but you end up with them sorted by length first, then lexicographically, not just lexicographically. You might not care. More generally a variable-length key whose size varies in bits, not bytes, can be stored the same way, just sorting initially by length in bits, not bytes. Of course a few bits/bytes are wasted in leaf value areas. Cheers, Alan Silverstein |
From: john s. <sk...@us...> - 2012-04-02 21:04:32
|
[Changed topic, old one is SPAMing the conversation: the 'gratulations is suggests i 'on a 'illion 'ollars .. :] On 03/04/2012, at 4:35 AM, Alan Silverstein wrote: > Gang, > >> Gcc at least has a way to "hint" to the compiler whether a branch is >> likely to be taken. Felix actually emits these hints. However I have >> not noticed gcc doing anything with them. > > Amusing, it reminds me that in the last year I carefully marked some new > source functions "inline", but later discovered that gcc was basically > ignoring my advice and doing whatever the heck it wanted. Turns out > "inline" is merely a suggestion, and I don't know of a way to force it > ("I know what I'm doing and I want to waste space for performance") > other than reverting to macros. (Maybe there's some gcc option or > #pragma I didn't look for...) Well .. you can try clang 3.0 :) Or if you want to use a "real" language, use Felix, where "inline" isn't a hint. > >> A compromise API would be like: "here is an array size 100, fill it >> with 100 entries starting at key such-and-such". You could then >> optimally process the 100 entries, then get the next 100. > > I vaguely recall (this was > 10 years ago) discussing multi-index > operations like this. I know I actually WROTE code that batch-created > (inserted) a Judy1/L array rather faster than by simple iteration, and > it worked fine, but I don't know where it is today. (Given a simple > array, or was it parallel arrays, of sparse indexes and values, libJudy > internally-smart code can create a valid Judy tree pretty quickly, I > think maybe 3X the one-at-a-time mode?) Actually that would be quite useful. I actually use Judy as a temporary data structure. The GC algorithm is basically: find all reachable objects, then delete everything not reachable. All-reachable is a set of pointers (Judy) constructed by scanning. Everything-not-reachable is a set of pointers. Also Judy. This is constructed because the practice is to run finalisers on all these objects *first* before deallocation. >> Generally, a map of word to "anything" would be good. This can be >> done by map to a word, which is a pointer. At the cost of an >> indirection and a memory management problem. > > Agreed. The solution, which we also toyed with but did not implement, > is to have libJudy offer APIs like JudyL but with arbitrary-sized value > areas whose memory is managed by libJudy. In which case don't just > return a pointer to the caller, have them give you a pointer to the > data, and its size, and the library does the mem-create and copy-in. Just an option for two words would be useful. Why? Well, there is a serious bug/misconception in the Judy build. Actually, the Felix clone makes the same mistake. I'm running on a 32 bit machine. I build only 32 bit Judy. Stupid. I still need 64 bit integers. I'm running on a 64 bit machine. I build 64 bit Judy. Stupid. I still need "int". That's only 32 bits. Heck, "long" is only 32 bits on Win64. What should be built is: 32 and 64 bit arrays. On BOTH 32 and 64 bit machines. I think this can be done now. In addition we should have 32->64 and 64->32 maps (JudyL). And 64->128 (64 bit machine only). >> A trivial example: an IP address is 4 bytes, some of which can be 0. >> Of course this can be done now, with a JudyL instead of S. > > Right. C strings are what I call "length-terminated" by the NUL bytes. In the C++ Standard they're called NTBS: Null terminated byte strings. So that's a technical, formal, and standardised name for "C strings" :-) > Perl strings for example are "length-associated," and any bit pattern > can appear in the string. Ditto C++ strings, and strings in almost every sane language other than C. Actually .. *including* C, since C also has length controlled buffers, just that you have to manage the length yourself. > You can easily store them in a JudyL > meta-trie where the top level is the length, but you end up with them > sorted by length first, then lexicographically, not just > lexicographically. You might not care. This is similar to the property naive Hash-table has. You lose ordering in return for O(1) operations. -- john skaller sk...@us... http://felix-lang.org |
From: Alan S. <aj...@fr...> - 2012-04-02 21:25:51
|
John, > Or if you want to use a "real" language, use Felix, where "inline" > isn't a hint. Can't blame you for marketing your product, but does it run under the NetBurner Eclipse IDE to produce code for a Coldfusion Freescale processor? :-) And call C language library code from NB too? And can you supply a C-to-Felix translator that's 100% bulletproof? (Wouldn't surprise me if you could. :-) And overcome the inertia here among people resistant to new platforms and languages? >> I vaguely recall (this was > 10 years ago) discussing multi-index >> operations like this. I know I actually WROTE code that batch-created >> (inserted) a Judy1/L array rather faster than by simple iteration, and >> it worked fine, but I don't know where it is today. > Actually that would be quite useful. I actually use Judy as a > temporary data structure. The GC algorithm is basically: find all > reachable objects, then delete everything not reachable. Doug, can you send him the "valid" copy of that code, or should I try to dig it up? I don't even recall what it was called... JudyLInsBatch(), etc, maybe? > All-reachable is a set of pointers (Judy) constructed by scanning. > Everything-not-reachable is a set of pointers. Also Judy. This is > constructed because the practice is to run finalisers on all these > objects *first* before deallocation. OK. > Just an option for two words would be useful. > > Why? Why stop at two words? :-) > Well, there is a serious bug/misconception in the Judy build. > Actually, the Felix clone makes the same mistake. > > I'm running on a 32 bit machine. I build only 32 bit Judy. Stupid. > I still need 64 bit integers. > > I'm running on a 64 bit machine. I build 64 bit Judy. Stupid. I > still need "int". That's only 32 bits. Heck, "long" is only 32 bits > on Win64. I see, having to bind the library to one size or the other is hobbling. > What should be built is: 32 and 64 bit arrays. On BOTH 32 and 64 bit > machines. I think you can, but not sure how they'd coexist -- namespace issues. > In the C++ Standard they're called NTBS: Null terminated byte > strings. So that's a technical, formal, and standardised name for "C > strings" :-) I'll try to remememember that. :-) > This is similar to the property naive Hash-table has. You lose > ordering in return for O(1) operations. Yup. Cheers, Alan |
From: Doug B. <dou...@ya...> - 2012-04-03 06:41:00
|
Alan: I believe it was called: ~/judy-1.0.5/src/JudyCommon/JudyInsArray.c and it was never taken out of the sourceforge.net downloads, nor was it documented, nor throughly tested. doug Doug, can you send him the "valid" copy of that code, or should I try to >dig it up? I don't even recall what it was called... JudyLInsBatch(), >etc, maybe? > > |
From: Doug B. <dou...@ya...> - 2012-04-03 07:27:16
|
John and Alan: > Just an option for two words (Values) would be useful. > Why? Our friend malloc(3) allocates buffers that are 2 word aligned. Therefore: 1 word allocates use 2 words. 2 word allocates use 4 words. 3 word allocates use 4 words 4 word allocates use 6 words, etc. (The overhead word (that describes the length) is odd word aligned) Therefore it would be nice if Judy had option for 2 word Values and not more. The cost of memory that malloc uses does not justify a larger Value area. But internally, Judy uses pointers and Values interchangeably. This would require extensive changes to the code. I am not sure the cost justifies the return. Thanks for your interest, doug |
From: Jim L. <jl...@si...> - 2012-04-02 22:05:08
|
On Mon, Apr 2, 2012 at 11:35 AM, Alan Silverstein <aj...@fr...> wrote: > Amusing, it reminds me that in the last year I carefully marked some new > source functions "inline", but later discovered that gcc was basically > ignoring my advice and doing whatever the heck it wanted. Turns out > "inline" is merely a suggestion, and I don't know of a way to force it > ("I know what I'm doing and I want to waste space for performance") > other than reverting to macros. (Maybe there's some gcc option or > #pragma I didn't look for...) > See http://gcc.gnu.org/onlinedocs/gcc/Inline.html Here is a macro that does the trick: #define force_inline inline __attribute__((always_inline)) |
From: Doug B. <dou...@ya...> - 2012-04-03 07:00:49
|
Gang: On modern processors, the problem of branch prediction is more important than subroutine call/returns, so inline code is not where your focus should be. Subroutine call/returns are fully predictable whereas a simple if <condition> statement is NOT. I have found that inlining only helps on the most trivial piece of code (on Intel 386 etc) and the compiler does a good job on doing it for you. MACRO's are a waste of time, IF used for improving performance. I now use them onlyto abbreviate/document trivial things or to make simple changes to test code in a 1000 places. doug Doug Baskins <dou...@ya...> >________________________________ > From: Jim Lloyd <jl...@si...> >To: aj...@fr... >Cc: dou...@ya...; jud...@li...; sta...@gm... >Sent: Tuesday, April 3, 2012 4:05 AM >Subject: Re: I'd like to congratulate you and the team that worked on Judy Arrays > > >On Mon, Apr 2, 2012 at 11:35 AM, Alan Silverstein <aj...@fr...> wrote: > >Amusing, it reminds me that in the last year I carefully marked some new >>source functions "inline", but later discovered that gcc was basically >>ignoring my advice and doing whatever the heck it wanted. Turns out >>"inline" is merely a suggestion, and I don't know of a way to force it >>("I know what I'm doing and I want to waste space for performance") >>other than reverting to macros. (Maybe there's some gcc option or >>#pragma I didn't look for...) >> > >See http://gcc.gnu.org/onlinedocs/gcc/Inline.html > > >Here is a macro that does the trick: >#define force_inline inline __attribute__((always_inline)) > > >------------------------------------------------------------------------------ >Better than sec? Nothing is better than sec when it comes to >monitoring Big Data applications. Try Boundary one-second >resolution app monitoring today. Free. >http://p.sf.net/sfu/Boundary-dev2dev >_______________________________________________ >Judy-devel mailing list >Jud...@li... >https://lists.sourceforge.net/lists/listinfo/judy-devel > > > |
From: john s. <sk...@us...> - 2012-04-03 11:23:39
|
On 03/04/2012, at 5:00 PM, Doug Baskins wrote: > Gang: > > On modern processors, the problem of branch prediction is more > important than subroutine call/returns, so inline code is not where > your focus should be. Subroutine call/returns are fully predictable > whereas a simple if <condition> statement is NOT. I have found > that inlining only helps on the most trivial piece of code (on Intel 386 etc) > and the compiler does a good job on doing it for you. MACRO's > are a waste of time, IF used for improving performance. I now use > them only to abbreviate/document trivial things or to make simple changes > to test code in a 1000 places. This is not quite true. Generally small subroutine calls are reasonably fast, and inlining may not appear to save much. However, the implicit assumption that overhead is in JSR/RET instructions and not other things is false in general. For small flaf leaf routines where the register allocation model allows a subroutine call with zero parameter passing overhead, then sure, the JSR/RET cost is minimal. But this is a pretty low level "macro vs inline subroutine" comparison. For "real code" the difference is utterly profound and it is the ONLY thing of any significance. Inlining does a lot more than save a JSR/RET. It saves copying arguments. It saves evaluating arguments. It saves stack space. It allows a major collection of optimisations that would not be possible otherwise. In Felix the difference between "standard model" functions (which are C++ class objects) and inlined code is something like 1000 times. The class objects, methods, etc are too hard for stupid compilers to optimise. But after inlinling an addition function taking arguments "a" and "b" to a + b gcc does a reasonable job adding them quickly .. :) -- john skaller sk...@us... http://felix-lang.org |
From: Bisht, P. <pra...@ya...> - 2012-07-26 21:38:58
|
Is it okay to do something like this in Judy arrays: JLL (value, JArray, index); if (value != NULL) { // insert/delete some entries <<< OKAY ? } // find the next overlapping index = (Word_t)someNewValue JLF (value, JArray, index); while (value != NULL) { // insert/delete some entries <<<< OKAY ? // continue searching for post overlapping range JLN (value, JArray, index); } // while (value != NULL) will I still be able to see all entries including newly inserted ones and not see the deleted ones? OR will I see only the old ones ? OR will I see unknown results ? |
From: Doug B. <dou...@ya...> - 2012-07-27 19:20:20
|
Bisht: You > will I still be able to see all entries including newly inserted ones and not see the deleted ones The only caution is what you call "value" pointer. It is only valid until the next Delete, Insert or Free (the Judy array modification routines). JLF, JLL JLN etc. examine the array as it is at the instant they are called -- I.E. they start from scratch. Thanks for your interest, doug Doug Baskins <dou...@ya...> >________________________________ > From: "Bisht, Pradeep" <pra...@ya...> >To: "jud...@li..." <jud...@li...> >Cc: Doug Baskins <dou...@ya...> >Sent: Thursday, July 26, 2012 3:38 PM >Subject: insertion/deletion during sorted order scan; > > >Is it okay to do something like this in Judy arrays: > > > JLL (value, JArray, index); > > if (value != NULL) { > // insert/delete some entries <<< OKAY ? > } > > // find the next overlapping > index = (Word_t)someNewValue > JLF (value, JArray, index); > while (value != NULL) { > // insert/delete some entries <<<< OKAY ? > > > // continue searching for post overlapping range > > JLN (value, JArray, index); > } // while (value != NULL) > > >will I still be able to see all entries including newly inserted ones and not see the deleted ones? >OR >will I see only the old ones ? >OR >will I see unknown results ? >------------------------------------------------------------------------------ >Live Security Virtual Conference >Exclusive live event will cover all the ways today's security and >threat landscape has changed and how IT managers can respond. Discussions >will include endpoint security, mobile security and the latest in malware >threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >_______________________________________________ >Judy-devel mailing list >Jud...@li... >https://lists.sourceforge.net/lists/listinfo/judy-devel > > > |
From: Geert De P. <ge...@de...> - 2012-08-02 20:06:16
|
Is there a reason why the Judy1SetArray and JudyLInsArray are missing from the documentation/man pages/etc ... ? Source code seems to indicate it would be much faster for inserting bulk data … but because it's missing from all docs - it's unclear if these functions are experimental or not. Thanks, — Geert |
From: Alan S. <aj...@fr...> - 2012-08-03 00:21:36
|
> Is there a reason why the Judy1SetArray and JudyLInsArray are missing > from the documentation/man pages/etc ... ? Source code seems to > indicate it would be much faster for inserting bulk data : but > because it's missing from all docs - it's unclear if these functions > are experimental or not. I wrote them, so I can speak to that. They are "not known not to work," but were never vetted to our project's high standards, so not formally documented. A larger issue we wrangled with but didn't solve was how to save and restore a Judy array to mass storage. Since they are pointer-rich, you'd have to do a lot of pointer fixups of some kind on restore, with no guarantees malloc() would give consistent results, etc. Storing data was so fast in the first place it wasn't a high priority to add save/restore. The Judy*Array() functions were a personal project to see how much faster you could do bulk insertion, still translating from a more "portable" data format, and it really was a lot faster. Cheers, Alan Silverstein |
From: Bisht, P. <pra...@ya...> - 2012-08-28 22:19:37
|
I have modified Judy to be used in kernel mode as a static library to my program ? Do I need to give away source code of my program ? Can I just give away the linkable .obj files for my program ? Please lmk. Thank you. |
From: Doug B. <dou...@ya...> - 2012-08-30 04:55:51
|
Pradeep: Judy has an LGPL license. That means you can link to it without having to share your non-Judy source code. However, you must publish the source code of the modified Judy. That is my understanding of the LGPL license. Please email me the source and I would like to take a look and may release if it if it seems useful. Thanks for your Interest. Doug Doug Baskins <dou...@ya...> >________________________________ > From: "Bisht, Pradeep" <pra...@ya...> >To: john skaller <sk...@us...>; Doug Baskins <dou...@ya...> >Cc: "jud...@li..." <jud...@li...> >Sent: Tuesday, August 28, 2012 4:19 PM >Subject: Judy license; > > >I have modified Judy to be used in kernel mode as a static library to my program ? > > >Do I need to give away source code of my program ? Can I just give away the linkable .obj files for my program ? Please lmk. > > >Thank you. >------------------------------------------------------------------------------ >Live Security Virtual Conference >Exclusive live event will cover all the ways today's security and >threat landscape has changed and how IT managers can respond. Discussions >will include endpoint security, mobile security and the latest in malware >threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >_______________________________________________ >Judy-devel mailing list >Jud...@li... >https://lists.sourceforge.net/lists/listinfo/judy-devel > > > |
From: Bisht, P. <pra...@ya...> - 2012-08-30 05:13:28
|
Hello Doug, I will send you the source code. Can I link judy to my program statically ? Thank you for your help. kind regards, pradeep ________________________________ From: Doug Baskins <dou...@ya...> To: "Bisht, Pradeep" <pra...@ya...>; john skaller <sk...@us...> Cc: "jud...@li..." <jud...@li...> Sent: Wednesday, August 29, 2012 9:55 PM Subject: Re: Judy license; Pradeep: Judy has an LGPL license. That means you can link to it without having to share your non-Judy source code. However, you must publish the source code of the modified Judy. That is my understanding of the LGPL license. Please email me the source and I would like to take a look and may release if it if it seems useful. Thanks for your Interest. Doug Doug Baskins <dou...@ya...> >________________________________ > From: "Bisht, Pradeep" <pra...@ya...> >To: john skaller <sk...@us...>; Doug Baskins <dou...@ya...> >Cc: "jud...@li..." <jud...@li...> >Sent: Tuesday, August 28, 2012 4:19 PM >Subject: Judy license; > > >I have modified Judy to be used in kernel mode as a static library to my program ? > > >Do I need to give away source code of my program ? Can I just give away the linkable .obj files for my program ? Please lmk. > > >Thank you. >------------------------------------------------------------------------------ >Live Security Virtual Conference >Exclusive live event will cover all the ways today's security and >threat landscape has changed and how IT managers can respond. Discussions >will include endpoint security, mobile security and the latest in malware >threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >_______________________________________________ >Judy-devel mailing list >Jud...@li... >https://lists.sourceforge.net/lists/listinfo/judy-devel > > > |
From: john s. <sk...@us...> - 2012-08-30 05:34:17
|
On 30/08/2012, at 3:13 PM, Bisht, Pradeep wrote: > Hello Doug, I will send you the source code. Can I link judy to my program statically ? Thank you for your help. Technically if you do your program is LGPL too. Exactly what that means isn't clear, since your application source is not derived from Judy: your source (apart from Judy) is yours unencumbered but the binary program is encumbered. Many vendors such as the Ocaml vendors use what I call LGPLX, or LGPL with an added clause explicitly granting an exemption for static linkage. Doug might agree to that if you need you distribute binaries and don't wish to publish the source (except for mods to Judy). I personally claim Felix (which uses Judy) is FFAU: free for any use even when static linking. Any mods I make to Judy are published on GitHub anyhow. I doubt anyone cares (especially as I rave about how good Judy is :) -- john skaller sk...@us... http://felix-lang.org |
From: Stavros M. <mac...@al...> - 2012-08-30 18:16:02
|
On Thu, Aug 30, 2012 at 1:33 AM, john skaller <sk...@us... > wrote: > > On 30/08/2012, at 3:13 PM, Bisht, Pradeep wrote: > > ... Can I link judy to my program statically ? ... > > ...Technically if you do your program is LGPL too.... > You are mistaken. Though the linked program is a "derived work" of the LGPL code (*), Pradeep's code does not have to be released under the LGPL. He just needs to make available whatever is necessary to re-link to a replacement of the LGPL code (e.g. another version of the same code, or a compatible equivalent), namely object code (.o, .jar, etc.) -- or even source code if he likes (not necessarily licensed under L/GPL). Presumably those extended LGPL licenses you mention remove even that requirement. -s (*) This is the FSF's position. Others may disagree, but let's assume it's true. |
From: john s. <sk...@us...> - 2012-08-30 20:14:07
|
On 31/08/2012, at 4:15 AM, Stavros Macrakis wrote: > On Thu, Aug 30, 2012 at 1:33 AM, john skaller <sk...@us...> wrote: > On 30/08/2012, at 3:13 PM, Bisht, Pradeep wrote: > > ... Can I link judy to my program statically ? ... > > ...Technically if you do your program is LGPL too.... > > You are mistaken. Though the linked program is a "derived work" of the LGPL code (*), Pradeep's code does not have to be released under the LGPL. I am not mistaken, you just misread my comments. By "program" I meant binary executable. Thought that was clear. -- john skaller sk...@us... http://felix-lang.org |
From: john s. <sk...@us...> - 2012-08-30 21:05:30
|
On 31/08/2012, at 6:13 AM, john skaller wrote: > > On 31/08/2012, at 4:15 AM, Stavros Macrakis wrote: > >> On Thu, Aug 30, 2012 at 1:33 AM, john skaller <sk...@us...> wrote: >> On 30/08/2012, at 3:13 PM, Bisht, Pradeep wrote: >>> ... Can I link judy to my program statically ? ... >> >> ...Technically if you do your program is LGPL too.... >> >> You are mistaken. Though the linked program is a "derived work" of the LGPL code (*), Pradeep's code does not have to be released under the LGPL. > > I am not mistaken, you just misread my comments. > By "program" I meant binary executable. Thought that was clear. Actually, GPL and LGPL are fraught with legal difficulties. They're bad licences because they try to make logically unsustainable distinctions .. although its not clear a naive judge would agree. In theory, when you compile your C source the resulting object file is a derived work because you included an LGPL header file (in fact the pre-processed C code is also a derived work: macro expansions and header file inclusions are derivations). If instead you copy the interface specification, your source is a derived work. However if you merely read the specification and write your own interface based on the intellectual content, that is NOT a derived work. The API itself is not copyrightable, established for US domain at least by the very recent Google vs. Oracle in respect of Google using Java library APIs in Android. For that reason I have code which uses GMP, but I wrote the interface (in Felix code) from the specs, rather than mechanically deriving it: my interface is not a derived work and so is unencumbered. I also distribute Judy with my product. Technically the claim that the product is FFAU is incorrect. As a whole, it is FFAU, but each part has a separate licence. This is the way it works for "compendium" products that are aggregations rather than derivations. The best known example of that is Linux distros: they can include GPL'd works without themselves being GPL. The individual component remains GPL. So we have the ludicrous situation where you can distribute non GPL code containing a header file which is GPL, and all is OK, but if you physically copy the header file into one of your files, that file is now GPL -- despite the two encodings being logically identical. Similarly, you can distribute your code plus some other GPL'd code and as long as the client compiles and links it into a program there's no breach of licence. But if you do that, and distribute the result, you're breaching the licence. I know of commercial companies that actually take the sources to the client's place and do the building there for just this reason. The whole concept of intellectual property is rubbish. It is nonsense in the modern world. IMHO of course :) -- john skaller sk...@us... http://felix-lang.org |
From: Stavros M. <mac...@al...> - 2012-08-30 23:47:54
|
Yes, I agree that there are all sorts of issues with L/GPL. As for the header files, I don't think you need to be so meticulous. There is an explicit provision for interface specifications, a.k.a. header files (LGPLv3, section 3). About the theoretical possibility of having two texts which are word-for-word identical, one a copyright violation, the other not, that is actually a well-established principle of copyright law in general (not just for software) as opposed to patent law, where, if you come up with the same thing independently and innocently, you can still infringe the patent. -s On Thu, Aug 30, 2012 at 5:04 PM, john skaller <sk...@us... > wrote: > > On 31/08/2012, at 6:13 AM, john skaller wrote: > > > > > On 31/08/2012, at 4:15 AM, Stavros Macrakis wrote: > > > >> On Thu, Aug 30, 2012 at 1:33 AM, john skaller < > sk...@us...> wrote: > >> On 30/08/2012, at 3:13 PM, Bisht, Pradeep wrote: > >>> ... Can I link judy to my program statically ? ... > >> > >> ...Technically if you do your program is LGPL too.... > >> > >> You are mistaken. Though the linked program is a "derived work" of the > LGPL code (*), Pradeep's code does not have to be released under the LGPL. > > > > I am not mistaken, you just misread my comments. > > By "program" I meant binary executable. Thought that was clear. > > Actually, GPL and LGPL are fraught with legal difficulties. > They're bad licences because they try to make logically unsustainable > distinctions .. although its not clear a naive judge would agree. > > In theory, when you compile your C source the resulting > object file is a derived work because you included > an LGPL header file (in fact the pre-processed C code > is also a derived work: macro expansions and header > file inclusions are derivations). > > If instead you copy the interface specification, your source > is a derived work. > > However if you merely read the specification and write your > own interface based on the intellectual content, that is NOT > a derived work. The API itself is not copyrightable, established > for US domain at least by the very recent Google vs. Oracle > in respect of Google using Java library APIs in Android. > > For that reason I have code which uses GMP, but I wrote the > interface (in Felix code) from the specs, rather than mechanically > deriving it: my interface is not a derived work and so is unencumbered. > > I also distribute Judy with my product. Technically the claim that the > product is FFAU is incorrect. As a whole, it is FFAU, but each part > has a separate licence. This is the way it works for "compendium" > products that are aggregations rather than derivations. The best > known example of that is Linux distros: they can include GPL'd works > without themselves being GPL. The individual component remains GPL. > > So we have the ludicrous situation where you can distribute non GPL > code containing a header file which is GPL, and all is OK, but if you > physically > copy the header file into one of your files, that file is now GPL -- > despite > the two encodings being logically identical. > > Similarly, you can distribute your code plus some other GPL'd code > and as long as the client compiles and links it into a program > there's no breach of licence. But if you do that, and distribute > the result, you're breaching the licence. I know of commercial > companies that actually take the sources to the client's place > and do the building there for just this reason. > > The whole concept of intellectual property is rubbish. > It is nonsense in the modern world. IMHO of course :) > > -- > john skaller > sk...@us... > http://felix-lang.org > > > > |
From: Doug B. <dou...@ya...> - 2012-08-31 01:09:27
|
Stavros: Well I did not want to start up a firestorm. I am not a lawyer, so I cannot comment on L/GPL "worthiness". The original intent of using a LGPL instead of a GPL license was to explicitly allow programs to link to Judy (either statically or dynamically) without incurring any changes to their codes license, due to linking to Judy. A more liberal license was not usedto prevent people from licensing Judy (with perhaps slight changes) with a different licenseand to charge for it. I.E. HP did not want to have to pay for a license for using Judy inthe future. I am very disappointed every time I hear of someone not using Judy because of a fear of "compromising" any code that links with Judy. This is simply not the intention of the LGPL license -- in my opinion. I also feel that if someone contributes/enhances Judy, then they should make those changes public and available to the whole programming community. I am also a critic of intellectual property. Thanks for you interest, Doug Doug Baskins <dou...@ya...> >________________________________ > From: Stavros Macrakis <mac...@al...> >To: john skaller <sk...@us...> >Cc: Doug Baskins <dou...@ya...>; "jud...@li..." <jud...@li...> >Sent: Thursday, August 30, 2012 5:47 PM >Subject: Re: Judy license; > > >Yes, I agree that there are all sorts of issues with L/GPL. > > >As for the header files, I don't think you need to be so meticulous. There is an explicit provision for interface specifications, a.k.a. header files (LGPLv3, section 3). > > >About the theoretical possibility of having two texts which are word-for-word identical, one a copyright violation, the other not, that is actually a well-established principle of copyright law in general (not just for software) as opposed to patent law, where, if you come up with the same thing independently and innocently, you can still infringe the patent. > > > -s > > >On Thu, Aug 30, 2012 at 5:04 PM, john skaller <sk...@us...> wrote: > > >>On 31/08/2012, at 6:13 AM, john skaller wrote: >> >>> >>> On 31/08/2012, at 4:15 AM, Stavros Macrakis wrote: >>> >>>> On Thu, Aug 30, 2012 at 1:33 AM, john skaller <sk...@us...> wrote: >>>> On 30/08/2012, at 3:13 PM, Bisht, Pradeep wrote: >>>>> ... Can I link judy to my program statically ? ... >>>> >>>> ...Technically if you do your program is LGPL too.... >>>> >>>> You are mistaken. Though the linked program is a "derived work" of the LGPL code (*), Pradeep's code does not have to be released under the LGPL. >>> >>> I am not mistaken, you just misread my comments. >>> By "program" I meant binary executable. Thought that was clear. >> >>Actually, GPL and LGPL are fraught with legal difficulties. >>They're bad licences because they try to make logically unsustainable >>distinctions .. although its not clear a naive judge would agree. >> >>In theory, when you compile your C source the resulting >>object file is a derived work because you included >>an LGPL header file (in fact the pre-processed C code >>is also a derived work: macro expansions and header >>file inclusions are derivations). >> >>If instead you copy the interface specification, your source >>is a derived work. >> >>However if you merely read the specification and write your >>own interface based on the intellectual content, that is NOT >>a derived work. The API itself is not copyrightable, established >>for US domain at least by the very recent Google vs. Oracle >>in respect of Google using Java library APIs in Android. >> >>For that reason I have code which uses GMP, but I wrote the >>interface (in Felix code) from the specs, rather than mechanically >>deriving it: my interface is not a derived work and so is unencumbered. >> >>I also distribute Judy with my product. Technically the claim that the >>product is FFAU is incorrect. As a whole, it is FFAU, but each part >>has a separate licence. This is the way it works for "compendium" >>products that are aggregations rather than derivations. The best >>known example of that is Linux distros: they can include GPL'd works >>without themselves being GPL. The individual component remains GPL. >> >>So we have the ludicrous situation where you can distribute non GPL >>code containing a header file which is GPL, and all is OK, but if you physically >>copy the header file into one of your files, that file is now GPL -- despite >>the two encodings being logically identical. >> >>Similarly, you can distribute your code plus some other GPL'd code >>and as long as the client compiles and links it into a program >>there's no breach of licence. But if you do that, and distribute >>the result, you're breaching the licence. I know of commercial >>companies that actually take the sources to the client's place >>and do the building there for just this reason. >> >>The whole concept of is rubbish. >>It is nonsense in the modern world. IMHO of course :) >> >> >>-- >>john skaller >>sk...@us... >>http://felix-lang.org >> >> >> >> > >------------------------------------------------------------------------------ >Live Security Virtual Conference >Exclusive live event will cover all the ways today's security and >threat landscape has changed and how IT managers can respond. Discussions >will include endpoint security, mobile security and the latest in malware >threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >_______________________________________________ >Judy-devel mailing list >Jud...@li... >https://lists.sourceforge.net/lists/listinfo/judy-devel > > > |