## judy-devel — List for Judy development and users.

You can subscribe to this list here.

2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 Jan Feb Mar Apr May Jun Jul (2) Aug (1) Sep Oct (16) Nov (6) Dec (5) Jan Feb Mar (1) Apr (1) May Jun Jul Aug Sep (7) Oct Nov Dec (8) Jan Feb Mar (14) Apr May (57) Jun (4) Jul (6) Aug (2) Sep (16) Oct Nov (3) Dec (12) Jan (16) Feb (3) Mar Apr May (11) Jun (1) Jul Aug Sep Oct (2) Nov (2) Dec (2) Jan (1) Feb (13) Mar (9) Apr May (16) Jun (4) Jul (5) Aug Sep (15) Oct Nov (3) Dec (4) Jan Feb Mar Apr (2) May Jun Jul Aug Sep (1) Oct Nov Dec (14) Jan (13) Feb (61) Mar (5) Apr (5) May Jun Jul Aug Sep Oct (8) Nov (6) Dec (2) Jan Feb Mar Apr (13) May (3) Jun Jul (2) Aug (14) Sep (1) Oct (36) Nov (37) Dec (1) Jan Feb (1) Mar Apr May (2) Jun Jul (3) Aug Sep Oct (4) Nov (5) Dec Jan (1) Feb (35) Mar (3) Apr May (7) Jun (2) Jul Aug (1) Sep (3) Oct Nov Dec Jan Feb Mar (7) Apr (4) May (1) Jun Jul Aug Sep (1) Oct Nov Dec Jan Feb Mar Apr May (3) Jun Jul Aug (10) Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul (1) Aug Sep (9) Oct Nov (3) Dec (4)
S M T W T F S
1

2

3
(2)
4

5

6

7

8

9

10

11

12

13

14

15

16

17
(5)
18
(2)
19

20

21

22

23

24

25

26

27

28

29

30

31

Showing 2 results of 2

 Re: length-specified-strings -- JudySLn() From: john skaller - 2009-03-03 15:51:33 ```On 03/03/2009, at 5:39 PM, Doug Baskins wrote: > > SO, MY QUESTION TO YOU IS: > > Is that the correct way to implement JudySLn()? There is only one "correct" lexicographic ordering: A string P which is a strict prefix of a string S is lower than it. Thus XYZ is lower than XYZ\0 which is lower than XYZ\0\0. The NULL byte \0 is perfectly ordinary and must be treated as such whether it is embedded or trailing does not matter. Instead imagine all string are infinite in length, by extending them after the specified length by an infinite stream of -1s. It is easy to store a set of such strings .. but you're right it seems quite hard to order them (uniqueness is trivial, just pre-pend the length count). Of course there IS a way: encode the strings, for example UTF-8 encoding sorts correctly and leaves FF free as a terminator (for example). The only problem with this is that the length can only be found by scanning the whole string (same as for NULL terminated!). But this means it requires two searches to decode it. -- john skaller skaller@... ```
 length-specified-strings -- JudySLn() From: Doug Baskins - 2009-03-03 07:39:38 Attachments: Message as HTML ```Massot: Well you got me thinking about JudySLn() (current name). I looked into modifying JudySL to be length-specified-strings instead of null terminated strings. It turns out to be difficult algorithm. A major problem seems to be a definition of "lexicographic" sorting. My first thought made strings with different number of trailing zeros ('\0') stored in order of the least number of trailing zeros (otherwise the same) being stored by just their length differences. I.E. a string "abc\0\0" would be stored after "abc\0". However, when writing the code an interesting problem shows up. In a tree hierarchy, they are synonyms in the tree and require a separate structure to be stored. This complicates and makes the code very "ugly" as well bloats the memory to store the strings. Then I decided to look up the definition of lexicographic sorting. My interpretation of the definition is the above strings would be considered the SAME. In "dictionary sorting", all words can be considered the same length and padded with trailing "spaces" before sorting. I also believe (but have not thought out carefully) that if someone wanted to use JudySLn() to do number sorting (by doing byte reversal of the string), then they would want "\0\0321" to be the same as "\0321" and considered a synonym (the same number). And, lastly, the code has a strong preference to work that that way. (I am not sure I can figure out all of the corner cases nor be sure it is possible.) SO, MY QUESTION TO YOU IS: Is that the correct way to implement JudySLn()? My preferred implementation is to strip all trailing (null or \0 character) from being stored in the array. Can anybody think of a good case or reason to NOT do this? I do not consider recovering a null terminating character a good enough case -- it is just expensive to store in the array (it could possibly double the memory used in the array). Strings with imbedded zero characters would not be effected. So a string "abc\0def\0\0" (length == 9) would be retrieved as "abc\0def" (length == 7) and no null terminating character. Thanks for your interest Doug Doug Baskins ________________________________ From: Doug Baskins To: Schaeferhaus Sent: Thursday, February 26, 2009 5:52:05 PM Subject: Re: New Judy Massot: You caught me without a good answer. JudySL/HS is an "application" using JudyL. So it is not related to what I am presently working on. My mind is very deep into Judy1/L right now. I remember that the JudySL (with string length and dictionary sorting) is a problem that so far does not have a clean solution. If I had a clean solution, I would be able to write it in a week or so. I feel I have to finish the next version of Judy before playing with JudySLL or whatever I call it. So, I will say a year for now. I am sorry I can't be more accommodating. Please understand that I believe that JudySL should of had the string length from the start. We just did not have time and priority to work on it and JudySL worked. Some creative thinking is necessary to get it written. Alan wrote a paper way back (2001?) on a number of possible ways, and I rejected all of them. If your are interested, I will send you a copy. Thanks for your interest, Doug Doug Baskins ________________________________ From: Schaeferhaus To: Doug Baskins Sent: Wednesday, February 25, 2009 7:23:23 PM Subject: New Judy Hi, excuse me for asking, but is it possible to know, when i can expect a test-version of the new judy? 1 week, 1month, 1 year. I'm asking this because of the new feature "non null terminated strings" (not the JudyHS). Thanks Massot```

Showing 2 results of 2