From: <ri...@dr...> - 2004-08-14 07:27:48
|
Hi! I am wondering about the status of JDBM. Not much seem to be happening, and yet there are a number of things that could be improved in the code (the use of streams in TransactionManager is "interesting" for example). Is there a plan on getting new things done, or is JDBM officially in maintenance mode? regards, Rickard |
From: austin-tx-sfnet-user <lok...@ya...> - 2004-08-16 16:53:25
|
Hi, I'm not an architect or lead developer on this project, but I've contributed a little in the past. if you send specific feature requests, I am happy to try help. thanks. --- Rickard Öberg <ri...@dr...> wrote: > Hi! > > I am wondering about the status of JDBM. Not much seem to be > happening, > and yet there are a number of things that could be improved in > the code > (the use of streams in TransactionManager is "interesting" for > example). > Is there a plan on getting new things done, or is JDBM > officially in > maintenance mode? > > regards, > Rickard > > > > > ------------------------------------------------------- > SF.Net email is sponsored by Shop4tech.com-Lowest price on > Blank Media > 100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only > $33 > Save 50% off Retail on Ink & Toner - Free Shipping and Free > Gift. > http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285 > _______________________________________________ > Jdbm-general mailing list > Jdb...@li... > https://lists.sourceforge.net/lists/listinfo/jdbm-general > |
From: Alex B. <boi...@in...> - 2004-08-16 20:25:24
|
Rickard, First, thanks for bringing that up! Without an official roadmap, I=20 guess we're unofficially in maintenance mode ;) Cheap jokes aside, the reality is that I've done very little work on=20 JDBM in the past years. I haven't declared maintenance measures yet=20 for a good reason: I see plenty of work ahead! It's just that I haven't=20 gotten to it! I'm very open to having people participate, improve and generally inject=20 more life into JDBM. I've granted some CVS commit access in the past=20 to promote this... and have accepted enhancement patches, bug fixes,=20 etc. But agreed, it's been slow and somewhat dead for a while over here. I think the best place to start is to formulate specific goals as to=20 what people want and what needs to be done. The use of streams in the=20 TransactionManager doesn't bother me much because JDBM fits my needs,=20 but if you think you can do better and want to contribute something,=20 great! I'll make sure you've got (or anybody else has) the proper=20 power to make it happen. Things that have been discussed in the past that I think are worthy: 1) BTree index key compression 2) Implement collections interfaces to follow Java idioms 3) Improve concurrency and transaction management (JDBM is thread-safe but isn't exactly MP-friendly) 4) Improve I/O -- think java.nio In all, I'd be happy to provide support and guidance for any of those=20 projects. In following the open-source philosophy, I'd also be happy to=20 see people scratch their own itch since mine don't itch much (or enough)=20 these days. <grin> cheers, alex Rickard =D6berg wrote: > Hi! >=20 > I am wondering about the status of JDBM. Not much seem to be happening,= =20 > and yet there are a number of things that could be improved in the code= =20 > (the use of streams in TransactionManager is "interesting" for example)= .=20 > Is there a plan on getting new things done, or is JDBM officially in=20 > maintenance mode? >=20 > regards, > Rickard |
From: <ri...@dr...> - 2004-08-16 20:35:19
|
Hi! Alex Boisvert wrote: > First, thanks for bringing that up! Without an official roadmap, I > guess we're unofficially in maintenance mode ;) > > Cheap jokes aside, the reality is that I've done very little work on > JDBM in the past years. I haven't declared maintenance measures yet > for a good reason: I see plenty of work ahead! It's just that I haven't > gotten to it! > > I'm very open to having people participate, improve and generally inject > more life into JDBM. I've granted some CVS commit access in the past > to promote this... and have accepted enhancement patches, bug fixes, > etc. But agreed, it's been slow and somewhat dead for a while over here. > > I think the best place to start is to formulate specific goals as to > what people want and what needs to be done. The use of streams in the > TransactionManager doesn't bother me much because JDBM fits my needs, > but if you think you can do better and want to contribute something, > great! I'll make sure you've got (or anybody else has) the proper > power to make it happen. Well, I only noticed it when I had to make a batch insert of a couple of million objects. While the computer was chugging along I read the TM code and realized it could be made much more efficient. Did you for example know that the ObjectOutputStream constructor is dead slow? Or that serializing an object using the plain OOS will write tons of stuff just for the class descriptor? Stuff like that could be much improved. Adding a buffered stream beetween the OOS and the file stream would do wonders as well. Small things, but they add up pretty quickly. On the bigger issues I would like to make a refactoring to make JDBM more PicoContainer friendly, which essentially just means to decouple things even more. This isn't strictly necessary, but would make it easier to extend and compose different types of managers. > Things that have been discussed in the past that I think are worthy: > > 1) BTree index key compression Yes, that'd be nice. > 2) Implement collections interfaces to follow Java idioms +1 on this one too. > 3) Improve concurrency and transaction management > (JDBM is thread-safe but isn't exactly MP-friendly) Yeah, and do more stuff to make backup management better. > 4) Improve I/O -- think java.nio Yeah, that'd be nice too. Another biggie would be to have transactional support, but without the log file. When doing batch inserts using transactions help minimize the number of writes, but the log file makes up for a lot of unnecessary disk access. > In all, I'd be happy to provide support and guidance for any of those > projects. In following the open-source philosophy, I'd also be happy to > see people scratch their own itch since mine don't itch much (or enough) > these days. <grin> I'll take that as a "go for it" :-) /Rickard |
From: <ri...@dr...> - 2004-08-16 21:57:18
|
In addition to the already mentioned improvements I would like to perform an open-heart surgery on BaseRecordManager. Today it is doing, at least, three things: 1) CRUD operations on byte[] 2) CRUD operations on objects using serializers with 1) 3) Manage a map of named objects I would like to split this into three different parts. 1) would become a RecordStore. 2) would become an ObjectStore hold a RecordStore and a Serializer. 3) would become a separate mapper that holds an ObjectStore. It would also be nice to change the basic ops to deal with either streams or buffers instead of byte[]. This would make it possible to write blobs and objects of any size. So, while I think doing minor things to improve JDBM is great I would also like to do Big Stuff. Changing API's and similar. This is why I wanted to know whether development was alive or not. If JDBM dev was considered to be in hibernation it would probably be easier to fork it and do the changes I want, in order to not change what is currently there to the substantial degree that I want. Any thoughts on that? /Rickard |
From: Alex B. <boi...@in...> - 2004-08-16 23:53:40
|
Rickard =D6berg wrote: > In addition to the already mentioned improvements I would like to=20 > perform an open-heart surgery on BaseRecordManager. Today it is doing,=20 > at least, three things: > 1) CRUD operations on byte[] > 2) CRUD operations on objects using serializers with 1) > 3) Manage a map of named objects >=20 > I would like to split this into three different parts. 1) would become = a=20 > RecordStore. 2) would become an ObjectStore hold a RecordStore and a=20 > Serializer. 3) would become a separate mapper that holds an ObjectStore. +1 on principle. I'm assuming that RecordStore and ObjectStore would be interfaces that=20 would belong in the jdbm.* package. Please clarify if I got this wrong. > It would also be nice to change the basic ops to deal with either=20 > streams or buffers instead of byte[]. This would make it possible to=20 > write blobs and objects of any size. I think that's a little less efficient than passing byte[] around.=20 Given that you can already pass an object and get it serialized=20 transparently, I'm not sure what this gets you. > So, while I think doing minor things to improve JDBM is great I would=20 > also like to do Big Stuff. Changing API's and similar. This is why I=20 > wanted to know whether development was alive or not. If JDBM dev was=20 > considered to be in hibernation it would probably be easier to fork it=20 > and do the changes I want, in order to not change what is currently=20 > there to the substantial degree that I want. Big Stuff will lead to branching, yes, but I don't see that as a problem=20 given the benefits coming from the change. I'm thinking of labeling the current CVS version the 1.x branch, whereas=20 any Big Stuff would go into the MAIN branch to support continued=20 development to would eventually lead to a 2.x version. Does that work for you? alex |
From: <ri...@dr...> - 2004-08-17 05:23:44
|
Alex Boisvert wrote: > +1 on principle. > > I'm assuming that RecordStore and ObjectStore would be interfaces that > would belong in the jdbm.* package. Please clarify if I got this wrong. You got that right. >> It would also be nice to change the basic ops to deal with either >> streams or buffers instead of byte[]. This would make it possible to >> write blobs and objects of any size. > > I think that's a little less efficient than passing byte[] around. Given > that you can already pass an object and get it serialized transparently, > I'm not sure what this gets you. It's a scalability vs performance tradeoff. With buffers you can store 4Gb objects even if your VM only has 1Gb of heap. You can't do that with byte[] arrays. But I'm not sure if there absolutely has to be a performance hit to it. > Big Stuff will lead to branching, yes, but I don't see that as a problem > given the benefits coming from the change. > > I'm thinking of labeling the current CVS version the 1.x branch, whereas > any Big Stuff would go into the MAIN branch to support continued > development to would eventually lead to a 2.x version. > > Does that work for you? That works just fine :-) One more thing. When I checked out the files from CVS I got screwed up newlines. Has anyone else encountered this? /Rickard |
From: <ri...@dr...> - 2004-08-17 06:11:55
|
Rickard =D6berg wrote: > It's a scalability vs performance tradeoff. With buffers you can store=20 > 4Gb objects even if your VM only has 1Gb of heap. You can't do that wit= h=20 > byte[] arrays. But I'm not sure if there absolutely has to be a=20 > performance hit to it. After a quick look into the NIO API's it seems like using ByteBuffer=20 instead of byte[] would work, since they can be mapped to files and=20 hence don't have to wrap physical arrays. That way a buffer can be of=20 any size, yes doesn't impact the basic API negatively, which using=20 streams instead of byte[] might do. /Rickard |
From: <ri...@dr...> - 2004-08-17 08:33:05
|
Rickard =D6berg wrote: > After a quick look into the NIO API's it seems like using ByteBuffer=20 > instead of byte[] would work, since they can be mapped to files and=20 > hence don't have to wrap physical arrays. That way a buffer can be of=20 > any size, yes doesn't impact the basic API negatively, which using=20 > streams instead of byte[] might do. After investigating some more it seems like if you, for example, want to=20 insert a file on disk into JDBM, and the API used ByteBuffers as the=20 main transfer object, then it would be about ten times faster than using=20 byte arrays "in the middle" (file->byte[]->file). What you'd do is map=20 the file to be store using a MappedByteBuffer, and then store that. If=20 the internal code just uses buffer.write(databaseFileBuffer) there is no=20 byte[] allocated, ever. Another thing to consider would be to allow such buffers to be reused.=20 It would make it possible to deserialize objects without allocating new=20 buffers. If one is interested in serializing objects into the database=20 it seems possible to do so without ever creating a byte[]. I'll try to find out if these gains are actual or just theoretical. /Rickard |
From: austin-tx-sfnet-user <lok...@ya...> - 2004-08-18 21:25:15
|
Since we're talking about splitting off of 1.x and all this refactoring, we probably ought to agree on a minimum VM level. Discussion of NIO implies at least JDK 1.4. Personally, I have not problem with that. But it's probably worthwhile making sure a majority of the users are ready to give up support on JDK 1.3... --- Rickard_Öberg <ri...@dr...> wrote: > Rickard Öberg wrote: > > After a quick look into the NIO API's it seems like using > ByteBuffer > > instead of byte[] would work, since they can be mapped to > files and > > hence don't have to wrap physical arrays. That way a buffer > can be of > > any size, yes doesn't impact the basic API negatively, which > using > > streams instead of byte[] might do. > > After investigating some more it seems like if you, for > example, want to > insert a file on disk into JDBM, and the API used ByteBuffers > as the > main transfer object, then it would be about ten times faster > than using > byte arrays "in the middle" (file->byte[]->file). What you'd do > is map > the file to be store using a MappedByteBuffer, and then store > that. If > the internal code just uses buffer.write(databaseFileBuffer) > there is no > byte[] allocated, ever. > > Another thing to consider would be to allow such buffers to be > reused. > It would make it possible to deserialize objects without > allocating new > buffers. If one is interested in serializing objects into the > database > it seems possible to do so without ever creating a byte[]. > > I'll try to find out if these gains are actual or just > theoretical. > > /Rickard > > > > ------------------------------------------------------- > SF.Net email is sponsored by Shop4tech.com-Lowest price on > Blank Media > 100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only > $33 > Save 50% off Retail on Ink & Toner - Free Shipping and Free > Gift. > http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285 > _______________________________________________ > Jdbm-general mailing list > Jdb...@li... > https://lists.sourceforge.net/lists/listinfo/jdbm-general > |
From: Alex B. <boi...@in...> - 2004-08-18 22:04:07
|
Good point. I'm fine with a JDK 1.4.x requirement for JDBM 2.x. An alternative would be to maintain two implementations of the low-level=20 disk access classes to support JDK 1.3.x. alex austin-tx-sfnet-user wrote: > Since we're talking about splitting off of 1.x and all this > refactoring, we probably ought to agree on a minimum VM level. > Discussion of NIO implies at least JDK 1.4. >=20 > Personally, I have not problem with that. But it's probably > worthwhile making sure a majority of the users are ready to give > up support on JDK 1.3... >=20 > --- Rickard_=D6berg <ri...@dr...> wrote: >=20 >=20 >>Rickard =D6berg wrote: >> >>>After a quick look into the NIO API's it seems like using >> >>ByteBuffer=20 >> >>>instead of byte[] would work, since they can be mapped to >> >>files and=20 >> >>>hence don't have to wrap physical arrays. That way a buffer >> >>can be of=20 >> >>>any size, yes doesn't impact the basic API negatively, which >> >>using=20 >> >>>streams instead of byte[] might do. >> >>After investigating some more it seems like if you, for >>example, want to=20 >>insert a file on disk into JDBM, and the API used ByteBuffers >>as the=20 >>main transfer object, then it would be about ten times faster >>than using=20 >>byte arrays "in the middle" (file->byte[]->file). What you'd do >>is map=20 >>the file to be store using a MappedByteBuffer, and then store >>that. If=20 >>the internal code just uses buffer.write(databaseFileBuffer) >>there is no=20 >>byte[] allocated, ever. >> >>Another thing to consider would be to allow such buffers to be >>reused.=20 >>It would make it possible to deserialize objects without >>allocating new=20 >>buffers. If one is interested in serializing objects into the >>database=20 >>it seems possible to do so without ever creating a byte[]. >> >>I'll try to find out if these gains are actual or just >>theoretical. >> >>/Rickard >> >> >> >>------------------------------------------------------- >>SF.Net email is sponsored by Shop4tech.com-Lowest price on >>Blank Media >>100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only >>$33 >>Save 50% off Retail on Ink & Toner - Free Shipping and Free >>Gift. >>http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285 >>_______________________________________________ >>Jdbm-general mailing list >>Jdb...@li... >>https://lists.sourceforge.net/lists/listinfo/jdbm-general >> >=20 >=20 >=20 >=20 > ------------------------------------------------------- > SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media > 100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33 > Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift. > http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285 > _______________________________________________ > Jdbm-general mailing list > Jdb...@li... > https://lists.sourceforge.net/lists/listinfo/jdbm-general --=20 Alex Boisvert boi...@in... Engineering Manager, Intalio Inc. www.intalio.com The Business Process Management Company (650) 577 4700 x225 This message is intended only for the use of the Addressee and may contain information that is PRIVILEGED and CONFIDENTIAL. If you are not the intended recipient, dissemination of this communication is prohibited. If you have received this communication in error, please erase all copies of the message and its attachments and notify us immediately. |
From: Kris L. <kl...@im...> - 2004-08-19 00:48:25
|
>Discussion of NIO implies at least JDK 1.4. Fine with me, Kris > |
From: <ri...@dr...> - 2004-08-19 06:55:45
|
austin-tx-sfnet-user wrote: > Since we're talking about splitting off of 1.x and all this > refactoring, we probably ought to agree on a minimum VM level. > Discussion of NIO implies at least JDK 1.4. > > Personally, I have not problem with that. But it's probably > worthwhile making sure a majority of the users are ready to give > up support on JDK 1.3... Right, that is obviously an important point. Our company and products always need to work with the latest stuff, so 1.4 is mandatory for us. If the NIO API's turn out to be as useful as they seem, it should be a big boost. Another thing I'm also doing in my own refactoring is adding better support for logging, through monitors. In other words, instead of doing log messages in the code it calls monitors, one popular implementation of which would be to create a log message. This opens up for more customized monitoring of the database, and doesn't pollute the core code with this or that logging package. /Rickard |
From: <ri...@dr...> - 2004-11-26 15:49:43
|
Rickard =D6berg wrote: > Another thing I'm also doing in my own refactoring is adding better=20 > support for logging, through monitors. In other words, instead of doing= =20 > log messages in the code it calls monitors, one popular implementation=20 > of which would be to create a log message. This opens up for more=20 > customized monitoring of the database, and doesn't pollute the core cod= e=20 > with this or that logging package. Just wanted to tell everyone the status of this. I've realized that=20 there's no way in hell that I'm going to have time to complete any of=20 these refactorings, so if anyone else have ideas that they want to play=20 with, don't let the above stand in your way, cuz it's vaporware :-/ regards, Rickard |
From: Alex B. <boi...@in...> - 2004-08-16 23:34:35
|
Rickard =D6berg wrote: > Well, I only noticed it when I had to make a batch insert of a couple o= f=20 > million objects. While the computer was chugging along I read the TM=20 > code and realized it could be made much more efficient. Did you for=20 > example know that the ObjectOutputStream constructor is dead slow? Or=20 > that serializing an object using the plain OOS will write tons of stuff= =20 > just for the class descriptor? Stuff like that could be much improved.=20 > Adding a buffered stream beetween the OOS and the file stream would do=20 > wonders as well. This sounds promising! Can't wait to get the patches!! <grin> Just to get it out in the open... something I've been trying to do is to=20 make you pay for serialization only if you need it. If you can find a=20 way to serialize your objects yourself (using better encoding or just=20 wasting fewer cycle), then you can pass byte arrays (byte[]) directly=20 and then JDBM ends up just passing data buckets around -- very efficient. > On the bigger issues I would like to make a refactoring to make JDBM=20 > more PicoContainer friendly, which essentially just means to decouple=20 > things even more. This isn't strictly necessary, but would make it=20 > easier to extend and compose different types of managers. Sounds good too. > Another biggie would be to have transactional support, but without the=20 > log file. When doing batch inserts using transactions help minimize the= =20 > number of writes, but the log file makes up for a lot of unnecessary=20 > disk access. I'm curious as to how you plan on getting rid of the log file and still=20 maintain crash-proof reliability. Care to share your thoughts? > I'll take that as a "go for it" :-) Please do. alex |
From: <ri...@dr...> - 2004-08-17 05:20:25
|
Alex Boisvert wrote: > Just to get it out in the open... something I've been trying to do is to > make you pay for serialization only if you need it. If you can find a > way to serialize your objects yourself (using better encoding or just > wasting fewer cycle), then you can pass byte arrays (byte[]) directly > and then JDBM ends up just passing data buckets around -- very efficient. Yup, that's one way. Yesterday I refactored RecordManager into three interfaces. One RecordStore which only deals with byte[] and knows nothing about serializers. One ObjectStore which sits on top of a RecordStore and deals with Objects through serializers. And then an ObjectMap that uses an ObjectStore to accomplish the named map thingy. This simple refactoring split up the three core concerns of the current RecordManager quite nicely I think. >> Another biggie would be to have transactional support, but without the >> log file. When doing batch inserts using transactions help minimize >> the number of writes, but the log file makes up for a lot of >> unnecessary disk access. > > I'm curious as to how you plan on getting rid of the log file and still > maintain crash-proof reliability. Care to share your thoughts? No, it wouldn't be crash-proof. I am interested in having a mode which is more geared towards performance than crash-resilience. Using transactions without a log file would accomplish that I think. regards, Rickard |
From: Kris L. <kl...@im...> - 2004-08-16 21:49:57
|
Hi! If you would allow my vote. 1) BTree index key compression = -1 - see itch #1 below. 2) Implement collections interfaces to follow Java idioms = 0 3) Improve concurrency and transaction management = +1 4) Improve I/O -- think java.nio = -1 - Currently preforms very nicely. I have a couple of itchs. <grin> 1) Compress before storing (could reduce need for key compression) 2) Encryption of DB data (after compression) (Why? to keep end users of our applications from accessing data that they are not allowed to see.) Thanks, Kris PS - I have implemented JDBM very successfully in two different applications that run 24/7/365. Thank you very much for a great piece of code. Alex Boisvert wrote: > > Rickard, > > First, thanks for bringing that up! Without an official roadmap, I > guess we're unofficially in maintenance mode ;) > > Cheap jokes aside, the reality is that I've done very little work on > JDBM in the past years. I haven't declared maintenance measures yet > for a good reason: I see plenty of work ahead! It's just that I > haven't gotten to it! > > I'm very open to having people participate, improve and generally > inject more life into JDBM. I've granted some CVS commit access in > the past to promote this... and have accepted enhancement patches, > bug fixes, etc. But agreed, it's been slow and somewhat dead for a > while over here. > > I think the best place to start is to formulate specific goals as to > what people want and what needs to be done. The use of streams in > the TransactionManager doesn't bother me much because JDBM fits my > needs, but if you think you can do better and want to contribute > something, great! I'll make sure you've got (or anybody else has) > the proper power to make it happen. > > Things that have been discussed in the past that I think are worthy: > > 1) BTree index key compression > > 2) Implement collections interfaces to follow Java idioms > > 3) Improve concurrency and transaction management > (JDBM is thread-safe but isn't exactly MP-friendly) > > 4) Improve I/O -- think java.nio > > In all, I'd be happy to provide support and guidance for any of those > projects. In following the open-source philosophy, I'd also be happy > to see people scratch their own itch since mine don't itch much (or > enough) these days. <grin> > > cheers, > alex > > > Rickard Öberg wrote: > >> Hi! >> >> I am wondering about the status of JDBM. Not much seem to be >> happening, and yet there are a number of things that could be >> improved in the code (the use of streams in TransactionManager is >> "interesting" for example). Is there a plan on getting new things >> done, or is JDBM officially in maintenance mode? >> >> regards, >> Rickard > > > > ------------------------------------------------------- > SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media > 100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33 > Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift. > http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285 > _______________________________________________ > Jdbm-general mailing list > Jdb...@li... > https://lists.sourceforge.net/lists/listinfo/jdbm-general > > . > |
From: Alex B. <boi...@in...> - 2004-08-17 00:13:03
|
Kris, Very cool that you're using JDBM 24/7! So am I! Just to clarify that BTree index key compression is not your typical Hoffman-style compression. Key compression refers to the inherent redundancy in consecutive keys in an index. Because keys are ordered, they generally share a common prefix. And the bigger the index, the longer the prefix. So if you have the following keys on a BPage: AAAABCDEF, AAAACDEFG, AAAADEFGH then you would "compress" the keys by using a prefix (usually stored on the BPage) like this: PREFIX = AAAA KEYS = *BCDEF, *CDEFG, *DEFGH (where * represents the common prefix) This typically works much better than Hoffman-style compression applied to a single key and is relatively cheap CPU-wise. You can also store the prefix on a parent BPage and achieve even greater compression ratios. As to encryption of DB data, I guess we could provide some kind of interceptor at the RecordFile level. Did you have something specific in mind? Standard encryption or public key crypto? alex Kris Leite wrote: > If you would allow my vote. > > 1) BTree index key compression = -1 - see itch #1 below. > 2) Implement collections interfaces to follow Java idioms = 0 > 3) Improve concurrency and transaction management = +1 > 4) Improve I/O -- think java.nio = -1 - Currently preforms very nicely. > > I have a couple of itchs. <grin> > > 1) Compress before storing (could reduce need for key compression) > 2) Encryption of DB data (after compression) (Why? to keep end users of > our applications from accessing data that they are not allowed to see.) > > Thanks, > Kris > > PS - I have implemented JDBM very successfully in two different > applications that > run 24/7/365. Thank you very much for a great piece of code. |
From: Kris L. <kl...@im...> - 2004-08-17 05:42:22
|
Alex, BTree index key compression is very effective if there is numerous common prefix(s). Personal experience with my applications have not showen that key compression has been worth the work to put that feature in. On the other hand, DB compression would appear to be useful. In my case, the objects are being serialized into XML, compression of the byte[] might reduce the DB. I use the word "appear" since I have not done any tests to confirm my theory. As for encryption, simple standard encryption is best for performance. Also public key cyrpto is just over kill for my needs. Thanks for letting me put in my 2 cents. Regards, Kris Alex Boisvert wrote: > > Kris, > > Very cool that you're using JDBM 24/7! So am I! > > Just to clarify that BTree index key compression is not your typical > Hoffman-style compression. > > Key compression refers to the inherent redundancy in consecutive keys > in an index. Because keys are ordered, they generally share a common > prefix. And the bigger the index, the longer the prefix. > > So if you have the following keys on a BPage: > > AAAABCDEF, AAAACDEFG, AAAADEFGH > > then you would "compress" the keys by using a prefix (usually stored > on the BPage) like this: > > PREFIX = AAAA > KEYS = *BCDEF, *CDEFG, *DEFGH > (where * represents the common prefix) > > This typically works much better than Hoffman-style compression > applied to a single key and is relatively cheap CPU-wise. > > You can also store the prefix on a parent BPage and achieve even > greater compression ratios. > > As to encryption of DB data, I guess we could provide some kind of > interceptor at the RecordFile level. Did you have something specific > in mind? Standard encryption or public key crypto? > > alex > > > Kris Leite wrote: > >> If you would allow my vote. >> >> 1) BTree index key compression = -1 - see itch #1 below. >> 2) Implement collections interfaces to follow Java idioms = 0 >> 3) Improve concurrency and transaction management = +1 >> 4) Improve I/O -- think java.nio = -1 - Currently preforms very >> nicely. >> >> I have a couple of itchs. <grin> >> >> 1) Compress before storing (could reduce need for key compression) >> 2) Encryption of DB data (after compression) (Why? to keep end >> users of >> our applications from accessing data that they are not allowed to see.) >> >> Thanks, >> Kris >> >> PS - I have implemented JDBM very successfully in two different >> applications that >> run 24/7/365. Thank you very much for a great piece of code. > > > . > |
From: Andreas H. <and...@de...> - 2004-08-16 22:03:29
|
Hi Rickard, Alex, I'd be happy to participate, since some things itch quite a bit here :) The use of streams and having lower memory requirements when adding data would be my favorite, since I'd like to be able to add 500+ MB files to the BTree. 1) BTree index key compression and 2) Implement Collections interface to follow Java idioms are the next on the list, however I don't know how much work would be involved. Well, for 2) I know, since I have already started. 3) and 4) would be very nice things to happen, I wouldn't say no to performance enhancements. Related to the index compression: how could that be best accomplished? For my usage scenario, it would be necessary that keys are maintained across indexes, since I have different indexes and key combinations for the same data. Would that be possible? Also, not sure how much overhead the discussed changes would add to JDBM. One thing I really like about JDBM is its small footprint. Regards, Andreas. -- http://sw.deri.ie/~aharth/ Got FOAF? |