I have a requirement to persist large blobs. I don't mind letting the store generate my id's for me, and I don't need to find blobs by anything but id.
I realize this doesn't really use some of the more main benefits of JDBM (BTree) - but is there any reason why this is not recommended?
I guess it depends how big your BLOBs are. I wouldn't recommend putting binary objects of more than 1 meg. in JDBM since that's what file systems are for; you'll get better performance, easier access, etc.
With JDBM, you basically have two choices:
1) You put your BLOBs directly in JDBM, in which case JDBM creates the ID's for you. You just have to store those ID's somewhere for retrieval. This works very well for small BLOBs.
2) you put the BLOBs on the file system and you only persist Strings (the filename of the BLOBs) in JDBM. JDBM will, once again, return you a unique ID for each of the BLOB filename. This way, you have a very efficient indirection and your BLOB access is fast & scalable. This works well for larger BLOBs.
Note that you could very well just use a file system, generate the IDs yourself and name your BLOB according to the IDs you generate. This also works well for large BLOBs.
Also, keep in mind that everytime you put BLOBs in the filesystem, you may run into directory sizing limitations. In such cases, you have to distribute the BLOBs in many directories, similar to what is done with the HTree in JDBM.
Thanks Alex -
For very large blobs, my intent was to create files/directories based on ID, for example a blob with ID 123456789 will reside in /01/23/45/67/89/123456789.blob.
I'm totally fine with this for large files, but for text and other small chunks of data, this seems wasteful to me (although I admit this perception of waste may be mostly in my head).
In any case, what I'm really looking for from JDBM is a simple reliable way to persist chunks of text. I'm building an application that I would like to keep as database portable as possible, and it seems every DBMS has a different way of dealing with text.
I seem to remember someone, possibly you, mention the goal to make JDBM an XA Resource, which would make what I am talking about - that being keeping some data in a DBMS and some in JDBM - much more reliable. How far off is this dream to becoming a reality?
Yes, there were plans to make an XAResource implementation for JDBM but that hasn't materialized. To make a true XA resource (2-phase commit) out of JDBM would be a significant project and I'm afraid it would defeat the original purposes (simple, small) of JDBM in the process.
For your specific use case, I think it would be possible to wrap JDBM as an XAResource and handle the two-phase commit by recording/rolling back operations using the current APIs.
Log in to post a comment.
Sign up for the SourceForge newsletter:
You seem to have CSS turned off.
Please don't fill out this field.