From: Bryan T. <br...@sy...> - 2006-05-30 22:45:38
|
Colin, Just to clarify - you can obtain the current developer revisions from anonymous cvs. And if you have a patch for a packing routine for jdbm, then you email the list and use the sourceforge features to attach the patch to an issue and one of the committers will take a look at integrating it into jdbm. Thanks, -bryan -----Original Message----- From: jdb...@li... [mailto:jdb...@li...]On Behalf Of Bryan Thompson Sent: Tuesday, May 30, 2006 6:27 PM To: Colin Evans Cc: JDBM Developer listserv Subject: RE: [Jdbm-developer] Free page selection bug Colin, If you are using the developer CVS version, then all you have to do is turn on lazyInsert using the RecordManagerOptions. However, this works within the context of a transaction. Once objects get flushed from the object cache to the database their "initial" size is fixed, even when using the lazyInsert option. What lazyInsert does is defer the installation of the object into the database as long as possible, but not past the end of a transaction in any case. If, on the other hand, your records grow in size over multiple transactions and you wind up never reusing the smaller records that you created originally then jdbm does not have a good solution for you. The the lazyInsert will not help you, and neither will the page at a time installation of objects that I am working on now. Perhaps the easiest thing would be to write a custom serializer that automatically padded out the records to multiples of expected record sizes, much as you describe below. However you will still have a lot of unreclaimed space in the store from small records that are no longer in use. We have explored some ideas for reclaiming such objects. One is the merging and splitting idea that Kevin has discussed. -bryan -----Original Message----- From: Colin Evans [mailto:col...@sr...] Sent: Tuesday, May 30, 2006 5:22 PM To: Bryan Thompson Subject: Re: [Jdbm-developer] Free page selection bug Hi Bryan, Thanks for your help on this. This approach results in similar performance as before. The major problem is that the size of records in my app increase slowly over time, so JDBM is being forced to constantly reallocate records as they grow, often not at uniform size intervals. I was able to get significantly better performance by pre-allocating space in the records, and doubling the amount of pre-allocated space when existing pre-allocated space is exhaused (i.e. I pre-allocate arrays of size 1, 2, 4, 8, 16..) Doing this inside the app results in JDBM hunting for free pages much less often, and gives a worst-case wasted space limit of 2X, which is better than the 100X that I am getting now. Also, I'm not working on the lazyInsert feature -- I'm not currently a JDBM commiter, though I use the code extensively. Thanks Colin Bryan Thompson wrote: > Colin, > > Did that work for you? > > Also, are you using the lazyInsert feature? It defers installations of > serialized objects onto pages while they are in the cache. This can help > if your problem is that objects are being installed onto physical rows too > soon - ie., before they reach their stable size. > > -bryan > > -----Original Message----- > From: Colin Evans [mailto:col...@sr...] > Sent: Thursday, May 25, 2006 4:43 PM > To: Bryan Thompson > Subject: Re: [Jdbm-developer] Free page selection bug > > > Thanks Bryan, I'll try this out. > > Bryan Thompson wrote: > >> Colin, >> >> I am not seeing as much grief from the allocation strategy as you >> describe. I expect that there is an interaction with the allocations >> that are being performed by your application. The change was introduced >> since it produced smaller stores for our RDFS database without a loss in >> speed, but clearly it is a loose for you. Let me look at the code and >> see what we can change. >> >> With respect to record copying, you should be able to refactor what you >> need out of the dump utility. It has mechanisms for scanning records >> from a page perspective, but it does not attempt to do inserts with a >> pre-defined logical record id. I can look into that I suppose, but let's >> see where the update (below) gets you. >> >> I've attached a modified version of FreePhysicalRowIdPage. Please try >> it out and let me know what impact it has on your application. It passes >> the jdbm test suite, but there are no performance tests that specifically >> target this feature. It has no performance impact (time or disk space) on >> our RDFS database application. >> >> -bryan >> >> /** >> * JDBM LICENSE v1.00 >> * >> * Redistribution and use of this software and associated documentation >> * ("Software"), with or without modification, are permitted provided >> * that the following conditions are met: >> * >> * 1. Redistributions of source code must retain copyright >> * statements and notices. Redistributions must also contain a >> * copy of this document. >> * >> * 2. Redistributions in binary form must reproduce the >> * above copyright notice, this list of conditions and the >> * following disclaimer in the documentation and/or other >> * materials provided with the distribution. >> * >> * 3. The name "JDBM" must not be used to endorse or promote >> * products derived from this Software without prior written >> * permission of Cees de Groot. For written permission, >> * please contact cg...@cd.... >> * >> * 4. Products derived from this Software may not be called "JDBM" >> * nor may "JDBM" appear in their names without prior written >> * permission of Cees de Groot. >> * >> * 5. Due credit should be given to the JDBM Project >> * (http://jdbm.sourceforge.net/). >> * >> * THIS SOFTWARE IS PROVIDED BY THE JDBM PROJECT AND CONTRIBUTORS >> * ``AS IS'' AND ANY EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT >> * NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND >> * FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL >> * CEES DE GROOT OR ANY CONTRIBUTORS BE LIABLE FOR ANY DIRECT, >> * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES >> * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR >> * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) >> * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, >> * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) >> * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED >> * OF THE POSSIBILITY OF SUCH DAMAGE. >> * >> * Copyright 2000 (C) Cees de Groot. All Rights Reserved. >> * Contributions are Copyright (C) 2000 by their associated contributors. >> * >> * $Id: FreePhysicalRowIdPage.java,v 1.2 2005/11/08 20:58:28 thompsonbry >> > Exp > >> $ >> */ >> >> package jdbm.recman; >> >> /** >> * Class describing a page that holds physical rowids that were freed. >> */ >> final class FreePhysicalRowIdPage extends PageHeader { >> // offsets >> private static final short O_COUNT = PageHeader.SIZE; // short count >> static final short O_FREE = O_COUNT + Magic.SZ_SHORT; >> static final short ELEMS_PER_PAGE = >> (RecordFile.BLOCK_SIZE - O_FREE) / FreePhysicalRowId.SIZE; >> >> /** >> * Used to place a limit on the wasted capacity resulting in a modified >> * first fit policy for re-allocated of free records. This value is the >> * maximum first fit waste that is accepted when scanning the available >> * slots on a given page of the free physical row page list. >> */ >> static final transient long wastedCapacityLimit = 128; >> >> /** >> * Used to place a limit on the wasted capacity resulting in a modified >> * first fit policy for re-allocated of free records. This value is the >> * upper bound of waste that is accepted before scanning another page on >> the >> * free physical row page list. If no page can be found whose waste for >> > the > >> * re-allocation request would be less than this value then a new page >> > will > >> * be allocated and the requested physical row will be allocated from >> > that > >> * new page. >> */ >> static final transient long wastedCapacityLimit2 = PageHeader.SIZE/4; >> >> // slots we returned. >> FreePhysicalRowId[] slots = new FreePhysicalRowId[ELEMS_PER_PAGE]; >> >> /** >> * Constructs a data page view from the indicated block. >> */ >> FreePhysicalRowIdPage(BlockIo block) { >> super(block); >> } >> >> /** >> * Factory method to create or return a data page for the >> * indicated block. >> */ >> static FreePhysicalRowIdPage getFreePhysicalRowIdPageView(BlockIo >> > block) > >> { >> BlockView view = block.getView(); >> if (view != null && view instanceof FreePhysicalRowIdPage) >> return (FreePhysicalRowIdPage) view; >> else >> return new FreePhysicalRowIdPage(block); >> } >> >> /** Returns the number of free rowids */ >> short getCount() { >> return block.readShort(O_COUNT); >> } >> >> /** Sets the number of free rowids */ >> private void setCount(short i) { >> block.writeShort(O_COUNT, i); >> } >> >> /** Frees a slot */ >> void free(int slot) { >> get(slot).setSize(0); >> setCount((short) (getCount() - 1)); >> } >> >> /** Allocates a slot */ >> FreePhysicalRowId alloc(int slot) { >> setCount((short) (getCount() + 1)); >> return get(slot); >> } >> >> /** Returns true if a slot is allocated */ >> boolean isAllocated(int slot) { >> return get(slot).getSize() != 0; >> } >> >> /** Returns true if a slot is free */ >> boolean isFree(int slot) { >> return !isAllocated(slot); >> } >> >> >> /** Returns the value of the indicated slot */ >> FreePhysicalRowId get(int slot) { >> if (slots[slot] == null) { >> slots[slot] = new FreePhysicalRowId(block, >> > slotToOffset(slot)); > >> } >> return slots[slot]; >> } >> >> /** Converts slot to offset */ >> short slotToOffset(int slot) { >> return (short) (O_FREE + >> (slot * FreePhysicalRowId.SIZE)); >> } >> >> /** >> * Returns first free slot, -1 if no slots are available >> */ >> int getFirstFree() { >> for (int i = 0; i < ELEMS_PER_PAGE; i++) { >> if (isFree(i)) >> return i; >> } >> return -1; >> } >> >> /** >> * Returns first slot with available size >= indicated size, >> * or -1 if no slots are available. >> * >> * @param size The requested allocation size. >> **/ >> int getFirstLargerThan(int size) { >> /* >> * Tracks slot of the smallest available physical row on the page. >> */ >> int bestSlot = -1; >> /* >> * Tracks size of the smallest available physical row on the page. >> */ >> int bestSlotSize = 0; >> /* >> * Scan each slot in the page. >> */ >> for (int i = 0; i < ELEMS_PER_PAGE; i++) { >> /* >> * When large allocations are used, the space wasted by the >> first >> * fit policy can become very large (25% of the store). The >> first >> * fit policy has been modified to only accept a record with a >> * maximum amount of wasted capacity given the requested >> allocation >> * size. >> */ >> // Note: isAllocated(i) is equiv to get(i).getSize() != 0 >> long theSize = get(i).getSize(); // capacity of this free >> record. >> long waste = theSize - size; // when non-negative, record has >> suf. capacity. >> if( waste >= 0 ) { >> if( waste < wastedCapacityLimit ) { >> return i; // record has suf. capacity and not too much >> > waste. > >> } else if( bestSlotSize >= size ) { >> /* >> * This slot is a better fit that any that we have seen so >> * far on this page so we update the slot# and available >> * size for that slot. >> */ >> bestSlot = i; >> bestSlotSize = size; >> } >> } >> } >> if( bestSlot != -1 ) { >> /* >> * An available slot was identified that is large enough, but it >> * exceeds the first wasted capacity limit. At this point we check >> * to see whether it is under our second wasted capacity limit. If >> * it is, then we return that slot. >> */ >> long waste = bestSlotSize - size; // when non-negative, record >> has suf. capacity. >> if( waste >= 0 && waste < wastedCapacityLimit2 ) { >> // record has suf. capacity and not too much waste. >> return bestSlot; >> } >> /* >> * Will scan next page on the free physical row page list. >> */ >> } >> return -1; >> } >> >> } >> >> > > ------------------------------------------------------- All the advantages of Linux Managed Hosting--Without the Cost and Risk! Fully trained technicians. The highest number of Red Hat certifications in the hosting industry. Fanatical Support. Click to learn more http://sel.as-us.falkag.net/sel?cmd=lnk&kid=107521&bid=248729&dat=121642 _______________________________________________ Jdbm-developer mailing list Jdb...@li... https://lists.sourceforge.net/lists/listinfo/jdbm-developer |