I have some questions that I would love if someone can answer them.
q1: is there a way to use bulk-loading to load data to a b-tree? or I have to insert each tuple separately!!
q2: does the newest version of jdbm support duplicates keys? - because the version I am using does not support duplicate key and suggests to inline or reference an object collection (ArrayList for example) as a value!
Before asking my third question I will try to give you an idea of what I am doing:
I am building a secondary index on a table with more than 6 million tuples. However, the attribute I am building the index on it (index key) has only 2500 unique values. This means that for each unique value (for index key) I will have around 2400 tuples. This will lead to having a VERY big _values array which also leads to having very huge records!!! Having huge records will make the size of the btree page extremely big compared to the block size (e.g. 8K). So my question is the following:
q3: I would love if someone can comment on the way btree pages (BPage) are mapped to record manager blocks (BlockIo). That is since I am having variable-length records ==> the sizes' of the tree pages will be also variable (e.g. one page might be stored in 300 blocks while another page might be stored in 500 blocks). I want to count the # of I/Os to scan the btree. So what should I count? the number of pages will not be true because of their variable size?
q4: how to differentiate between sequential I/O and random I/O?
Log in to post a comment.