|
From: <cu...@us...> - 2008-03-12 22:09:03
|
Revision: 5
http://bailey.svn.sourceforge.net/bailey/?rev=5&view=rev
Author: cutting
Date: 2008-03-12 15:09:08 -0700 (Wed, 12 Mar 2008)
Log Message:
-----------
Add a random query generator.
Modified Paths:
--------------
trunk/src/java/org/apache/bailey/Document.java
trunk/src/java/org/apache/bailey/Field.java
trunk/src/java/org/apache/bailey/Query.java
trunk/src/test/org/apache/bailey/Generator.java
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: <cu...@us...> - 2008-03-14 20:23:08
|
Revision: 8
http://bailey.svn.sourceforge.net/bailey/?rev=8&view=rev
Author: cutting
Date: 2008-03-14 13:23:14 -0700 (Fri, 14 Mar 2008)
Log Message:
-----------
Simplify search API, implement it for heap db & add a test case.
Modified Paths:
--------------
trunk/src/java/org/apache/bailey/Document.java
trunk/src/java/org/apache/bailey/Query.java
trunk/src/java/org/apache/bailey/Results.java
trunk/src/java/org/apache/bailey/heap/HeapDatabase.java
trunk/src/test/org/apache/bailey/Generator.java
trunk/src/test/org/apache/bailey/TestHeapDb.java
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: <ni...@us...> - 2008-03-17 14:19:25
|
Revision: 10
http://bailey.svn.sourceforge.net/bailey/?rev=10&view=rev
Author: ning_li
Date: 2008-03-17 07:19:28 -0700 (Mon, 17 Mar 2008)
Log Message:
-----------
Rename the protocols. Add a synchronized implementation for the protocols.
Modified Paths:
--------------
trunk/src/java/org/apache/bailey/ddb/Client.java
trunk/src/java/org/apache/bailey/ddb/Master.java
trunk/src/java/org/apache/bailey/ddb/Node.java
trunk/src/java/org/apache/bailey/ddb/Range.java
trunk/src/java/org/apache/bailey/ddb/Ring.java
trunk/src/java/org/apache/bailey/util/Pair.java
trunk/src/test/org/apache/bailey/Generator.java
Added Paths:
-----------
trunk/src/java/org/apache/bailey/ddb/ClientToMasterProtocol.java
trunk/src/java/org/apache/bailey/ddb/ClientToNodeProtocol.java
trunk/src/java/org/apache/bailey/ddb/NodeCommand.java
trunk/src/java/org/apache/bailey/ddb/NodeID.java
trunk/src/java/org/apache/bailey/ddb/NodeInfo.java
trunk/src/java/org/apache/bailey/ddb/NodeToMasterProtocol.java
trunk/src/java/org/apache/bailey/ddb/NodeToNodeProtocol.java
trunk/src/java/org/apache/bailey/ddb/simple/
trunk/src/java/org/apache/bailey/ddb/simple/SimpleClient.java
trunk/src/java/org/apache/bailey/ddb/simple/SimpleMaster.java
trunk/src/java/org/apache/bailey/ddb/simple/SimpleNode.java
trunk/src/test/org/apache/bailey/TestSimpleDb.java
Removed Paths:
-------------
trunk/src/java/org/apache/bailey/ddb/ClientMasterProtocol.java
trunk/src/java/org/apache/bailey/ddb/ClientNodeProtocol.java
trunk/src/java/org/apache/bailey/ddb/NodeMasterProtocol.java
trunk/src/java/org/apache/bailey/ddb/NodeNodeProtocol.java
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: Ning L. <nin...@gm...> - 2008-03-17 14:55:32
|
On Fri, Mar 14, 2008 at 5:04 PM, Doug Cutting <cu...@ap...> wrote: > Heartbeats should probably include the range/version currently > searchable. It should also report "load", perhaps its average response > time. I added a couple of things, but more will be needed as we move along. > When creating a new node, a host should ask the master what its id > should be, and the master should allocate new nodes to areas of the ring > that have a heavy load. Yes, the master will make that decision. > The master should also be able to give directives to hosts, indicating > that a node should be dropped (since it is in a cool area of the ring). > Then the host should ask for a new id to replace this. Hosts will be > configured to run a particular number of nodes. > Should we have a HostToMaster protocol in addition to a NodeToMaster > protocol, or should these be the same? None of the host related protocols are there yet. We should decide a few things before we add those? 1 Should nodes run as threads within a host process or as separate processes on a host? Do we still need a host process in the latter case? 2 In the former case, do we have HostToMasterProtocol and ClientToHostProtocol, or do we have NodeToMasterProtocol and ClientToNodeProtocol, or both sets? 3 In the latter case, we should have NodeToMasterProtocol and ClientToNodeProtocol. And HostToMasterProtocol if we have a host process. > Ranges might be <long,long> rather than <String,String>? I changed range to be <long, long> for now. However, a document id is of type String, so we have to convert from String to long. It is hard if we want to maintain the order of the ids after we convert them to longs. Cheers, Ning |
|
From: Doug C. <cu...@ap...> - 2008-03-17 21:13:00
|
Ning Li wrote: > None of the host related protocols are there yet. We should decide > a few things before we add those? > 1 Should nodes run as threads within a host process or as separate > processes on a host? Do we still need a host process in the latter case? I think we'll at least initially implement these as threads. I don't see many advantages of making these separate processes, and it will complicate lots of things. > 2 In the former case, do we have HostToMasterProtocol and > ClientToHostProtocol, or do we have NodeToMasterProtocol and > ClientToNodeProtocol, or both sets? With threads, we don't need both Host and Node protocols, so, unless we think it will make implementation cleaner, we should skip this. Will each node have its own thread, or might a single thread host per send hearbeats for all nodes, with another thread fetching and applying updates, etc. (Since Lucene itself now multithreads updates, we might not want all nodes updating in parallel.) Given that, I think I'd opt for just HostToMasterProtocol and ClientToHostProtocol, with nodes in parameters. Does that sound reasonable to you? > I changed range to be <long, long> for now. However, a document > id is of type String, so we have to convert from String to long. It is > hard if we want to maintain the order of the ids after we convert > them to longs. Right. We need to decide on the bits per point on the ring. Is 64 enough? We also need a Document method to get its ring coordinate. This will by default be a hash of the ID string, but we should store it in a separate field, so that folks can someday specify it explicitly. Doug |
|
From: Yonik S. <yo...@ap...> - 2008-03-17 22:01:32
|
On Mon, Mar 17, 2008 at 5:12 PM, Doug Cutting <cu...@ap...> wrote: > Ning Li wrote: > > I changed range to be <long, long> for now. However, a document > > id is of type String, so we have to convert from String to long. It is > > hard if we want to maintain the order of the ids after we convert > > them to longs. > > Right. We need to decide on the bits per point on the ring. Is 64 > enough? Wouldn't 32 bits be more than enough if it's just used for partitioning (as opposed to using it for a unique id)? -Yonik |
|
From: Ning L. <nin...@gm...> - 2008-03-17 23:39:22
|
On Mon, Mar 17, 2008 at 4:12 PM, Doug Cutting <cu...@ap...> wrote: > Ning Li wrote: > > None of the host related protocols are there yet. We should decide > > a few things before we add those? > > 1 Should nodes run as threads within a host process or as separate > > processes on a host? Do we still need a host process in the latter case? > > I think we'll at least initially implement these as threads. I don't > see many advantages of making these separate processes, and it will > complicate lots of things. They have their pros and cons. But yes, let's start with nodes as threads. > > 2 In the former case, do we have HostToMasterProtocol and > > ClientToHostProtocol, or do we have NodeToMasterProtocol and > > ClientToNodeProtocol, or both sets? > > With threads, we don't need both Host and Node protocols, so, unless we > think it will make implementation cleaner, we should skip this. Will > each node have its own thread, or might a single thread host per send > hearbeats for all nodes, with another thread fetching and applying > updates, etc. (Since Lucene itself now multithreads updates, we might > not want all nodes updating in parallel.) > > Given that, I think I'd opt for just HostToMasterProtocol and > ClientToHostProtocol, with nodes in parameters. Does that sound > reasonable to you? This sounds reasonable. We'd also have a logger thread (or threads) for all the nodes. I still think we should allow multiple update threads? I'll change Node protocols to Host protocols. > > I changed range to be <long, long> for now. However, a document > > id is of type String, so we have to convert from String to long. It is > > hard if we want to maintain the order of the ids after we convert > > them to longs. > > Right. We need to decide on the bits per point on the ring. Is 64 > enough? We also need a Document method to get its ring coordinate. > This will by default be a hash of the ID string, but we should store it > in a separate field, so that folks can someday specify it explicitly. Why cann't we use the string for a point on the ring? Each document can have a unique point and all the documents have an order. Ning |
|
From: Yonik S. <yo...@ap...> - 2008-03-18 00:28:25
|
On Mon, Mar 17, 2008 at 7:39 PM, Ning Li <nin...@gm...> wrote: > Why cann't we use the string for a point on the ring? Each document > can have a unique point and all the documents have an order. Using an int/long with a good hash makes document distribution uniform on the ring, and thus makes it much easier for the master to assign ranges to nodes. For example, for a two node system the master could assign NodeA hashes 0x00000000-0x7FFFFFFF and NodeB hashes 0x80000000-0xFFFFFFFF, knowing nothing about the document ids. How would this work if one uses the String Ids? Or did you mean represent the hash as a String? -Yonik |
|
From: Ning L. <nin...@gm...> - 2008-03-18 15:01:11
|
On Mon, Mar 17, 2008 at 7:28 PM, Yonik Seeley <yo...@ap...> wrote: > Or did you mean represent the hash as a String? Yes. :) Converting a number to a string while maintaining the number order is easy when we know the min and the max numbers. Converting a string to a number while maintaining the string order, on the other hand, is difficult. Ning |
|
From: Doug C. <cu...@ap...> - 2008-03-19 19:06:05
|
Ning Li wrote: > On Mon, Mar 17, 2008 at 7:28 PM, Yonik Seeley <yo...@ap...> wrote: >> Or did you mean represent the hash as a String? > > Yes. :) I had a slightly different idea. I thought that the id string would be the external id provided by the application that we return with hits, e.g., a uri, a filename, etc. We'd also have a numeric 'position' value that places the document on the ring. The position would, by default, be the hash of the id, but an application might override that. It would be a bug for an application to ever provide different positions for the same id. I'd imagined that positions would be longs, but Yonik has argued that they might as well be ints, and I can't think why they couldn't, if we're going to keep the string id too. That makes the default implementation in Java much easier, since it can be hashCode(). Doug |
|
From: Ning L. <nin...@gm...> - 2008-03-19 20:03:54
|
On Wed, Mar 19, 2008 at 2:06 PM, Doug Cutting <cu...@ap...> wrote: > I had a slightly different idea. I thought that the id string would be > the external id provided by the application that we return with hits, > e.g., a uri, a filename, etc. We'd also have a numeric 'position' value > that places the document on the ring. The position would, by default, > be the hash of the id, but an application might override that. It would > be a bug for an application to ever provide different positions for the > same id. Originally, I was thinking simply using the application-specified external id as its 'position' value on the ring. We'd have one value instead of two. No need to check if different positions are ever provided for the same id. The ring distribution won't be uniform in this case. But we have to deal with this case anyway. So the main downside I see is the performance cost with strings - computation, memory... That's why I'm fine with a separate 'position' value. > I'd imagined that positions would be longs, but Yonik has argued that > they might as well be ints, and I can't think why they couldn't, if > we're going to keep the string id too. That makes the default > implementation in Java much easier, since it can be hashCode(). I'm not insisting on longs. But here is what I reasoned. :) I imagined a good number of the applications which would use Bailey would be similar to an email system - the application would provide the 'position' values so that a search on a fraction of all the documents spans a relatively small number of nodes. Let's use Yonik's suggestion to assign such 'position' values: > Of course, fixing my bug it would be (username.hashCode() << 29) | > (id.hashCode() >>> 3) One user may have one document. Another may have a lot. Is 29 bits for username enough? Maybe. But is 3 bits for the documents of a user enough? That means a user's documents cannot span more than 8 nodes. Maybe I over-thought the problem. :) Ning |
|
From: Doug C. <cu...@ap...> - 2008-03-19 20:28:23
|
Ning Li wrote: > The ring distribution won't be uniform in this case. But we have > to deal with this case anyway. So the main downside I see is > the performance cost with strings - computation, memory... > That's why I'm fine with a separate 'position' value. Also, having a well-known place for the application-specified external id is useful too. Lucene lacks this, which makes things like deletion and duplicate detection more complicated than they ought to be. So I think <externalId, version, position, <field>*> better than Lucene's minimalist <field>*. > One user may have one document. Another may have a lot. > Is 29 bits for username enough? Maybe. But is 3 bits for the > documents of a user enough? That means a user's documents > cannot span more than 8 nodes. I only have 50k emails in my archives. Even if I had 500k, one node would be plenty. I've heard that gmail handles all per-user requests on a single node, and gmail allows up to around 500k messages. On the other hand, squeezing the most out of bits is often a premature optimization that's later regretted. Long might be more future-proof. Doug |
|
From: Yonik S. <yo...@ap...> - 2008-03-19 20:33:21
|
On Wed, Mar 19, 2008 at 4:28 PM, Doug Cutting <cu...@ap...> wrote: > On the other hand, squeezing the most out of bits is often a premature > optimization that's later regretted. Long might be more future-proof. Going with a long might be premature optimism that no one will need to create or track by hash :-) Going from 4 to 8 bytes per document could be significant. 32 bits provides 4B slices of the ring! -Yonik |
|
From: Doug C. <cu...@ap...> - 2008-03-19 20:46:03
|
Yonik Seeley wrote: > Going with a long might be premature optimism that no one will need to > create or track by hash :-) I don't follow this. You mean create an array of all positions? > Going from 4 to 8 bytes per document could be significant. A Document is uniquely identified by a String externalId and a long version. The position is not assumed to uniquely identify it. So I'm not sure where the size of the position will be significant. We will be slinging around representations of the ring: the client will need to refresh its frequently. These will have a <position,position> range per node, so these would get bigger. But the ring also has to represent a host per node, plus each node's unique id and its ring position, so the size of the range is not dominant in the size of the ring. > 32 bits provides 4B slices of the ring! But, like IP addresses, having enough doesn't make them easy to divide. Doug |
|
From: Yonik S. <yo...@ap...> - 2008-03-19 21:06:19
|
On Wed, Mar 19, 2008 at 4:46 PM, Doug Cutting <cu...@ap...> wrote: > A Document is uniquely identified by a String externalId and a long > version. The position is not assumed to uniquely identify it. So I'm > not sure where the size of the position will be significant. I'm not exactly sure yet either... but it seems like a node does need to identify all documents within certain arbitrary ranges at some point for rebalancing (and perhaps for filtering too). Will the hash be indexed, stored somehow, or calculated on the fly on the node? > > 32 bits provides 4B slices of the ring! > > But, like IP addresses, having enough doesn't make them easy to divide. True... I guess it depends on if those 32 bits will be used for anything other than a hash (or position) by the application level. -Yonik |
|
From: Yonik S. <yo...@ap...> - 2008-03-19 20:38:48
|
On Wed, Mar 19, 2008 at 4:03 PM, Ning Li <nin...@gm...> wrote: > One user may have one document. Another may have a lot. > Is 29 bits for username enough? Maybe. But is 3 bits for the > documents of a user enough? That means a user's documents > cannot span more than 8 nodes. With that particular application split, a users documents would span 1/8 of the ring (or 29 bits). In a system with 100 nodes, a users email would span ~13 nodes, an it can easily change by changing the split. -Yonik |
|
From: Yonik S. <yo...@ap...> - 2008-03-18 19:57:03
|
On Tue, Mar 18, 2008 at 11:01 AM, Ning Li <nin...@gm...> wrote: > On Mon, Mar 17, 2008 at 7:28 PM, Yonik Seeley <yo...@ap...> wrote: > > Or did you mean represent the hash as a String? > > Yes. :) Ah, OK... so like a variable length int. The ring management and passing of ranges with Strings would be easy enough, but depending on what needs to be done on a node, it may be less efficient. For example, to find a range of hashes (say to split of a piece for rebalancing) we would probably use either the FieldCache, or a colum stored field (upcoming) in Lucene, both of which would be more efficient as an int. > Converting a string to a number while maintaining the string order, > on the other hand, is difficult. If one wants each user in an email system to span a maximum of 1/8th of the ring then the hash could be hash = (username.hashCode() << 29) | (id.hashCode() >> 3) Or if the number of emails per user is small, hash = username.hashCode() -Yonik |
|
From: Ning L. <nin...@gm...> - 2008-03-18 23:32:53
|
On Tue, Mar 18, 2008 at 2:57 PM, Yonik Seeley <yo...@ap...> wrote: > If one wants each user in an email system to span a maximum of 1/8th > of the ring then the > hash could be hash = (username.hashCode() << 29) | (id.hashCode() >> 3) This could work. > Or if the number of emails per user is small, hash = username.hashCode() We cannot really assume that, right? :) Let's use long for now. Ning |
|
From: Yonik S. <yo...@ap...> - 2008-03-18 23:43:08
|
On Tue, Mar 18, 2008 at 7:32 PM, Ning Li <nin...@gm...> wrote: > On Tue, Mar 18, 2008 at 2:57 PM, Yonik Seeley <yo...@ap...> wrote: > > If one wants each user in an email system to span a maximum of 1/8th > > of the ring then the > > hash could be hash = (username.hashCode() << 29) | (id.hashCode() >> 3) > > This could work. Of course, fixing my bug it would be (username.hashCode() << 29) | (id.hashCode() >>> 3) :-) > > Or if the number of emails per user is small, hash = username.hashCode() > > We cannot really assume that, right? :) Well, any assumptions like that are up to the specific application if they want to try their own partitioning. I think most just use the default hash. > Let's use long for now. I'm still not sure I see the value over an int hash, but I guess it's not a big deal as long as we don't have to index it or use the FieldCache for it. That leaves calculating it on the fly on the node when needed, or storing it in a quickly accessible manner (payload or upcomming column store) -Yonik |
|
From: Doug C. <cu...@ap...> - 2008-03-19 21:39:10
|
Yonik Seeley wrote: > On Wed, Mar 19, 2008 at 4:46 PM, Doug Cutting <cu...@ap...> wrote: >> A Document is uniquely identified by a String externalId and a long >> version. The position is not assumed to uniquely identify it. So I'm >> not sure where the size of the position will be significant. > > I'm not exactly sure yet either... but it seems like a node does need > to identify all documents within certain arbitrary ranges at some > point for rebalancing (and perhaps for filtering too). > Will the hash be indexed, stored somehow, or calculated on the fly on the node? The naive way to implement this on Lucene is to make the position be an indexed field, then use a RangeFilter to constrain queries, if that's what you're asking. Mostly we hope that queries will span the entire range of an index & with no need for filtering. But sometimes the Filter will be needed. When replaying the log to a neighbor node we'll also need to filter by position range. So we'll be touching these a lot, but I don't yet see a case where we'd, e.g., want to create a bit vector of occupied positions. They'll be pretty sparse for that, even if only 32 bit. > True... I guess it depends on if those 32 bits will be used for > anything other than a hash (or position) by the application level. Well, we've talked of applications encoding the user in the high 29 bits and message in the low three. Frankly, I have a hard time seeing where 32 bits would be a problem here. Typically what you'll want to do is have a primary field (e.g., user) that can limit the amount of the ring that must be queried, and trade that against the chances that a a single value of that field will overwhelm that portion of the ring. If the master rebalances by load, this will be easier. A single node should probably never index more than a few million items, so if you know that you might have, e.g., 100M items with a given primary field value, then you'd want to make sure that there are at least 100 distinct values within that. But I think 32 bits gives plenty of room for such things. Hey, did I just switch sides again? Doug |
|
From: Yonik S. <yo...@ap...> - 2008-03-19 22:07:35
|
On Wed, Mar 19, 2008 at 5:38 PM, Doug Cutting <cu...@ap...> wrote: > Yonik Seeley wrote: > > I'm not exactly sure yet either... but it seems like a node does need > > to identify all documents within certain arbitrary ranges at some > > point for rebalancing (and perhaps for filtering too). > > Will the hash be indexed, stored somehow, or calculated on the fly on the node? > > The naive way to implement this on Lucene is to make the position be an > indexed field, then use a RangeFilter to constrain queries, if that's > what you're asking. Right. If it's indexed, it seems advantageous to use 32 bits rather than 64 (esp thinking about the .tii file size). Longer term it might make sense to use a column-stride field: https://issues.apache.org/jira/browse/LUCENE-1231 > Mostly we hope that queries will span the entire > range of an index & with no need for filtering. But sometimes the > Filter will be needed. > > When replaying the log to a neighbor node we'll also need to filter by > position range. Or just log the position along with the Id and version. > So we'll be touching these a lot, but I don't yet see a > case where we'd, e.g., want to create a bit vector of occupied > positions. They'll be pretty sparse for that, even if only 32 bit. Sorry for the confusion, I never meant that. I just meant the ability to map from range to documents in that range on the node. -Yonik |
|
From: <ni...@us...> - 2008-03-21 15:36:02
|
Revision: 14
http://bailey.svn.sourceforge.net/bailey/?rev=14&view=rev
Author: ning_li
Date: 2008-03-21 08:36:06 -0700 (Fri, 21 Mar 2008)
Log Message:
-----------
Change search() in the Client-to-Host protocol to return an array of RangeResults. Make id in NodeID a long and change HostID to be a name and a port.
Modified Paths:
--------------
trunk/src/java/org/apache/bailey/ddb/ClientToHostProtocol.java
trunk/src/java/org/apache/bailey/ddb/Host.java
trunk/src/java/org/apache/bailey/ddb/HostID.java
trunk/src/java/org/apache/bailey/ddb/NodeID.java
trunk/src/java/org/apache/bailey/ddb/simple/SimpleClient.java
trunk/src/java/org/apache/bailey/ddb/simple/SimpleHost.java
trunk/src/test/org/apache/bailey/TestSimpleDb.java
Added Paths:
-----------
trunk/src/java/org/apache/bailey/RangeResults.java
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: <ni...@us...> - 2008-03-21 23:15:44
|
Revision: 15
http://bailey.svn.sourceforge.net/bailey/?rev=15&view=rev
Author: ning_li
Date: 2008-03-21 16:15:50 -0700 (Fri, 21 Mar 2008)
Log Message:
-----------
Change the type of begin/end in Range to integer. I forgot it in the last change. I still dream for string in the future. :) Also add getCoverage(Range) to the Ring API.
Modified Paths:
--------------
trunk/src/java/org/apache/bailey/ddb/Mapper.java
trunk/src/java/org/apache/bailey/ddb/Range.java
trunk/src/java/org/apache/bailey/ddb/Ring.java
trunk/src/java/org/apache/bailey/ddb/simple/SimpleClient.java
trunk/src/java/org/apache/bailey/ddb/simple/SimpleHost.java
trunk/src/test/org/apache/bailey/Generator.java
trunk/src/test/org/apache/bailey/TestSimpleDb.java
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: <ni...@us...> - 2008-03-27 17:07:16
|
Revision: 16
http://bailey.svn.sourceforge.net/bailey/?rev=16&view=rev
Author: ning_li
Date: 2008-03-27 09:57:55 -0700 (Thu, 27 Mar 2008)
Log Message:
-----------
Modify the ring to be a real ring. The ring was actually [Integer.MIN_VALUE, Integer.MAX_VALUE) before.
Modified Paths:
--------------
trunk/src/java/org/apache/bailey/ddb/Mapper.java
trunk/src/java/org/apache/bailey/ddb/Range.java
trunk/src/java/org/apache/bailey/ddb/Ring.java
trunk/src/java/org/apache/bailey/ddb/simple/SimpleHost.java
trunk/src/java/org/apache/bailey/util/Pair.java
trunk/src/test/org/apache/bailey/TestSimpleDb.java
Added Paths:
-----------
trunk/src/test/org/apache/bailey/ddb/
trunk/src/test/org/apache/bailey/ddb/TestRange.java
trunk/src/test/org/apache/bailey/ddb/TestRing.java
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: <ni...@us...> - 2008-03-28 20:21:07
|
Revision: 17
http://bailey.svn.sourceforge.net/bailey/?rev=17&view=rev
Author: ning_li
Date: 2008-03-28 13:21:12 -0700 (Fri, 28 Mar 2008)
Log Message:
-----------
1 Add the protocol and the classes related to log propagation. Also add a simple implementation in the withlog package and a test case TestSimpleDbWithLog.
2 Add the RangedDatabase class which contains NodeStatus, Database and Log.
3 Add "getDocs" to the Database class to retrieve a number of documents. This will be used to improve performance during the log propagation. Q: Should Database be aware of Range to support filtered queries based on Range? Or do we make RangedDatabase add a clause to a query before passing it down to Database?
Modified Paths:
--------------
trunk/src/java/org/apache/bailey/Database.java
trunk/src/java/org/apache/bailey/ddb/ClientToHostProtocol.java
trunk/src/java/org/apache/bailey/ddb/Host.java
trunk/src/java/org/apache/bailey/ddb/HostToHostProtocol.java
trunk/src/java/org/apache/bailey/ddb/Mapper.java
trunk/src/java/org/apache/bailey/ddb/Range.java
trunk/src/java/org/apache/bailey/ddb/Ring.java
trunk/src/java/org/apache/bailey/ddb/simple/SimpleClient.java
trunk/src/java/org/apache/bailey/ddb/simple/SimpleHost.java
trunk/src/java/org/apache/bailey/ddb/simple/SimpleMaster.java
trunk/src/test/org/apache/bailey/TestHeapDb.java
trunk/src/test/org/apache/bailey/TestSimpleDb.java
Added Paths:
-----------
trunk/src/java/org/apache/bailey/ddb/HostProtocol.java
trunk/src/java/org/apache/bailey/ddb/IndexAction.java
trunk/src/java/org/apache/bailey/ddb/Log.java
trunk/src/java/org/apache/bailey/ddb/LogEntry.java
trunk/src/java/org/apache/bailey/ddb/RangedDatabase.java
trunk/src/java/org/apache/bailey/ddb/withlog/
trunk/src/java/org/apache/bailey/ddb/withlog/SimpleClient.java
trunk/src/java/org/apache/bailey/ddb/withlog/SimpleHost.java
trunk/src/java/org/apache/bailey/ddb/withlog/SimpleLog.java
trunk/src/java/org/apache/bailey/ddb/withlog/SimpleMaster.java
trunk/src/java/org/apache/bailey/util/HashUtil.java
trunk/src/test/org/apache/bailey/TestSimpleDbWithLog.java
Removed Paths:
-------------
trunk/src/java/org/apache/bailey/ddb/Logger.java
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|