AW: [gentle-devel] Merge Replication (SQL Server 2000)

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

=20

Hello Martin,

> ServerA has IDs from 1..500
> ServerB has IDs from 501..1000
> ServerC has IDs from 1001..1500

You should avoid those nifty tricks. I usually have a similar problem. =
I'm not using replication, but I'm giving away unallocated Ids to the =
outside world via a web interface. In HTML forms, it is nice to give the =
client the ID the entry will have when it is created. This way you can =
avoid two rows being created when the user is impatiant and clicks the =
submit button twice; the second submit will result in a PK-collision, =
because the record was already created - but only if it really was =
created.

About your replication: I've read three ways of handling the problem of =
colliding PKs:

1) Partition the ID space (i.e. like your example)
2) Interleave the ID space (PK =3D ID * ServerCount + ServerNum)
2) Use really unique Ids (UUIDs)
3) Use two columns, one for the server (scope), and one for the ID =
itself

Partitioning will lead to really *BIG* problems if one of the server =
overflows his partition, so you would have to constantly monitor them. =
At least instead of partitioning, you should do interleaving. However, =
this means you have to know the number of servers in advance. =
Interleaving will give the first server [1,4,7,10,13,...], the second =
server [2,5,8,11,14,...], the third [3,6,9,12,15,..]. This way, each =
server will have at least 715,827,882 Ids. However, the problem with =
partitioning is, the server wont do an integer overflow when it really =
overflows his partition, but instead continue and create colliding Ids, =
which you will have to merge manually.

If you really want to partition the ID space, provide big enough =
partitions for each server, like using 6 bits for the partition, and 25 =
bits for the ID. This will give you enough room for additional servers =
(2^6 =3D 63), and enough IDs to avoid a server overflowing this =
partition (2^25 =3D 33,554,432). It wont hurt to use bigints, this will =
take away the task to regularly monitor the servers for overflows.

Using two columns will mean you have to add the two columns not only for =
PKs, but also for FKs, and change indices properly. This method is =
"clean", and a little faster than using UUIDs, because the scope is =
simple to assign. Because both scope and ID are indexed, there wont be =
little performance hurt from the additional PK column. The scope could =
be a simple int which stores the IP of the server. Overhead is only 4 =
bytes per row, at the additional benefit to see exactly where merged =
records originated.

I'm using constantly UUIDs, because they have additional benefits. You =
can aquire an ID without an additional database roundtrip. I'm filling =
my ID-column in my Gentle BOs simply with Guid.NewGuid() and submit the =
record. Gentle does not need to look up the assigned value from the =
database. This makes it easier to build graphs of objects, because you =
can build all objects, and then submit them once. I'm also using them in =
the HTML forms as described above. You should really consider using them =
with replication, because you do not have to maintain anything with =
them. Server or business logic simply create them, without the risk of =
any collisions.

Regards, Alex
=20