Menu

Tree [c5ac69] master /
 History

Read/Write SSH access


File Date Author Commit
 src 2024-03-11 Haitao Song Haitao Song [c5ac69] update package
 tests 2024-03-02 Haitao Song Haitao Song [77ff00] comments update
 .gitignore 2024-03-02 Haitao Song Haitao Song [86486c] adding implementation
 DatedGuid.sln 2024-03-02 Haitao Song Haitao Song [8dcc77] changed to dated guid only
 README.MD 2024-03-11 Haitao Song Haitao Song [c5ac69] update package

Read Me

# Dated guid project
## _Creating a GUID system that has encoded date_

This library provide an Id generation scheme for our future data sharding and hot/warm awareness of the storage objects, as well as database entries using 16 bits to denote day, but kept it in a constrainted within xor encoded

 .NET uses GUID version 4.2  
 See reference: https://datatracker.ietf.org/doc/html/rfc4122#section-6
 https://datatracker.ietf.org/doc/html/rfc4122#section-4.1.2

 The algorithm is as follows:
* Set the two most significant bits (bits 6 and 7) of the clock_seq_hi_and_reserved to zero and one, respectively.
* Set the four most significant bits(bits 12 through 15) of the time_hi_and_version field to the 4-bit version number from Section 4.1.3.
* Set all the other bits to randomly(or pseudo-randomly) chosen values.

## Edianness 
.NET uses GUID v4, it will result in format of: xxxxxxxx-xxxx-[4xxx]-Yxxx-xxxxxxxxxxxx where Y is either 8, 9, a, b, ensuring top bits starts with 10
>  For example, GUID string as:
>   "39 550f 39 - 31cc - 4b79 - 9233 - 1099678b8b21"
>  Corresponding edianness bytes are, when represent in memory, at little edian
>  order of bytes:
>    39  0f55 39 - cc31 - 794c - 9233 - 1099678b8b21
>  note the 794c, in the spec, 4c is actually in this case at the 8th byte, at index 7.

## Dated Guid scheme

### Version Major=D,  Minor = 3.
We overwrite this to force all these v2 signature bits to set to 1101, 11. which ensures avoiding collision with existing .NET generated GUID, GUID v4, instead of version 4's, major=0100(4), minor= 10(2)

### Encoding Date
Notice GUID's last 8 byte 8-15 don't have edianness difference, which we use to encode dates, 2 bytes (24 bits) at the tail of the GUIDs are recruited for supporting timestamps:
* byte index at 14 and 11 encodes date 2nd byte via xor operation
* byte index at 15 and 12 encodes date first byte via xor operation

In this scheme, it has 122 bits random. Four most significant bits of the the byte '0100', The little edian byte order look like this when change to guid, when we parse a correpsonding guid string.
In our dated guid scheme, we propse a GUID version D.3, major being 1101, minor being 11, similar scheme of GUID versioning. However, the rest of bits encodes a date of 16 bits.

## Randomness and Entropy 
GUID v4 has 122 bits random, which is similar to our scheme, with version set to D.3. Within a date, we take 16 bit randomness out, leaving 96 bits instead.

 Within the specific day, entropy is 2^106 bits, which is reduced from 2^122 bits by 16 bits day, however, as our collision scope is overall GUIDs, the cross-day-entropy which our systems care the most. While do reckon within the date, the chance of collision is higher, much like a partitioned GUID v5 scheme entropy.
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.