The P2B format was created in an attempt to significantly lighten the bandwidth requirements for transfering PeerGuardian lists. The new format typically results in files 50% smaller than an identical P2P text list. Because it is a binary format and not easily modified without specialized software, it is not recommended for general use in anything but transfer where bandwidth is an issue.
This describes the formats for the binary list formats seen in PeerGuardian 2 (Windows), should you wish to develop an application that uses the same lists.
An eight byte header exists at the start of every P2B list to identify a file and to let a parser know which version it is.
Type | Description |
---|---|
int32 | Always -1 (0xFFFFFFFF) |
char[3] | Magic Number. Always 'P2B' |
uint8 | Version Number. Can currently be 1, 2, or 3. |
P2Bv1 can be thought of as a direct binary mapping of the original P2P format. Directly after the header comes a series of IP ranges:
Type | Description |
---|---|
string | Range label, a zero-terminated C string encoded in ISO-8859-1. |
uint32 | Starting IP, in network byte order. |
uint32 | Ending IP, in network byte order. |
P2Bv2 is identical in format to version 1, except all strings are encoded in UTF-8 for better internationalization.
P2Bv3 was made with the realization that many ranges use the same name. In many cases it produces smaller lists than version 2, and compresses a little better.
Type | Description |
---|---|
uint32 | The amount of labels that follow, in network byte order. |
n strings | n zero-terminated C strings which define the range labels. All strings are encoded in UTF-8. |
uint32 | The amount of ranges that follow, in network byte order. |
n ranges | The IP ranges. |
Type | Description |
---|---|
uint32 | The index of the label associated with this range, in network byte order. |
uint32 | Starting IP, in network byte order. |
uint32 | Ending IP, in network byte order. |
Warning: P2Bv4 development is not yet finished.
P2Bv4 is being made to address the issue of IPv6. In addition to IPv6, it allows a variable amount of fields for each range, metadata about the list, and support for downloading only changes instead of entire lists while updating. P2Bv4 should also be smaller due to support for CIDR notation and a dynamic range format. P2Bv4 also carries a builtin cryptographic signature for verification.
Type | Description |
---|---|
n chunks | n chunks. |
uint8 | Always 0, indicating the end of chunks. |
64 bytes | A SHA-512 hash of all the previous data in the file. |
uint32 | The size of the following digital signature, in network byte order. 0 is a valid size if no digital signature is associated with this file. |
n bytes | A ECDSA-521 signature of the previous hash. |
P2Bv4 is broken down into chunks. Applications can safely ignore chunks they don't recognize. One chunk type can come multiple times.
Type | Description |
---|---|
uint8 | The type of chunk this is. Currently this can be: |
1. Metadata | |
2. Update information | |
3. Strings | |
4. Diff IPs | |
5. IP ranges | |
uint32 | The size of the chunk, including this header. |
Type | Description |
---|---|
n*2 strings | n*2 zero-terminated C strings, which are n key/value pairs that define metadata about the list. These can be the application which made it, the date it was made, etc. All strings are encoded in UTF-8. |
uint8 | Always 0, an empty string designating the end of the list metadata. |
Type | Description |
---|---|
uint16 | Minimum interval, in minutes, at which applications should auto-update the list. If 0, the applications should supply a reasonable value. |
uint64 | A UNIX timestamp of when this list was created. |
string | A URL to be used for pulling down updates. |
uint32 | A counter value for diffs. If 0, this list is not diffable. Any other value should be used while updating lists. Specifically, if the counter value is 10 and the URL is http://phoenixlabs.org/lists/p2p.7z, updaters should attempt to download diffs at p2p.11.7z, p2p.12.7z, p2p.13.7z, and so on, until it gets a 404. |
Type | Description |
---|---|
n strings | n zero-terminated C strings which can be used for labels or any other range fields. All strings are encoded in UTF-8. |
uint8 | Always 0, an empty string designating the end of strings. |
Note that if there are multiple string chunks, they are cumulative for indexes.
Type | Description |
---|---|
uint8 | The type of IP this holds. Currently this can be: |
IPv4 IPs | |
IPv6 IPs | |
uint32 | The amount of IPs that follow, in network byte order. |
n IPs | IP addresses to remove from the previous version, in network byte order. Only the starting IP of ranges needs to be specified. |
The diff IPs chunk is only used in diffs. It specifies the starting IPs of ranges to remove from the previous version of the list.
Type | Description |
---|---|
n range descriptors | n descriptors showing what and in which order range data comes in. |
uint8 | Always 0, a byte designating the end of the range metadata. |
uint32 | The amount of ranges that follow, in network byte order. |
n ranges | The ranges. If this is a diff, this consists of the ranges added or changed since the last version. |
Type | Description |
---|---|
uint8 | The type of value this range field holds. Can currently be: |
uint8 | |
uint16, in network byte order. | |
uint32, in network byte order. | |
uint64, in network byte order. | |
uint128, in network byte order. | |
zero-terminated C string in UTF-8. | |
blob: uint32 prefix in network byte order specifies the length, and the blob data follows. | |
uint32 | The size of the field in bytes, in network byte order. Should be 0 for variable-sized types like strings and blobs. |
string | A short zero-terminated ASCII string describing what this range field contains. Currently, only the following are recognized: |
label: If an integer is used, it is a string index. Otherwise must be a string. | |
startaddr: The starting address of an IP range. Must be a uint32 for IPv4, or uint128 for IPv6. | |
endaddr: The ending address of an IP range. Must be a uint32 for IPv4, or uint128 for IPv6. Can not be used with cidrbits. | |
cidrbits: The CIDR bitmask length. Must be an integer. Can not be used with endaddr. |
A cross-platform C++ library for working with the P2B format is freely available here.
In the future, when bandwidth is cheaper and the information is deemed useful, you may wish to include range metadata (category, geographical region, description, etc.) within a P2B list. If you wish to do so, please coordinate with me (phrosty@gmail.com) to ensure it is done compatibly.
PeerGuardian Text Lists (P2P) Format
eMule Text Lists (DAT) Format
Based upon the old PhoenixLabs Wiki page
Wiki: dev-Main
Wiki: dev-blocklist-format-dat
Wiki: dev-blocklist-format-p2p