From: <arn...@us...> - 2007-11-10 18:26:14
|
Revision: 900 http://dcplusplus.svn.sourceforge.net/dcplusplus/?rev=900&view=rev Author: arnetheduck Date: 2007-11-10 10:26:11 -0800 (Sat, 10 Nov 2007) Log Message: ----------- Initial version of adc with tiger support moved out of base Modified Paths: -------------- dcplusplus/trunk/ADC.txt Modified: dcplusplus/trunk/ADC.txt =================================================================== --- dcplusplus/trunk/ADC.txt 2007-11-09 23:37:00 UTC (rev 899) +++ dcplusplus/trunk/ADC.txt 2007-11-10 18:26:11 UTC (rev 900) @@ -15,10 +15,9 @@ Advanced Direct Connect is the first neutral thing that springs to mind =). Many ideas for the protocol come from Jan Vidar Krey's DCTNG draft. Other -contributors include Dustin Brody, Walter Doekes, Timmo Stange, Fredrik -Ullner. Jon Hess -contributed the original Direct Connect idea through the Neo-Modus Direct -Connect client / hub. +contributors include Dustin Brody, Walter Doekes, Timmo Stange, Fredrik +Ullner. Jon Hess contributed the original Direct Connect idea through the +Neo-Modus Direct Connect client / hub. The latest draft version of this document can be downloaded from $URL$. @@ -51,7 +50,9 @@ only include viewable characters that can be encoded by one byte in the UTF-8 encoding (Unicode codepoints 33-127). ADC is case-sensitive, requiring upper case. - +* A session hash function is negotiated for each connection. The session hash + function may not be changed without a complete session renegotiation. + === Message syntax .................... message ::= message_body? eol @@ -71,7 +72,7 @@ my_sid ::= encoded_sid encoded_sid ::= base32_character{4} my_cid ::= encoded_cid -encoded_cid ::= base32_character{39} +encoded_cid ::= base32_character+ base32_character ::= simple_alpha | [2-7] feature_name ::= simple_alpha simple_alphanum{3} escaped_letter ::= [^ \#x0a] | escape 's' | escape 'n' | escape escape @@ -102,6 +103,16 @@ U | UDP message | Clients must use this message type when communicating directly over UDP. ___ +=== Session hash +Certain commands require the use of a hash function. The hash function used is +negotiated each time a new connection is established using the SUP mechanism. +When a client first connects, it offers a set of hash functions as SUP +features. The server picks one of the offered functions and communicates the +choice to the client by placing it before any other hash features present in +the first SUP from the server. Clients and hubs are required to support at +least one hash function, used both for protocol purposes and file +identification. + === Client identification Each client is identified by three different IDs, Session ID (SID), Private ID (PID) and Client ID (CID). @@ -120,14 +131,15 @@ hash of the current time and primary network card MAC address if sufficient randomness cannot be generated. Hubs and clients may not disclose PIDs to other clients; doing so weakens the security of the ADC network. Clients -should should keep the same PID between sessions and hubs. PIDs are 192 bits -and encoded using a 39 byte base32 encoded string. +should should keep the same PID between sessions and hubs. PID length follows +the length of the hash algorithm used for the session. ==== Client ID Client IDs globally and publicly identify a unique client and underlie client -to client communication. They are generated by hashing the 192 bit, unencoded -PID with the Tiger hash algorithm. Hubs should register clients by CID. CIDs -are 192 bits and encoded using a 39 byte base32 encoded string. +to client communication. They are generated by hashing the (unencoded) PID +with the session hash algorithm. Hubs should register clients by CID. CID +length follows the length of the hash algorithm used for the session. Clients +must be prepared to handle CIDs of varying lengths. == Files === File names and structure @@ -138,16 +150,21 @@ properly filter the filename for the target file system, as well as request filenames from other clients according to these rules. The special names '.' and '..' may not occur as a directory or filename; any file list received -containing those must be ignored. Shared files are identified relative to the -unnamed root '/' ("/dir/subdir/filename.ext"), while extensions can add named -roots to this namespace, preferably using their SUP name. "TTH/<root-base32>", -for example, locates a file in the share by TTH root rather than filename. +containing those must be ignored. + +Shared files are identified relative to the unnamed root '/' +("/dir/subdir/filename.ext"), while extensions can add named roots to this +namespace. "TTH/<root-base32>" from the TIGR extension for example, locates a +file in the share by TTH root rather than filename. + Rootless filenames are special, for they may not appear in the file listing, and can be used to supply binary transfers of arbitrary data, but should not be used to avoid polluting the namespace by using a named root. All directory names must end with a '/'. Names in the unnamed root generally find little use -identifying files, as the TTH root does so more effectively. Hence, commands -that get files or file data may never use the unnamed root for selecting file. +identifying files, as the hash of a file does so more effectively. Hence, +commands that get files or file data may never use the unnamed root for +selecting file. It is invalid for normal files to appear in the share without +being identified by at least one hash value. The special, rootless filename "files.xml" specifies the full file listing, uncompressed, in XML using the UTF-8 encoding. Clients can then compress this @@ -164,33 +181,19 @@ "Base" attribute of "FileListing" specifies which directory a particular file list represents. -=== Hashes -ADC clients must share only files hashed using Merkle Hash trees, as defined -by http://www.open-content.net/specs/draft-jchapweske-thex-02.html. The Tiger -algorithm, as specified by http://www.cs.technion.ac.il/~biham/Reports/Tiger/ -functions as the hash algorithm. A base segment size of 1024 bytes must be -used when generating the tree, but clients may then discard parts of the tree -as long as at least 7 levels are kept or a block granularity of 64 KiB is -achieved. +=== File list +files.xml is the list of files intended for browsing. It has the following +general structure: -Generally, the root of the tree serves to identify a file uniquely. Searches -use it and it must be present in the file list. Further, the root of the file -list must also be available, and discoverable via GFI. A client may also -request the rest of the tree using the normal client-client transfer -procedure. The root must be encoded using base32 encoding when converted to -text. - -=== File list -Files.xml is the list of files intended for browsing. It has the following general structure: ---- <?xml version="1.0" encoding="utf-8" standalone="yes"?> <FileListing Version="1" CID="my-cid" Generator="DC++ 0.401" Base="/"> <Directory Name="share"> <Directory Name="DC++ Prerelease"> - <File Name="DCPlusPlus.pdb" Size="17648640" TTH="xxxxxxxxx"/> - <File Name="DCPlusPlus.exe" Size="946176" TTH="xxxxxxxxx"/> + <File Name="DCPlusPlus.pdb" Size="17648640" ... /> + <File Name="DCPlusPlus.exe" Size="946176" ... /> </Directory> - <File Name="ADC.txt" Size="154112" TTH="xxxxxxxxx"/> + <File Name="ADC.txt" Size="154112" ... /> </Directory> <!-- Only used by partial lists --> <Directory Name="share2" Incomplete="1"/> @@ -211,8 +214,6 @@ "Base" is used for partial file lists, but must be present even in the non-partial list. -"TTH" is the base32-encoded TTH root of the file. - "Incomplete" signals whether a directory in a partial file list contains unlisted items. "1" means the directory contains unlisted items, "0" that it does not. Incomplete="0" is the default and may thus be omitted. @@ -382,7 +383,7 @@ Code | Type | Description ___ ID | base32 | The CID of the client. Mandatory for C-C connections. -PD | base32 | The PID of the client. Hubs must check that the Tiger(PID) == CID and then discard the field before broadcasting it to other clients. Must not be sent in C-C connections. +PD | base32 | The PID of the client. Hubs must check that the hash(PID) == CID and then discard the field before broadcasting it to other clients. Must not be sent in C-C connections. I4 | IPv4 | IPv4 address without port. A zero address (0.0.0.0) means that the server should replace it with the real IP of the client. Hubs must check that a specified address corresponds to what the client is connecting from to avoid DoS attacks, and only allow trusted clients to specify a different address. Clients should use the zero address when connecting, but may opt not to do so at the user's discretion. Any client that supports incoming TCPv4 connections must also add the feature TCP4 to their SU field. I6 | IPv6 | IPv6 address without port. A zero address (::) means that the server should replace it with the IP of the client. Any client that supports incoming TCPv6 connections must also add the feature TCP6 to their SU field. U4 | integer | Client UDP port. Any client that supports incoming UDPv4 packets must also add the feature UDP4 to their SU field. @@ -485,8 +486,6 @@ SI | Size, in bytes SL | Slots currently available TO | Token -TR | Tiger tree hash root, encoded with base32. -TD | Tiger tree depth, index of the highest level of tree data available, root-only = 0, first level (2 leaves) = 1, second level = 2, etc… ___ ==== CTM @@ -520,8 +519,7 @@ States: VERIFY -Get Password. The data parameter is at least 24 random bytes (base32 encoded), -used to avoid replay attacks on the password. +Get Password. The data parameter is at least 24 random bytes (base32 encoded). ==== PAS PAS password @@ -531,7 +529,7 @@ States: VERIFY Password. The password (utf-8 encoded bytes), followed by the random data -(binary), passed through the Tiger hash algorithm (not TTH) then converted to +(binary), passed through the session hash algorithm then converted to base32. When validated, this transitions the server into NORMAL state. ==== QUI @@ -567,24 +565,13 @@ 0 as the first byte. <bytes> may be set to -1 to indicate that the sending client should fill it in with the number of bytes needed to complete the file from <start_pos>. <type> is a [a-zA-Z0-9]+ string that specifies the namespace -for identifier and BASE requires that clients recognize the types "file", -"tthl" and "list". Extensions may add to the identifier names as well as add -new types. +for identifier and BASE requires that clients recognize the types "file" and +"list". Extensions may add to the identifier names as well as add new types. "file" transfers transfer the file data in binary, starting at <start_pos> and -sending <bytes> bytes. Identifier must be a TTH root value from the "TTH/" -root. +sending <bytes> bytes. Identifier must come from the namespace of the current +session hash. -"tthl" transfers send the largest set of leaves available) as a binary -stream of leaf data, right-to-left, with no spacing in between them. -<start_pos> must be set to 0 and <bytes> to -1 when requesting the data. -<bytes> must contain the total binary size of the leaf stream in SND, and by -dividing this length by the individual hash length, the number of leaves, and -thus the leaf level can be deducted. The received leaves can then be used to -reconstruct the entire tree, and the resulting root must match the root of the -file (this verifies the integrity of the tree itself). Identifier must be a -TTH root value from the "TTH/" root. - "list" transfers are used for partial file lists and have a directory as identifier. <start_pos> is always 0 and <bytes> contains the uncompressed length of the generated XML text in the corresponding SND. An optional flag @@ -654,6 +641,56 @@ ___ == Standard Extensions <work-in-progress> + +=== TIGR - Tiger tree hash support +This extension adds tiger tree hash support to the base protocol. It is +intended to be used both for identifying files and for protocol purposes such +as CID generation and password negotiation + +==== General + +==== TIGR for shared files +TIGR supporting clients must share only files hashed using Merkle Hash trees, +as defined by http://www.open-content.net/specs/draft-jchapweske-thex-02.html. +The Tiger algorithm, as specified by +http://www.cs.technion.ac.il/~biham/Reports/Tiger/ functions as the hash +algorithm. A base segment size of 1024 bytes must be used when generating the +tree, but clients may then discard parts of the tree as long as at least 7 +levels are kept or a block granularity of 64 KiB is achieved. + +Generally, the root of the tree (TTH) serves to identify a file uniquely. +Searches use it and it must be present in the file list. Further, the root of +the file list must also be available, and discoverable via GFI. A client may +also request the rest of the tree using the normal client-client transfer +procedure. The root must be encoded using base32 encoding when converted to +text. + +In the file list, each file carries an additional attribute "TTH" containing +the base32-encoded value of the tiger tree root. + +In the GET/GFI type, the full tree may be accessed using the "tthl" type. + +"tthl" transfers send the largest set of leaves available) as a binary +stream of leaf data, right-to-left, with no spacing in between them. +<start_pos> must be set to 0 and <bytes> to -1 when requesting the data. +<bytes> must contain the total binary size of the leaf stream in SND, and by +dividing this length by the individual hash length, the number of leaves, and +thus the leaf level can be deducted. The received leaves can then be used to +reconstruct the entire tree, and the resulting root must match the root of the +file (this verifies the integrity of the tree itself). Identifier must be a +TTH root value from the "TTH/" root. + +In the GET/GFI namespace, files are identified by +"TTH/<base32-encoded tree root>". + +In searches and GFI, the following attributes are added: + +[separator="|",grid="all",frame="all"] +``_ +TR | Tiger tree hash root, encoded with base32. +TD | Tree depth, index of the highest level of tree data available, root-only = 0, first level (2 leaves) = 1, second level = 2, etc... +___ + === BZIP – File list compressed with bzip2 This extension adds a special file "files.xml.bz2" in the unnamed root of the share which contains "files.xml" compressed with bzip2 1.0.3+ (www.bzip.org). This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |