From: Michael F. <fr...@gm...> - 2010-06-04 08:42:17
|
Great, didn't realise the hashes in the database were used for caching only. Michael On Fri, Jun 4, 2010 at 10:36 AM, Rob Vesse <rv...@vd...> wrote: > Hi Michael > > All the hash codes for Nodes and Triples in dotNetRDF (and most other > internal objects which need their own hash codes) are based upon using > GetHashCode() on a string representation of the object in question. Yes > hash codes are not stable across .Net versions and platform architectures > but we do not require them to be since the vast majority of the time these > hash codes are only being used in memory for the purpose of storage and > lookup in various hash code based data structures and algorithms. Hash > codes just need to be fast and efficient to compute which the .Net string > classes GetHashCode() implementation is regardless of .Net > version/architecture since we want to compute the hash codes once and then > store the value and just return it when needed. > > While we could potentially design and implement our own hash code algorithm > for these things there is no point reinventing the wheel and it would almost > certainly take far more effort to implement than the potential benefit of > finding/designing an algorithm which generated codes with sufficient > uniqueness and efficiency. > > With regards to their use for database identity this is a pragmatic design > decision which makes a trade off between read/write speed and data > instability. Hash codes are used as part of database identity only for our > own SQL based stored simply because it makes a significant difference in > speed and most of the time you'll create and access your data on the same > architecture so hash code instability won't be an issue. Since it is a > potential issue the database code is all designed to take account of this > hash code instability and work around it automatically and seamlessly. > Actual database identity is based on numeric identifiers and the hash codes > are only used as a means to speed up and cache conversions between Nodes and > database IDs. > > Regards, > > Rob Vesse > > ________________________________ > From: Michael Friis <fr...@gm...> > Sent: 03 June 2010 16:05 > To: dot...@li... > Subject: [dotNetRDF-develop] Use of string.GetHashCode() for database > identety > > As far as I can determine from the code in eg. UriNode.cs, dotNetRDF > uses string.GetHashCode() for database identety. This is bad design > because the string hashcodes are not stable accross .Net version nor > accross architecture: > http://stackoverflow.com/questions/2099998/hash-quality-and-stability-of-string-gethashcode-in-net > > Regards > Michael > > ------------------------------------------------------------------------------ > ThinkGeek and WIRED's GeekDad team up for the Ultimate > GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the > lucky parental unit. See the prize list and enter to win: > http://p.sf.net/sfu/thinkgeek-promo > _______________________________________________ > Dotnetrdf-develop mailing list > Dot...@li... > https://lists.sourceforge.net/lists/listinfo/dotnetrdf-develop > > -- http://friism.com (+45) 27122799 Sapere aude |