Checksum library

2009-02-13
2013-05-01
  • Green Koopa
    Green Koopa
    2009-02-13

    I am searching for a library that calculates a variety of checksums.  My search has so far been surprisingly unsuccessful.  Code within this project has been the only option I have really been impressed with.  I know this is a somewhat out-of-place question from a n00b (to the open source community, not in-house programming), so feel free to ignore (or delete).

    Is there really no library out there that is:
    > Written in standard C or C++
    > Has a common interface to a variety of checksum algorithms
    --> A few standard cryptographic hashes such as MD5 and SHA
    --> Basic checksums such as a simple modulo sum, CRC, Fletcher, and Adler
    > Supports an Initialize, Step/Update, Finalize model

    I understand that this project doesn't support all of these algorithms and is not intended to be a library.  It was its clean architecture/interface that caught my attention.

    So groping in the dark for my question:
    > What was your experience searching when you needed this functionality?
    > Was your checksum code adapted from somewhere, or developed yourself to fill this void?
    > Is 'library' the word I should be searching for?
    > Does GPL allow me to use your code for my own use or to create such a library?

    Thanks in advance,
    Chris

     
    • Tom Bramer
      Tom Bramer
      2009-02-14

      > Is there really no library out there that is:
      > > Written in standard C or C++
      > > Has a common interface to a variety of checksum algorithms
      > --> A few standard cryptographic hashes such as MD5 and SHA
      > --> Basic checksums such as a simple modulo sum, CRC, Fletcher, and Adler
      > > Supports an Initialize, Step/Update, Finalize model

      Botan is a library that appears to fit these characteristics to a certain degree, at least in terms of being written in C++, having a standard interface for checksum/hash algorithms, and supporting many commonly used cryptographic hash functions.  It seems to be fairly comprehensive (which may or may not be desirable, depending on the situation).

      > I understand that this project doesn't support all of these algorithms and is not intended to be a
      > library. It was its clean architecture/interface that caught my attention.

      It's definitely not meant to be a standalone library, but a collection of plugins to be utilized by the programs that come with the FV++ releases.  All of the main components (fv.exe, fvc.exe, winfvc.exe, fvshell.dll) reference these modules at runtime to utilize or simply discover what algorithms are available to be used, which means that new algorithms can be supported without changing (or even recompiling) the source.

      > Is 'library' the word I should be searching for?
      It depends on what one trying to accomplish.  I could have written plugins that simply reference algorithms in an existing library (like Botan, Crypto++, or others), though doing such would have resulted in much larger plugin binaries or reliance on a large shared library for the plugins, which was not one of my design goals. 

      > > What was your experience searching when you needed this functionality?
      > > Was your checksum code adapted from somewhere, or developed yourself to fill this void?

      The code was written using specifications of the algorithms and public domain reference implementations as a guide.  The algorithms are not terribly optimized, but for most part, the limiting factor is acquiring the source to be hashed -- files on disk, where disk I/O is slow, at least in relation to the performance of most of the implementations.

      > > Does GPL allow me to use your code for my own use or to create such a library?
      Yes, it does, provided that the usage is in compliance with the GPL, which in effect means that any derivative works of a GPL'd program must be subsequently licensed under the GPL if redistributed.  There are many resources that explain the ins and outs of this particular license.

      There is a FAQ section on the GPL here, which should hopefully answer any questions you might have about it:
      http://www.gnu.org/licenses/gpl-faq.html

       
    • Green Koopa
      Green Koopa
      2009-02-14

      > > Is there really no library out there that is:
      > > > Written in standard C or C++
      > > > Has a common interface to a variety of checksum algorithms
      > > --> A few standard cryptographic hashes such as MD5 and SHA
      > > --> Basic checksums such as a simple modulo sum, CRC, Fletcher, and Adler
      > > > Supports an Initialize, Step/Update, Finalize model

      > Botan is a library that appears to fit these characteristics to a certain
      > degree, at least in terms of being written in C++, having a standard
      > interface for checksum/hash algorithms, and supporting many commonly used
      > cryptographic hash functions. It seems to be fairly comprehensive (which
      > may or may not be desirable, depending on the situation).

      I did briefly consider Crypto++ and others, but I found their
      comprehensiveness undesirable.  I had overlooked Botan, so thank you for that
      lead.  However, it also appears to focus on cryptography.  This makes them
      all heavy and complex for my needs.

      On one end I need a choice of through hashes, but I don't need them to be
      secure against a malicious attack (and certainly don't need encryption).  On
      the other end I do desire speed, but less robust checksums such as Adler will
      probably be sufficient in those cases.

      > It's definitely not meant to be a standalone library, but a collection of
      > plugins to be utilized by the programs that come with the FV++ releases.
      > All of the main components (fv.exe, fvc.exe, winfvc.exe, fvshell.dll)
      > reference these modules at runtime to utilize or simply discover what
      > algorithms are available to be used, which means that new algorithms can be
      > supported without changing (or even recompiling) the source.

      That separation and clear interface that serves your needs is exactly what
      makes it library-like.  It also tipped me off to the likelihood of clean code
      inside.  It saddens me how many great programs are impossible to expand into
      the future and are useless for reuse or to serve as example for learning.

      I guess I'm just now realizing that you intended this separation to go both
      ways.  I initially saw the plug-in DLLs as a way to expand your collection of
      supported hash algorithms.  Now I see that it is also a way to expand your
      collection of tools that utilize these algorithms.  That is my need exactly!
      I may only need your DLLs to start.

      > There is a FAQ section on the GPL here, which should hopefully answer any
      > questions you might have about it:
      > http://www.gnu.org/licenses/gpl-faq.html

      Thank you!  I'll read this and then look for a broader comparative analysis
      of open source licenses.  I honestly doubt I will write anything of
      redistributable quality, but I am curious all the same.

      -Chris

       
      • Tom Bramer
        Tom Bramer
        2009-02-15

        If you do use the modules, it is recommended that you use the latest SVN source for those, as an issue has been discovered with some of them (SHA, RMD, and WHIRLPOOL). 

        The issue does not affect the results when they are utilized in the same way that FileVerifier++ uses them (using a buffer size of 32,768 bytes), but may be an issue for anything that uses a buffer size that is not a multiple of 64 (or 128 for SHA-384, SHA-512, and WHIRLPOOL variants).

         
        • Green Koopa
          Green Koopa
          2009-02-15

          > > There is a FAQ section on the GPL here, which should hopefully answer any
          > > questions you might have about it:
          > > http://www.gnu.org/licenses/gpl-faq.html
          >
          > Thank you! I'll read this and then look for a broader comparative analysis
          > of open source licenses. I honestly doubt I will write anything of
          > redistributable quality, but I am curious all the same.

          Copious license information including definitions and comparative analysis
          are well presented there.  Thank you.

          > If you do use the modules, it is recommended that you use the latest SVN
          > source for those, as an issue has been discovered with some of them (SHA,
          > RMD, and WHIRLPOOL). 
          >
          > The issue does not affect the results when they are utilized in the same
          > way that FileVerifier++ uses them (using a buffer size of 32,768 bytes),
          > but may be an issue for anything that uses a buffer size that is not a
          > multiple of 64 (or 128 for SHA-384, SHA-512, and WHIRLPOOL variants).

          Thanks for the heads up.

          -Chris