Menu

#38 Support for CBV files

Future Release
closed
None
5
2015-06-03
2014-02-07
No

Are you intending to support bundled chessbase files: CBV

As usual there does not seem to be any details of chessbase compression algorithm; I have some files here if you want them

In fact they may not be compressed at all but just concatenated cbh, cbg, etc files - well I can see plain text when edit the file

Discussion

  • Gregor Cramer

    Gregor Cramer - 2014-02-07
    • status: open --> closed
    • assigned_to: Gregor Cramer
     
  • Gregor Cramer

    Gregor Cramer - 2014-02-07

    I know, this would be a nice feature. Unluckely I think there is no way, I've already took a look into this file, and I cannot see the structure. I did some calculations about the file sizes, ChessBase must use a form of compression for some of the files, the file size of the cbv is much shorter than the sum of all the individual files. If you take a closer look you will see that the .ini file is partly compressed (in my examples it is so). If someone can actual manage to decode the format I will integrate this.

    ChessBase is quite unfriendly to his customers. If someone is buying a database from ChessBase, he cannot use a tool of his own choice. Scidb is going the opposite way, everybody can manage to decode .scv files, the format is open, you will find the description in the help pages, and the format is very simple, although it is using zlib compression. So use .scv if you want to ship databases, or to archive databases, this format is able to bundle any database format.

     
  • antoyo

    antoyo - 2015-03-28

    I work on this issue.
    You can see my current work here:
    https://github.com/antoyo/uncbv

    It works on some .cbv files, but not on all.
    I think that some files in bigger archives are compressed twice (with two algorithms), so I need to figure out the other compression algorithm.
    Any help will be appreciated.

     

    Last edit: antoyo 2015-03-28
  • Gregor Cramer

    Gregor Cramer - 2015-03-28
    • status: closed --> pending
     
  • Gregor Cramer

    Gregor Cramer - 2015-03-28

    Many thanks for your info, unzipping the cbv archive is of significant interest. I'm a very experienced software developer, but figuring out compression algorithms, or the unraveling of decoded data, is out of my scope, most of the encoder stuff for the ChessBase format is based on the spadework of other developers.

    It works on some .cbv files, but not on all.
    

    It's a start.

    so I need to figure out the other compression algorithm
    

    Good luck! It would be very nice if all cbv archives can be unzipped.

    Any help will be appreciated.
    

    So far as I understand the only crux is the second compression algorithm - the other issues are minor. As I emphasized above I cannot help to figure out the algorithm, that's outside of my talent - I am amazed how you did that. Do you need any other help?

    Because of your work I've reopened this feature request.

    It should be underlined here that the reverse approach is not required, Scidb provides the open format scv. Every application can unzip scv archives, it's a very simple format.

     
  • antoyo

    antoyo - 2015-04-02

    After getting help, I implemented the second compression algorithm.
    So now, my tool can decomppress many more cbv archives.
    Could you test it to see if there is any archives that it cannot decompress please?
    It is able to decompress an archive containing the Mega Database 2014!
    There's a few small issues to fix (sub-directories support, improve performance, memory issue), but it should not take much time.

     

    Last edit: antoyo 2015-04-02
  • Gregor Cramer

    Gregor Cramer - 2015-04-02

    I don't have many ChessBase databases. I've tried 5 archives, and all worked fine, I'm impressed.

    About the improvements: I will do this while adopting your code into Scidb, I'm a specialist in speed improvements. Choosing a destination sub-directory is a GUI task.

    I think that your work is ripe for an integration into the next release of Scidb. Scidb will give a message that the unzip of .cbv is still experimental and may fail. Of course I will try to ensure that a failed decompression will not overwrite existing files, as far as the detection of a failed decompression is possible. Furthermore I will create an additional entry in "Main Menu->About Scidb->Contributions". Do you agree?

     
  • antoyo

    antoyo - 2015-04-02

    Yeah, I agree.
    But the subdirectories issue is not a GUI task:
    I mean that an archive may contain subdirectories but my tool did not support it.
    However, I've just fixed the issue so now there is only the memory issue and the performance to fix.
    I think these two issues come from the way I transform bytes into bits (BitArray::operator[]).

    If you have any question concerning the .cbv file format (overall structure or compression algorithms), feel free to ask.

    By the way, there is the possibility to encrypt a .cbv archive so the work is not completely done.
    I have an idea of what is done so I'll look to see if I'm right.

     

    Last edit: antoyo 2015-04-02
  • Gregor Cramer

    Gregor Cramer - 2015-04-03
    But the subdirectories issue is not a GUI task:
    I mean that an archive may contain subdirectories
    but my tool did not support it.
    

    Ok, now it's clear, I know about the meaning of the sub-directories, no problem to handle this.

    However, I've just fixed the issue so now there is only
    the memory issue and the performance to fix.
    

    Fine. About the performance: I think that the Huffman decoder from the JPEG library should also work here, this decoder is very fast.

    By the way, there is the possibility to encrypt a .cbv archive
    so the work is not completely done.
    I have an idea of what is done so I'll look to see if I'm right.
    

    It's of course fine if the unzip algorithm is complete, including decrypting, but I think it's not a must. Probably it' easy to do the decryption, probably not. If I look into the binary cb11.exe I see that the Crypto API of Windows is used (so far my knowledge about data encoding/decoding is sufficient), but I don't know whether a Linux libarary exists for the decryption.

    By the way: do you know anything about .cbone, the all in one database format of ChessBase?

     
  • antoyo

    antoyo - 2015-04-03

    I discovered the .cbone file format a few days ago so I don't really know it.
    There's also the .cbcloud file format which can be read by ChessBase Reader 2013.

    Just to make it clear: I've already fixed the issue about sub-directories so you won't have anything to do about it (actually, we only need to create a directory if there is a backslash in the filename).

    I hope you'll be able to use the Huffman decoder from the JPEG library, but it may be hard because the Huffman tree is encoded at the start of the data and may not be in the same format.
    By the way, there is the libjpeg-turbo library which may be faster.

     
  • antoyo

    antoyo - 2015-04-03

    I found that .cbz archives are encrypted using the DES algorithm:
    I updated my tool to support that.
    However, it does not work when the password is more than 8 characters.
    My tool only works when the password length is 8 or below.
    Moreover, I think we need to better way to check if the password is wrong (as for now, I only check if the first byte is 0x08).
    Do you have any ideas?

    By the way, I used the mcrypt library to do the decryption in my tool.

     
  • antoyo

    antoyo - 2015-04-03

    I found that when the password has a length above 8, a hash algorithm is used to create a key of length 8.
    I implemented this algorithm in my tool.
    I think that the reverse engineering of the .cbv and .cbz archives is done.
    Have I missed something?

     
  • Gregor Cramer

    Gregor Cramer - 2015-04-13

    Antoni has developed an unzip tool: https://github.com/antoyo/uncbv. This is adopted into Scidb. In Scidb it will be even possible to import cbv or cbz archives without extraction to external files. Also it will be possible to repack cbv/cbz archives into the scv format, but not vice versa.

     
  • Gregor Cramer

    Gregor Cramer - 2015-04-13
    • status: pending --> closed
     
  • antoyo

    antoyo - 2015-06-03

    Sorry for the delay.
    Have you put this feature in the SVN?
    If yes, where is it?
    I cannot see it in revision 1070.

     
    • Gregor Cramer

      Gregor Cramer - 2015-06-04

      Have you put this feature in the SVN?
      If yes, where is it?
      I cannot see it in revision 1070.

      No, I've integrated this feature in a branch. I'm working since more than a
      year on a new branch, it contains some reworks and some big improvements. The
      CBV feature will be released with the new branch, but I don't know when. One
      of the goals for this branch is the integration of C/CIF, but the C/CIF
      library is still in progress, but the most work is already done. Another goal
      of the new branch is a new position search algorithm, much faster than the old
      one. The effort for this new algorithm is very, very high, because it is based
      on a kind of knowledge database.

      The effort to integrate this feature into the old branch is too high, because
      the implementation is based on some new concepts, only realized in the new
      branch. For that the CBV support will have some nice features,for example
      opening a .cbv file without extraction.

      If you are interested how it is implemented I can send you the involved
      sources of the new branch.

       

Log in to post a comment.

MongoDB Logo MongoDB