Menu

How are pages stored in GDBM format?

Help
Heiner
2006-08-10
2012-10-11
  • Heiner

    Heiner - 2006-08-10

    I can no longer access my phpWiki pages using the
    wiki itself, but still have the GDBM files used
    to store them.

    I tried to read them using the code below, but
    only get the page names with some additional
    bytes, which probably are pointers to the actual
    data. Could somebody shed some light on this?

    I'd like to have the program print the complete
    text of all pages to standard output.

    This is what I have now (compile with
    "gcc -o readpages readpages.c -lgdbm"):


    / readpages.c /

    include <errno.h>

    include <stdio.h>

    include <gdbm.h>

    int main(int argc, char *argv[])
    {
    int i;

    GDBM_FILE dbf;
    datum key;
    
    if (argc &lt; 2) {
        fprintf(stderr, &quot;%s: usage: %s filename\n&quot;,
                argv[0], argv[0]);
        return 1;
    }
    
    for (i = 1; i &lt; argc; i++) {
        if (!(dbf = gdbm_open(argv[i], 0, GDBM_READER, 0, 0))) {
            fprintf(stderr, &quot;gdbm_open: %s gdbm_errno=%d, errno=%d\n&quot;,
                gdbm_strerror(gdbm_errno), gdbm_errno, errno);
            return 1;
        }
    
        for (key = gdbm_firstkey(dbf); key.dptr; key = gdbm_nextkey(dbf, key)) {
            printf(&quot;%d bytes: &lt;%.*s&gt;\n&quot;, key.dsize, key.dsize, key.dptr);
        }
    
        gdbm_close(dbf);
    }
    
    return 0;
    

    }

     
    • Reini Urban

      Reini Urban - 2006-08-12

      You can use "dumpgdbm" to dump the data also :)
      This comes with gdbm.

      Please see the php sourcecode in lib/WikiDB/backend/dbaBase.php

      • Tables:
        *
      • page:
      • Index: pagename
      • Values: latestversion . ':' . flags . ':' serialized hash of page meta data
      • Currently flags = 1 if latest version has empty content.
        *
      • version
      • Index: version:pagename
      • Value: serialized hash of revision meta data, including:
        • quasi-meta-data %content
          *
      • links
      • index: 'o' . pagename
      • value: serialized list of pages (names) which pagename links to.
      • index: 'i' . pagename
      • value: serialized list of pages which link to pagename

      Each table uses a unique prefix for the key to seperate the
      page (p), version (v), links (l)

      So to get the page HomePage => "pHomePage", and the version (the text you need) => "vHomePage"
      The value is php-serialized, which you have to unserialize.

      best done with php.
      best done by using the existing library.

       

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.