From: Erik H. <ehe...@gm...> - 2014-04-08 16:56:45
|
Hi, Welcome! These questions actually belong on arc...@li..., which I have CCed. I don’t think any production servers use the BDBCollection; we all use CDX. And no, I don’t think it would make sense to replace BerkeleyDB with postgresql. It makes the most sense to manage your CDX indexing outside wayback. It’s hard to know what the issue is with your CDX without knowing what your configuration looks like. Are you sure that you have a CDXCollection set up correctly? best, Erik At Tue, 8 Apr 2014 01:56:31 -0700 (PDT), lo all wrote: > > hey openwayback-community ;) > > I successfully installed openwayback and now I am trying to configure it to > fit my needs. > At the moment I have some doubts which implementation fits "best".. > > For my current installation I have tested the BDBCollection which uses > Berkely DB behind the scenes and has an autoindex-funtion. > When I feed openwayback with a new *.warc.gz I want to be able check wether > the warc has been indexed or not. Problem here is the exclusive lock on the > BDBCollection by Wayback which makes it hard to query the collection from > outside of wayback. > Is it possible and does it make sense to replace the Berkely DB with eg. > postgresql-database? If yes, is it worth the "hassle" to keep the > autoindex-feature? > > I also think about using CDX resource index. For testing I created a index > of a few warcs using CDX-Writer (https://github.com/rajbot/CDX-Writer). > Well, I am not able to query my websites using this index (resource not in > archive). > here is the command I used: python cdx_writer.py --format 'N b a m s k r V > g' /tmp/my-new-warc.warc.gz > /tmp/mynew_index.cdx > is there anything I am missing? > > > Im not sure if my questions are placed correct into the dev-group but I > cant find any other wayback community to ask those questions. > Is there any irc channel for discussions? Im probably not the only beginner > with questions and right now I have more questions than answers :p > > -- > You received this message because you are subscribed to the Google Groups "openwayback-dev" group. > To unsubscribe from this group and stop receiving emails from it, send an email to ope...@go.... > For more options, visit https://groups.google.com/d/optout. -- Sent from my free software system <http://fsf.org/>. |