|
From: Alfredo S. <se...@gm...> - 2016-07-07 07:47:50
|
Hi Bryan thank you very much for the really clear explanation. We are doing some research/test for multiple backing stores with large advance, in order to be ready in case if will be possibile, and being able to correctly evaluate possible strategie, drawbacks, etc. So I will follow any updates on this. At the moment the single journal file with multiple context works fine, and I've already also tested the CompactJournalUtility which works ok too, so I'll test also the ExportKB utility for export as an alternative to use a process with SPARQL, thanks for the suggestion. thank you, Alfredo PS: sorry for wrong subject in the email, I have done something wrong with mailing list addresses and I didn't notice the wrong line 2016-07-06 16:41 GMT+02:00 Bryan Thompson <br...@bl...>: > There is a facility to take a snapshot of a journal file. This exists in > the core platform and is automated in the enterprise platform, which also > supports transaction logs, resync, etc. > > Each namespace is stored in the same journal file. So anything that > operates at the journal level handles all namespaces. > > There is an ExportKB utility. This is not integrated into the REST API, > but it could be. This would provide a means to dump a namespace. > > Having multiple backing stores (multi-RWStore) is not trivial. We have > taken some steps to prepare for this, such as including a store file unique > identifier in the HA replication messages, but actually doing this would be > a significant undertaking. Lots of tests, more complex conditions around > atomic commit and rollback, etc. > > Thanks, > Bryan > > On Wednesday, July 6, 2016, Alfredo Serafini <se...@gm...> wrote: > >> Hi >> >> I'm testing blazegraph and exploring some possible configuration options >> for the journal file. >> >> For a project in which I'm involved, we will have an increasing amount of >> data, so we are searching in advance for a robust strategy to conduct >> backups. >> >> Ideally we'd like to test different strategies: >> >> - backup a single dataset / namespace : this seems to be possible by >> using dumps over the specific endpoints exposed for every namespace, while >> still using the same journal file. >> On the other hand, I wonder if there could be a way for avoiding the >> usage of SERVICE statements when we have to query on different contexts >> stored on the same instance, (avoiding materializations and thus a loto of >> duplicated data)? >> - backup the journal file itself: if possible we'd like to have it >> physically splitted for dataset, but I didn't find any references to such a >> feature. >> Moreover: if this feature it's not available, do you think it could >> be possible to hack a bit the classes that are currently handling the file, >> creating a n intermediate class which could handle multiple files at the >> same time transparently? I imagine something like a "MultipleRWStore", just >> to say. >> >> Sorry if the questions may seem weird: any suggestions / criticism is >> very welcome >> >> Thank you in advance (and apologies in case this was not the right >> address to post requesting help) >> >> Alfredo >> > |