[getdata-devel] Implicit index vector for MPLEX data streams?
Scientific Database Format
Brought to you by:
ketiltrout
|
From: Graeme S. <gsm...@th...> - 2018-10-05 00:45:37
|
Hi, I am a long-time casual user of libgetdata/dirfile. It has been quite a while since I looked at either the library or the standard, and I am happy to see them both growing and maturing. Congratulations and thanks. I have successfully shifted older code from a very large number of separate RAW files to a single MPLEX file. The results scale better (fewer gd_putdata calls, less stress on the VM, more striped I/O). All in all, it's been a pleasure to fix a long-standing problem with my code. However, I now have to carry around an <index> vector for my MPLEX stream that is just a modulo-2^n counter. (That is, my MPLEX'd data stream is always ordered predictably, but I still need to tell libgetdata that.) This index counter takes up a significant amount of disk space, and I would prefer not to carry it around. Is it possible to synthesize this counter using some dirfile magic? Here's a longer explanation in case it's not clear. My older format file looked like this: foo_c1 RAW INT32 1 foo_c2 RAW INT32 1 [...] foo_c256 RAW INT32 1 After multiplexing these data streams into a single file, I can use foo_mplex RAW INT32 256 /HIDDEN foo_mplex channel RAW UINT8 256 /HIDDEN channel foo_c1 MPLEX foo_mplex channel 0 256 foo_c2 MPLEX foo_mplex channel 1 256 [...] foo_c256 MPLEX foo_mplex channel 255 256 The only wrinkle is the "channel" file; even in UINT8s, it wastes a large fraction of the overall disk space. I'm sure this is not an unique requirement and I thought I'd ask if I was missing something in the standards, or if I am approaching things from the wrong perspective. thanks, Graeme |