getdata-devel Mailing List for GetData (Page 5)
Scientific Database Format
Brought to you by:
ketiltrout
You can subscribe to this list here.
2008 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(6) |
Oct
(1) |
Nov
(10) |
Dec
(4) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2009 |
Jan
(1) |
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(4) |
Oct
(2) |
Nov
(1) |
Dec
|
2010 |
Jan
|
Feb
(4) |
Mar
(1) |
Apr
(3) |
May
|
Jun
|
Jul
(21) |
Aug
(1) |
Sep
(16) |
Oct
(2) |
Nov
(12) |
Dec
(11) |
2011 |
Jan
(2) |
Feb
(5) |
Mar
(42) |
Apr
(1) |
May
|
Jun
|
Jul
(5) |
Aug
|
Sep
(4) |
Oct
(4) |
Nov
(7) |
Dec
(9) |
2012 |
Jan
|
Feb
|
Mar
|
Apr
(3) |
May
(1) |
Jun
|
Jul
(9) |
Aug
(1) |
Sep
|
Oct
(3) |
Nov
|
Dec
(5) |
2013 |
Jan
(2) |
Feb
|
Mar
(9) |
Apr
(3) |
May
(1) |
Jun
|
Jul
|
Aug
(3) |
Sep
(3) |
Oct
(1) |
Nov
(1) |
Dec
|
2014 |
Jan
(4) |
Feb
(7) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(2) |
Oct
|
Nov
|
Dec
(1) |
2015 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(7) |
Jun
|
Jul
|
Aug
(6) |
Sep
(6) |
Oct
(1) |
Nov
|
Dec
|
2016 |
Jan
|
Feb
(6) |
Mar
(11) |
Apr
(2) |
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
|
2017 |
Jan
(1) |
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(2) |
Oct
|
Nov
|
Dec
|
2018 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(1) |
Nov
|
Dec
|
2020 |
Jan
(1) |
Feb
(2) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2021 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
|
Nov
(1) |
Dec
|
From: D. V. W. <ge...@ke...> - 2012-07-12 00:59:06
|
On Tue, Jul 10, 2012 at 07:44:33PM +0200, Christian Trippe wrote: > Am Montag, 9. Juli 2012, 12:22:36 schrieb D. V. Wiebe: > > On Sat, Jul 07, 2012 at 08:06:40AM +0200, Christian Trippe wrote: > > > > > Did you fix something? The Fortran tests seem to be passing now in the > > log you linked above: > > > > No, I did not change anything. An automatic rebuild has been triggered in the > meantime, but I don't know why this should have changed something for the > build result. > > But as I can no longer reproduce the error, let's forget it. > > Christian One memory audit later, I think I've found the bug. I'll try to get a new verious out soon with this fix a few other fixes people have found. Thanks for the heads-up, -dvw -- D. V. Wiebe ge...@ke... http://getdata.sourceforge.net/ |
From: Christian T. <ct...@op...> - 2012-07-10 17:47:44
|
Am Montag, 9. Juli 2012, 12:22:36 schrieb D. V. Wiebe: > On Sat, Jul 07, 2012 at 08:06:40AM +0200, Christian Trippe wrote: > > Did you fix something? The Fortran tests seem to be passing now in the > log you linked above: > No, I did not change anything. An automatic rebuild has been triggered in the meantime, but I don't know why this should have changed something for the build result. But as I can no longer reproduce the error, let's forget it. Christian |
From: D. V. W. <ge...@ke...> - 2012-07-09 19:22:44
|
On Sat, Jul 07, 2012 at 08:06:40AM +0200, Christian Trippe wrote: > Hi, > > I wanted to build getdata 0.8.0 on openSUSE. However 2 of 3 tests for the > Fortran bindings fail on x86 and x86_64. > > ==== [...] > FAIL: big_test [...] > FAIL: big_test95 > ==================================================== > 2 of 3 tests failed > Please report to get...@li... > ==================================================== > > === > > This is with gcc46 and gcc47. Full build logs (for gcc47 on x86_64) can be > found at > https://build.opensuse.org/package/live_build_log?arch=x86_64&package=getdata&project=home%3Achristiantrippe%3Abranches%3AKDE%3ADistro%3AFactory&repository=openSUSE_Factory > > Please tell me if I can provide any further useful information. > > Regards > Christian Did you fix something? The Fortran tests seem to be passing now in the log you linked above: PASS: gdcopn PASS: big_test PASS: big_test95 ================== All 3 tests passed ================== -dvw -- D. V. Wiebe ge...@ke... http://getdata.sourceforge.net/ |
From: Christian T. <ct...@op...> - 2012-07-07 06:09:46
|
Hi, I wanted to build getdata 0.8.0 on openSUSE. However 2 of 3 tests for the Fortran bindings fail on x86 and x86_64. ==== make check-TESTS make[5]: Entering directory `/home/abuild/rpmbuild/BUILD/getdata-0.8.0/bindings/f77/test' PASS: gdcopn s( 6)[232] = "EEEEEEEEEEEEEEEEEE ", expected "test3 test4" ne = 1 FAIL: big_test s(6)[232] = "EEEEEEEEEEEEEEEEEE ", expected "test3 test4" ne = 1 FAIL: big_test95 ==================================================== 2 of 3 tests failed Please report to get...@li... ==================================================== === This is with gcc46 and gcc47. Full build logs (for gcc47 on x86_64) can be found at https://build.opensuse.org/package/live_build_log?arch=x86_64&package=getdata&project=home%3Achristiantrippe%3Abranches%3AKDE%3ADistro%3AFactory&repository=openSUSE_Factory Please tell me if I can provide any further useful information. Regards Christian |
From: D. V. W. <ge...@ke...> - 2012-07-05 01:29:34
|
GetData 0.8.0 has been released. It may be downloaded from your local SourceForge mirror: http://sourceforge.net/projects/getdata/files/getdata/0.8.0/ GetData introduces Dirfile Standards Version 9. This new Standards Version adds the MPLEX and WINDOW field types, and field name aliases, the ability to modify field names provided in an /INCLUDE, and support for the zzip compression of data. Additionally, GetData 0.8.0 introduces the ability to do automatic sequential reads or writes, write support for gzip compressed Dirfiles, and various other improvements and bug fixes. This release also adds Perl to the list of supported bindings. Full release notes are provided at the bottom of the release page indicated above. Cheers, -dvw -- D. V. Wiebe ge...@ke... http://getdata.sourceforge.net/ |
From: Ben L. <egr...@gm...> - 2012-05-06 05:51:05
|
Thanks Barth for the sample code and Mr Wiebe for the very helpful explanation. Since posting this request I've not had time to experiment with this code. As soon as I do, I'll post back my results. Regards, Ben On 1/05/2012 6:57 AM, D. V. Wiebe wrote: > On Thu, Apr 26, 2012 at 10:52:12AM -0400, Barth Netterfield wrote: >> Hi Ben, >> >> I typically write my dirfiles directly, without using getdata. This >> is probably 'bad', but... here is a very simple piece of code which >> does that. I use it to test real time kst operation... > For the record, I consider your solution the Right Thing to do to create > simple dirfiles. Using GetData to make a dirfile is fine, if you're need > to do something fancy, or are using GetData for other stuff anyways, but > it's overkill in the simple case. > > For a simple dirfile the basic procedure is: > > 1) create a directory. The name of this directory will be used as the name > of the dirfile in kst. > 2) create a text file called "format" in that directory > 3) write a line in the "format" file for each field in the dirfile of the > form: > > <name> RAW<type> <spf> > > where > > *<name> is the name of the field. kst will expect to find the data for > this field in a file called<name> in the dirfile directory. > *<type> is the type of data stored in the file; it should be one of > the words: > > UINT8, INT8, UINT16, INT16, UINT32, INT32, UINT64, INT64, FLOAT32, > FLOAT64, COMPLEX64, COMPLEX128 > > Hopefully these are self explanatory. A C "double" is a 64-bit float, > so "FLOAT64". If you don't have strong opinions about the data type, > use that as a default. > > (Barth actually uses obsolete, single character, versions of these type > names. For "FLOAT32" he uses "f"; for "UINT16" he uses "u"; for > "FLOAT64" he uses "d".) > > *<spf> is the number of samples-per-frame. If all your data fields > have the same number of data points, this should be 1 for every field. > It gets more complicated if different fields have different sample > rates. > > (Note: Barth's example has other things that aren't RAW fields. Just > ignore those for now.) > > 4) Optionally, choose a "reference field", it can be any of the fields > you defined above; let's call it<ref>. Write a line in the "format" > file: > > /REFERENCE<ref> > > A dirfile always has a "reference field", even if you omit this, > step. If you don't choose a reference field explicitly here, the > first field defined will end up being used. > > (Barth's example doesn't do this, so the first field he defines, > "scount", ends up being used as the reference field.) > > 5) Close the "format" file (to ensure it gets saved to disk). > > 6) Now it's time for writing data. For each field, create a binary > file in the dirfile directory. The name of these files have to be > the same as the name of the corresponding field. (As a result, a > little care is needed when picking field names.) > > 7) Write (binary) data to the files. Make sure the binary type of the > data is the same as the type code you chose above when creating the > format file. (ie. if you used the type "FLOAT64", make sure you > write doubles to the binary file.) > > You can write as much data to a file at a time as you like. (you > don't have to write sample-by-sample). *BUT* there's one very > important caveat you need to adhere to if you don't want to confuse > kst. (And I've put some asterisks around this to make it look even > more important): > > ***The reference field (step 4 above) may NEVER have more data > frames in it than ANY OTHER FIELD.*** > > Typically, what this means is: write the reference field last. If > you're writing one frame at a time to each field, make sure you > write a frame to every other file before writing to the reference > field. > > (This restriction is why Barth explicitly writes the "scount" field > last.) > > That's it, really. Hopefully it make some sort of sense along with > Barth's code. > > -dvw |
From: D. V. W. <ge...@ke...> - 2012-04-30 21:19:12
|
On Thu, Apr 26, 2012 at 10:52:12AM -0400, Barth Netterfield wrote: > Hi Ben, > > I typically write my dirfiles directly, without using getdata. This > is probably 'bad', but... here is a very simple piece of code which > does that. I use it to test real time kst operation... For the record, I consider your solution the Right Thing to do to create simple dirfiles. Using GetData to make a dirfile is fine, if you're need to do something fancy, or are using GetData for other stuff anyways, but it's overkill in the simple case. For a simple dirfile the basic procedure is: 1) create a directory. The name of this directory will be used as the name of the dirfile in kst. 2) create a text file called "format" in that directory 3) write a line in the "format" file for each field in the dirfile of the form: <name> RAW <type> <spf> where * <name> is the name of the field. kst will expect to find the data for this field in a file called <name> in the dirfile directory. * <type> is the type of data stored in the file; it should be one of the words: UINT8, INT8, UINT16, INT16, UINT32, INT32, UINT64, INT64, FLOAT32, FLOAT64, COMPLEX64, COMPLEX128 Hopefully these are self explanatory. A C "double" is a 64-bit float, so "FLOAT64". If you don't have strong opinions about the data type, use that as a default. (Barth actually uses obsolete, single character, versions of these type names. For "FLOAT32" he uses "f"; for "UINT16" he uses "u"; for "FLOAT64" he uses "d".) * <spf> is the number of samples-per-frame. If all your data fields have the same number of data points, this should be 1 for every field. It gets more complicated if different fields have different sample rates. (Note: Barth's example has other things that aren't RAW fields. Just ignore those for now.) 4) Optionally, choose a "reference field", it can be any of the fields you defined above; let's call it <ref>. Write a line in the "format" file: /REFERENCE <ref> A dirfile always has a "reference field", even if you omit this, step. If you don't choose a reference field explicitly here, the first field defined will end up being used. (Barth's example doesn't do this, so the first field he defines, "scount", ends up being used as the reference field.) 5) Close the "format" file (to ensure it gets saved to disk). 6) Now it's time for writing data. For each field, create a binary file in the dirfile directory. The name of these files have to be the same as the name of the corresponding field. (As a result, a little care is needed when picking field names.) 7) Write (binary) data to the files. Make sure the binary type of the data is the same as the type code you chose above when creating the format file. (ie. if you used the type "FLOAT64", make sure you write doubles to the binary file.) You can write as much data to a file at a time as you like. (you don't have to write sample-by-sample). *BUT* there's one very important caveat you need to adhere to if you don't want to confuse kst. (And I've put some asterisks around this to make it look even more important): ***The reference field (step 4 above) may NEVER have more data frames in it than ANY OTHER FIELD.*** Typically, what this means is: write the reference field last. If you're writing one frame at a time to each field, make sure you write a frame to every other file before writing to the reference field. (This restriction is why Barth explicitly writes the "scount" field last.) That's it, really. Hopefully it make some sort of sense along with Barth's code. -dvw -- D. V. Wiebe ge...@ke... http://getdata.sourceforge.net/ |
From: Barth N. <net...@as...> - 2012-04-26 14:52:19
|
Hi Ben, I typically write my dirfiles directly, without using getdata. This is probably 'bad', but... here is a very simple piece of code which does that. I use it to test real time kst operation... It should be very easily modified to do whatever you want with a simple subset of a dirfile. It doesn't use the getdata library at all. $ gcc -o dms dirfile_maker_simple.c -lm $ ./dms dms will create a dirfile of interesting data which grows every 200 ms, and a link to it, called dm.lnk. You can access the file either by its long name (eg, 1335451802.dm) or by the link (dm.lnk). kst2 can read this file. for example kst2 ./dm.lnk -n 50 -y cos -y COS -y E0 have fun. cbn On Sat, Apr 21, 2012 at 8:35 AM, Ben Lewis <egr...@gm...> wrote: > Hi getdata mailing list, > > I use National Instruments CompactDAQ hardware for data acquisition. > National Instruments provide C/C++ drivers for this hardware in a package > called NI-DAQmx. > > I would like to be able to view live data collected from my NI hardware in > kst. As far as I understand the best format for this is the dirfile > standard. > > I have looked around the getdata website but I cannot find enough > information to implement getdata in my own code. I'm new to C/C++, so what I > need is an example to follow that I can then modify to suit my application. > > Attached is a piece of code showing how I currently write to a plain text > file, which can then be opened with kst for live data plotting. > > Can anybody show me an example of how to implement getdata in this code? I > assume it will replace the fprintf statements after the heading "DAQmx Read > Code" (line 172)? > > Regards, Ben > > ------------------------------------------------------------------------------ > For Developers, A Lot Can Happen In A Second. > Boundary is the first to Know...and Tell You. > Monitor Your Applications in Ultra-Fine Resolution. Try it FREE! > http://p.sf.net/sfu/Boundary-d2dvs2 > _______________________________________________ > getdata-devel mailing list > get...@li... > https://lists.sourceforge.net/lists/listinfo/getdata-devel > -- C. Barth Netterfield University of Toronto 416-845-0946 |
From: Ben L. <egr...@gm...> - 2012-04-21 12:35:39
|
#include <stdio.h> #include <NIDAQmx.h> #include <time.h> #define DAQmxErrChk(functionCall) if( DAQmxFailed(error=(functionCall)) ) goto Error; else int32 CVICALLBACK EveryNCallback(TaskHandle taskHandle, int32 everyNsamplesEventType, uInt32 nSamples, void *callbackData); int32 CVICALLBACK DoneCallback(TaskHandle taskHandle, int32 status, void *callbackData); FILE *datafile; int main(void) { int32 error=0; TaskHandle taskHandle=0; char errBuff[2048]={'\0'}; //Map names to physical channels const char physicalChannel1[]="cDAQ1Mod1/ai0"; const char nameToAssignToChannel1[]="temp1"; const char physicalChannel2[]="cDAQ1Mod2/ai0"; const char nameToAssignToChannel2[]="temp2"; const char physicalChannel3[]="cDAQ1Mod3/ai0"; const char nameToAssignToChannel3[]="temp3"; const char physicalChannel4[]="cDAQ1Mod4/ai0"; const char nameToAssignToChannel4[]="prox1"; const char physicalChannel5[]="cDAQ1Mod5/ai0"; const char nameToAssignToChannel5[]="distance1"; const char physicalChannel6[]="cDAQ1Mod5/ai1"; const char nameToAssignToChannel6[]="distance2"; //DAQmxCreateAIThrmcplChan float64 minVal=0; float64 maxVal=300; uInt32 units=DAQmx_Val_DegC; //DAQmx_Val_DegC degrees Celsius //DAQmx_Val_DegF degrees Fahrenheit //DAQmx_Val_Kelvins degrees kelvins //DAQmx_Val_DegR degrees Rankine uInt32 thermocoupleType=DAQmx_Val_K_Type_TC; //DAQmx_Val_J_Type_TC //DAQmx_Val_K_Type_TC //DAQmx_Val_N_Type_TC //DAQmx_Val_R_Type_TC //DAQmx_Val_S_Type_TC //DAQmx_Val_T_Type_TC //DAQmx_Val_B_Type_TC //DAQmx_Val_E_Type_TC uInt32 cjcSource=DAQmx_Val_BuiltIn; //DAQmx_Val_BuiltIn Use a cold-junction compensation channel built into the terminal block. //DAQmx_Val_ConstVal You must specify the cold-junction temperature. //DAQmx_Val_Chan Use a channel for cold-junction compensation. float64 cjcVal=25.0; const char cjcChannel[]=""; //e.g. "cDAQ1Mod1/ai0" //AutoZeroMode int32 setAutoZeroMode=DAQmx_Val_EverySample; //DAQmx_Val_None //DAQmx_Val_Once //DAQmx_Val_EverySample //DAQmxCfgSampClkTiming float64 rate=100; int32 activeEdge=DAQmx_Val_Rising; //DAQmx_Val_Rising Acquire or generate samples on the rising edges of the Sample Clock. //DAQmx_Val_Falling Acquire or generate samples on the falling edges of the Sample Clock. int32 sampleMode=DAQmx_Val_ContSamps; //DAQmx_Val_FiniteSamps Acquire or generate a finite number of samples. //DAQmx_Val_ContSamps Acquire or generate samples until you stop the task. //DAQmx_Val_HWTimedSinglePoint Acquire or generate samples continuously using hardware timing without a buffer. Hardware timed single point sample mode is supported only for the sample clock and change detection timing types. uInt64 sampsPerChan=100; //The number of samples to acquire or generate for each channel in the task if sampleMode is DAQmx_Val_FiniteSamps. If sampleMode is DAQmx_Val_ContSamps, NI-DAQmx uses this value to determine the buffer size. //DAQmxRegisterEveryNSamplesEvent int32 everyNsamplesEventType=DAQmx_Val_Acquired_Into_Buffer; //DAQmx_Val_Acquired_Into_Buffer This event type is only supported for input tasks. Events occur when the specified number of samples are acquired into the buffer from the device. //DAQmx_Val_Transferred_From_Buffer This event type is only supported for output tasks. Events occur when the specified number of samples are transferred from the buffer to the device. uInt32 nSamples=50; //The number of samples after which each event should occur. /*********************************************/ // DAQmx Configure Code /*********************************************/ DAQmxErrChk (DAQmxCreateTask("",&taskHandle)); DAQmxErrChk (DAQmxCreateAIThrmcplChan(taskHandle,physicalChannel1,nameToAssignToChannel1,minVal,maxVal,units,thermocoupleType,cjcSource,cjcVal,cjcChannel)); DAQmxErrChk (DAQmxCreateAIThrmcplChan(taskHandle,physicalChannel2,nameToAssignToChannel2,minVal,maxVal,units,thermocoupleType,cjcSource,cjcVal,cjcChannel)); DAQmxErrChk (DAQmxCreateAIThrmcplChan(taskHandle,physicalChannel3,nameToAssignToChannel3,minVal,maxVal,units,thermocoupleType,cjcSource,cjcVal,cjcChannel)); DAQmxErrChk (DAQmxSetAIAutoZeroMode(taskHandle,physicalChannel1, setAutoZeroMode)); DAQmxErrChk (DAQmxSetAIAutoZeroMode(taskHandle,physicalChannel2, setAutoZeroMode)); DAQmxErrChk (DAQmxSetAIAutoZeroMode(taskHandle,physicalChannel2, setAutoZeroMode)); DAQmxErrChk (DAQmxCreateAIVoltageChan(taskHandle,physicalChannel4,nameToAssignToChannel4,DAQmx_Val_Cfg_Default,-60.0,60.0,DAQmx_Val_Volts,NULL)); DAQmxErrChk (DAQmxCreateLinScale("AnalogProxScale", 2500.0, -10.0, DAQmx_Val_Amps, "mm")); DAQmxErrChk (DAQmxCreateAICurrentChan(taskHandle,physicalChannel5,nameToAssignToChannel5,DAQmx_Val_RSE,0.0,0.02,DAQmx_Val_FromCustomScale,DAQmx_Val_Default,249.0,"AnalogProxScale")); DAQmxErrChk (DAQmxCreateAICurrentChan(taskHandle,physicalChannel6,nameToAssignToChannel6,DAQmx_Val_RSE,0.0,0.02,DAQmx_Val_FromCustomScale,DAQmx_Val_Default,249.0,"AnalogProxScale")); DAQmxErrChk (DAQmxCfgSampClkTiming(taskHandle,NULL,rate,activeEdge,sampleMode,sampsPerChan)); DAQmxErrChk (DAQmxRegisterEveryNSamplesEvent(taskHandle,everyNsamplesEventType,nSamples,0,EveryNCallback,NULL)); DAQmxErrChk (DAQmxRegisterDoneEvent(taskHandle,0,DoneCallback,NULL)); /*********************************************/ // DAQmx Start Code /*********************************************/ /* Sample Clock Rate */ float64 sampClkRate; DAQmxErrChk (DAQmxGetSampClkRate(taskHandle, &sampClkRate)); float64 dt; if (sampClkRate>0) dt=1/sampClkRate; else dt=0; /* Start Time */ time_t rawtime; struct tm * timeinfo; char buffer [80]; time ( &rawtime ); timeinfo = localtime ( &rawtime ); strftime (buffer,80,"%d/%m/%Y %X",timeinfo); printf("channel names:\t\t\t\t\t\n"); printf("%s\t%s\t%s\t%s\t%s\t%s\n",nameToAssignToChannel1,nameToAssignToChannel2,nameToAssignToChannel3,nameToAssignToChannel4,nameToAssignToChannel5,nameToAssignToChannel6); printf("start times:\t\t\t\t\t\n"); printf("%s\t%s\t%s\t%s\t%s\t%s\n",buffer,buffer,buffer,buffer,buffer,buffer); printf("dt:\t\t\t\t\t\n"); printf("%0.6f\t\t\t\t\t\n",dt); printf("data:\t\t\t\t\t\n"); datafile = fopen("testdata.txt","w"); fprintf(datafile,"channel names:\t\t\t\t\t\n"); fprintf(datafile,"%s\t%s\t%s\t%s\t%s\t%s\n",nameToAssignToChannel1,nameToAssignToChannel2,nameToAssignToChannel3,nameToAssignToChannel4,nameToAssignToChannel5,nameToAssignToChannel6); fprintf(datafile,"start times:\t\t\t\t\t\n"); fprintf(datafile,"%s\t%s\t%s\t%s\t%s\t%s\n",buffer,buffer,buffer,buffer,buffer,buffer); fprintf(datafile,"dt:\t\t\t\t\t\n"); fprintf(datafile,"%0.6f\t\t\t\t\t\n",dt); fprintf(datafile,"data:\t\t\t\t\t\n"); fclose (datafile); DAQmxErrChk (DAQmxStartTask(taskHandle)); getchar(); Error: if( DAQmxFailed(error) ) DAQmxGetExtendedErrorInfo(errBuff,2048); if( taskHandle!=0 ) { /*********************************************/ // DAQmx Stop Code /*********************************************/ DAQmxStopTask(taskHandle); DAQmxClearTask(taskHandle); } if( DAQmxFailed(error) ) printf("DAQmx Error: %s\n",errBuff); printf("End of program, press Enter key to quit\n"); getchar(); return 0; } int32 CVICALLBACK EveryNCallback(TaskHandle taskHandle, int32 everyNsamplesEventType, uInt32 nSamples, void *callbackData) { int32 error=0; char errBuff[2048]={'\0'}; static int totalRead=0; int32 read=0; uInt32 chansToRead; DAQmxErrChk (DAQmxGetTaskNumChans(taskHandle, &chansToRead)); uInt32 arraySizeInSamps = chansToRead*nSamples; float64 *data = new float64[arraySizeInSamps]; /*********************************************/ // DAQmx Read Code /*********************************************/ DAQmxErrChk (DAQmxReadAnalogF64(taskHandle,-1,10.0,DAQmx_Val_GroupByScanNumber,data,arraySizeInSamps,&read,NULL)); uInt32 i; uInt32 j; if( read>0 ) { datafile = fopen("testdata.txt","a"); for (i = 0 ; i < chansToRead*read ; i+=chansToRead ) { for (j = 0 ; j < chansToRead ; j++ ) { if (j<chansToRead-1){ printf("% 5.5E\t", data[i+j]); fprintf(datafile, "% 5.5E\t", data[i+j]); } else{ printf("% 5.5E\n", data[i+j]); fprintf(datafile,"% 5.5E\n", data[i+j]); } } } printf("Acquired %d samples. Total %d\r\n",read,totalRead+=read); fclose (datafile); fflush(stdout); } Error: if( DAQmxFailed(error) ) { DAQmxGetExtendedErrorInfo(errBuff,2048); /*********************************************/ // DAQmx Stop Code /*********************************************/ DAQmxStopTask(taskHandle); DAQmxClearTask(taskHandle); printf("DAQmx Error: %s\n",errBuff); } delete [] data; data = NULL; return 0; } int32 CVICALLBACK DoneCallback(TaskHandle taskHandle, int32 status, void *callbackData) { int32 error=0; char errBuff[2048]={'\0'}; // Check to see if an error stopped the task. DAQmxErrChk (status); Error: if( DAQmxFailed(error) ) { DAQmxGetExtendedErrorInfo(errBuff,2048); DAQmxClearTask(taskHandle); printf("DAQmx Error: %s\n",errBuff); } return 0; } |
From: Mike N. <no...@ci...> - 2011-12-02 19:38:18
|
On Fri, Dec 2, 2011 at 2:02 PM, Ted Kisner <tsk...@gm...> wrote: > Hi Mike, > On Dec 2, 2011, at 10:47 AM, Mike Nolta wrote: > > The problem wasn't the number of files per directory (~1500), but the > total number of files (~200 million). > > wow. This is off-topic slightly, but how often do you "restart" your > sampling (creating a new dirfile)? Even if you did it once per hour for > 2000 channels and acquired data for a year you would only have ~18 million > files. About every 15 minutes. Plus we had 3 cameras, and observed off-and-on for 4 years. > If I was in that situation I would probably archive most data (tar to tape / > long term storage) unless I was actually working on it, and then stage data > to disk in sections. For reference, a typical project account at NERSC has > an inode (# of files) quota of about 4 million. Planck has a 50 million > inode quota. These are CMB timestreams, and we're still busy making maps (maximum likelihood), so we can't really do partial stages. Anyway, zipping up the dirfiles completely solved the problem, reducing the number of inodes to ~140k. -Mike > I guess everyone comes up with their own preferred "system" for dealing with > large data volumes… > cheers, > -Ted > > > |
From: Ted K. <tsk...@gm...> - 2011-12-02 19:03:08
|
Hi Mike, On Dec 2, 2011, at 10:47 AM, Mike Nolta wrote: > The problem wasn't the number of files per directory (~1500), but the > total number of files (~200 million). wow. This is off-topic slightly, but how often do you "restart" your sampling (creating a new dirfile)? Even if you did it once per hour for 2000 channels and acquired data for a year you would only have ~18 million files. If I was in that situation I would probably archive most data (tar to tape / long term storage) unless I was actually working on it, and then stage data to disk in sections. For reference, a typical project account at NERSC has an inode (# of files) quota of about 4 million. Planck has a 50 million inode quota. I guess everyone comes up with their own preferred "system" for dealing with large data volumes… cheers, -Ted |
From: Mike N. <no...@ci...> - 2011-12-02 18:48:44
|
On Fri, Dec 2, 2011 at 12:12 PM, Michael Milligan <mmi...@as...> wrote: > It is pretty easy to construct use cases where dirfile yields poor I/O > performance for *writing*, but once all the descriptors are open > *reading* hasn't generally been an issue, even with rather large > numbers of channels in a flat directory. And at any rate, GPFS is > supposed to have good support for large numbers of files per > directory, so I'm guessing the issue of relatively small files is the > underlying problem. Especially if they compress well, putting them in > a single file could let OS prefetch caching preload entire RAW fields. > > It sounds like this zzip file solution must be a read-only encoding, > but maybe Adam will correct me on that. Yes, read only. -Mike > > ...Milligan > > On Fri, Dec 02, 2011 at 08:07:23AM -0800, Ted Kisner wrote: >> Would it not be simpler to split the channels into sub-dirfiles to >> reduce the number of files in a single directory? I am only just >> now starting to do tests on complicated dirfile setups at NERSC (on >> both GPFS and Lustre filesystems). So I don't yet have a sense of >> which configurations cause poor I/O performance... >> >> -Ted >> >> On Dec 1, 2011, at 6:16 PM, Mike Nolta wrote: >> >> > On Thu, Dec 1, 2011 at 5:07 PM, D. V. Wiebe <ge...@ke...> wrote: >> >> On Thu, Nov 24, 2011 at 07:49:40PM -0500, Adam D. Hincks, S.J. wrote: >> >>> Dear getdata team, >> >>> >> >>> I'm going to be using getdata on SciNet, the big cluster here at the U of >> >>> T, for SPIDER sims. >> >>> >> >>> One thing the cluster doesn't do well is parallel reads from many files in >> >>> the same directory. Mike Nolta, however, came up with a solution for >> >>> reading dirfiles a couple of years ago---zip up the whole dirfile in a >> >>> single archive, and then read directly from that using the zzip library. >> >>> This amazingly solves the problem. He wrote some I/O wrappers for an >> >>> ancient version of getdata ((C) 2002 C. Barth Netterfield (!)) and >> >>> called it "zirfile". The ACT project (which uses dirfiles) has been >> >>> successfully using zirfile on SciNet to do its heavy-duty map-making. >> >> >> >> Hey Adam, >> >> >> >> Can you give me more information. I don't really understand the SciNet >> >> problem. Is it an issue of running out of file descriptors? Or, is it >> >> something more subtle? >> >> >> > >> > The problem was due to the filesystem. Scinet forced us to drastically >> > reduce the number of files we were using, as GPFS performs poorly when >> > carrying lots of small files. >> > >> > -Mike >> > >> >> If you're using zirfiles because you can get away with a single file >> >> descriptor, don't you loose a lot of the advantage of a dirfile? (Since >> >> you're seeking back and forth through the same file.) Wouldn't a >> >> different file format like, say, HDF, work better? >> >> >> >> How would automatic encoding detection work? How would GetData >> >> determine the name of the zip file? >> >> >> >> The encoding API has changed somewhat from 0.7 to trunk. How would your >> >> encoding like a descriptor to the containing directory and a string field >> >> name? Or a descriptor to the zip fie and the field name? >> >> >> >> It really seems that what you really want is a way to abstract getdata's >> >> file I/O. I'm not really comfortable makling zzlib a GetData >> >> prerequisite, but perhaps I could come up with a way for the caller to >> >> reimplement GetData's I/O functions... >> >> >> >> -dvw >> >> -- >> >> D. V. Wiebe >> >> ge...@ke... >> >> http://getdata.sourceforge.net/ >> >> >> > >> > ------------------------------------------------------------------------------ >> > All the data continuously generated in your IT infrastructure >> > contains a definitive record of customers, application performance, >> > security threats, fraudulent activity, and more. Splunk takes this >> > data and makes sense of it. IT sense. And common sense. >> > http://p.sf.net/sfu/splunk-novd2d >> > _______________________________________________ >> > getdata-devel mailing list >> > get...@li... >> > https://lists.sourceforge.net/lists/listinfo/getdata-devel >> >> >> ------------------------------------------------------------------------------ >> All the data continuously generated in your IT infrastructure >> contains a definitive record of customers, application performance, >> security threats, fraudulent activity, and more. Splunk takes this >> data and makes sense of it. IT sense. And common sense. >> http://p.sf.net/sfu/splunk-novd2d >> _______________________________________________ >> getdata-devel mailing list >> get...@li... >> https://lists.sourceforge.net/lists/listinfo/getdata-devel > > -- > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.10 (GNU/Linux) > > iQEcBAEBCAAGBQJO2QcFAAoJEMg5sUdMlLvC4WQIAL7fGEgMyrqoQ0cenOqxGEin > gMAA/9Oa1SWjSBdAROW2vtLFPau6n0pD2VUUOUB0ZTS9EBVr/YC7GecBNTuYtKfS > FKJUxkEzoohGuZnVldyQOaW/cdaMRs2YCXShGCy1/r/gzTD7PGveRm4nWvzTYnxf > GyeFDDINOkXzFzzMCMAKiiQOutA5GyavwVRjVfqtcToh4LqlVjEifRFcseiuMdsn > Q/fxRg14uejQw7iALFN4+JWxQ/W0ZQ7ooA8nIZ/k8cIMZ6mc1tUQwD5X2Y/wvytg > 0jhfNRHmvBLDBDoU/io5sc6h0tGSszQc1aNEFhIA8rArxLuvwLx6BB4WhiRdXHk= > =EgBB > -----END PGP SIGNATURE----- > > |
From: Mike N. <no...@ci...> - 2011-12-02 18:47:52
|
On Fri, Dec 2, 2011 at 11:07 AM, Ted Kisner <tsk...@gm...> wrote: > Would it not be simpler to split the channels into sub-dirfiles to reduce the number of files in a single directory? I am only just now starting to do tests on complicated dirfile setups at NERSC (on both GPFS and Lustre filesystems). So I don't yet have a sense of which configurations cause poor I/O performance... > The problem wasn't the number of files per directory (~1500), but the total number of files (~200 million). -Mike > -Ted > > On Dec 1, 2011, at 6:16 PM, Mike Nolta wrote: > >> On Thu, Dec 1, 2011 at 5:07 PM, D. V. Wiebe <ge...@ke...> wrote: >>> On Thu, Nov 24, 2011 at 07:49:40PM -0500, Adam D. Hincks, S.J. wrote: >>>> Dear getdata team, >>>> >>>> I'm going to be using getdata on SciNet, the big cluster here at the U of >>>> T, for SPIDER sims. >>>> >>>> One thing the cluster doesn't do well is parallel reads from many files in >>>> the same directory. Mike Nolta, however, came up with a solution for >>>> reading dirfiles a couple of years ago---zip up the whole dirfile in a >>>> single archive, and then read directly from that using the zzip library. >>>> This amazingly solves the problem. He wrote some I/O wrappers for an >>>> ancient version of getdata ((C) 2002 C. Barth Netterfield (!)) and >>>> called it "zirfile". The ACT project (which uses dirfiles) has been >>>> successfully using zirfile on SciNet to do its heavy-duty map-making. >>> >>> Hey Adam, >>> >>> Can you give me more information. I don't really understand the SciNet >>> problem. Is it an issue of running out of file descriptors? Or, is it >>> something more subtle? >>> >> >> The problem was due to the filesystem. Scinet forced us to drastically >> reduce the number of files we were using, as GPFS performs poorly when >> carrying lots of small files. >> >> -Mike >> >>> If you're using zirfiles because you can get away with a single file >>> descriptor, don't you loose a lot of the advantage of a dirfile? (Since >>> you're seeking back and forth through the same file.) Wouldn't a >>> different file format like, say, HDF, work better? >>> >>> How would automatic encoding detection work? How would GetData >>> determine the name of the zip file? >>> >>> The encoding API has changed somewhat from 0.7 to trunk. How would your >>> encoding like a descriptor to the containing directory and a string field >>> name? Or a descriptor to the zip fie and the field name? >>> >>> It really seems that what you really want is a way to abstract getdata's >>> file I/O. I'm not really comfortable makling zzlib a GetData >>> prerequisite, but perhaps I could come up with a way for the caller to >>> reimplement GetData's I/O functions... >>> >>> -dvw >>> -- >>> D. V. Wiebe >>> ge...@ke... >>> http://getdata.sourceforge.net/ >>> >> >> ------------------------------------------------------------------------------ >> All the data continuously generated in your IT infrastructure >> contains a definitive record of customers, application performance, >> security threats, fraudulent activity, and more. Splunk takes this >> data and makes sense of it. IT sense. And common sense. >> http://p.sf.net/sfu/splunk-novd2d >> _______________________________________________ >> getdata-devel mailing list >> get...@li... >> https://lists.sourceforge.net/lists/listinfo/getdata-devel > > |
From: Michael M. <mmi...@as...> - 2011-12-02 17:12:48
|
It is pretty easy to construct use cases where dirfile yields poor I/O performance for *writing*, but once all the descriptors are open *reading* hasn't generally been an issue, even with rather large numbers of channels in a flat directory. And at any rate, GPFS is supposed to have good support for large numbers of files per directory, so I'm guessing the issue of relatively small files is the underlying problem. Especially if they compress well, putting them in a single file could let OS prefetch caching preload entire RAW fields. It sounds like this zzip file solution must be a read-only encoding, but maybe Adam will correct me on that. ...Milligan On Fri, Dec 02, 2011 at 08:07:23AM -0800, Ted Kisner wrote: > Would it not be simpler to split the channels into sub-dirfiles to > reduce the number of files in a single directory? I am only just > now starting to do tests on complicated dirfile setups at NERSC (on > both GPFS and Lustre filesystems). So I don't yet have a sense of > which configurations cause poor I/O performance... > > -Ted > > On Dec 1, 2011, at 6:16 PM, Mike Nolta wrote: > > > On Thu, Dec 1, 2011 at 5:07 PM, D. V. Wiebe <ge...@ke...> wrote: > >> On Thu, Nov 24, 2011 at 07:49:40PM -0500, Adam D. Hincks, S.J. wrote: > >>> Dear getdata team, > >>> > >>> I'm going to be using getdata on SciNet, the big cluster here at the U of > >>> T, for SPIDER sims. > >>> > >>> One thing the cluster doesn't do well is parallel reads from many files in > >>> the same directory. Mike Nolta, however, came up with a solution for > >>> reading dirfiles a couple of years ago---zip up the whole dirfile in a > >>> single archive, and then read directly from that using the zzip library. > >>> This amazingly solves the problem. He wrote some I/O wrappers for an > >>> ancient version of getdata ((C) 2002 C. Barth Netterfield (!)) and > >>> called it "zirfile". The ACT project (which uses dirfiles) has been > >>> successfully using zirfile on SciNet to do its heavy-duty map-making. > >> > >> Hey Adam, > >> > >> Can you give me more information. I don't really understand the SciNet > >> problem. Is it an issue of running out of file descriptors? Or, is it > >> something more subtle? > >> > > > > The problem was due to the filesystem. Scinet forced us to drastically > > reduce the number of files we were using, as GPFS performs poorly when > > carrying lots of small files. > > > > -Mike > > > >> If you're using zirfiles because you can get away with a single file > >> descriptor, don't you loose a lot of the advantage of a dirfile? (Since > >> you're seeking back and forth through the same file.) Wouldn't a > >> different file format like, say, HDF, work better? > >> > >> How would automatic encoding detection work? How would GetData > >> determine the name of the zip file? > >> > >> The encoding API has changed somewhat from 0.7 to trunk. How would your > >> encoding like a descriptor to the containing directory and a string field > >> name? Or a descriptor to the zip fie and the field name? > >> > >> It really seems that what you really want is a way to abstract getdata's > >> file I/O. I'm not really comfortable makling zzlib a GetData > >> prerequisite, but perhaps I could come up with a way for the caller to > >> reimplement GetData's I/O functions... > >> > >> -dvw > >> -- > >> D. V. Wiebe > >> ge...@ke... > >> http://getdata.sourceforge.net/ > >> > > > > ------------------------------------------------------------------------------ > > All the data continuously generated in your IT infrastructure > > contains a definitive record of customers, application performance, > > security threats, fraudulent activity, and more. Splunk takes this > > data and makes sense of it. IT sense. And common sense. > > http://p.sf.net/sfu/splunk-novd2d > > _______________________________________________ > > getdata-devel mailing list > > get...@li... > > https://lists.sourceforge.net/lists/listinfo/getdata-devel > > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure > contains a definitive record of customers, application performance, > security threats, fraudulent activity, and more. Splunk takes this > data and makes sense of it. IT sense. And common sense. > http://p.sf.net/sfu/splunk-novd2d > _______________________________________________ > getdata-devel mailing list > get...@li... > https://lists.sourceforge.net/lists/listinfo/getdata-devel -- |
From: Ted K. <tsk...@gm...> - 2011-12-02 16:07:31
|
Would it not be simpler to split the channels into sub-dirfiles to reduce the number of files in a single directory? I am only just now starting to do tests on complicated dirfile setups at NERSC (on both GPFS and Lustre filesystems). So I don't yet have a sense of which configurations cause poor I/O performance... -Ted On Dec 1, 2011, at 6:16 PM, Mike Nolta wrote: > On Thu, Dec 1, 2011 at 5:07 PM, D. V. Wiebe <ge...@ke...> wrote: >> On Thu, Nov 24, 2011 at 07:49:40PM -0500, Adam D. Hincks, S.J. wrote: >>> Dear getdata team, >>> >>> I'm going to be using getdata on SciNet, the big cluster here at the U of >>> T, for SPIDER sims. >>> >>> One thing the cluster doesn't do well is parallel reads from many files in >>> the same directory. Mike Nolta, however, came up with a solution for >>> reading dirfiles a couple of years ago---zip up the whole dirfile in a >>> single archive, and then read directly from that using the zzip library. >>> This amazingly solves the problem. He wrote some I/O wrappers for an >>> ancient version of getdata ((C) 2002 C. Barth Netterfield (!)) and >>> called it "zirfile". The ACT project (which uses dirfiles) has been >>> successfully using zirfile on SciNet to do its heavy-duty map-making. >> >> Hey Adam, >> >> Can you give me more information. I don't really understand the SciNet >> problem. Is it an issue of running out of file descriptors? Or, is it >> something more subtle? >> > > The problem was due to the filesystem. Scinet forced us to drastically > reduce the number of files we were using, as GPFS performs poorly when > carrying lots of small files. > > -Mike > >> If you're using zirfiles because you can get away with a single file >> descriptor, don't you loose a lot of the advantage of a dirfile? (Since >> you're seeking back and forth through the same file.) Wouldn't a >> different file format like, say, HDF, work better? >> >> How would automatic encoding detection work? How would GetData >> determine the name of the zip file? >> >> The encoding API has changed somewhat from 0.7 to trunk. How would your >> encoding like a descriptor to the containing directory and a string field >> name? Or a descriptor to the zip fie and the field name? >> >> It really seems that what you really want is a way to abstract getdata's >> file I/O. I'm not really comfortable makling zzlib a GetData >> prerequisite, but perhaps I could come up with a way for the caller to >> reimplement GetData's I/O functions... >> >> -dvw >> -- >> D. V. Wiebe >> ge...@ke... >> http://getdata.sourceforge.net/ >> > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure > contains a definitive record of customers, application performance, > security threats, fraudulent activity, and more. Splunk takes this > data and makes sense of it. IT sense. And common sense. > http://p.sf.net/sfu/splunk-novd2d > _______________________________________________ > getdata-devel mailing list > get...@li... > https://lists.sourceforge.net/lists/listinfo/getdata-devel |
From: D. V. W. <ge...@ke...> - 2011-12-02 14:21:55
|
On Thu, Dec 01, 2011 at 09:16:59PM -0500, Mike Nolta wrote: > The problem was due to the filesystem. Scinet forced us to drastically > reduce the number of files we were using, as GPFS performs poorly when > carrying lots of small files. > > -Mike So do you prefer opening the zip file once, and then seeking around from one field to the next, or is it better to open the same zip file multiple times, which gives you multiple independent I/O pointers into the zip? -dvw -- D. V. Wiebe ge...@ke... http://getdata.sourceforge.net/ |
From: Mike N. <no...@ci...> - 2011-12-02 03:19:10
|
On Thu, Dec 1, 2011 at 5:07 PM, D. V. Wiebe <ge...@ke...> wrote: > On Thu, Nov 24, 2011 at 07:49:40PM -0500, Adam D. Hincks, S.J. wrote: >> Dear getdata team, >> >> I'm going to be using getdata on SciNet, the big cluster here at the U of >> T, for SPIDER sims. >> >> One thing the cluster doesn't do well is parallel reads from many files in >> the same directory. Mike Nolta, however, came up with a solution for >> reading dirfiles a couple of years ago---zip up the whole dirfile in a >> single archive, and then read directly from that using the zzip library. >> This amazingly solves the problem. He wrote some I/O wrappers for an >> ancient version of getdata ((C) 2002 C. Barth Netterfield (!)) and >> called it "zirfile". The ACT project (which uses dirfiles) has been >> successfully using zirfile on SciNet to do its heavy-duty map-making. > > Hey Adam, > > Can you give me more information. I don't really understand the SciNet > problem. Is it an issue of running out of file descriptors? Or, is it > something more subtle? > The problem was due to the filesystem. Scinet forced us to drastically reduce the number of files we were using, as GPFS performs poorly when carrying lots of small files. -Mike > If you're using zirfiles because you can get away with a single file > descriptor, don't you loose a lot of the advantage of a dirfile? (Since > you're seeking back and forth through the same file.) Wouldn't a > different file format like, say, HDF, work better? > > How would automatic encoding detection work? How would GetData > determine the name of the zip file? > > The encoding API has changed somewhat from 0.7 to trunk. How would your > encoding like a descriptor to the containing directory and a string field > name? Or a descriptor to the zip fie and the field name? > > It really seems that what you really want is a way to abstract getdata's > file I/O. I'm not really comfortable makling zzlib a GetData > prerequisite, but perhaps I could come up with a way for the caller to > reimplement GetData's I/O functions... > > -dvw > -- > D. V. Wiebe > ge...@ke... > http://getdata.sourceforge.net/ > |
From: D. V. W. <ge...@ke...> - 2011-12-01 22:07:55
|
On Thu, Nov 24, 2011 at 07:49:40PM -0500, Adam D. Hincks, S.J. wrote: > Dear getdata team, > > I'm going to be using getdata on SciNet, the big cluster here at the U of > T, for SPIDER sims. > > One thing the cluster doesn't do well is parallel reads from many files in > the same directory. Mike Nolta, however, came up with a solution for > reading dirfiles a couple of years ago---zip up the whole dirfile in a > single archive, and then read directly from that using the zzip library. > This amazingly solves the problem. He wrote some I/O wrappers for an > ancient version of getdata ((C) 2002 C. Barth Netterfield (!)) and > called it "zirfile". The ACT project (which uses dirfiles) has been > successfully using zirfile on SciNet to do its heavy-duty map-making. Hey Adam, Can you give me more information. I don't really understand the SciNet problem. Is it an issue of running out of file descriptors? Or, is it something more subtle? If you're using zirfiles because you can get away with a single file descriptor, don't you loose a lot of the advantage of a dirfile? (Since you're seeking back and forth through the same file.) Wouldn't a different file format like, say, HDF, work better? How would automatic encoding detection work? How would GetData determine the name of the zip file? The encoding API has changed somewhat from 0.7 to trunk. How would your encoding like a descriptor to the containing directory and a string field name? Or a descriptor to the zip fie and the field name? It really seems that what you really want is a way to abstract getdata's file I/O. I'm not really comfortable makling zzlib a GetData prerequisite, but perhaps I could come up with a way for the caller to reimplement GetData's I/O functions... -dvw -- D. V. Wiebe ge...@ke... http://getdata.sourceforge.net/ |
From: Adam D. H. S.J. <ada...@ut...> - 2011-11-25 00:50:08
|
Dear getdata team, I'm going to be using getdata on SciNet, the big cluster here at the U of T, for SPIDER sims. One thing the cluster doesn't do well is parallel reads from many files in the same directory. Mike Nolta, however, came up with a solution for reading dirfiles a couple of years ago---zip up the whole dirfile in a single archive, and then read directly from that using the zzip library. This amazingly solves the problem. He wrote some I/O wrappers for an ancient version of getdata ((C) 2002 C. Barth Netterfield (!)) and called it "zirfile". The ACT project (which uses dirfiles) has been successfully using zirfile on SciNet to do its heavy-duty map-making. I think it would be better to use the getdata library than to start up a new project with zirfile, since zirfile is far from compatible with the current getdata standards. So, I've been working on a new zzip encoding for getdata. The reason for zzip is that, as far as I'm aware, the current gzip and bzip2 encodings read from individually zipped files, whereas zzip has nice wrapper functions to read a file from within a zip archive. I've got zzip encoding working, but it's kludgy, because in getdata, I/O on the format file does not use encodings---it simply uses the stdio.h functions. Therefore, the format file can't exist in the zip archive. So the dirfile looks like this: dirfile/format dirfile/raw.zip The file raw.zip is an archive of all the raw fields: dirfile/raw.zip[/file1] dirfile/raw.zip[/file2] etc. The zzip functions treat raw.zip like a directory, so simply does: fd = zzip_open("dirfile/raw/file1", O_RDONLY), and then one can happily proceed. But it is currently awkward to figure out how to add the "raw/" part to the filebase of the field: it's a complete kludge in my current version. One solution might be to add another field like "char *prefix" to the encoding_t struct. But this seems lame. Cleaner would be to have: dirfile.zip With all the files in its archive: dirfile.zip[/format] dirfile.zip[/file1] dirfile.zip[/file2] etc. But this would require bringing I/O on the format file under the ambit of encodings. Any ideas about how to proceed? I currently have a zzip.c with the I/O wrappers and have modified encoding.c, some configure scripts and a few other places to make it all work. Best, Adam. |
From: D. V. W. <ge...@ke...> - 2011-11-09 01:53:06
|
On Tue, Nov 08, 2011 at 06:35:36PM -0500, Steve Benton wrote: > On 11-11-08 05:13 PM, D. V. Wiebe wrote: > > Cool. I'll take a look when I have a moment. I'm slowly re-writing > > defile in my spare time to be a getdata-backed anything -> dirfile > > converter (for small values of anything). There's a dirfile -> dirfile > > mode in there but, as the re-written defile is vapourware at this point, > > this does seem useful. > Cool. Is the idea that you're adding getdata support for useful > framefile formats? I suppose you could call it that. I typically think of defile as something that takes row-oriented stuff and transposes it into a column-oriented dirfile. Esentially my plan is to write a frame-data- to-dirfile writer back-end and with a utilitarian front-end to control it, plus a simple API which you can use to attach various input readers (the middle-end?) to it. And then write a reader for the MCE framefiles and another one for CHIME. Is the current MCP frame format documented? I could do that one too, if it hasn't changed too much. I've also got a dirfile reader in the works, mostly as a proof-of-concept thing (since it doesn't require any other libraries), which could potentially be able to do the same thing your dirfile2subset does. But the idea is to make something that can take any random formatted data in, if someone's willing to write some software to glue their acquisition library to this defile API and spit a dirfile out the other end. Subject to the dirfile constraints (syncronicity probably being the biggest one). > For when/if you get around to looking at dirfile2subset, I've attached a > somewhat corrected version. I'm considering this mostly "done" for now, > in the sense that it does everything I want it to. If it were to be > included in a getdata release, I would eventually need to gussy it up > some (especially making usage() correct). Excellent. > I also have one more outstanding problem. With gzip encoding enabled on > the output dirfile, I get "Operation not supported by current encoding > scheme" after gd_add() for a RAW field. This seems like a perfectly sane > operation to me. Am I missing something? That's "works-as-documented". The gzip encoding in GetData-0.7 doesn't contain write support. GetData refuses to add a RAW field to a dirfile which it is unable to write to (an attempt to point out problems as early as possible). So gd_add() and friends create a zero-length TOD on disk whenever a RAW field is added. Granted it's zero bytes long, but that still counts as a write, so when writing to a gzipped dirfile GetData will return GD_E_UNSUPPORTED (ie. there's no code in GetData to do write a gzipped zero-length file). If you're impatient, I've added gzip write support to trunk. Try it out, if you'd like. It's still very alpha, though, so don't use it on data you like (and don't expect the API to be stable). (Gzip writing in GetData can get glacially slow: each non-contiguous write requires uncompressing and then recompressing the entire RAW vector. Which is basically why it hasn't existed up until now.) > >From an even-more-vapourware perspective, I found myself wondering if > utilities like dirfile2ascii and dirfile2subset would be better replaced > by simpler utilities that could be combined in interesting ways. Maybe > something like getdata field-aware versions of ls, cat, cp, rm, mv, etc. You want to be to do something like: $ cp_dirfile field1 field2 field3 /new_dirfile and have it make a new dirfile with those fields in it? Isn't that what you just made? You could probably get such a paradigm to work; certainly for simple dirfiles, such as ones we make from flight data, which are primarily RAW fields, but I've found that thinking of dirfiles as simply "directory + format + TODs" tends to break down when things get complicated. Cheers, -dvw -- D. V. Wiebe ge...@ke... http://getdata.sourceforge.net/ |
From: Steve B. <sb...@ph...> - 2011-11-08 23:35:43
|
On 11-11-08 05:13 PM, D. V. Wiebe wrote: > Cool. I'll take a look when I have a moment. I'm slowly re-writing > defile in my spare time to be a getdata-backed anything -> dirfile > converter (for small values of anything). There's a dirfile -> dirfile > mode in there but, as the re-written defile is vapourware at this point, > this does seem useful. Cool. Is the idea that you're adding getdata support for useful framefile formats? For when/if you get around to looking at dirfile2subset, I've attached a somewhat corrected version. I'm considering this mostly "done" for now, in the sense that it does everything I want it to. If it were to be included in a getdata release, I would eventually need to gussy it up some (especially making usage() correct). I also have one more outstanding problem. With gzip encoding enabled on the output dirfile, I get "Operation not supported by current encoding scheme" after gd_add() for a RAW field. This seems like a perfectly sane operation to me. Am I missing something? >From an even-more-vapourware perspective, I found myself wondering if utilities like dirfile2ascii and dirfile2subset would be better replaced by simpler utilities that could be combined in interesting ways. Maybe something like getdata field-aware versions of ls, cat, cp, rm, mv, etc. >> Behaviour problems: >> - gd_entry() on metafields sets gd_entry_t.field to full "parent/meta" >> name, but when using gd_madd() entry.field should be just "meta". > Yeah, I know. One of them probably should be changed. But I can't > decide which one. I should probably just let gd_madd accept the / and > then either ignore it or else ignore parent if it's present. On the > other hand, it's fairly trivial to strip the parent off of the name in > gd_entry, too. Hrm... > > (Internally, getdata stores the field name as "parent/meta" to make field > look-up easy. However, the field ingestor doesn't like it that way. > Although, maybe with the new-fangled "Barth-style" metafield > specification, it might be more forgiving now.) > > It's probably best to change gd_madd, since changing gd_entry breaks the > principle that the field name provided by gd_entry can be used as a > field code. I agree that changing gd_madd to accept either format is best. |
From: D. V. W. <ge...@ke...> - 2011-11-08 22:13:44
|
On Sun, Nov 06, 2011 at 05:45:57PM -0500, Steve Benton wrote: > Greetings GetData, > > I have a new utility program dirfile2subset, loosely based on > dirfile2ascii. It creates a dirfile containing a subset of the > fields/frames from the original. It's not yet fully debugged or > complete, but seems suitable for sharing. Please comment, if you > want---especially on command-line options. I am also not particularly > attached to the name. Cool. I'll take a look when I have a moment. I'm slowly re-writing defile in my spare time to be a getdata-backed anything -> dirfile converter (for small values of anything). There's a dirfile -> dirfile mode in there but, as the re-written defile is vapourware at this point, this does seem useful. > I also have comments on a few things I noticed in the writing of this. > > Corrections for the online API docs: > - the gd_entry_t element in_fields is missing its '_' > - under gd_mfield_list(), the number of elements is from gd_nmfields(), > not gd_nmframes() > - gd_add() does not take a fragment index Thanks. Fixed. > Behaviour problems: > - gd_entry() on metafields sets gd_entry_t.field to full "parent/meta" > name, but when using gd_madd() entry.field should be just "meta". Yeah, I know. One of them probably should be changed. But I can't decide which one. I should probably just let gd_madd accept the / and then either ignore it or else ignore parent if it's present. On the other hand, it's fairly trivial to strip the parent off of the name in gd_entry, too. Hrm... (Internally, getdata stores the field name as "parent/meta" to make field look-up easy. However, the field ingestor doesn't like it that way. Although, maybe with the new-fangled "Barth-style" metafield specification, it might be more forgiving now.) It's probably best to change gd_madd, since changing gd_entry breaks the principle that the field name provided by gd_entry can be used as a field code. Cheers, -dvw -- D. V. Wiebe ge...@ke... http://getdata.sourceforge.net/ |
From: D. V. W. <ge...@ke...> - 2011-11-08 21:21:33
|
On Mon, Nov 07, 2011 at 11:24:18AM -0500, Steve Benton wrote: > On 11-11-06 05:45 PM, Steve Benton wrote: > > Behaviour problems: > > - gd_entry_t.scalar_ind does not seem to be respected by gd_add(). I > > tried a LINCOM2 with four CARRAY parameters and in the output dirfile > > all had index <0> > Attached is a patch for getdata-0.7.3/src/add.c that resolves this > problem for me. Applied to trunk. Thanks. -dvw -- D. V. Wiebe ge...@ke... http://getdata.sourceforge.net/ |
From: Steve B. <sb...@ph...> - 2011-11-07 16:24:25
|
On 11-11-06 05:45 PM, Steve Benton wrote: > Behaviour problems: > - gd_entry_t.scalar_ind does not seem to be respected by gd_add(). I > tried a LINCOM2 with four CARRAY parameters and in the output dirfile > all had index <0> Attached is a patch for getdata-0.7.3/src/add.c that resolves this problem for me. |
From: Steve B. <sb...@ph...> - 2011-11-06 23:05:53
|
Greetings GetData, I have a new utility program dirfile2subset, loosely based on dirfile2ascii. It creates a dirfile containing a subset of the fields/frames from the original. It's not yet fully debugged or complete, but seems suitable for sharing. Please comment, if you want---especially on command-line options. I am also not particularly attached to the name. I also have comments on a few things I noticed in the writing of this. Corrections for the online API docs: - the gd_entry_t element in_fields is missing its '_' - under gd_mfield_list(), the number of elements is from gd_nmfields(), not gd_nmframes() - gd_add() does not take a fragment index Behaviour problems: - gd_entry() on metafields sets gd_entry_t.field to full "parent/meta" name, but when using gd_madd() entry.field should be just "meta". - gd_entry_t.scalar_ind does not seem to be respected by gd_add(). I tried a LINCOM2 with four CARRAY parameters and in the output dirfile all had index <0> Cheers, -Steve |