Re: [Scalablecr-discuss] User configuration file format
Brought to you by:
kathrynmohror,
moody20
From: Maksym P. <mpl...@os...> - 2016-01-26 07:59:36
|
Thank you for the response. On 01/26/2016 12:54 AM, Mohror, Kathryn wrote: > Hi Maksym, > >> >> I decided to try out SCR. I compiled it and installed as specified in the manual. >> No I try to specify checkpoint descriptors in use configuration file. > > Glad to hear you're trying out SCR! > >> It turns out that the documentation describes other format, from what an >> example at https://github.com/hpc/scr/blob/master/scr.user.conf shows. >> >> For example, file doc/scr_users_manual.pdf does not contain keyword >> CKPTDESC whatsoever. >> >> Could you tell me what is the correct format? > > It looks like you have uncovered a bug in the example scr.user.conf file. Please use the keyword CKPT as you found in the user's guide for those lines instead of CKPTDESC. > CKPTDESC seems to be not the only dead keyword. CACHEDESC from example scr.user.conf is not used neither in the source nor in the documentation. >> I tried to use the one which documentation specifies, but I get an error which >> tells, that I probably don't have enough nodes: >> >> SCR v1.1.8 WARNING: rank 10 on taurusi6325: Failed to find partner >> processes for redundancy descriptor 0, disabling checkpoint, too few nodes? >> @ scr_reddesc.c:169 >> >> I definitely do, because I specify SET_SIZE=1 and create a job with 4 nodes. > > Yes, this error is related to the SCR_SET_SIZE parameter. Try setting it to 8 and see if it works better. I believe the reason you get that message is because the set size needs to be greater to 1 for a redundancy scheme to work. > > Let me know if that helps! If not we can work some more on it. > Changing the set size and removing GROUP=WORLD helped. Thank you. > Kathryn >> >> I attach all the configuration files for completeness. >> >> -- >> Regards, >> Maksym Planeta -- Regards, Maksym Planeta |