Re: [Scalablecr-discuss] User configuration file format
Brought to you by:
kathrynmohror,
moody20
From: Mohror, K. <mo...@ll...> - 2016-01-25 23:54:39
|
Hi Maksym, > > I decided to try out SCR. I compiled it and installed as specified in the manual. > No I try to specify checkpoint descriptors in use configuration file. Glad to hear you're trying out SCR! > It turns out that the documentation describes other format, from what an > example at https://github.com/hpc/scr/blob/master/scr.user.conf shows. > > For example, file doc/scr_users_manual.pdf does not contain keyword > CKPTDESC whatsoever. > > Could you tell me what is the correct format? It looks like you have uncovered a bug in the example scr.user.conf file. Please use the keyword CKPT as you found in the user's guide for those lines instead of CKPTDESC. > I tried to use the one which documentation specifies, but I get an error which > tells, that I probably don't have enough nodes: > > SCR v1.1.8 WARNING: rank 10 on taurusi6325: Failed to find partner > processes for redundancy descriptor 0, disabling checkpoint, too few nodes? > @ scr_reddesc.c:169 > > I definitely do, because I specify SET_SIZE=1 and create a job with 4 nodes. Yes, this error is related to the SCR_SET_SIZE parameter. Try setting it to 8 and see if it works better. I believe the reason you get that message is because the set size needs to be greater to 1 for a redundancy scheme to work. Let me know if that helps! If not we can work some more on it. Kathryn > > I attach all the configuration files for completeness. > > -- > Regards, > Maksym Planeta |