From: Andrey I. <ign...@im...> - 2023-09-21 12:03:28
|
Thank you for your replies. > I will regret loss of >> date info and will maybe look at changes that save a datestamp in the >> repository when the revision is checked in so that the date displayed >> can be the date of the latest checkin and keyed to the revision number >> rather than reflecting when the version was built. I can have a go at >> retrieving this from the "$Id:" stuff in version.h so it will in fact >> reflect the last checking that updated version.. The script >> scripts/commit.sh does that so people who use it are looked after! For the cases when it is needed to retain timestamps the reproducible-builds project developed a specification https://reproducible-builds.org/specs/source-date-epoch/ which introduces SOURCE_DATE_EPOCH environment variable. SOURCE_DATE_EPOCH is supposed to be used in the source code in place of calls to the 'current time' functions. There are examples of its use in Makefile and C/C++ code in https://reproducible-builds.org/docs/source-date-epoch/. The value of SOURCE_DATE_EPOCH should be set to the last modification time of the source, incorporating any packaging-specific modifications, so it is supposed to be consumed from packaging systems. It is also possible to use other timestamps provided all of them are before the value of SOURCE_DATE_EPOCH. As they claim, the major distributions and some tools respect this specification. >> BUILDING Reduce should not care about randomness, but I incline fairly >> strongly to a view that when run the random() function should (by >> default) behave differently each time. The user is given an option to >> fix the seed at startup if they need that. For chasing a particular bug >> repeatable behaviour can be vital - for proper testing and performance >> measurements if things that pretend to be random are not then bugs >> remain undiscovered and performance engineering can be based on the very >> particular circunstances of one non-random sequence. If the build >> scripts all specify "-r 1" that will make the BUILDS deterministic and >> anybody who needs deterministic tests can use the same flag. Will that >> suffice? If build processes does not bring calls to random() this flag is unneeded in build scripts. But it may be kept (introduced?) there temporarily until we make Reduce reproducible. > Then address space > randominzation during a build would very obviously be able to alter it. > If these days the ordering of addresses allocated by the underlying OS > during build are unstable then when I sort on addresses while building > various symbol tables etc who knows what may happen. reduce.img is > (these days) a serialized version of the heap. In serialize.cpp you will > see that references to compiled functions are processed using a CRC so I > can mention them using an integer handle. If addresses of all functions > in the executable are the same from build to build that will be > repeatable, but if not it will be a challange. Certainly release and > debug builds will put functions at different locations. Will address > space randomization and other "helpful" modern features intrude? Your > help may be useful!!! Thank you for mentioning the address space randomization, I didn't know about that. But as I mentioned I observe exactly the same executable files reduce, bootstrapreduce, and csl from build to build. Does it mean that if any address randomization is present, it all happens at runtime, not buildtime? >> If one builds two copies of anything in different directories then there >> are loads of divergencies. I do not know if people keen on >> reproducability want it to extend that far, so until that is explained >> (and other things sorted!) I will not address that issue. Certainly at >> present information about source paths and build locations tends to get Problem with build paths nowdays is considered to be solved by simply building in predictable paths. In can be just a fixed path, or, a path that contains hash of all inputs used to build a package as in GNU Guix and Nix OS. > Well image files go through a compression process so changes in content > - even if small - may lead to a different file-size. That is interesting. I think that compression shutoff would be important step on the path to clarification. Is it hard to switch it off? > And this is >> presumably EITHER a case where something is still sensitive to the clock >> or or >> sensitive to memory layout that can change with Linux memory address >> randomization. Sometimes I have observed that two separate builds performed on different days are exactly the same, including all the images. I think this allows to exclude the "clock" possibility. Best regards, Andrey Ignatenko |