Recently, a number of projects joined efforts to make the builds of their software reproducible. One of the goals of this effort is getting rid of timestamps in generated files, such as the documentation and man pages.
For this purpose, a specification of SOURCE_DATE_EPOCH
environment variable was developed, which various build systems can set and various projects can read.
The attached patch (initially submitted by Chris Lamb to Debian #831779) adds support for that environment variable to docutils date
directive.
Applied in revision 7953.
Thanks for the patch.
Thanks for applying it so quickly!
The patch contains an unwanted side-effect: while before, the local time was used for the date, the patch replaces this with the gmtime.
A possible fix is
Thinking about it, it is not in line with Docutils policy to have output depending on
an environment variable beeing set without an "opt-in" setting.
Changed in revision 7955.
What should the expected behaviour be when the variable is expected but not set:
Currently, the user gets a warning and using the date fails. If this is OK, please close the bug, is not specify the desired behaviour.
Thanks for the
gmtime
fix (and sorry that I did not notice it).I don't like the introduction of the setting because what we want in Debian is that docutils respects
SOURCE_DATE_EPOCH
if it's set, and uses the current time otherwise (this is what most other tools do, i.e. Sphinx). After your change, it looks like we should carry a patch.However, if you remove a warning (and silently fall back to current time), that will be a bit better for us (we will have to patch just the default value of the setting).
If it is fine that a missing SOURCE_DATE_EPOCH variable can be silently ignored, I'll remove the warning.
But there are more generic problems:
(for the average user) behaviour depending on an environment variable
to correct just one package¹ seems overkill.
¹ https://tests.reproducible-builds.org/debian/issues/unstable/timestamps_in_manpages_generated_by_rst2man_issue.html
the current patch does not even fix the only occurence of the problem!!
the man-page of git-hub generated with rst2man does not contian a "date" directive
this can be turned off with the "datestamp" configuration setting
making the datestamp dependent on SOURCE_DATE_EPOCH would be feasible
(after consensus on the devel list).
making the "date" directive dependent on SOURCE_DATE_EPOCH would be a change in
the specification of the reStructuredText language. This is similar to
the case of \date and \time TeX primitives.
TeXLive uses SOURCE_DATE_EPOCH_TEX_PRIMITIVES to control this behaviour
(could you point me to the original discussion leading to this decision?)
For consistency, Docutils might need to use this second variable
(now renamed to FORCE_SOURCE... ), too.
Hi Günter,
Today you reminded me about your proposal to disable the timestamps in
docutils.conf
. I want to ask you a question about it. You say:Are you sure you are correct here? To me, it looks like it does use a
date
directive on line 16.And setting “
datestamp:
” in/etc/docutils.conf
like you suggested does not seem to remove the datestamp from the generated manpage.Also, do you know a test case where setting that option would change something? For me neither
rst2html
norrst2man
put the datestamp into the footer unless I explicitly ask them to do so.(Upd: Günter replied to this message the mailing list.)
Last edit: Dmitry Shachnev 2016-12-11
Thanks for the comments, I will try to reply inline:
The tags on that page are filled manually. If there's only one package tagged it doesn't mean there aren't more such packages.
Indeed you are right.
Yes, I think it should use the same algorithm as
date
directive, whatever it is.Please keep in mind that a set SOURCE_DATE_EPOCH variable will be a very rare case. Most users of docutils will never have it set, and those who have it set probably know what to expect (these users are usually build machines, not humans).
I think it was http://tug.org/pipermail/tex-k/2016-May/002692.html.
Please read the “Documentation update” section of this blog post and the patch linked from there. In short, we would still prefer if
SOURCE_DATE_EPOCH
was obeyed unconditionally.Thanks for the comments. Discussion below.
As the current patch for the reStructuredText "date" directive
does not solve the problem,
has not a single real use case,
is a hack with (rare) side-effects,
introduces a possibly useless configuration option,
* still lacks documentation
I revert it for the time beeing (leaving markers).
I agree that reproducible builds are a "good thing", but I don't agree that
introducing behaviour change to an environment variable is alternativlos.
The simplest solution from the POV of software packagers is not necessarily the
best solution from a general view.
My recommendation is to use
in a config file (e.g. /etc/docutils.conf) This would solve the existing issue(s) without the need for hacks and could also replace the current (nonfunctional!) Debian patch.
If any package really uses the "date" directive in its documentation,
this is better fixed upstream there. In contrast to TeX, there are no
known existing cases of "nonreproducible" documentation due to the use
of the "date" directive.
but even this is not required if the build system disables the
datestamp in the Docutils configuration. Please raise this issue in
docutils-devel if you still think it otherwise.
...
Yes, in theory and if all goes well, this will not hurt. However, if
for some reason (accident, sabotage, ransomware, ...) this variable is
set for a non-developer, solving the mysterious behaviour is even
harder when it is a rare case.
After all, Docutils is not a development tool but a generic document
generator that may also be used by dentists, lawyers or linguists.
(Would you also ask the LyX document processor to change the
"date-insert" function to listen to SOURCE_DATE_EPOCH because it may
be used to generate software documentation from a script?)
Last edit: Günter Milde 2016-10-20
Hello Günter,
In fact we are already using SOURCE_DATE_EPOCH to make hundreds of packages reproducible, most notably GCC - who applied our patch to support this environment variable, and has now made over 400 packages reproducible.
We have in fact asked dozens of packages to adopt this, and most have done so. See here for a list:
https://wiki.debian.org/ReproducibleBuilds/TimestampsProposal#Reading_the_variable
In response to your point about "localtime" being used, you may mean two things:
The previous code did not do (1) either. But if you want (2), remember that for binary distributions like Debian (who want to reproduce their builds), readers might be in many many different timezones. So using the local time of the build machine, is not really useful. Also, if someone sets SOURCE_DATE_EPOCH, their primary concern is reproducible builds, and local timezones are not important.
Ideally you could write some javascript to dynamically convert the embedded date/time into any reader's local time zone, if you wanted really to use time zones. But doxygen decided that this was not worth the complexity, and decided to just use UTC for SOURCE_DATE_EPOCH. For example see discussion here:
https://github.com/doxygen/doxygen/pull/477
The reason we chose this environment variable is to be standard across many tools, as a cost-reduction mechanism. If each tool decided to implement its own mechanism to achieve this, it would not really reduce the cost. For more discussion see here:
https://wiki.debian.org/ReproducibleBuilds/TimestampsProposal#A.22We.27ll_add_a_command-line_flag_instead.22
If you agree that reproducible builds are "a good thing" then wouldn't you also agree that it's a good thing to reduce the cost to achieve this? Users of docutils shouldn't have to specifically "opt into" reproducible builds, that's why we want to set it centrally as an environment variable in Debian's build machines.
You asked us to provide "real use cases" on when SOURCE_DATE_EPOCH would be used. Can you provide "real use cases" on when SOURCE_DATE_EPOCH would be set accidentally?
All users can read the documentation. If you are worried your users may trip up, you may simply document the fact that you are supporting SOURCE_DATE_EPOCH. For example see GCC environment variables:
https://gcc.gnu.org/onlinedocs/gcc/Environment-Variables.html
In fact we have done this several times, see previous list linked. In the area of documentation: TeX, doxygen, sphinx - they have all adopted our patches. SOURCE_DATE_EPOCH being a standard environment variable helps this effort a lot, which would not be the case if each tool came up with its own mechanism.
Last edit: Ximin Luo 2016-10-17
Dear Xeimin Luo,
This is about the proposed and overhasty applied patch for Docutils. This specific patch was buggy: missing the cause for nonreproducible builds!
I assume that SOURCE_DATE_EPOCH is now established as a tool for reproducible builds.
My objection was about another bug in the attached patch that changed the output of the "date" directive when SOURCE_DATE_EPOCH is not set (see Dmitry's reply #bd29).
...
I would like to reduce the "overall cost" -- not just for binary package builders but for all Docutils users. And I prefer a safe and good solution over a cheap and fast one.
...
This is why I wrote
Reverting the buggy patch is no decision against the proposal. The status is still "pending", not "rejected" nor "accepted".
Hi Günter,
I am once again sorry that the initial patch was buggy. However, you fixed it yourself in r7954, didn’t you?
You reverted the patch after you fixed it, so it was not buggy at the time of reverting. Or did I miss anything?
Are there any other remaining issues with the patch that I can help to fix?
The original patch had 2 issues:
a) side-effects (changed the output of the "date" directive when SOURCE_DATE_EPOCH is not set) This was fixed before reverting the patch.
b) applying the cure at the wrong place: changing the date directive does not make any Debian package to build reproducible. For this, a fix to the auto-added timestamps would be required (see my comment from 2016-07-27).
The patch would need a complete overhoul.
Also, it requires a change in the specification of reStructuredText which in turn requires approval from all Docutils developers. Ask for it on docutils-devel.
In the meantime, Debian packages depending on Docutils can be made reproducible by a simple patch to the python-docutils Debian package:
turn off timestamps in /etc/docutils.conf (see my comments above).
Last edit: Günter Milde 2016-11-29
SOURCE_DATE_EPOCH seems to be established as a means for reproducible builds by now in a way that whenever someone were met by unexpected behaviour due to accident, a prank, or a malign actor a simple internet search may help to find the cause.
Hence I remove my objection and suggest that Docutils 1.0 will consider this environment variable.
In addition to the patch from Debian, we will need a patch for the documentation and a test case.