From: Faré <fa...@gm...> - 2009-10-06 19:31:49
|
I remember Christophe worked towards getting some determinism in FASLs, which was a great achievement. Unhappily, when I compile the same file twice, even an empty file, I get a different FASL with some binary number change, which suggests a timestamp of some kind (using 1.0.31.0.debian). Does anyone know what is causing this? Can it be removed? This doesn't seem to happen with the cfasl portion. Also, what special variables may I bind to maximize the chances that compiling a same Lisp file twice will yield the same FASL, even though there may have been slight changes in the environment? *gensym-couter*, whatever is used by gentemp, and the seed of RANDOM come to mind. Is there anything else? And is there a documented way to turn an object (say, an integer, through sxhash) into a valid random-state (other than iterating N times with a large N from the initial state)? Rationale: I'm adding content-driven computation to XCVB, and this would allow to avoid needless computations when the meaning of a Lisp file hasn't changed. [ François-René ÐVB Rideau | Reflection&Cybernethics | http://fare.tunes.org ] On-line, adj.: The idea that a human being should always be accessible to a computer. |
From: Faré <fa...@gm...> - 2009-10-08 05:51:25
|
2009/10/6 Faré <fa...@gm...>: > I remember Christophe worked towards getting some determinism in > FASLs, which was a great achievement. > Unhappily, when I compile the same file twice, even an empty file, I > get a different FASL with some binary number change, which suggests a > timestamp of some kind (using 1.0.31.0.debian). Does anyone know what > is causing this? Can it be removed? This doesn't seem to happen with > the cfasl portion. > > Also, what special variables may I bind to maximize the chances that > compiling a same Lisp file twice will yield the same FASL, even though > there may have been slight changes in the environment? > *gensym-couter*, whatever is used by gentemp, and the seed of RANDOM > come to mind. Is there anything else? And is there a documented way to > turn an object (say, an integer, through sxhash) into a valid > random-state (other than iterating N times with a large N from the > initial state)? > > Rationale: I'm adding content-driven computation to XCVB, and this > would allow to avoid needless computations when the meaning of a Lisp > file hasn't changed. > Considering that compiling with debug 0 seems to eliminate the tag, and seeing a start-time field in source-info, I suspect that said source-info is the source of the non-determinism I am observing. Is this start-time field ever useful to anyone? Can we just get rid of it? [ François-René ÐVB Rideau | Reflection&Cybernethics | http://fare.tunes.org ] If this country is worth saving, it's worth saving at a profit. -- H. L. Hunt |
From: Tobias C. R. <tc...@fr...> - 2009-10-08 07:13:06
|
Faré <fa...@gm...> writes: > 2009/10/6 Faré <fa...@gm...>: >> I remember Christophe worked towards getting some determinism in >> FASLs, which was a great achievement. >> Unhappily, when I compile the same file twice, even an empty file, I >> get a different FASL with some binary number change, which suggests a >> timestamp of some kind (using 1.0.31.0.debian). Does anyone know what >> is causing this? Can it be removed? This doesn't seem to happen with >> the cfasl portion. >> >> Also, what special variables may I bind to maximize the chances that >> compiling a same Lisp file twice will yield the same FASL, even though >> there may have been slight changes in the environment? >> *gensym-couter*, whatever is used by gentemp, and the seed of RANDOM >> come to mind. Is there anything else? And is there a documented way to >> turn an object (say, an integer, through sxhash) into a valid >> random-state (other than iterating N times with a large N from the >> initial state)? >> >> Rationale: I'm adding content-driven computation to XCVB, and this >> would allow to avoid needless computations when the meaning of a Lisp >> file hasn't changed. >> > Considering that compiling with debug 0 seems to eliminate the tag, > and seeing a start-time field in source-info, I suspect that said > source-info is the source of the non-determinism I am observing. Is > this start-time field ever useful to anyone? Can we just get rid of > it? It's used by SBCL's swank backend for determining the right source context even after a file was interactively modified since the last compilation. -T. |
From: Faré <fa...@gm...> - 2009-10-08 14:07:02
|
2009/10/8 Tobias C. Rittweiler <tc...@fr...>: >> 2009/10/6 Faré <fa...@gm...>: >> Considering that compiling with debug 0 seems to eliminate the tag, >> and seeing a start-time field in source-info, I suspect that said >> source-info is the source of the non-determinism I am observing. Is >> this start-time field ever useful to anyone? Can we just get rid of >> it? > > It's used by SBCL's swank backend for determining the right source > context even after a file was interactively modified since the last > compilation. > How exactly does the time tag help you? Are you just comparing for equality? And what do you do in case of inequality? Could you do with the sxhash of (cons source optimization-settings) instead? [ François-René ÐVB Rideau | Reflection&Cybernethics | http://fare.tunes.org ] It has been my observation that most people get ahead during the time that others waste. -- Henry Ford |
From: Tobias C. R. <tc...@fr...> - 2009-10-08 14:49:57
|
Faré <fa...@gm...> writes: > How exactly does the time tag help you? Are you just comparing for equality? > And what do you do in case of inequality? Could you do with the sxhash > of (cons source optimization-settings) instead? It compares with the file-write-date of a buffer's file. -T. |
From: Faré <fa...@gm...> - 2009-10-08 15:43:58
|
2009/10/8 Tobias C. Rittweiler <tc...@fr...>: > Faré <fa...@gm...> writes: > >> How exactly does the time tag help you? Are you just comparing for equality? >> And what do you do in case of inequality? Could you do with the sxhash >> of (cons source optimization-settings) instead? > > It compares with the file-write-date of a buffer's file. > What if I were using a side-channel to give you the tthsum of the fasl loaded and of the lisp file that was used to compile said fasl? You could instead check whether the package and variable xcvb-master:*loaded-grains* exists, then extract from that variable the tthsum of the source code used. I could add whichever API you like in xcvb-master so that you don't have to maintain details you're not interested in (such as tthsum, etc.). The issue I'm trying to solve here is that I'd like to give as much chance as possible for innocuous source modifications (whitespace, comments, lexical name change and other trivial refactorings, removal of an unneeded dependency, etc.) to lead to identical fasls, which will help detect that indeed those changes were semantics-preserving, and allow to avoid triggering a long chain of recompilations, or prune the unneeded dependencies, etc. This would be utterly defeated by any requirement to store date, exact source text, overly precise locations, etc., in the fasl itself -- though the same information could be stored in a different file, or otherwise provided to you. So I'd like for SBCL to provide a reliable way to either * disable saving of pathnames and dates in the FASL, or * store them in a side-file instead, or * strip a fasl from its debug info so the code can be compared. [ François-René ÐVB Rideau | Reflection&Cybernethics | http://fare.tunes.org ] To do evil a human being must first of all believe that what he's doing is good. -- Alexander Solzhenitsyn |
From: James Y K. <fo...@fu...> - 2009-10-14 22:05:32
|
On Oct 8, 2009, at 10:48 AM, Tobias C. Rittweiler wrote: > It compares with the file-write-date of a buffer's file. Hm, would it make sense to just use the timestamp of the fasl file itself? That is: compare the timestamp of the fasl file with the timestamp of the associated lisp file. No extra information need be included within the fasl itself. That's basically GDB does, in order to print its "Warning: source file is more recent than executable." messages. Seems to work well enough there -- it simply checks the timestamp of the .so or executable vs the timestamp of the source file. James |
From: James Y K. <fo...@fu...> - 2009-10-08 17:15:00
|
On Oct 8, 2009, at 10:48 AM, Tobias C. Rittweiler wrote: > Faré <fa...@gm...> writes: > >> How exactly does the time tag help you? Are you just comparing for >> equality? >> And what do you do in case of inequality? Could you do with the >> sxhash >> of (cons source optimization-settings) instead? > > It compares with the file-write-date of a buffer's file. Perhaps a simple fix would be to have sbcl write an md5sum of the source file instead of a timestamp of the source file? |
From: Faré <fa...@gm...> - 2009-10-08 17:54:19
|
2009/10/8 James Y Knight <fo...@fu...>: > > On Oct 8, 2009, at 10:48 AM, Tobias C. Rittweiler wrote: > >> Faré <fa...@gm...> writes: >> >>> How exactly does the time tag help you? Are you just comparing for >>> equality? >>> And what do you do in case of inequality? Could you do with the >>> sxhash >>> of (cons source optimization-settings) instead? >> >> It compares with the file-write-date of a buffer's file. > > Perhaps a simple fix would be to have sbcl write an md5sum of the > source file instead of a timestamp of the source file? > Would be better than a date in that at least the result would be deterministic, but it would make the most trivial change in a comment cause the fasl to always be different, even if the generated code is identical. I'd much rather be able to strip the fasl from any of this file or time information, and/or carry it out-of-band. [ François-René ÐVB Rideau | Reflection&Cybernethics | http://fare.tunes.org ] A real person has two reasons for doing anything ... a good reason and the real reason. |
From: Faré <fa...@gm...> - 2009-10-09 12:51:35
|
Would it satisfy SBCL and SLIME maintainers if I * added an option to COMPILE-FILE so that file and date info would not be saved in the file, and * added an option to LOAD so that you can add that information back into the debug info when you load the fasl ? [ François-René ÐVB Rideau | Reflection&Cybernethics | http://fare.tunes.org ] Power tends to corrupt and absolute power corrupts absolutely. That unalterable rule applies both to God and man. -- John Emerich Edward Dalberg-Acton (Lord Acton) in a letter to Bishop Mandell Creighton, April 5,1887 2009/10/8 Faré <fa...@gm...>: > 2009/10/8 James Y Knight <fo...@fu...>: >> >> On Oct 8, 2009, at 10:48 AM, Tobias C. Rittweiler wrote: >> >>> Faré <fa...@gm...> writes: >>> >>>> How exactly does the time tag help you? Are you just comparing for >>>> equality? >>>> And what do you do in case of inequality? Could you do with the >>>> sxhash >>>> of (cons source optimization-settings) instead? >>> >>> It compares with the file-write-date of a buffer's file. >> >> Perhaps a simple fix would be to have sbcl write an md5sum of the >> source file instead of a timestamp of the source file? >> > Would be better than a date in that at least the result would be > deterministic, but it would make the most trivial change in a comment > cause the fasl to always be different, even if the generated code is > identical. > > I'd much rather be able to strip the fasl from any of this file or > time information, and/or carry it out-of-band. > > [ François-René ÐVB Rideau | Reflection&Cybernethics | http://fare.tunes.org ] > A real person has two reasons for doing anything ... a good reason and > the real reason. > |
From: Daniel H. <dhe...@te...> - 2009-10-10 04:16:09
|
On Thu, 8 Oct 2009, Faré wrote: > 2009/10/8 James Y Knight <fo...@fu...>: >> >> On Oct 8, 2009, at 10:48 AM, Tobias C. Rittweiler wrote: >> >>> Faré <fa...@gm...> writes: >>> >>>> How exactly does the time tag help you? Are you just comparing for >>>> equality? And what do you do in case of inequality? Could you do with >>>> the sxhash of (cons source optimization-settings) instead? >>> >>> It compares with the file-write-date of a buffer's file. >> >> Perhaps a simple fix would be to have sbcl write an md5sum of the >> source file instead of a timestamp of the source file? >> > Would be better than a date in that at least the result would be > deterministic, but it would make the most trivial change in a comment > cause the fasl to always be different, even if the generated code is > identical. I understand your desire to do this, but this is an unusually strict requirement. For example, GCC's output fails this test when debug is enabled, unless GNU objcopy is used to enable debuglink. To refine James' idea, would it be hard to calculate a hash of the sexprs in a source file? Later, Daniel |
From: Attila L. <att...@gm...> - 2009-10-10 06:27:54
|
> I understand your desire to do this, but this is an unusually strict > requirement. For example, GCC's output fails this test when debug is > enabled, unless GNU objcopy is used to enable debuglink. well, using the gnu toolchain as the basis of a comparison is... > To refine James' idea, would it be hard to calculate a hash of the sexprs in > a source file? some people use reader macros that read into CLOS instances that are macroexpanded away (like our lib, cl-quasi-quote). those sexps are not guaranteed to be as simple as you think. -- attila |
From: Faré <fa...@gm...> - 2009-10-10 14:30:54
|
2009/10/10 Attila Lendvai <att...@gm...>: >> I understand your desire to do this, but this is an unusually strict >> requirement. For example, GCC's output fails this test when debug is >> enabled, unless GNU objcopy is used to enable debuglink. > > well, using the gnu toolchain as the basis of a comparison is... > >> To refine James' idea, would it be hard to calculate a hash of the sexprs in >> a source file? > > > some people use reader macros that read into CLOS instances that are > macroexpanded away (like our lib, cl-quasi-quote). those sexps are not > guaranteed to be as simple as you think. > Moreover, including the hash of the source in the object defeats part of the purpose, which is to be able to have different sources with the same object file from which we can conclude that they are semantically equivalent (e.g. whitespace change, comments, trivial refactoring, lexical name change, etc.). Ideally, you should be able to separate or strip the debug info. [ François-René ÐVB Rideau | Reflection&Cybernethics | http://fare.tunes.org ] If the human mind were simple enough to understand, we'd be too simple to understand it. -- Pat Bahn |
From: Nathan F. <fr...@gm...> - 2009-10-10 20:00:24
|
On Sat, Oct 10, 2009 at 2:27 AM, Attila Lendvai <att...@gm...> wrote: >> I understand your desire to do this, but this is an unusually strict >> requirement. For example, GCC's output fails this test when debug is >> enabled, unless GNU objcopy is used to enable debuglink. > > well, using the gnu toolchain as the basis of a comparison is... ...is actually a really good one, because the GNU toolchain does this sort of thing: e.g. -O2 -g produces the same code as -O2, compilers built from the same sources for different hosts and the same targets produce identical object code, etc. To speak to the original example, I can make trivial changes in comments and have the md5sums of the generated object files be identical with the system GCC on my system. If you are going to make non-trivial changes in comments--changes that change source lines, for instance--then yes, you are going to have the debug information change. But the non-debug bits will stay the same. -Nathan |