From: David L. <lic...@fh...> - 2004-03-12 17:31:22
Attachments:
java-finalization.diff
ftest.lisp
|
Hi, attached is an attempt to implement Java-style finalization. Any review by someone who knows the GC would be nice! Currently, SBCL uses a table of weak references to keep track of objects scheduled for finalization. After GC, the references are inspected to see whether they have been broken. If so, the recorded function is called. The disadvantage of this approach is that the finalizer doesn't have access to the original object. The workaround trick is, of course, to use closures, which have a reference not to the object (which would not be garbage collected otherweise), but to other data related to the object. In the case of a stream, the closure would reference just the file descriptor instead of the stream itself. Unfortunately it is not possible to implement/simulate Java-style finalization this way. Consider two objects which reference each other. If the first finalizer closure references the other object, and that object's finalizer references the first objects, none of them will over become garbage. To solve this problem, I have tried to add "semi-weak" references (for lack of a better name). Such references should not break, instead they should simply record the fact that their target value _would_ have been garbage collected. The patch uses such references for finalization instead of ordinary weak pointers. To avoid changing too much, I've used weak pointer objects to implement both kinds of references. Previously a weak pointer had brokenness NIL, then T. Now the brokenness can also be a fixnum, which starts at zero and is incremented every time the object would have been collected. The problem is that scan_weak_pointers() now calls scavenge() for such references, and scavenge_newspace_generation() needs to be called twice if that happened... Apparently. The patch survives an SBCL rebuild and some very brief testing... Thanks for any help, David -- o To emphasize the highly professional nature of Nmap, all instances of "fucked up" in error message text has been changed to "b0rked". -- http://www.insecure.org/stf/Nmap-3.50-Release.html |
From: Rudi S. <ru...@co...> - 2004-03-12 19:49:08
Attachments:
PGP.sig
|
On 12. M=E4r 2004, at 18:12, David Lichteblau wrote: > attached is an attempt to implement Java-style finalization. > Any review by someone who knows the GC would be nice! > I don't know anything about the GC, but I'm curious ... how is the=20 following case handled: (defvar *saver* nil) (defun make-saver () (let ((already-called nil)) (lambda (object) (unless already-saved (push object *saver*) (setf already-saved t))))) (finalize (make-instance 'blah) (make-saver)) (gc :full t) ;; finalizer called, does object become live again? (setf *saver* nil) (gc :full t) ;; finalizer called a second time? The essence of my question is: isn't this trading one source of bugs=20 (don't close over objects in the finalizer function) with another (be=20 prepared for finalizer functions to be called multiple times, and/or be=20= prepared for "finalized" objects to continue existing)? IIRC there's a=20= lot of hair in the Java specs because this, and it seems a kludge that=20= can't be solved "correctly". I always appreciated the simplicity of=20 the sbcl solution, where you can rely on the fact that the finalizer=20 will be called exactly one time. Cheers, Rudi |
From: David L. <da...@li...> - 2004-03-12 22:03:01
|
Quoting Rudi Schlatte (ru...@co...): > [...] I'm curious ... how is the=20 > following case handled: >=20 > (defvar *saver* nil) >=20 > (defun make-saver () > (let ((already-called nil)) > (lambda (object) > (unless already-saved > (push object *saver*) > (setf already-saved t))))) >=20 > (finalize (make-instance 'blah) (make-saver)) >=20 > (gc :full t) ;; finalizer called, does object become live again? Using Java terminology, it is reachable and finalized now. > (setf *saver* nil) >=20 > (gc :full t) ;; finalizer called a second time? No. It is unreachable and finalized, so it is discarded. [...] > I always appreciated the simplicity of the sbcl solution, where you > can rely on the fact that the finalizer=20 > will be called exactly one time. Note that I do not propose to change the implementation of SB-EXT:FINALIZE. Instead, a second function FINALIZE-WITH-OBJECT is provided for those who want or need the more complex behaviour. But you are absolutely right, this is a hairy issue. For now I would be happy if I knew that the semi-weak references work correctly. Reviewing the new finalization implementation is really only the second step. To facilitate the comparison with Java, let me make a simplifying assumption for now, namely that our objects are scheduled for finalization using FINALIZE-WITH-OBJECT once during initialization and are never scheduled for finalization again. This assumption always holds in Java, which schedules objects for finalization implicitly. The state machine for Java's finalization is explained here: http://java.sun.com/docs/books/jls/second_edition/html/execution.doc.html#4= 4760 Objects move from the unfinalized to the finalizable to the finalized state. At the second transition their finalizer (if any) is called. Even if they are reachable again, the finalizer will not run again. Consider the following cases to define these three states: The object has been referenced from *object-pending-finalization*... (1) no, never. (1a) it is reachable =3D> it is UNFINALIZED (1b) it is unreachable (has been garbage collected) =3D> the object can be considered `FINALIZED' (2) at least once, but only indirectly =3D> the object is `FINALIZED' (3) yes. Its semi-weak reference has brokenness: (3a) 0 =3D> it is UNFINALIZED (3b) >0, and GC is still running =3D> it is FINALIZABLE (3c) >0, and finalize-corpses is currently running =3D> it is FINALIZED (4) not any longer =3D> it is FINALIZED Objects start at state (1a) if they are never scheduled for finalization using FINALIZE-WITH-OBJECT or at state (3a) otherwise. Objects starting at state (1a) either become unreachable and `finalized' directly or are finalizer-reachable at some point (2). In the second case, they are to be considered finalizable in theory, but there is no finalizer to invoke, so we can as well label them finalized directly. Objects starting at state (3a) become finalizable during GC when function scan_weak_pointers() increments their brokenness. This is state (3b). Immediately after GC, function FINALIZE-CORPSES runs. We considered the objects to be finalized from this point on. The object's finalizer runs, and in the same step it is removed from the *o-p-f* table. So that is my theory anyway, please correct me if I'm wrong or oversimplifying anything. That is precisely the review I am looking for. :) (...Now if the user really calls FINALIZE-WITH-OBJECT multiple times for the same objects, things might indeed go wrong. But note that, for example, FINALIZE must not be called from a finalizer anyway, even in the old implementation...) David --=20 o To emphasize the highly professional nature of Nmap, all instances of "fucked up" in error message text has been changed to "b0rked". -- http://www.insecure.org/stf/Nmap-3.50-Release.html |