You can subscribe to this list here.
| 2009 |
Jan
|
Feb
(3) |
Mar
(1) |
Apr
|
May
|
Jun
(5) |
Jul
(3) |
Aug
(5) |
Sep
(2) |
Oct
(6) |
Nov
(2) |
Dec
(1) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2010 |
Jan
(16) |
Feb
(1) |
Mar
(10) |
Apr
(4) |
May
(3) |
Jun
|
Jul
(1) |
Aug
(1) |
Sep
(1) |
Oct
(3) |
Nov
|
Dec
(1) |
| 2011 |
Jan
(3) |
Feb
(6) |
Mar
|
Apr
(2) |
May
(4) |
Jun
(5) |
Jul
(1) |
Aug
(8) |
Sep
(1) |
Oct
(1) |
Nov
(2) |
Dec
(1) |
| 2012 |
Jan
(2) |
Feb
(2) |
Mar
(8) |
Apr
(6) |
May
(13) |
Jun
(6) |
Jul
(3) |
Aug
(2) |
Sep
(7) |
Oct
(1) |
Nov
|
Dec
|
| 2013 |
Jan
(5) |
Feb
(1) |
Mar
(6) |
Apr
(1) |
May
(1) |
Jun
(11) |
Jul
(9) |
Aug
|
Sep
(4) |
Oct
|
Nov
(12) |
Dec
(6) |
| 2014 |
Jan
(6) |
Feb
(17) |
Mar
(3) |
Apr
(3) |
May
|
Jun
(4) |
Jul
|
Aug
(7) |
Sep
(2) |
Oct
(4) |
Nov
(1) |
Dec
(1) |
| 2015 |
Jan
(2) |
Feb
(1) |
Mar
(6) |
Apr
(2) |
May
|
Jun
(5) |
Jul
(7) |
Aug
(2) |
Sep
(5) |
Oct
(3) |
Nov
(16) |
Dec
(15) |
| 2016 |
Jan
(11) |
Feb
(11) |
Mar
(5) |
Apr
|
May
(3) |
Jun
|
Jul
|
Aug
|
Sep
(2) |
Oct
|
Nov
|
Dec
(1) |
| 2017 |
Jan
(2) |
Feb
|
Mar
(3) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2018 |
Jan
|
Feb
|
Mar
(2) |
Apr
(12) |
May
|
Jun
(1) |
Jul
|
Aug
(2) |
Sep
|
Oct
(4) |
Nov
|
Dec
|
| 2019 |
Jan
|
Feb
(6) |
Mar
|
Apr
(4) |
May
|
Jun
(2) |
Jul
(9) |
Aug
|
Sep
(12) |
Oct
(2) |
Nov
|
Dec
(8) |
| 2020 |
Jan
(12) |
Feb
(3) |
Mar
(4) |
Apr
(4) |
May
(27) |
Jun
(14) |
Jul
(3) |
Aug
(7) |
Sep
(1) |
Oct
(1) |
Nov
(2) |
Dec
(3) |
| 2021 |
Jan
(2) |
Feb
(6) |
Mar
(8) |
Apr
(10) |
May
(1) |
Jun
(8) |
Jul
(4) |
Aug
(9) |
Sep
(1) |
Oct
(7) |
Nov
(6) |
Dec
(8) |
| 2022 |
Jan
(7) |
Feb
(4) |
Mar
(3) |
Apr
(2) |
May
(2) |
Jun
(3) |
Jul
(14) |
Aug
(15) |
Sep
(13) |
Oct
(16) |
Nov
(5) |
Dec
(6) |
| 2023 |
Jan
(18) |
Feb
(2) |
Mar
(28) |
Apr
(8) |
May
(3) |
Jun
(24) |
Jul
(11) |
Aug
(22) |
Sep
(20) |
Oct
(27) |
Nov
(12) |
Dec
(2) |
| 2024 |
Jan
(5) |
Feb
(45) |
Mar
(45) |
Apr
(25) |
May
(10) |
Jun
(23) |
Jul
(20) |
Aug
(5) |
Sep
(3) |
Oct
(7) |
Nov
(7) |
Dec
(11) |
| 2025 |
Jan
(5) |
Feb
(6) |
Mar
(4) |
Apr
(13) |
May
(2) |
Jun
(4) |
Jul
(14) |
Aug
(5) |
Sep
(9) |
Oct
|
Nov
|
Dec
(4) |
| 2026 |
Jan
(5) |
Feb
(16) |
Mar
(13) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
|
From: Stas B. <sta...@gm...> - 2026-03-11 16:44:50
|
Solved in a different way. Thanks for the idea. On Wed, Mar 11, 2026 at 6:41 PM John Mallery <jc...@cs...> wrote: > > SBCL version: 2.6.1 (also affects current git HEAD) > Platform: all > > DESCRIPTION > > Every call to OPEN registers an auto-close finalizer on the fd-stream > via FINALIZE (which calls SB-LOCKLESS:SO-INSERT to insert into the > finalizer hash table). This allocates a closure and a hash table > node. When OPEN is called from WITH-OPEN-FILE, the finalizer is > redundant because the UNWIND-PROTECT in WITH-OPEN-FILE guarantees > CLOSE is called on both normal and abnormal exit. > > Profiling with sb-sprof in :alloc mode shows SB-LOCKLESS:SO-INSERT > accounting for ~31% of OPEN's total allocation. In a web server > that opens/closes files thousands of times per second, this is a > significant source of GC pressure. > > HOW TO REPRODUCE > > (require :sb-sprof) > > (let ((path (namestring *load-truename*))) ; any existing file > (dotimes (i 50) (close (open path))) ; warm buffer pool > > ;; Profile allocation in OPEN > (sb-sprof:start-profiling :mode :alloc :sample-interval 1) > (dotimes (i 5000) (close (open path))) > (sb-sprof:stop-profiling) > (sb-sprof:report :type :flat :max 10)) > > ;; Output shows SB-LOCKLESS:SO-INSERT at ~31% of samples > > FIX > > Two files changed: src/code/fd-stream.lisp and src/code/macros.lisp. > > 1. Add a dynamic variable *SUPPRESS-STREAM-AUTO-CLOSE* that gates > the finalizer registration in MAKE-FD-STREAM: > > --- a/src/code/fd-stream.lisp > +++ b/src/code/fd-stream.lisp > @@ -2475,6 +2475,12 @@ > ;;; > ;;; NAME is used to identify the stream when printed. > ;;; > +;;; When true, MAKE-FD-STREAM skips the auto-close finalizer registration. > +;;; Bound by WITH-OPEN-FILE, where the UNWIND-PROTECT guarantees CLOSE, > +;;; making the finalizer redundant. Eliminates ~31% of OPEN allocation > +;;; (finalizer closure + SO-INSERT hash table node). > +(defvar *suppress-stream-auto-close* nil) > + > ;;; If SERVE-EVENTS is true, SERVE-EVENT machinery is used to > ;;; handle blocking IO on the stream. > (defun make-fd-stream (fd > @@ -2544,7 +2550,7 @@ > (set-fd-stream-routines stream element-type ...) > - (when auto-close > + (when (and auto-close (not *suppress-stream-auto-close*)) > (finalize stream > (lambda () > (sb-unix:unix-close fd) > > 2. Modify WITH-OPEN-FILE to bind *SUPPRESS-STREAM-AUTO-CLOSE* to T. > The stream variable is initialized to NIL and set inside the > UNWIND-PROTECT scope so the suppression is active during OPEN: > > --- a/src/code/macros.lisp > +++ b/src/code/macros.lisp > @@ -1437,12 +1437,18 @@ > (multiple-value-bind (forms decls) (parse-body body nil) > (let ((abortp (gensym))) > - `(let ((,stream (open ,filespec ,@options)) > - (,abortp t)) > + ;; Suppress the auto-close finalizer -- UNWIND-PROTECT guarantees > + ;; CLOSE, making the finalizer redundant. > + `(let ((,stream) > + (,abortp t) > + (sb-impl::*suppress-stream-auto-close* t)) > ,@decls > (unwind-protect > (multiple-value-prog1 > - (progn ,@forms) > + (progn > + (setf ,stream (open ,filespec ,@options)) > + ,@forms) > (setq ,abortp nil)) > (when ,stream > (close ,stream :abort ,abortp))))))) > > SAFETY > > The auto-close finalizer is ONLY suppressed within WITH-OPEN-FILE, > where the UNWIND-PROTECT guarantees CLOSE. Bare OPEN calls outside > WITH-OPEN-FILE still register the finalizer as a safety net. > > If WITH-OPEN-FILE's OPEN succeeds but the thread is destroyed before > UNWIND-PROTECT runs (e.g., SB-THREAD:TERMINATE-THREAD), the fd would > leak. However, this is already the case for any resource acquired > inside an UNWIND-PROTECT -- thread termination can prevent cleanup > forms from running regardless of finalizers. > > IMPACT > > Any I/O-intensive application that opens files via WITH-OPEN-FILE > benefits. Web servers, compilers, and build systems are typical > examples. The fix eliminates ~31% of OPEN's allocation with zero > behavioral change for correct programs. > > > _______________________________________________ > Sbcl-bugs mailing list > Sbc...@li... > https://lists.sourceforge.net/lists/listinfo/sbcl-bugs |
|
From: John M. <jc...@cs...> - 2026-03-11 16:00:25
|
yep, try working with 256MB from the early 1980s on MIT lisp machines. > On Mar 11, 2026, at 16:58, Stas Boukarev <sta...@gm...> wrote: > > Huge RAM? In this economy? > > On Wed, Mar 11, 2026 at 6:57 PM John Mallery <jc...@cs...> wrote: >> >> What do you make of this in the era huge ram? >> >>> On Mar 11, 2026, at 16:51, Stas Boukarev <sta...@gm...> wrote: >>> >>> Larger caches are not free. And your 8KB math is only for the vector >>> holding strings, not the strings themselves. >>> >>> On Wed, Mar 11, 2026 at 6:50 PM John C Mallery <jcm...@gm...> wrote: >>>> >>>> Well, this is not an argument to avoid making the format cache larger. >>>> >>>>> On Mar 11, 2026, at 16:47, Stas Boukarev <sta...@gm...> wrote: >>>>> >>>>> But if you are after performance why are you calling format with >>>>> unknown strings. Maybe use FORMATTER? >>>>> >>>>> On Wed, Mar 11, 2026 at 6:45 PM John C Mallery <jcm...@gm...> wrote: >>>>>> >>>>>> I don’t know about SBCL, but we have a fast-format function which does exactly that. We don’t handle, however, complex directives. In any case, this bug in SBCL is fixed by using a large hash table to cache the string. >>>>>> >>>>>>> On Mar 11, 2026, at 16:27, Stas Boukarev <sta...@gm...> wrote: >>>>>>> >>>>>>> But why are these format strings not processed at compile time? >>>>>>> >>>>>>> >>>>>>> On Wed, Mar 11, 2026 at 6:22 PM John C Mallery <jcm...@gm...> wrote: >>>>>>>> >>>>>>>> SBCL version: 2.6.1 (also affects current git HEAD) >>>>>>>> Platform: all >>>>>>>> >>>>>>>> DESCRIPTION >>>>>>>> >>>>>>>> The FORMAT control string tokenizer cache (tokenize-control-string) >>>>>>>> uses :hash-bits 7, giving a 128-entry 2-way set-associative cache. >>>>>>>> In large applications with many distinct format strings, the cache >>>>>>>> fills completely and thrashes -- every collision re-tokenizes the >>>>>>>> control string, allocating fresh tokenized representations. >>>>>>>> >>>>>>>> In CL-HTTP (a web server with ~200 distinct format strings in its >>>>>>>> codebase), the cache is 128/128 full. Profiling under load shows >>>>>>>> tokenize-control-string accounting for 7.0% of all allocation during >>>>>>>> HTTP request serving (~28K req/s). >>>>>>>> >>>>>>>> HOW TO REPRODUCE >>>>>>>> >>>>>>>> ;; Create enough distinct format strings to fill the 128-entry cache >>>>>>>> (let ((strings (loop for i below 200 >>>>>>>> collect (format nil "test-~D: ~~A ~~D" i)))) >>>>>>>> ;; Warm the cache >>>>>>>> (dolist (s strings) (format nil s "x" 1)) >>>>>>>> ;; Measure -- cache is now thrashing >>>>>>>> (let ((before (sb-ext:get-bytes-consed))) >>>>>>>> (dotimes (j 100) >>>>>>>> (dolist (s strings) >>>>>>>> (format nil s "x" 1))) >>>>>>>> (format t "~,1F bytes/format call~%" >>>>>>>> (/ (float (- (sb-ext:get-bytes-consed) before)) >>>>>>>> (* 100.0 (length strings)))))) >>>>>>>> >>>>>>>> ;; With :hash-bits 7 (128 entries): ~200+ bytes/call (re-tokenizing) >>>>>>>> ;; With :hash-bits 10 (1024 entries): ~0 bytes/call (cache hits) >>>>>>>> >>>>>>>> FIX >>>>>>>> >>>>>>>> 1-line change in src/code/format.lisp. Increase :hash-bits from 7 >>>>>>>> to 10 (1024 entries instead of 128): >>>>>>>> >>>>>>>> --- a/src/code/format.lisp >>>>>>>> +++ b/src/code/format.lisp >>>>>>>> @@ -23,7 +23,7 @@ >>>>>>>> #-sb-xc-host >>>>>>>> (defun-cached (tokenize-control-string >>>>>>>> :memoizer memoize >>>>>>>> - :hash-bits 7 >>>>>>>> + :hash-bits 10 >>>>>>>> :hash-function (lambda (string) >>>>>>>> (ash (get-lisp-obj-address string) >>>>>>>> #.(- sb-vm:n-lowtag-bits)))) >>>>>>>> >>>>>>>> COST >>>>>>>> >>>>>>>> The cache vector grows from 128 to 1024 entries (simple-vector). >>>>>>>> On 64-bit, this is ~8KB of memory -- negligible for any application >>>>>>>> large enough to have 128+ distinct format strings. The cache is >>>>>>>> allocated lazily on first use, so small programs pay nothing. >>>>>>>> >>>>>>>> The defun-cached assert already allows hash-bits up to 12: >>>>>>>> (assert (typep hash-bits '(integer 5 12))) >>>>>>>> >>>>>>>> IMPACT >>>>>>>> >>>>>>>> Any application with more than ~64 active format strings (the >>>>>>>> effective capacity of a 128-entry 2-way cache) will see thrashing. >>>>>>>> Web servers, GUI applications, and compilers commonly exceed this. >>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Sbcl-bugs mailing list >>>>>>>> Sbc...@li... >>>>>>>> https://lists.sourceforge.net/lists/listinfo/sbcl-bugs >>>>>> >>>> >> |
|
From: Stas B. <sta...@gm...> - 2026-03-11 15:58:26
|
Huge RAM? In this economy? On Wed, Mar 11, 2026 at 6:57 PM John Mallery <jc...@cs...> wrote: > > What do you make of this in the era huge ram? > > > On Mar 11, 2026, at 16:51, Stas Boukarev <sta...@gm...> wrote: > > > > Larger caches are not free. And your 8KB math is only for the vector > > holding strings, not the strings themselves. > > > > On Wed, Mar 11, 2026 at 6:50 PM John C Mallery <jcm...@gm...> wrote: > >> > >> Well, this is not an argument to avoid making the format cache larger. > >> > >>> On Mar 11, 2026, at 16:47, Stas Boukarev <sta...@gm...> wrote: > >>> > >>> But if you are after performance why are you calling format with > >>> unknown strings. Maybe use FORMATTER? > >>> > >>> On Wed, Mar 11, 2026 at 6:45 PM John C Mallery <jcm...@gm...> wrote: > >>>> > >>>> I don’t know about SBCL, but we have a fast-format function which does exactly that. We don’t handle, however, complex directives. In any case, this bug in SBCL is fixed by using a large hash table to cache the string. > >>>> > >>>>> On Mar 11, 2026, at 16:27, Stas Boukarev <sta...@gm...> wrote: > >>>>> > >>>>> But why are these format strings not processed at compile time? > >>>>> > >>>>> > >>>>> On Wed, Mar 11, 2026 at 6:22 PM John C Mallery <jcm...@gm...> wrote: > >>>>>> > >>>>>> SBCL version: 2.6.1 (also affects current git HEAD) > >>>>>> Platform: all > >>>>>> > >>>>>> DESCRIPTION > >>>>>> > >>>>>> The FORMAT control string tokenizer cache (tokenize-control-string) > >>>>>> uses :hash-bits 7, giving a 128-entry 2-way set-associative cache. > >>>>>> In large applications with many distinct format strings, the cache > >>>>>> fills completely and thrashes -- every collision re-tokenizes the > >>>>>> control string, allocating fresh tokenized representations. > >>>>>> > >>>>>> In CL-HTTP (a web server with ~200 distinct format strings in its > >>>>>> codebase), the cache is 128/128 full. Profiling under load shows > >>>>>> tokenize-control-string accounting for 7.0% of all allocation during > >>>>>> HTTP request serving (~28K req/s). > >>>>>> > >>>>>> HOW TO REPRODUCE > >>>>>> > >>>>>> ;; Create enough distinct format strings to fill the 128-entry cache > >>>>>> (let ((strings (loop for i below 200 > >>>>>> collect (format nil "test-~D: ~~A ~~D" i)))) > >>>>>> ;; Warm the cache > >>>>>> (dolist (s strings) (format nil s "x" 1)) > >>>>>> ;; Measure -- cache is now thrashing > >>>>>> (let ((before (sb-ext:get-bytes-consed))) > >>>>>> (dotimes (j 100) > >>>>>> (dolist (s strings) > >>>>>> (format nil s "x" 1))) > >>>>>> (format t "~,1F bytes/format call~%" > >>>>>> (/ (float (- (sb-ext:get-bytes-consed) before)) > >>>>>> (* 100.0 (length strings)))))) > >>>>>> > >>>>>> ;; With :hash-bits 7 (128 entries): ~200+ bytes/call (re-tokenizing) > >>>>>> ;; With :hash-bits 10 (1024 entries): ~0 bytes/call (cache hits) > >>>>>> > >>>>>> FIX > >>>>>> > >>>>>> 1-line change in src/code/format.lisp. Increase :hash-bits from 7 > >>>>>> to 10 (1024 entries instead of 128): > >>>>>> > >>>>>> --- a/src/code/format.lisp > >>>>>> +++ b/src/code/format.lisp > >>>>>> @@ -23,7 +23,7 @@ > >>>>>> #-sb-xc-host > >>>>>> (defun-cached (tokenize-control-string > >>>>>> :memoizer memoize > >>>>>> - :hash-bits 7 > >>>>>> + :hash-bits 10 > >>>>>> :hash-function (lambda (string) > >>>>>> (ash (get-lisp-obj-address string) > >>>>>> #.(- sb-vm:n-lowtag-bits)))) > >>>>>> > >>>>>> COST > >>>>>> > >>>>>> The cache vector grows from 128 to 1024 entries (simple-vector). > >>>>>> On 64-bit, this is ~8KB of memory -- negligible for any application > >>>>>> large enough to have 128+ distinct format strings. The cache is > >>>>>> allocated lazily on first use, so small programs pay nothing. > >>>>>> > >>>>>> The defun-cached assert already allows hash-bits up to 12: > >>>>>> (assert (typep hash-bits '(integer 5 12))) > >>>>>> > >>>>>> IMPACT > >>>>>> > >>>>>> Any application with more than ~64 active format strings (the > >>>>>> effective capacity of a 128-entry 2-way cache) will see thrashing. > >>>>>> Web servers, GUI applications, and compilers commonly exceed this. > >>>>>> > >>>>>> > >>>>>> _______________________________________________ > >>>>>> Sbcl-bugs mailing list > >>>>>> Sbc...@li... > >>>>>> https://lists.sourceforge.net/lists/listinfo/sbcl-bugs > >>>> > >> > |
|
From: John M. <jc...@cs...> - 2026-03-11 15:57:33
|
What do you make of this in the era huge ram? > On Mar 11, 2026, at 16:51, Stas Boukarev <sta...@gm...> wrote: > > Larger caches are not free. And your 8KB math is only for the vector > holding strings, not the strings themselves. > > On Wed, Mar 11, 2026 at 6:50 PM John C Mallery <jcm...@gm...> wrote: >> >> Well, this is not an argument to avoid making the format cache larger. >> >>> On Mar 11, 2026, at 16:47, Stas Boukarev <sta...@gm...> wrote: >>> >>> But if you are after performance why are you calling format with >>> unknown strings. Maybe use FORMATTER? >>> >>> On Wed, Mar 11, 2026 at 6:45 PM John C Mallery <jcm...@gm...> wrote: >>>> >>>> I don’t know about SBCL, but we have a fast-format function which does exactly that. We don’t handle, however, complex directives. In any case, this bug in SBCL is fixed by using a large hash table to cache the string. >>>> >>>>> On Mar 11, 2026, at 16:27, Stas Boukarev <sta...@gm...> wrote: >>>>> >>>>> But why are these format strings not processed at compile time? >>>>> >>>>> >>>>> On Wed, Mar 11, 2026 at 6:22 PM John C Mallery <jcm...@gm...> wrote: >>>>>> >>>>>> SBCL version: 2.6.1 (also affects current git HEAD) >>>>>> Platform: all >>>>>> >>>>>> DESCRIPTION >>>>>> >>>>>> The FORMAT control string tokenizer cache (tokenize-control-string) >>>>>> uses :hash-bits 7, giving a 128-entry 2-way set-associative cache. >>>>>> In large applications with many distinct format strings, the cache >>>>>> fills completely and thrashes -- every collision re-tokenizes the >>>>>> control string, allocating fresh tokenized representations. >>>>>> >>>>>> In CL-HTTP (a web server with ~200 distinct format strings in its >>>>>> codebase), the cache is 128/128 full. Profiling under load shows >>>>>> tokenize-control-string accounting for 7.0% of all allocation during >>>>>> HTTP request serving (~28K req/s). >>>>>> >>>>>> HOW TO REPRODUCE >>>>>> >>>>>> ;; Create enough distinct format strings to fill the 128-entry cache >>>>>> (let ((strings (loop for i below 200 >>>>>> collect (format nil "test-~D: ~~A ~~D" i)))) >>>>>> ;; Warm the cache >>>>>> (dolist (s strings) (format nil s "x" 1)) >>>>>> ;; Measure -- cache is now thrashing >>>>>> (let ((before (sb-ext:get-bytes-consed))) >>>>>> (dotimes (j 100) >>>>>> (dolist (s strings) >>>>>> (format nil s "x" 1))) >>>>>> (format t "~,1F bytes/format call~%" >>>>>> (/ (float (- (sb-ext:get-bytes-consed) before)) >>>>>> (* 100.0 (length strings)))))) >>>>>> >>>>>> ;; With :hash-bits 7 (128 entries): ~200+ bytes/call (re-tokenizing) >>>>>> ;; With :hash-bits 10 (1024 entries): ~0 bytes/call (cache hits) >>>>>> >>>>>> FIX >>>>>> >>>>>> 1-line change in src/code/format.lisp. Increase :hash-bits from 7 >>>>>> to 10 (1024 entries instead of 128): >>>>>> >>>>>> --- a/src/code/format.lisp >>>>>> +++ b/src/code/format.lisp >>>>>> @@ -23,7 +23,7 @@ >>>>>> #-sb-xc-host >>>>>> (defun-cached (tokenize-control-string >>>>>> :memoizer memoize >>>>>> - :hash-bits 7 >>>>>> + :hash-bits 10 >>>>>> :hash-function (lambda (string) >>>>>> (ash (get-lisp-obj-address string) >>>>>> #.(- sb-vm:n-lowtag-bits)))) >>>>>> >>>>>> COST >>>>>> >>>>>> The cache vector grows from 128 to 1024 entries (simple-vector). >>>>>> On 64-bit, this is ~8KB of memory -- negligible for any application >>>>>> large enough to have 128+ distinct format strings. The cache is >>>>>> allocated lazily on first use, so small programs pay nothing. >>>>>> >>>>>> The defun-cached assert already allows hash-bits up to 12: >>>>>> (assert (typep hash-bits '(integer 5 12))) >>>>>> >>>>>> IMPACT >>>>>> >>>>>> Any application with more than ~64 active format strings (the >>>>>> effective capacity of a 128-entry 2-way cache) will see thrashing. >>>>>> Web servers, GUI applications, and compilers commonly exceed this. >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Sbcl-bugs mailing list >>>>>> Sbc...@li... >>>>>> https://lists.sourceforge.net/lists/listinfo/sbcl-bugs >>>> >> |
|
From: Stas B. <sta...@gm...> - 2026-03-11 15:57:12
|
Applied. Thanks. On Wed, Mar 11, 2026 at 6:41 PM John Mallery <jc...@cs...> wrote: > > SBCL version: 2.6.1 (also affects current git HEAD) > Platform: Darwin arm64 (Apple Silicon), likely all platforms > > DESCRIPTION > > The one-class and two-class discriminating function generators for > slot-accessor makunbound generic functions are missing RETURN-FROM > ACCESS in the fast-path success case. Every other accessor type > (:reader, :writer, :boundp) wraps its result in (RETURN-FROM ACCESS > ...) to skip the fallthrough to the miss function. The :makunbound > case omits this, causing every slot-makunbound call to fall through > to the cache miss function even when the class matches. > > This results in ~607 bytes of allocation per slot-makunbound call > (miss function rebuilds the cache, allocates wrapper lists, etc.). > In CL-HTTP under load, slot-makunbound is called once per header per > request during header-set cleanup, causing 24.3% of total allocation > (8.8% direct + 15.5% cache miss overhead). > > HOW TO REPRODUCE > > (defclass foo () ((x :initform 42))) > (defmethod slot-makunbound-using-class > ((class standard-class) (obj foo) > (slotd standard-effective-slot-definition)) > (call-next-method)) > > (let ((obj (make-instance 'foo))) > ;; Warm the dispatch cache > (dotimes (i 20) (slot-makunbound obj 'x) (setf (slot-value obj 'x) 42)) > ;; Measure > (let ((before (sb-ext:get-bytes-consed))) > (dotimes (i 10000) > (slot-makunbound obj 'x) > (setf (slot-value obj 'x) 42)) > (format t "~,1F bytes/makunbound~%" > (/ (float (- (sb-ext:get-bytes-consed) before)) 10000.0)))) > > ;; Prints ~607 bytes/makunbound (should be ~0) > > FIX > > 3-line change in src/pcl/dlisp.lisp. Add RETURN-FROM ACCESS around > the :makunbound case, matching :reader, :writer, and :boundp: > > --- a/src/pcl/dlisp.lisp > +++ b/src/pcl/dlisp.lisp > @@ -246,8 +246,9 @@ > `((let ((value ,read-form)) > (return-from access (not (unbound-marker-p value)))))) > (:makunbound > - `((progn (setf ,read-form +slot-unbound+) > - ,instance))) > + `((return-from access > + (progn (setf ,read-form +slot-unbound+) > + ,instance)))) > (:writer > `((return-from access (setf ,read-form ,(car arglist))))))) > (funcall miss-fn ,@arglist)))))) > > IMPACT > > Any application that calls slot-makunbound frequently (CLOS-heavy web > servers, object pools, etc.) pays ~607 bytes per call. The PCL cache > miss from the fallthrough also causes quadratic behavior as the cache > grows and rehashes repeatedly for entries that should have been hits. > > > _______________________________________________ > Sbcl-bugs mailing list > Sbc...@li... > https://lists.sourceforge.net/lists/listinfo/sbcl-bugs |
|
From: Stas B. <sta...@gm...> - 2026-03-11 15:52:07
|
Larger caches are not free. And your 8KB math is only for the vector holding strings, not the strings themselves. On Wed, Mar 11, 2026 at 6:50 PM John C Mallery <jcm...@gm...> wrote: > > Well, this is not an argument to avoid making the format cache larger. > > > On Mar 11, 2026, at 16:47, Stas Boukarev <sta...@gm...> wrote: > > > > But if you are after performance why are you calling format with > > unknown strings. Maybe use FORMATTER? > > > > On Wed, Mar 11, 2026 at 6:45 PM John C Mallery <jcm...@gm...> wrote: > >> > >> I don’t know about SBCL, but we have a fast-format function which does exactly that. We don’t handle, however, complex directives. In any case, this bug in SBCL is fixed by using a large hash table to cache the string. > >> > >>> On Mar 11, 2026, at 16:27, Stas Boukarev <sta...@gm...> wrote: > >>> > >>> But why are these format strings not processed at compile time? > >>> > >>> > >>> On Wed, Mar 11, 2026 at 6:22 PM John C Mallery <jcm...@gm...> wrote: > >>>> > >>>> SBCL version: 2.6.1 (also affects current git HEAD) > >>>> Platform: all > >>>> > >>>> DESCRIPTION > >>>> > >>>> The FORMAT control string tokenizer cache (tokenize-control-string) > >>>> uses :hash-bits 7, giving a 128-entry 2-way set-associative cache. > >>>> In large applications with many distinct format strings, the cache > >>>> fills completely and thrashes -- every collision re-tokenizes the > >>>> control string, allocating fresh tokenized representations. > >>>> > >>>> In CL-HTTP (a web server with ~200 distinct format strings in its > >>>> codebase), the cache is 128/128 full. Profiling under load shows > >>>> tokenize-control-string accounting for 7.0% of all allocation during > >>>> HTTP request serving (~28K req/s). > >>>> > >>>> HOW TO REPRODUCE > >>>> > >>>> ;; Create enough distinct format strings to fill the 128-entry cache > >>>> (let ((strings (loop for i below 200 > >>>> collect (format nil "test-~D: ~~A ~~D" i)))) > >>>> ;; Warm the cache > >>>> (dolist (s strings) (format nil s "x" 1)) > >>>> ;; Measure -- cache is now thrashing > >>>> (let ((before (sb-ext:get-bytes-consed))) > >>>> (dotimes (j 100) > >>>> (dolist (s strings) > >>>> (format nil s "x" 1))) > >>>> (format t "~,1F bytes/format call~%" > >>>> (/ (float (- (sb-ext:get-bytes-consed) before)) > >>>> (* 100.0 (length strings)))))) > >>>> > >>>> ;; With :hash-bits 7 (128 entries): ~200+ bytes/call (re-tokenizing) > >>>> ;; With :hash-bits 10 (1024 entries): ~0 bytes/call (cache hits) > >>>> > >>>> FIX > >>>> > >>>> 1-line change in src/code/format.lisp. Increase :hash-bits from 7 > >>>> to 10 (1024 entries instead of 128): > >>>> > >>>> --- a/src/code/format.lisp > >>>> +++ b/src/code/format.lisp > >>>> @@ -23,7 +23,7 @@ > >>>> #-sb-xc-host > >>>> (defun-cached (tokenize-control-string > >>>> :memoizer memoize > >>>> - :hash-bits 7 > >>>> + :hash-bits 10 > >>>> :hash-function (lambda (string) > >>>> (ash (get-lisp-obj-address string) > >>>> #.(- sb-vm:n-lowtag-bits)))) > >>>> > >>>> COST > >>>> > >>>> The cache vector grows from 128 to 1024 entries (simple-vector). > >>>> On 64-bit, this is ~8KB of memory -- negligible for any application > >>>> large enough to have 128+ distinct format strings. The cache is > >>>> allocated lazily on first use, so small programs pay nothing. > >>>> > >>>> The defun-cached assert already allows hash-bits up to 12: > >>>> (assert (typep hash-bits '(integer 5 12))) > >>>> > >>>> IMPACT > >>>> > >>>> Any application with more than ~64 active format strings (the > >>>> effective capacity of a 128-entry 2-way cache) will see thrashing. > >>>> Web servers, GUI applications, and compilers commonly exceed this. > >>>> > >>>> > >>>> _______________________________________________ > >>>> Sbcl-bugs mailing list > >>>> Sbc...@li... > >>>> https://lists.sourceforge.net/lists/listinfo/sbcl-bugs > >> > |
|
From: John C M. <jcm...@gm...> - 2026-03-11 15:50:25
|
Well, this is not an argument to avoid making the format cache larger. > On Mar 11, 2026, at 16:47, Stas Boukarev <sta...@gm...> wrote: > > But if you are after performance why are you calling format with > unknown strings. Maybe use FORMATTER? > > On Wed, Mar 11, 2026 at 6:45 PM John C Mallery <jcm...@gm...> wrote: >> >> I don’t know about SBCL, but we have a fast-format function which does exactly that. We don’t handle, however, complex directives. In any case, this bug in SBCL is fixed by using a large hash table to cache the string. >> >>> On Mar 11, 2026, at 16:27, Stas Boukarev <sta...@gm...> wrote: >>> >>> But why are these format strings not processed at compile time? >>> >>> >>> On Wed, Mar 11, 2026 at 6:22 PM John C Mallery <jcm...@gm...> wrote: >>>> >>>> SBCL version: 2.6.1 (also affects current git HEAD) >>>> Platform: all >>>> >>>> DESCRIPTION >>>> >>>> The FORMAT control string tokenizer cache (tokenize-control-string) >>>> uses :hash-bits 7, giving a 128-entry 2-way set-associative cache. >>>> In large applications with many distinct format strings, the cache >>>> fills completely and thrashes -- every collision re-tokenizes the >>>> control string, allocating fresh tokenized representations. >>>> >>>> In CL-HTTP (a web server with ~200 distinct format strings in its >>>> codebase), the cache is 128/128 full. Profiling under load shows >>>> tokenize-control-string accounting for 7.0% of all allocation during >>>> HTTP request serving (~28K req/s). >>>> >>>> HOW TO REPRODUCE >>>> >>>> ;; Create enough distinct format strings to fill the 128-entry cache >>>> (let ((strings (loop for i below 200 >>>> collect (format nil "test-~D: ~~A ~~D" i)))) >>>> ;; Warm the cache >>>> (dolist (s strings) (format nil s "x" 1)) >>>> ;; Measure -- cache is now thrashing >>>> (let ((before (sb-ext:get-bytes-consed))) >>>> (dotimes (j 100) >>>> (dolist (s strings) >>>> (format nil s "x" 1))) >>>> (format t "~,1F bytes/format call~%" >>>> (/ (float (- (sb-ext:get-bytes-consed) before)) >>>> (* 100.0 (length strings)))))) >>>> >>>> ;; With :hash-bits 7 (128 entries): ~200+ bytes/call (re-tokenizing) >>>> ;; With :hash-bits 10 (1024 entries): ~0 bytes/call (cache hits) >>>> >>>> FIX >>>> >>>> 1-line change in src/code/format.lisp. Increase :hash-bits from 7 >>>> to 10 (1024 entries instead of 128): >>>> >>>> --- a/src/code/format.lisp >>>> +++ b/src/code/format.lisp >>>> @@ -23,7 +23,7 @@ >>>> #-sb-xc-host >>>> (defun-cached (tokenize-control-string >>>> :memoizer memoize >>>> - :hash-bits 7 >>>> + :hash-bits 10 >>>> :hash-function (lambda (string) >>>> (ash (get-lisp-obj-address string) >>>> #.(- sb-vm:n-lowtag-bits)))) >>>> >>>> COST >>>> >>>> The cache vector grows from 128 to 1024 entries (simple-vector). >>>> On 64-bit, this is ~8KB of memory -- negligible for any application >>>> large enough to have 128+ distinct format strings. The cache is >>>> allocated lazily on first use, so small programs pay nothing. >>>> >>>> The defun-cached assert already allows hash-bits up to 12: >>>> (assert (typep hash-bits '(integer 5 12))) >>>> >>>> IMPACT >>>> >>>> Any application with more than ~64 active format strings (the >>>> effective capacity of a 128-entry 2-way cache) will see thrashing. >>>> Web servers, GUI applications, and compilers commonly exceed this. >>>> >>>> >>>> _______________________________________________ >>>> Sbcl-bugs mailing list >>>> Sbc...@li... >>>> https://lists.sourceforge.net/lists/listinfo/sbcl-bugs >> |
|
From: Stas B. <sta...@gm...> - 2026-03-11 15:48:12
|
But if you are after performance why are you calling format with unknown strings. Maybe use FORMATTER? On Wed, Mar 11, 2026 at 6:45 PM John C Mallery <jcm...@gm...> wrote: > > I don’t know about SBCL, but we have a fast-format function which does exactly that. We don’t handle, however, complex directives. In any case, this bug in SBCL is fixed by using a large hash table to cache the string. > > > On Mar 11, 2026, at 16:27, Stas Boukarev <sta...@gm...> wrote: > > > > But why are these format strings not processed at compile time? > > > > > > On Wed, Mar 11, 2026 at 6:22 PM John C Mallery <jcm...@gm...> wrote: > >> > >> SBCL version: 2.6.1 (also affects current git HEAD) > >> Platform: all > >> > >> DESCRIPTION > >> > >> The FORMAT control string tokenizer cache (tokenize-control-string) > >> uses :hash-bits 7, giving a 128-entry 2-way set-associative cache. > >> In large applications with many distinct format strings, the cache > >> fills completely and thrashes -- every collision re-tokenizes the > >> control string, allocating fresh tokenized representations. > >> > >> In CL-HTTP (a web server with ~200 distinct format strings in its > >> codebase), the cache is 128/128 full. Profiling under load shows > >> tokenize-control-string accounting for 7.0% of all allocation during > >> HTTP request serving (~28K req/s). > >> > >> HOW TO REPRODUCE > >> > >> ;; Create enough distinct format strings to fill the 128-entry cache > >> (let ((strings (loop for i below 200 > >> collect (format nil "test-~D: ~~A ~~D" i)))) > >> ;; Warm the cache > >> (dolist (s strings) (format nil s "x" 1)) > >> ;; Measure -- cache is now thrashing > >> (let ((before (sb-ext:get-bytes-consed))) > >> (dotimes (j 100) > >> (dolist (s strings) > >> (format nil s "x" 1))) > >> (format t "~,1F bytes/format call~%" > >> (/ (float (- (sb-ext:get-bytes-consed) before)) > >> (* 100.0 (length strings)))))) > >> > >> ;; With :hash-bits 7 (128 entries): ~200+ bytes/call (re-tokenizing) > >> ;; With :hash-bits 10 (1024 entries): ~0 bytes/call (cache hits) > >> > >> FIX > >> > >> 1-line change in src/code/format.lisp. Increase :hash-bits from 7 > >> to 10 (1024 entries instead of 128): > >> > >> --- a/src/code/format.lisp > >> +++ b/src/code/format.lisp > >> @@ -23,7 +23,7 @@ > >> #-sb-xc-host > >> (defun-cached (tokenize-control-string > >> :memoizer memoize > >> - :hash-bits 7 > >> + :hash-bits 10 > >> :hash-function (lambda (string) > >> (ash (get-lisp-obj-address string) > >> #.(- sb-vm:n-lowtag-bits)))) > >> > >> COST > >> > >> The cache vector grows from 128 to 1024 entries (simple-vector). > >> On 64-bit, this is ~8KB of memory -- negligible for any application > >> large enough to have 128+ distinct format strings. The cache is > >> allocated lazily on first use, so small programs pay nothing. > >> > >> The defun-cached assert already allows hash-bits up to 12: > >> (assert (typep hash-bits '(integer 5 12))) > >> > >> IMPACT > >> > >> Any application with more than ~64 active format strings (the > >> effective capacity of a 128-entry 2-way cache) will see thrashing. > >> Web servers, GUI applications, and compilers commonly exceed this. > >> > >> > >> _______________________________________________ > >> Sbcl-bugs mailing list > >> Sbc...@li... > >> https://lists.sourceforge.net/lists/listinfo/sbcl-bugs > |
|
From: John C M. <jcm...@gm...> - 2026-03-11 15:45:52
|
I don’t know about SBCL, but we have a fast-format function which does exactly that. We don’t handle, however, complex directives. In any case, this bug in SBCL is fixed by using a large hash table to cache the string. > On Mar 11, 2026, at 16:27, Stas Boukarev <sta...@gm...> wrote: > > But why are these format strings not processed at compile time? > > > On Wed, Mar 11, 2026 at 6:22 PM John C Mallery <jcm...@gm...> wrote: >> >> SBCL version: 2.6.1 (also affects current git HEAD) >> Platform: all >> >> DESCRIPTION >> >> The FORMAT control string tokenizer cache (tokenize-control-string) >> uses :hash-bits 7, giving a 128-entry 2-way set-associative cache. >> In large applications with many distinct format strings, the cache >> fills completely and thrashes -- every collision re-tokenizes the >> control string, allocating fresh tokenized representations. >> >> In CL-HTTP (a web server with ~200 distinct format strings in its >> codebase), the cache is 128/128 full. Profiling under load shows >> tokenize-control-string accounting for 7.0% of all allocation during >> HTTP request serving (~28K req/s). >> >> HOW TO REPRODUCE >> >> ;; Create enough distinct format strings to fill the 128-entry cache >> (let ((strings (loop for i below 200 >> collect (format nil "test-~D: ~~A ~~D" i)))) >> ;; Warm the cache >> (dolist (s strings) (format nil s "x" 1)) >> ;; Measure -- cache is now thrashing >> (let ((before (sb-ext:get-bytes-consed))) >> (dotimes (j 100) >> (dolist (s strings) >> (format nil s "x" 1))) >> (format t "~,1F bytes/format call~%" >> (/ (float (- (sb-ext:get-bytes-consed) before)) >> (* 100.0 (length strings)))))) >> >> ;; With :hash-bits 7 (128 entries): ~200+ bytes/call (re-tokenizing) >> ;; With :hash-bits 10 (1024 entries): ~0 bytes/call (cache hits) >> >> FIX >> >> 1-line change in src/code/format.lisp. Increase :hash-bits from 7 >> to 10 (1024 entries instead of 128): >> >> --- a/src/code/format.lisp >> +++ b/src/code/format.lisp >> @@ -23,7 +23,7 @@ >> #-sb-xc-host >> (defun-cached (tokenize-control-string >> :memoizer memoize >> - :hash-bits 7 >> + :hash-bits 10 >> :hash-function (lambda (string) >> (ash (get-lisp-obj-address string) >> #.(- sb-vm:n-lowtag-bits)))) >> >> COST >> >> The cache vector grows from 128 to 1024 entries (simple-vector). >> On 64-bit, this is ~8KB of memory -- negligible for any application >> large enough to have 128+ distinct format strings. The cache is >> allocated lazily on first use, so small programs pay nothing. >> >> The defun-cached assert already allows hash-bits up to 12: >> (assert (typep hash-bits '(integer 5 12))) >> >> IMPACT >> >> Any application with more than ~64 active format strings (the >> effective capacity of a 128-entry 2-way cache) will see thrashing. >> Web servers, GUI applications, and compilers commonly exceed this. >> >> >> _______________________________________________ >> Sbcl-bugs mailing list >> Sbc...@li... >> https://lists.sourceforge.net/lists/listinfo/sbcl-bugs |
|
From: John M. <jc...@cs...> - 2026-03-11 15:40:22
|
SBCL version: 2.6.1 (also affects current git HEAD)
Platform: Darwin arm64 (Apple Silicon), likely all platforms
DESCRIPTION
The one-class and two-class discriminating function generators for
slot-accessor makunbound generic functions are missing RETURN-FROM
ACCESS in the fast-path success case. Every other accessor type
(:reader, :writer, :boundp) wraps its result in (RETURN-FROM ACCESS
...) to skip the fallthrough to the miss function. The :makunbound
case omits this, causing every slot-makunbound call to fall through
to the cache miss function even when the class matches.
This results in ~607 bytes of allocation per slot-makunbound call
(miss function rebuilds the cache, allocates wrapper lists, etc.).
In CL-HTTP under load, slot-makunbound is called once per header per
request during header-set cleanup, causing 24.3% of total allocation
(8.8% direct + 15.5% cache miss overhead).
HOW TO REPRODUCE
(defclass foo () ((x :initform 42)))
(defmethod slot-makunbound-using-class
((class standard-class) (obj foo)
(slotd standard-effective-slot-definition))
(call-next-method))
(let ((obj (make-instance 'foo)))
;; Warm the dispatch cache
(dotimes (i 20) (slot-makunbound obj 'x) (setf (slot-value obj 'x) 42))
;; Measure
(let ((before (sb-ext:get-bytes-consed)))
(dotimes (i 10000)
(slot-makunbound obj 'x)
(setf (slot-value obj 'x) 42))
(format t "~,1F bytes/makunbound~%"
(/ (float (- (sb-ext:get-bytes-consed) before)) 10000.0))))
;; Prints ~607 bytes/makunbound (should be ~0)
FIX
3-line change in src/pcl/dlisp.lisp. Add RETURN-FROM ACCESS around
the :makunbound case, matching :reader, :writer, and :boundp:
--- a/src/pcl/dlisp.lisp
+++ b/src/pcl/dlisp.lisp
@@ -246,8 +246,9 @@
`((let ((value ,read-form))
(return-from access (not (unbound-marker-p value))))))
(:makunbound
- `((progn (setf ,read-form +slot-unbound+)
- ,instance)))
+ `((return-from access
+ (progn (setf ,read-form +slot-unbound+)
+ ,instance))))
(:writer
`((return-from access (setf ,read-form ,(car arglist)))))))
(funcall miss-fn ,@arglist))))))
IMPACT
Any application that calls slot-makunbound frequently (CLOS-heavy web
servers, object pools, etc.) pays ~607 bytes per call. The PCL cache
miss from the fallthrough also causes quadratic behavior as the cache
grows and rehashes repeatedly for entries that should have been hits.
|
|
From: John M. <jc...@cs...> - 2026-03-11 15:40:10
|
SBCL version: 2.6.1 (also affects current git HEAD)
Platform: all
DESCRIPTION
Every call to OPEN registers an auto-close finalizer on the fd-stream
via FINALIZE (which calls SB-LOCKLESS:SO-INSERT to insert into the
finalizer hash table). This allocates a closure and a hash table
node. When OPEN is called from WITH-OPEN-FILE, the finalizer is
redundant because the UNWIND-PROTECT in WITH-OPEN-FILE guarantees
CLOSE is called on both normal and abnormal exit.
Profiling with sb-sprof in :alloc mode shows SB-LOCKLESS:SO-INSERT
accounting for ~31% of OPEN's total allocation. In a web server
that opens/closes files thousands of times per second, this is a
significant source of GC pressure.
HOW TO REPRODUCE
(require :sb-sprof)
(let ((path (namestring *load-truename*))) ; any existing file
(dotimes (i 50) (close (open path))) ; warm buffer pool
;; Profile allocation in OPEN
(sb-sprof:start-profiling :mode :alloc :sample-interval 1)
(dotimes (i 5000) (close (open path)))
(sb-sprof:stop-profiling)
(sb-sprof:report :type :flat :max 10))
;; Output shows SB-LOCKLESS:SO-INSERT at ~31% of samples
FIX
Two files changed: src/code/fd-stream.lisp and src/code/macros.lisp.
1. Add a dynamic variable *SUPPRESS-STREAM-AUTO-CLOSE* that gates
the finalizer registration in MAKE-FD-STREAM:
--- a/src/code/fd-stream.lisp
+++ b/src/code/fd-stream.lisp
@@ -2475,6 +2475,12 @@
;;;
;;; NAME is used to identify the stream when printed.
;;;
+;;; When true, MAKE-FD-STREAM skips the auto-close finalizer registration.
+;;; Bound by WITH-OPEN-FILE, where the UNWIND-PROTECT guarantees CLOSE,
+;;; making the finalizer redundant. Eliminates ~31% of OPEN allocation
+;;; (finalizer closure + SO-INSERT hash table node).
+(defvar *suppress-stream-auto-close* nil)
+
;;; If SERVE-EVENTS is true, SERVE-EVENT machinery is used to
;;; handle blocking IO on the stream.
(defun make-fd-stream (fd
@@ -2544,7 +2550,7 @@
(set-fd-stream-routines stream element-type ...)
- (when auto-close
+ (when (and auto-close (not *suppress-stream-auto-close*))
(finalize stream
(lambda ()
(sb-unix:unix-close fd)
2. Modify WITH-OPEN-FILE to bind *SUPPRESS-STREAM-AUTO-CLOSE* to T.
The stream variable is initialized to NIL and set inside the
UNWIND-PROTECT scope so the suppression is active during OPEN:
--- a/src/code/macros.lisp
+++ b/src/code/macros.lisp
@@ -1437,12 +1437,18 @@
(multiple-value-bind (forms decls) (parse-body body nil)
(let ((abortp (gensym)))
- `(let ((,stream (open ,filespec ,@options))
- (,abortp t))
+ ;; Suppress the auto-close finalizer -- UNWIND-PROTECT guarantees
+ ;; CLOSE, making the finalizer redundant.
+ `(let ((,stream)
+ (,abortp t)
+ (sb-impl::*suppress-stream-auto-close* t))
,@decls
(unwind-protect
(multiple-value-prog1
- (progn ,@forms)
+ (progn
+ (setf ,stream (open ,filespec ,@options))
+ ,@forms)
(setq ,abortp nil))
(when ,stream
(close ,stream :abort ,abortp)))))))
SAFETY
The auto-close finalizer is ONLY suppressed within WITH-OPEN-FILE,
where the UNWIND-PROTECT guarantees CLOSE. Bare OPEN calls outside
WITH-OPEN-FILE still register the finalizer as a safety net.
If WITH-OPEN-FILE's OPEN succeeds but the thread is destroyed before
UNWIND-PROTECT runs (e.g., SB-THREAD:TERMINATE-THREAD), the fd would
leak. However, this is already the case for any resource acquired
inside an UNWIND-PROTECT -- thread termination can prevent cleanup
forms from running regardless of finalizers.
IMPACT
Any I/O-intensive application that opens files via WITH-OPEN-FILE
benefits. Web servers, compilers, and build systems are typical
examples. The fix eliminates ~31% of OPEN's allocation with zero
behavioral change for correct programs.
|
|
From: Stas B. <sta...@gm...> - 2026-03-11 15:27:26
|
But why are these format strings not processed at compile time? On Wed, Mar 11, 2026 at 6:22 PM John C Mallery <jcm...@gm...> wrote: > > SBCL version: 2.6.1 (also affects current git HEAD) > Platform: all > > DESCRIPTION > > The FORMAT control string tokenizer cache (tokenize-control-string) > uses :hash-bits 7, giving a 128-entry 2-way set-associative cache. > In large applications with many distinct format strings, the cache > fills completely and thrashes -- every collision re-tokenizes the > control string, allocating fresh tokenized representations. > > In CL-HTTP (a web server with ~200 distinct format strings in its > codebase), the cache is 128/128 full. Profiling under load shows > tokenize-control-string accounting for 7.0% of all allocation during > HTTP request serving (~28K req/s). > > HOW TO REPRODUCE > > ;; Create enough distinct format strings to fill the 128-entry cache > (let ((strings (loop for i below 200 > collect (format nil "test-~D: ~~A ~~D" i)))) > ;; Warm the cache > (dolist (s strings) (format nil s "x" 1)) > ;; Measure -- cache is now thrashing > (let ((before (sb-ext:get-bytes-consed))) > (dotimes (j 100) > (dolist (s strings) > (format nil s "x" 1))) > (format t "~,1F bytes/format call~%" > (/ (float (- (sb-ext:get-bytes-consed) before)) > (* 100.0 (length strings)))))) > > ;; With :hash-bits 7 (128 entries): ~200+ bytes/call (re-tokenizing) > ;; With :hash-bits 10 (1024 entries): ~0 bytes/call (cache hits) > > FIX > > 1-line change in src/code/format.lisp. Increase :hash-bits from 7 > to 10 (1024 entries instead of 128): > > --- a/src/code/format.lisp > +++ b/src/code/format.lisp > @@ -23,7 +23,7 @@ > #-sb-xc-host > (defun-cached (tokenize-control-string > :memoizer memoize > - :hash-bits 7 > + :hash-bits 10 > :hash-function (lambda (string) > (ash (get-lisp-obj-address string) > #.(- sb-vm:n-lowtag-bits)))) > > COST > > The cache vector grows from 128 to 1024 entries (simple-vector). > On 64-bit, this is ~8KB of memory -- negligible for any application > large enough to have 128+ distinct format strings. The cache is > allocated lazily on first use, so small programs pay nothing. > > The defun-cached assert already allows hash-bits up to 12: > (assert (typep hash-bits '(integer 5 12))) > > IMPACT > > Any application with more than ~64 active format strings (the > effective capacity of a 128-entry 2-way cache) will see thrashing. > Web servers, GUI applications, and compilers commonly exceed this. > > > _______________________________________________ > Sbcl-bugs mailing list > Sbc...@li... > https://lists.sourceforge.net/lists/listinfo/sbcl-bugs |
|
From: John C M. <jcm...@gm...> - 2026-03-11 15:19:17
|
SBCL version: 2.6.1 (also affects current git HEAD)
Platform: all
DESCRIPTION
The FORMAT control string tokenizer cache (tokenize-control-string)
uses :hash-bits 7, giving a 128-entry 2-way set-associative cache.
In large applications with many distinct format strings, the cache
fills completely and thrashes -- every collision re-tokenizes the
control string, allocating fresh tokenized representations.
In CL-HTTP (a web server with ~200 distinct format strings in its
codebase), the cache is 128/128 full. Profiling under load shows
tokenize-control-string accounting for 7.0% of all allocation during
HTTP request serving (~28K req/s).
HOW TO REPRODUCE
;; Create enough distinct format strings to fill the 128-entry cache
(let ((strings (loop for i below 200
collect (format nil "test-~D: ~~A ~~D" i))))
;; Warm the cache
(dolist (s strings) (format nil s "x" 1))
;; Measure -- cache is now thrashing
(let ((before (sb-ext:get-bytes-consed)))
(dotimes (j 100)
(dolist (s strings)
(format nil s "x" 1)))
(format t "~,1F bytes/format call~%"
(/ (float (- (sb-ext:get-bytes-consed) before))
(* 100.0 (length strings))))))
;; With :hash-bits 7 (128 entries): ~200+ bytes/call (re-tokenizing)
;; With :hash-bits 10 (1024 entries): ~0 bytes/call (cache hits)
FIX
1-line change in src/code/format.lisp. Increase :hash-bits from 7
to 10 (1024 entries instead of 128):
--- a/src/code/format.lisp
+++ b/src/code/format.lisp
@@ -23,7 +23,7 @@
#-sb-xc-host
(defun-cached (tokenize-control-string
:memoizer memoize
- :hash-bits 7
+ :hash-bits 10
:hash-function (lambda (string)
(ash (get-lisp-obj-address string)
#.(- sb-vm:n-lowtag-bits))))
COST
The cache vector grows from 128 to 1024 entries (simple-vector).
On 64-bit, this is ~8KB of memory -- negligible for any application
large enough to have 128+ distinct format strings. The cache is
allocated lazily on first use, so small programs pay nothing.
The defun-cached assert already allows hash-bits up to 12:
(assert (typep hash-bits '(integer 5 12)))
IMPACT
Any application with more than ~64 active format strings (the
effective capacity of a 128-entry 2-way cache) will see thrashing.
Web servers, GUI applications, and compilers commonly exceed this.
|
|
From: Stas B. <sta...@gm...> - 2026-02-28 20:30:44
|
Applied both. Thanks. On Sat, Feb 28, 2026 at 11:20 PM Robert Brown <rob...@gm...> wrote: > > Compiling sbcl version 2.6.2 failed for me because the recipe for makefile target embedcore-sbcl runs sbcl in a way that loaded my personal .sbclrc file. The attached patch adds flags --no-userinit and --no-sysinit. > > _______________________________________________ > Sbcl-bugs mailing list > Sbc...@li... > https://lists.sourceforge.net/lists/listinfo/sbcl-bugs |
|
From: Robert B. <rob...@gm...> - 2026-02-28 20:18:09
|
Compiling sbcl version 2.6.2 failed for me because the recipe for makefile target embedcore-sbcl runs sbcl in a way that loaded my personal .sbclrc file. The attached patch adds flags --no-userinit and --no-sysinit. |
|
From: Robert B. <rob...@gm...> - 2026-02-28 20:17:03
|
While compiling 2.6.2, I noticed a C compiler warning. The attached patch gets rid of it. |
|
From: Stas B. <sta...@gm...> - 2026-02-24 15:11:54
|
Applied. Thanks On Tue, Feb 24, 2026 at 4:39 PM Andreas Schwab <sc...@su...> wrote: > > --- > src/compiler/riscv/sap.lisp | 3 +++ > tests/compare-and-swap.impure.lisp | 2 +- > 2 files changed, 4 insertions(+), 1 deletion(-) > > diff --git a/src/compiler/riscv/sap.lisp b/src/compiler/riscv/sap.lisp > index 480324cc6..7758ebaff 100644 > --- a/src/compiler/riscv/sap.lisp > +++ b/src/compiler/riscv/sap.lisp > @@ -203,6 +203,9 @@ > (inst add addr sap offset) > LOOP > (inst ,load result addr :aq) > + #+64-bit ,@(when (and (not signed) (eq size :word)) > + `((inst slli result result 32) > + (inst srli result result 32))) > (inst bne result oldval EXIT) > (inst ,store temp newval addr :aq :rl) > (inst bne temp zero-tn LOOP) > diff --git a/tests/compare-and-swap.impure.lisp b/tests/compare-and-swap.impure.lisp > index c6af87ae8..3b8d59566 100644 > --- a/tests/compare-and-swap.impure.lisp > +++ b/tests/compare-and-swap.impure.lisp > @@ -628,7 +628,7 @@ > (format t "Double-width compare-and-swap NOT TESTED~%"))) > > (test-util:with-test (:name :cas-sap-ref-smoke-test > - :fails-on (or :riscv :loongarch64) ; unsigned-32-bit gets the wrong answer > + :fails-on :loongarch64 ; unsigned-32-bit gets the wrong answer > :skipped-on (not :sb-thread)) > (let ((data (make-array 1 :element-type 'sb-vm:word))) > (sb-sys:with-pinned-objects (data) > -- > 2.53.0 > > > -- > Andreas Schwab, SUSE Labs, sc...@su... > GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 > "And now for something completely different." > > > _______________________________________________ > Sbcl-bugs mailing list > Sbc...@li... > https://lists.sourceforge.net/lists/listinfo/sbcl-bugs |
|
From: Andreas S. <sc...@su...> - 2026-02-24 13:39:20
|
---
src/compiler/riscv/sap.lisp | 3 +++
tests/compare-and-swap.impure.lisp | 2 +-
2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/src/compiler/riscv/sap.lisp b/src/compiler/riscv/sap.lisp
index 480324cc6..7758ebaff 100644
--- a/src/compiler/riscv/sap.lisp
+++ b/src/compiler/riscv/sap.lisp
@@ -203,6 +203,9 @@
(inst add addr sap offset)
LOOP
(inst ,load result addr :aq)
+ #+64-bit ,@(when (and (not signed) (eq size :word))
+ `((inst slli result result 32)
+ (inst srli result result 32)))
(inst bne result oldval EXIT)
(inst ,store temp newval addr :aq :rl)
(inst bne temp zero-tn LOOP)
diff --git a/tests/compare-and-swap.impure.lisp b/tests/compare-and-swap.impure.lisp
index c6af87ae8..3b8d59566 100644
--- a/tests/compare-and-swap.impure.lisp
+++ b/tests/compare-and-swap.impure.lisp
@@ -628,7 +628,7 @@
(format t "Double-width compare-and-swap NOT TESTED~%")))
(test-util:with-test (:name :cas-sap-ref-smoke-test
- :fails-on (or :riscv :loongarch64) ; unsigned-32-bit gets the wrong answer
+ :fails-on :loongarch64 ; unsigned-32-bit gets the wrong answer
:skipped-on (not :sb-thread))
(let ((data (make-array 1 :element-type 'sb-vm:word)))
(sb-sys:with-pinned-objects (data)
--
2.53.0
--
Andreas Schwab, SUSE Labs, sc...@su...
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."
|
|
From: Stas B. <sta...@gm...> - 2026-02-24 06:04:43
|
Applied. Thanks On Wed, Feb 11, 2026 at 6:55 PM Andreas Schwab <sc...@su...> wrote: > > > --- > src/compiler/riscv/parms.lisp | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/compiler/riscv/parms.lisp b/src/compiler/riscv/parms.lisp > index e05b33301..a41685d62 100644 > --- a/src/compiler/riscv/parms.lisp > +++ b/src/compiler/riscv/parms.lisp > @@ -51,7 +51,7 @@ > (defconstant-eqx float-rounding-mode (byte 3 5) #'equalp) > (defconstant-eqx float-sticky-bits (byte 5 0) #'equalp) > ;;;; RISC-V has no explicit floating point traps. > -(defconstant-eqx float-traps-byte (byte 5 0) #'equalp) > +(defconstant-eqx float-traps-byte (byte 0 0) #'equalp) > (defconstant-eqx float-exceptions-byte (byte 5 0) #'equalp) > (defconstant float-fast-bit (ash 1 24)) ;; Flush-to-zero mode > > -- > 2.52.0 > > > -- > Andreas Schwab, SUSE Labs, sc...@su... > GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 > "And now for something completely different." > > > _______________________________________________ > Sbcl-bugs mailing list > Sbc...@li... > https://lists.sourceforge.net/lists/listinfo/sbcl-bugs |
|
From: Andreas S. <sc...@su...> - 2026-02-16 10:11:59
|
On Feb 13 2026, Stas Boukarev wrote: > This is what I get on the current risc-v sbcl: > (scale-float 0.5d0 1025) > => > #.DOUBLE-FLOAT-POSITIVE-INFINITY Of course, this is due to commit 673a3d8ab. Use the 2.6.1 release, or try this: * (sb-int:get-floating-point-modes) (:TRAPS (:INEXACT) :ROUNDING-MODE :NEAREST :CURRENT-EXCEPTIONS (:INEXACT) :ACCRUED-EXCEPTIONS (:INEXACT) :FAST-MODE NIL) * (expt 2.0 -1025) 0.0 * (sb-int:get-floating-point-modes) (:TRAPS (:UNDERFLOW :INEXACT) :ROUNDING-MODE :NEAREST :CURRENT-EXCEPTIONS (:UNDERFLOW :INEXACT) :ACCRUED-EXCEPTIONS (:UNDERFLOW :INEXACT) :FAST-MODE NIL) -- Andreas Schwab, SUSE Labs, sc...@su... GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 "And now for something completely different." |
|
From: Stas B. <sta...@gm...> - 2026-02-13 15:18:48
|
I might have pressed the wrong button. ---------- Forwarded message ---------- From: "dur...@la..." <dur...@la...> To: sbc...@li... Cc: Bcc: Date: Thu, 12 Feb 2026 20:59:48 -0800 Subject: SBCL 2.6.1 SIGILL during make-target-2 warm init on riscv64/linux/musl Building SBCL 2.6.1 from source on native riscv64 hardware using ECL 24.5.10 as the cross-compilation host results in a SIGILL (signal 4) crash during the make-target-2 warm init phase. The cross-compilation (make-host-1) and genesis complete successfully, and the freshly built SBCL boots and begins warm compilation, but crashes after compiling src/code/room.lisp. Build progression make-host-1 (cross-compilation via ECL): Completes successfully. make-host-2 / genesis: Completes successfully. Cold core is generated. Genesis passes are consistent. make-target-2(warm init): The freshly built SBCL boots from the cold core and begins compiling. During warm init, three missing foreign symbols are reported at startup: Missing required foreign symbol 'fun_end_breakpoint_trap'Missing required foreign symbol 'fun_end_breakpoint_end'Missing required foreign symbol 'fun_end_breakpoint_guts' Despite these warnings, cold init proceeds and warm compilation begins. The build successfully compiles many files (including the disassembler) but crashes immediately after compiling src/code/room.lisp. Crash output ; compiling file "src/code/room.lisp" (written 26 JAN 2026 09:10:11 PM): ; wrote /builds/.../obj/from-self/src/code/room.fasl ; compilation finished in 0:00:01.999 CORRUPTION WARNING in SBCL pid 2678 tid 2678: Signal 4 received (PC: 0x518e6f14) Exiting. Error opening /dev/tty: No such device or address ldb> Welcome to LDB, a low-level debugger for the Lisp runtime environment. Command exited with non-zero status 1 Signal 4 is SIGILL. The crash PC (0x518e6f14) falls within the dynamic space (base 0x4F000000), indicating the illegal instruction is in compiled Lisp code rather than the C runtime. The full build log is attached, and any suggestions on how to proceed are sincerely appreciated! -- Sincerely, Will Sinatra Alpine · Lambdacreate · Krei · Tenejo |
|
From: Stas B. <sta...@gm...> - 2026-02-12 22:39:30
|
This is what I get on the current risc-v sbcl: (scale-float 0.5d0 1025) => #.DOUBLE-FLOAT-POSITIVE-INFINITY On Thu, Feb 12, 2026 at 7:23 PM Andreas Schwab <sc...@su...> wrote: > > On Feb 12 2026, Stas Boukarev wrote: > > > What's signalling this error if there are no traps? > > Because float-traps-byte is incorrectly overlayed over > float-exceptions-byte. > > > Why is it -INEXACT instead of -OVERFLOW? > > Because it is an inexact operation. > > -- > Andreas Schwab, SUSE Labs, sc...@su... > GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 > "And now for something completely different." |
|
From: Andreas S. <sc...@su...> - 2026-02-12 16:23:36
|
On Feb 12 2026, Stas Boukarev wrote: > What's signalling this error if there are no traps? Because float-traps-byte is incorrectly overlayed over float-exceptions-byte. > Why is it -INEXACT instead of -OVERFLOW? Because it is an inexact operation. -- Andreas Schwab, SUSE Labs, sc...@su... GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 "And now for something completely different." |
|
From: Stas B. <sta...@gm...> - 2026-02-12 16:06:43
|
What's signalling this error if there are no traps? Why is it -INEXACT
instead of -OVERFLOW?
On Thu, Feb 12, 2026 at 12:13 PM Andreas Schwab <sc...@su...> wrote:
>
> ; compiling file "/home/abuild/rpmbuild/BUILD/maxima-5.49.0-build/maxima-5.49.0/src/float-properties.lisp" (written 18 DEC 2025 07:13:51 AM):
>
> debugger invoked on a FLOATING-POINT-INEXACT in thread
> #<THREAD tid=8808 "main thread" RUNNING {4F97F4A3}>:
> arithmetic error FLOATING-POINT-INEXACT signalled
> Operation was (SCALE-FLOAT 0.5d0 1025).
>
> Type HELP for debugger help, or (SB-EXT:EXIT) to exit from SBCL.
>
> restarts (invokable by number or by possibly-abbreviated name):
> 0: [ABORT] Exit debugger, returning to top level.
>
> (SB-KERNEL::SCALE-DOUBLE-FLOAT-MAYBE-OVERFLOW 0.5d0 1025)
> 0]
> ; compilation aborted after 0:00:00.217
> ;
> ; compilation unit aborted
>
> --
> Andreas Schwab, SUSE Labs, sc...@su...
> GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7
> "And now for something completely different."
|
|
From: Andreas S. <sc...@su...> - 2026-02-12 09:13:15
|
; compiling file "/home/abuild/rpmbuild/BUILD/maxima-5.49.0-build/maxima-5.49.0/src/float-properties.lisp" (written 18 DEC 2025 07:13:51 AM):
debugger invoked on a FLOATING-POINT-INEXACT in thread
#<THREAD tid=8808 "main thread" RUNNING {4F97F4A3}>:
arithmetic error FLOATING-POINT-INEXACT signalled
Operation was (SCALE-FLOAT 0.5d0 1025).
Type HELP for debugger help, or (SB-EXT:EXIT) to exit from SBCL.
restarts (invokable by number or by possibly-abbreviated name):
0: [ABORT] Exit debugger, returning to top level.
(SB-KERNEL::SCALE-DOUBLE-FLOAT-MAYBE-OVERFLOW 0.5d0 1025)
0]
; compilation aborted after 0:00:00.217
;
; compilation unit aborted
--
Andreas Schwab, SUSE Labs, sc...@su...
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."
|