From: Faheem M. <fa...@fa...> - 2012-05-30 23:59:33
|
Hi, I posted the following at http://stackoverflow.com/questions/10813398/optimizing-simple-common-lisp-gibbs-sampler-program I reproduce the question below. The cl-rmath is a wrapper for R's math library, and it located at https://github.com/tpapp/cl-rmath. As the question says, I'm looking for optimization hints. This is just an exercise; I took it from Darren Wilkinson's blog, and it looked simple enough to be something I could learn from. First, I should mention that trying this code without any attempt at optimization or type declaration, I got slightly over 1 minute runtime. With the added declarations etc, it is now about 52/54 seconds; not a dramatic improvement. The C code runs about 25 seconds, so this is not too bad, but I'd like to see if I can get some improvement. BTW, by way of comparison, Darren's Python version runs on my machine at around 9 1/2 minutes. Per Rainer's feedback, I tried wrapping the sqrt value in a double-float d (the double-float (sqrt (+ (* 2 x) 2))) However, I still get the following note, but I'm not sure what the compiler is complaining about now. If I understand this correctly, the issue is that the argument to sqrt might be negative, and hence the result might be complex. However, I've said that the sqrt expression is double-float. The only other thing I can think of, is to tell the compiler that (+ (* 2 X) 2) is always positive, but there doesn't seem to be a way to do that. ; file: /home/faheem/lisp/gibbs.lisp ; in: DEFUN GIBBS ; (SQRT (+ (* 2 X) 2)) ; ; note: unable to ; optimize ; due to type uncertainty: ; The result is a (VALUES (OR (DOUBLE-FLOAT 0.0) (COMPLEX DOUBLE-FLOAT)) ; &OPTIONAL), not a (VALUES FLOAT &REST T). Furthermore, I'm also not sure what to do about the note: doing float to pointer coercion (cost 13) notes. This is wrt the x and y assignments, I think. From browsing on the net, it seems that this is another case where the compiler is not sure about type, but I tried wrapping the rhs of each assignment in a (the fixnum), but it doesn't make any difference. I realise that fixing these compiler notes might not make any difference to the runtime of the program, but I just wanted to get some practice learning to understand how the compiler thinks. Rainer also suggested That I try running TIME and DISASSEMBLE. The time shows a lot of consing - I get Evaluation took: 62.979 seconds of real time 59.695730 seconds of total run time (59.267704 user, 0.428026 system) [ Run times consist of 1.568 seconds GC time, and 58.128 seconds non-GC time. ] 94.79% CPU 126,548,820,533 processor cycles 2 page faults 5,902,262,600 bytes consed I'm not clear whether the consing accounts for a significant fraction of the runtime. If it does, then any hints how I can reduce it? Also, how much bearing does consing/garbage collection have on memory usage? According to top, the usage for the lisp executable seemed to stay fairly steady around 30M, though the virtual memory was high. at 524m. PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 25805 faheem 20 0 524m 30m 3468 R 99 0.8 0:29.61 ./gibbs I assume all the consing is coming from the creation of those temporary values in the x and y assignments. The disassemble output is quite large, but means little to me. Lastly, I'm Ok with SBCL-specific optimization techniques. Thanks in advance. Please CC me on any reply. Regards, Faheem ############################################################################################################################################################################ As an exercise, I rewrote the example program in the blog post [Gibbs sampler in various languages (revisited)](http://darrenjw.wordpress.com/2011/07/16/gibbs-sampler-in-various-languages-revisited/) by Darren Wilkinson. The code appears below. This code runs on my (5 year old) machine in around 53 seconds, using SBCL, creating a core imsage using buildapp, and then running it with time ./gibbs > gibbs.dat Since this was how the timings were calculated for the other languages in the post, I thought I would do something comparable The C code in the post runs in around 25 seconds. I'd like to try and speed up the Lisp code if possible. ############################## gibbs.lisp ############################## (eval-when (:compile-toplevel :load-toplevel :execute) (require :cl-rmath) (setf *read-default-float-format* 'double-float)) (defun gibbs (N thin) (declare (fixnum N thin)) (declare (optimize (speed 3) (safety 1))) (let ((x 0.0) (y 0.0)) (declare (double-float x y)) (print "Iter x y") (dotimes (i N) (dotimes (j thin) (declare (fixnum i j)) (setf x (cl-rmath::rgamma 3.0 (/ 1.0 (+ (* y y) 4)))) (setf y (cl-rmath::rnorm (/ 1.0 (+ x 1.0)) (/ 1.0 (sqrt (+ (* 2 x) 2)))))) (format t "~a ~a ~a~%" i x y)))) (defun main (argv) (declare (ignore argv)) (gibbs 50000 1000)) Then I built the executable `gibbs` by calling `sh gibbs.sh` with `gibbs.sh` as ################## gibbs.sh ################## buildapp --output gibbs --asdf-tree /usr/share/common-lisp/source/ --asdf-tree /usr/local/share/common-lisp/source/ --load-system cl-rmath --load gibbs.lisp --entry main I get 6 compiler notes when compiling with SBCL 1.56, which reproduce below. I'm not sure what to do about them, but would be grateful for any hints. ; compiling file "/home/faheem/lisp/gibbs.lisp" (written 30 MAY 2012 02:00:55 PM): ; file: /home/faheem/lisp/gibbs.lisp ; in: DEFUN GIBBS ; (SQRT (+ (* 2 X) 2)) ; ; note: unable to ; optimize ; due to type uncertainty: ; The result is a (VALUES (OR (DOUBLE-FLOAT 0.0) (COMPLEX DOUBLE-FLOAT)) ; &OPTIONAL), not a (VALUES FLOAT &REST T). ; (/ 1.0d0 (SQRT (+ (* 2 X) 2))) ; ; note: unable to ; optimize ; due to type uncertainty: ; The second argument is a (OR (DOUBLE-FLOAT 0.0) ; (COMPLEX DOUBLE-FLOAT)), not a (COMPLEX ; DOUBLE-FLOAT). ; ; note: forced to do static-fun Two-arg-/ (cost 53) ; unable to do inline float arithmetic (cost 12) because: ; The second argument is a (OR (DOUBLE-FLOAT 0.0) (COMPLEX DOUBLE-FLOAT)), not a DOUBLE-FLOAT. ; The result is a (VALUES (OR (COMPLEX DOUBLE-FLOAT) (DOUBLE-FLOAT 0.0)) ; &OPTIONAL), not a (VALUES DOUBLE-FLOAT &REST T). ; (CL-RMATH:RGAMMA 3.0d0 (/ 1.0d0 (+ (* Y Y) 4))) ; ; note: doing float to pointer coercion (cost 13) ; (SQRT (+ (* 2 X) 2)) ; ; note: doing float to pointer coercion (cost 13) ; (CL-RMATH:RNORM (/ 1.0d0 (+ X 1.0d0)) (/ 1.0d0 (SQRT (+ (* 2 X) 2)))) ; ; note: doing float to pointer coercion (cost 13) ; ; compilation unit finished ; printed 6 notes ; /home/faheem/lisp/gibbs.fasl written ; compilation finished in 0:00:00.073 |
From: <cri...@cs...> - 2012-05-31 07:37:10
|
Hi Fareen, the (the double-float ...) tells the compiler that you insist that the result is a double-float; though you may lie to the compiler if x assumes non-appropriate values for this declaration. I see in the Stackoverflow thread that you want to declrare x as beaing a d-f >= 0.0. If this is the case than you can try: (declare (type (double-float 0.0 *) x)) Then you should also be able to remove the (the double-float ...) since the compiler should be able to infer that the result can only be a double-float. Furthermore you will get automatically a type/range-check for the correct values of x. Hope this helps. Best regards, Stephan > Hi, > > I posted the following at > http://stackoverflow.com/questions/10813398/optimizing-simple-common-lisp-gibbs-sampler-program > I reproduce the question below. > > The cl-rmath is a wrapper for R's math library, and it located at > https://github.com/tpapp/cl-rmath. > > As the question says, I'm looking for optimization hints. This is just > an exercise; I took it from Darren Wilkinson's blog, and it looked > simple enough to be something I could learn from. > > First, I should mention that trying this code without any attempt at > optimization or type declaration, I got slightly over 1 minute > runtime. With the added declarations etc, it is now about 52/54 > seconds; not a dramatic improvement. The C code runs about 25 seconds, > so this is not too bad, but I'd like to see if I can get some > improvement. BTW, by way of comparison, Darren's Python version runs > on my machine at around 9 1/2 minutes. > > Per Rainer's feedback, I tried wrapping the sqrt value in a double-float d > > (the double-float (sqrt (+ (* 2 x) 2))) > > However, I still get the following note, but I'm not sure what the > compiler is complaining about now. If I understand this correctly, the > issue is that the argument to sqrt might be negative, and hence the > result might be complex. However, I've said that the sqrt expression > is double-float. The only other thing I can think of, is to tell the > compiler that (+ (* 2 X) 2) is always positive, but there doesn't seem > to be a way to do that. > > ; file: /home/faheem/lisp/gibbs.lisp > ; in: DEFUN GIBBS > ; (SQRT (+ (* 2 X) 2)) > ; > ; note: unable to > ; optimize > ; due to type uncertainty: > ; The result is a (VALUES (OR (DOUBLE-FLOAT 0.0) (COMPLEX DOUBLE-FLOAT)) > ; &OPTIONAL), not a (VALUES FLOAT &REST T). > > Furthermore, I'm also not sure what to do about the > > note: doing float to pointer coercion (cost 13) > > notes. This is wrt the x and y assignments, I think. From browsing on > the net, it seems that this is another case where the compiler is not > sure about type, but I tried wrapping the rhs of each assignment in a > (the fixnum), but it doesn't make any difference. > > I realise that fixing these compiler notes might not make any > difference to the runtime of the program, but I just wanted to get > some practice learning to understand how the compiler thinks. > > Rainer also suggested That I try running TIME and DISASSEMBLE. The > time shows a lot of consing - I get > > Evaluation took: > 62.979 seconds of real time > 59.695730 seconds of total run time (59.267704 user, 0.428026 system) > [ Run times consist of 1.568 seconds GC time, and 58.128 seconds > non-GC time. ] > 94.79% CPU > 126,548,820,533 processor cycles > 2 page faults > 5,902,262,600 bytes consed > > I'm not clear whether the consing accounts for a significant fraction > of the runtime. If it does, then any hints how I can reduce it? Also, > how much bearing does consing/garbage collection have on memory usage? > According to top, the usage for the lisp executable seemed to stay > fairly steady around 30M, though the virtual memory was high. at 524m. > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 25805 faheem 20 0 524m 30m 3468 R 99 0.8 0:29.61 ./gibbs > > I assume all the consing is coming from the creation of those > temporary values in the x and y assignments. > > The disassemble output is quite large, but means little to me. > > Lastly, I'm Ok with SBCL-specific optimization techniques. Thanks in > advance. Please CC me on any reply. > > Regards, Faheem > > ############################################################################################################################################################################ > > As an exercise, I rewrote the example program in the blog post [Gibbs > sampler in various languages > (revisited)](http://darrenjw.wordpress.com/2011/07/16/gibbs-sampler-in-various-languages-revisited/) > by Darren Wilkinson. > > The code appears below. This code runs on my (5 year old) machine in > around 53 seconds, using SBCL, creating a core imsage using buildapp, > and then running it with > > time ./gibbs > gibbs.dat > > Since this was how the timings were calculated for the other languages > in the post, I thought I would do something comparable The C code in > the post runs in around 25 seconds. I'd like to try and speed up the > Lisp code if possible. > > ############################## > gibbs.lisp > ############################## > (eval-when (:compile-toplevel :load-toplevel :execute) > (require :cl-rmath) (setf *read-default-float-format* > 'double-float)) > > (defun gibbs (N thin) > (declare (fixnum N thin)) > (declare (optimize (speed 3) (safety 1))) > (let ((x 0.0) (y 0.0)) > (declare (double-float x y)) > (print "Iter x y") > (dotimes (i N) > (dotimes (j thin) > (declare (fixnum i j)) > (setf x (cl-rmath::rgamma 3.0 (/ 1.0 (+ (* y y) 4)))) > (setf y (cl-rmath::rnorm (/ 1.0 (+ x 1.0)) (/ 1.0 (sqrt (+ (* 2 x) > 2)))))) > (format t "~a ~a ~a~%" i x y)))) > > (defun main (argv) > (declare (ignore argv)) > (gibbs 50000 1000)) > > Then I built the executable `gibbs` by calling `sh gibbs.sh` with > `gibbs.sh` as > > ################## > gibbs.sh > ################## > buildapp --output gibbs --asdf-tree /usr/share/common-lisp/source/ > --asdf-tree /usr/local/share/common-lisp/source/ --load-system > cl-rmath --load gibbs.lisp --entry main > > I get 6 compiler notes when compiling with SBCL 1.56, which reproduce > below. I'm not sure what to do about them, but would be grateful for > any hints. > > ; compiling file "/home/faheem/lisp/gibbs.lisp" (written 30 MAY 2012 > 02:00:55 PM): > > ; file: /home/faheem/lisp/gibbs.lisp > ; in: DEFUN GIBBS > ; (SQRT (+ (* 2 X) 2)) > ; > ; note: unable to > ; optimize > ; due to type uncertainty: > ; The result is a (VALUES (OR (DOUBLE-FLOAT 0.0) (COMPLEX > DOUBLE-FLOAT)) > ; &OPTIONAL), not a (VALUES FLOAT &REST T). > > ; (/ 1.0d0 (SQRT (+ (* 2 X) 2))) > ; > ; note: unable to > ; optimize > ; due to type uncertainty: > ; The second argument is a (OR (DOUBLE-FLOAT 0.0) > ; (COMPLEX DOUBLE-FLOAT)), not a > (COMPLEX > ; > DOUBLE-FLOAT). > ; > ; note: forced to do static-fun Two-arg-/ (cost 53) > ; unable to do inline float arithmetic (cost 12) because: > ; The second argument is a (OR (DOUBLE-FLOAT 0.0) (COMPLEX > DOUBLE-FLOAT)), not a DOUBLE-FLOAT. > ; The result is a (VALUES (OR (COMPLEX DOUBLE-FLOAT) > (DOUBLE-FLOAT 0.0)) > ; &OPTIONAL), not a (VALUES > DOUBLE-FLOAT &REST T). > > ; (CL-RMATH:RGAMMA 3.0d0 (/ 1.0d0 (+ (* Y Y) 4))) > ; > ; note: doing float to pointer coercion (cost 13) > > ; (SQRT (+ (* 2 X) 2)) > ; > ; note: doing float to pointer coercion (cost 13) > > ; (CL-RMATH:RNORM (/ 1.0d0 (+ X 1.0d0)) (/ 1.0d0 (SQRT (+ (* 2 X) > 2)))) > ; > ; note: doing float to pointer coercion (cost 13) > ; > ; compilation unit finished > ; printed 6 notes > > ; /home/faheem/lisp/gibbs.fasl written > ; compilation finished in 0:00:00.073 > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Sbcl-help mailing list > Sbc...@li... > https://lists.sourceforge.net/lists/listinfo/sbcl-help > |
From: Pierpaolo B. <olo...@gm...> - 2012-05-31 08:08:24
|
On Thu, May 31, 2012 at 1:59 AM, Faheem Mitha <fa...@fa...> wrote: > (defun gibbs (N thin) > (declare (fixnum N thin)) > (declare (optimize (speed 3) (safety 1))) Using (safety 0) does make any difference? |
From: Faheem M. <fa...@fa...> - 2012-05-31 08:31:53
|
Hi Stephan, Thanks for your reply. On Thu, 31 May 2012, cri...@cs... wrote: > Hi Fareen, > > the (the double-float ...) tells the compiler that you insist that the > result is a double-float; though you may lie to the compiler if x assumes > non-appropriate values for this declaration. > > I see in the Stackoverflow thread that you want to declrare x as beaing a > d-f >= 0.0. If this is the case than you can try: > > (declare (type (double-float 0.0 *) x)) If this declaration says that x is a double float which is greater than or equal to 0, then that seems like a good way to resolve the problem. Can you give me a reference for this syntax? Thanks. > Then you should also be able to remove the (the double-float ...) since > the compiler should be able to infer that the result can only be a > double-float. Furthermore you will get automatically a type/range-check > for the correct values of x. Oh, the compiler will automatically put in a runtime check that x >= 0? Interesting. > Hope this helps. Yes, thanks. > Best regards, > Stephan Regards, Faheem. |
From: Faheem M. <fa...@fa...> - 2012-05-31 08:33:49
|
On Thu, 31 May 2012, Pierpaolo Bernardi wrote: > On Thu, May 31, 2012 at 1:59 AM, Faheem Mitha <fa...@fa...> wrote: > >> (defun gibbs (N thin) >> (declare (fixnum N thin)) >> (declare (optimize (speed 3) (safety 1))) > > Using (safety 0) does make any difference? I tried it, but it does not have a noticeable effect. Also, I've seen people recommending against setting safety to 0. Regards, Faheem |
From: Pierpaolo B. <olo...@gm...> - 2012-05-31 08:43:12
|
On Thu, May 31, 2012 at 10:33 AM, Faheem Mitha <fa...@fa...> wrote: > > > On Thu, 31 May 2012, Pierpaolo Bernardi wrote: > >> On Thu, May 31, 2012 at 1:59 AM, Faheem Mitha <fa...@fa...> wrote: >> >>> (defun gibbs (N thin) >>> (declare (fixnum N thin)) >>> (declare (optimize (speed 3) (safety 1))) >> >> Using (safety 0) does make any difference? > > I tried it, but it does not have a noticeable effect. Also, I've seen people > recommending against setting safety to 0. In general, they are correct. But if you are trying to squeeze the last drop of performance from a little benchmark and comparing the speed against C then it's it's allowed. 8^) |
From: Faheem M. <fa...@fa...> - 2012-05-31 08:45:51
|
On Thu, 31 May 2012, Faheem Mitha wrote: > > Hi Stephan, > > Thanks for your reply. > > On Thu, 31 May 2012, cri...@cs... wrote: > >> Hi Fareen, >> >> the (the double-float ...) tells the compiler that you insist that the >> result is a double-float; though you may lie to the compiler if x assumes >> non-appropriate values for this declaration. >> >> I see in the Stackoverflow thread that you want to declrare x as beaing a >> d-f >= 0.0. If this is the case than you can try: >> >> (declare (type (double-float 0.0 *) x)) Incidentally, the compiler accepts (declare ((double-float 0.0 *) x)) which suggests that the TYPE is redundant. Haven't found the syntax for (double-float 0.0 *), though. This change does seem to make the program run slightly faster. It is now consistently below 52 sec. Faheem |
From: Pierpaolo B. <olo...@gm...> - 2012-05-31 09:09:17
|
On Thu, May 31, 2012 at 10:45 AM, Faheem Mitha <fa...@fa...> wrote: > Incidentally, the compiler accepts > > (declare ((double-float 0.0 *) x)) > > which suggests that the TYPE is redundant. See "Notes" in: http://www.lispworks.com/documentation/HyperSpec/Body/d_type.htm > Haven't found the syntax for (double-float 0.0 *), though. http://www.lispworks.com/documentation/HyperSpec/Body/t_short_.htm#double-float |
From: Faheem M. <fa...@fa...> - 2012-06-04 21:12:07
|
Hi, I got some feedback, and decided to try reworking this exercise using an array. I ran into some issues, at least one of which is quite worrying. The code is below, and here are the issue, in descending order of importance. 1) I'm having GC problems. If I run this code a number of times, like Here is the output for (room) for a newly started REPL. CL-USER> (room) Dynamic space usage is: 92,358,952 bytes. Read-only space usage is: 3,512 bytes. Static space usage is: 2,224 bytes. Control stack usage is: 5,928 bytes. Binding stack usage is: 536 bytes. Control and binding stack usage is for the current thread only. Garbage collection is currently enabled. Breakdown for dynamic space: 21,981,408 bytes for 565,633 instance objects. 15,453,048 bytes for 1,931,631 cons objects. 12,000,912 bytes for 16,217 code objects. 11,585,944 bytes for 210,891 simple-vector objects. 8,400,184 bytes for 61,316 simple-character-string objects. 5,903,840 bytes for 42,632 simple-array-unsigned-byte-32 objects. 17,717,104 bytes for 1,177,704 other objects. 93,042,440 bytes for 4,006,024 dynamic objects (space total.) Now I run this a bunch of times. (time (gibbs 10000 1000)) The dynamic space usage keeps climbing, and eventually I get a heap error. At that point (room) gives CL-USER> (room) Dynamic space usage is: 1,010,419,640 bytes. Read-only space usage is: 3,512 bytes. Static space usage is: 2,224 bytes. Control stack usage is: 5,928 bytes. Binding stack usage is: 536 bytes. Control and binding stack usage is for the current thread only. Garbage collection is currently enabled. Breakdown for dynamic space: 960,000,104 bytes for 13 simple-array-double-float objects. 50,673,608 bytes for 1,673,714 other objects. 1,010,673,712 bytes for 1,673,727 dynamic objects (space total.) ; No value Yikes. After I waited a bit, the dynamic space usage did go down a bit, but only a little. Breakdown for dynamic space: 800,000,088 bytes for 11 simple-array-double-float objects. I don't understand this. These were dynamically created objects. Shouldn't they be destroyed when the function exited? Did I make some mistake in writing the code? Tangential question: is there some way I can inspect these objects which are occupying the dynamic space? Tangent question 2. What are all these millions of objects? 93,042,440 bytes for 4,006,024 dynamic objects (space total.)?? 2) This code emits a bunch of compiler warnings. Let me just focus on one. ; note: forced to do GENERIC-* (cost 30) ; unable to do inline fixnum arithmetic (cost 4) because: ; The result is a (VALUES (MOD 288230374541099011) &OPTIONAL), not a (VALUES ; FIXNUM ; &REST ; T). ; unable to do inline (signed-byte 32) arithmetic (cost 5) because: ; The result is a (VALUES (MOD 288230374541099011) &OPTIONAL), not a (VALUES ; (SIGNED-BYTE ; 32) ; &REST ; T). ; etc. ; I've seen the term MOD appear in this context frequently. I've searched, but have not been able to find any explanation of this. Would someone be kind enough to explain MOD and the magic number 288230374541099011, and also if possible a suggestion for getting rid of the warning? By guess is that this reflects some kind of integer overflow issue, but I'm fuzzy on the details. Would some kind of range restriction be a solution? The following are more minor issues. 3) If I pass the size of the array, as a function argument, the compiler won't let me declare (declare (type (simple-array (double-float 0.0 *) (somefunctionargument)) xvec)) using the function argument, though if I use a literal value like 10000000 as in the code above, it works. The error is not very helpful - it says bad dimension. Is this because for this declaration, the array length needs to be a compile time constant? 4) The compiler is Ok with (declare ((integer 0 536870911) i j)) (positive fixnum, basically) but not with (declare ((integer 0 most-positive-fixnum) i j)) though, if I understand this correctly, most-positive-fixnum is just the integer 536870911. Also, I get CL-USER> (type-of most-positive-fixnum) (INTEGER 0 536870911) CL-USER> (type-of 2) (INTEGER 0 536870911) So SBCL thinks most-positive-fixnum is of the same type as 2. Please CC me on any reply. Thanks. Regards, Faheem #################################################################### (eval-when (:compile-toplevel :load-toplevel :execute) (require :cl-rmath) (setf *read-default-float-format* 'double-float)) (declaim (inline cl-rmath:rnorm cl-rmath:rgamma)) (cffi:defcfun ("rgamma" rgamma) :double (arg0 :double) (arg1 :double)) (cffi:defcfun ("rnorm" rnorm) :double (arg0 :double) (arg1 :double)) (defun gibbs (N thin) (declare (fixnum N thin)) (declare (optimize (speed 3) (safety 1))) (declare (optimize debug)) (let ((x 0.0) (y 0.0) (xvec (make-array 10000000 :element-type '(double-float 0.0 *) :adjustable nil :fill-pointer nil :displaced-to nil)) (yvec (make-array 10000000 :element-type 'double-float :adjustable nil :fill-pointer nil :displaced-to nil))) (declare (double-float x y)) (declare (type (simple-array (double-float 0.0 *) (10000000)) xvec)) (declare (type (simple-array double-float (10000000)) yvec)) (dotimes (i N) (dotimes (j thin) (declare (fixnum i j)) (declare ((integer 0 536870911) i j)) ;;(declare ((integer 0 most-positive-fixnum) i j)) (symbol-macrolet ((xval (aref xvec (+ (the fixnum (* i thin)) j))) (yval (aref yvec (+ (the fixnum (* i thin)) j)))) (setf xval (cl-rmath::rgamma 3.0 (/ 1 (+ (* yval yval) 4)))) (setf yval (cl-rmath::rnorm (/ 1 (+ xval 1)) (/ 1 (sqrt (+ (* 2 xval) 2))))) (setf x xval) (setf y yval))) (format t "~a ~a ~a~%" i x y)))) (defun main (argv) (declare (ignore argv)) ;;(gibbs 50000 1000 10000000)) ;;(gibbs 10000 1000 10000000)) (gibbs 10000 1000)) |
From: Piotr C. <pio...@po...> - 2012-06-05 10:11:39
|
W dniu 2012-06-04 23:11, Faheem Mitha pisze: > Hi, > > I got some feedback, and decided to try reworking this exercise using > an array. I ran into some issues, at least one of which is quite > worrying. The code is below, and here are the issue, in descending > order of importance. > > 1) I'm having GC problems. If I run this code a number of times, like > > Here is the output for (room) for a newly started REPL. > > CL-USER> (room) > Dynamic space usage is: 92,358,952 bytes. > (...) > > Now I run this a bunch of times. > > (time (gibbs 10000 1000)) > > The dynamic space usage keeps climbing, and eventually I get a heap > error. At that point (room) gives > > CL-USER> (room) > Dynamic space usage is: 1,010,419,640 bytes. > (...) > > Yikes. After I waited a bit, the dynamic space usage did go down a bit, > but only a little. > > Breakdown for dynamic space: > 800,000,088 bytes for 11 simple-array-double-float objects. > > I don't understand this. These were dynamically created > objects. Shouldn't they be destroyed when the function exited? Did I > make some mistake in writing the code? Garbage collector in SBCL is „conservative”, from http://www.sbcl.org/manual/History-and-Implementation-of-SBCL.html : „On the x86 SBCL – like the x86 port of CMUCL – uses a /conservative/ GC. This means that it doesn't maintain a strict separation between tagged and untagged data, instead treating some untagged data (e.g. raw floating point numbers) as possibly-tagged data and so not collecting any Lisp objects that they point to. This has some negative consequences for average time efficiency (though possibly no worse than the negative consequences of trying to implement an exact GC on a processor architecture as register-poor as the X86) and also has potentially unlimited consequences for worst-case memory efficiency. In practice, conservative garbage collectors work reasonably well, not getting anywhere near the worst case. But they can occasionally cause odd patterns of memory usage. ” If it is problem You can try another implementation with „precise” collector, as for example CCL... -- pozdrawiam Piotr Chamera |
From: Stephan F. <cri...@cs...> - 2012-06-04 22:09:56
|
Hi Faheem, I'm going to answer just one of your questions, because it is way past my bedtime and the rest require some more thinking. > 4) The compiler is Ok with > > (declare ((integer 0 536870911) i j)) > > (positive fixnum, basically) but not with > > (declare ((integer 0 most-positive-fixnum) i j)) > > though, if I understand this correctly, most-positive-fixnum is just > the integer 536870911. > If you declare using such constants you need to use the #. reader macro. So the following should work: (declare ((integer 0 #.most-positive-fixnum) i j)) see http://www.lispworks.com/documentation/HyperSpec/Body/02_dhf.htm Kind regards, Stephan |
From: Faheem M. <fa...@fa...> - 2012-06-05 06:56:51
|
On Tue, 5 Jun 2012, Stephan Frank wrote: > Hi Faheem, > > I'm going to answer just one of your questions, because it is way past > my bedtime and the rest require some more thinking. > >> 4) The compiler is Ok with >> >> (declare ((integer 0 536870911) i j)) >> >> (positive fixnum, basically) but not with >> >> (declare ((integer 0 most-positive-fixnum) i j)) >> >> though, if I understand this correctly, most-positive-fixnum is just >> the integer 536870911. > If you declare using such constants you need to use the #. reader macro. > So the following should work: > > (declare ((integer 0 #.most-positive-fixnum) i j)) > > see http://www.lispworks.com/documentation/HyperSpec/Body/02_dhf.htm Hi Stephan, Thanks, that works. Since 'most-positive-fixnum' is an an object not an integer, even though 'type-of' provided no clue to this, was there some way I could have known this in advance? Unlike C and Python, in Lisp it seems a symbolic constant cannot just be substituted in place of a literal numerical value. Regards, faheem |
From: Stephan F. <cri...@cs...> - 2012-06-04 22:18:58
|
oh yeah and there are way too many declares in your code. The type analyzer should be able to figure out most of your declareration. E.g. with just declaring N thin as (declare ((integer 0 #.most-positive-fixnum) N thin)) you should be able to remove any declares concerning i and j since their values are depended on N and thin. Furthermore, I think you try to reduce overall run-time. Have you already profiled your code to ensure that the FFI are not the main contributing factor? You're not going to optimize away this overhead with your type annotations.... Stephan |
From: Faheem M. <fa...@fa...> - 2012-06-04 22:50:03
|
On Mon, 4 Jun 2012, Pierpaolo Bernardi wrote: > On Mon, Jun 4, 2012 at 11:11 PM, Faheem Mitha <fa...@fa...> wrote: >> 2) This code emits a bunch of compiler warnings. Let me just focus on >> one. >> >> ; note: forced to do GENERIC-* (cost 30) >> ; unable to do inline fixnum arithmetic (cost 4) because: >> ; The result is a (VALUES (MOD 288230374541099011) &OPTIONAL), not a (VALUES >> ; FIXNUM >> ; &REST >> ; T). >> ; unable to do inline (signed-byte 32) arithmetic (cost 5) because: >> ; The result is a (VALUES (MOD 288230374541099011) &OPTIONAL), not a (VALUES >> ; (SIGNED-BYTE >> ; 32) >> ; &REST >> ; T). >> ; etc. >> ; >> >> I've seen the term MOD appear in this context frequently. I've >> searched, but have not been able to find any explanation of >> this. Would someone be kind enough to explain MOD and the magic number >> 288230374541099011, and also if possible a suggestion for getting rid >> of the warning? By guess is that this reflects some kind of integer >> overflow issue, but I'm fuzzy on the details. Would some kind of range >> restriction be a solution? > This one is easy: > - go to http://www.lispworks.com/documentation/HyperSpec/Front/index.htm > - click on "Symbol Index", then "M", then "MOD", then "Type Specifier" Hi Pierpaolo, Thanks for the reply and the pointer. I missed it somehow. You didn't copy the list, which I presume was unintentional, so I'm ccing back to it. I've got into the bad habit of googling for error messages, when using other, more disorderly languages. I've come to realise that with Common Lisp, all roads lead to the Hyperspec. So I'll check there first next time I see a term I don't recognize. So, per http://www.lispworks.com/documentation/HyperSpec/Body/t_mod.htm "This denotes the set of non-negative integers less than n." So, this is saying, this product is too big for a fixnum. Really not the best notation. To anyone's mind, this suggests modulus. As regards that magic value, my guess would have been the square of the largest possible fixnum (most-positive-fixnum), which is 536870911, but in fact it is (most-positive-fixnum * ( most-positive-fixnum - 1)) + 1 = 536870911 * 536870910 + 1 = 288230374541099011L Not sure why. Hmm. I guess I could try restricting the range of i and thin, and seeing what happens. Regards, Faheem |
From: Faheem M. <fa...@fa...> - 2012-06-05 09:40:17
|
On Tue, 5 Jun 2012, Stephan Frank wrote: > oh yeah and there are way too many declares in your code. The type > analyzer should be able to figure out most of your declareration. E.g. > with just declaring N thin as > > (declare ((integer 0 #.most-positive-fixnum) N thin)) > > you should be able to remove any declares concerning i and j since their > values are depended on N and thin. Good point. Done. Any other tips? > Furthermore, I think you try to reduce overall run-time. Have you > already profiled your code to ensure that the FFI are not the main > contributing factor? You're not going to optimize away this overhead > with your type annotations.... Also good point. Actually, I recently wrote in another post (on comp.lang.lisp) """SBCL also has a statistical profiler. I didn't try that, since I assumed this wouldn't give me much information for such a small simple function, but it might at least tell me how much time is spent running the C functions.""" I guess I should try profiling next. Is the SBCL statistical profiler the best way to go, in your opinion? Regards, Faheem |
From: Stas B. <sta...@gm...> - 2012-06-05 09:51:40
|
Faheem Mitha <fa...@fa...> writes: > On Tue, 5 Jun 2012, Stephan Frank wrote: > >> oh yeah and there are way too many declares in your code. The type >> analyzer should be able to figure out most of your declareration. E.g. >> with just declaring N thin as >> >> (declare ((integer 0 #.most-positive-fixnum) N thin)) >> >> you should be able to remove any declares concerning i and j since their >> values are depended on N and thin. > > Good point. Done. Any other tips? This declaration can be written as (declare (type (and (integer 0) fixnum) N)) instead. -- With best regards, Stas. |
From: Faheem M. <fa...@fa...> - 2012-06-05 10:48:46
|
On Tue, 5 Jun 2012, Faheem Mitha wrote: > Thanks for the reply. I suppose experimenting with other CL > implementations are a possibility, but I'd rather stick with SBCL if > possible. Is it possible to tell SBCL that I want to deallocate this > memory, please? More generally, do I have any alternative besides > restarting the REPL? I found (gc :full t), which however, doesn't change space usage either. Maybe there is some way of telling the objects that you want them garbage collected at the time of creation? Regards, Faheem CL-USER> (room) Dynamic space usage is: 603,951,752 bytes. Read-only space usage is: 3,512 bytes. Static space usage is: 2,224 bytes. Control stack usage is: 5,928 bytes. Binding stack usage is: 536 bytes. Control and binding stack usage is for the current thread only. Garbage collection is currently enabled. Breakdown for dynamic space: 560,000,064 bytes for 8 simple-array-double-float objects. 44,184,776 bytes for 1,499,100 other objects. 604,184,840 bytes for 1,499,108 dynamic objects (space total.) ; No value CL-USER> (gc :full t) NIL CL-USER> (room) Dynamic space usage is: 603,956,048 bytes. Read-only space usage is: 3,512 bytes. Static space usage is: 2,224 bytes. Control stack usage is: 5,928 bytes. Binding stack usage is: 536 bytes. Control and binding stack usage is for the current thread only. Garbage collection is currently enabled. Breakdown for dynamic space: 560,000,064 bytes for 8 simple-array-double-float objects. 44,180,872 bytes for 1,500,023 other objects. 604,180,936 bytes for 1,500,031 dynamic objects (space total.) |