Re: [OctDev] regress and rcoplot (partially written, please help) From: William Poetra Yoga Hadisoeseno - 2006-02-15 13:03 ```OK, so it seems I've been idle for the past few weeks. Would you like to try the file? It's attached. -- William Poetra Yoga Hadisoeseno ```

 Re: [OctDev] regress and rcoplot (partially written, please help) From: Paul Kienzle - 2005-12-31 10:09 ```Here's what I wrote in wsolve: [nr,nc] = size(A); if nc > nr, error("underdetermined system"); end ## system solution: A x = y => x = inv(A) y ## QR decomposition has good numerical properties: ## AP = QR, with P'P = Q'Q = I, and R upper triangular ## so ## inv(A) y = P inv(R) inv(Q) y = P inv(R) Q' y = P (R \ (Q' y)) ## Note that b is usually a vector and Q is matrix, so it will ## be faster to compute (y' Q)' than (Q' y). [Q,R,p] = qr(A,0); x = R\(y'*Q)'; x(p) = x; s.R = R; s.R(:,p) = R; s.df = nr-nc; s.normr = norm(y - A*x); Here's what I wrote in polyconf: ## Confidence intervals for linear system are given by: ## x' p +/- sqrt( Finv(1-a,1,df) var(x' p) ) ## where for confidence intervals, ## var(x' p) = sigma^2 (x' inv(A'A) x) ## and for prediction intervals, ## var(x' p) = sigma^2 (1 + x' inv(A'A) x) ## ## Rather than A'A we have R from the QR decomposition of A, but ## R'R equals A'A. Note that R is not upper triangular since we ## have already multiplied it by the permutation matrix, but it ## is invertible. Rather than forming the product R'R which is ## ill-conditioned, we can rewrite x' inv(A'A) x as the equivalent ## x' inv(R) inv(R') x = t t', for t = x' inv(R) ## Since x is a vector, t t' is the inner product sumsq(t). ## Note that LAPACK allows us to do this simultaneously for many ## different x using sqrt(sumsq(X/R,2)), with each x on a different row. ## ## Note: sqrt(F(1-a;1,df)) = T(1-a/2;df) function [y,dy] = confidence(A,p,S,alpha,typestr) if nargin < 4, alpha = []; end if nargin < 5, typestr = 'ci'; end y = A*p(:); switch typestr, case 'ci', pred = 0; default_alpha=erfc(1/sqrt(2)); case 'pi', pred = 1; default_alpha=0.05; otherwise, error("use 'ci' or 'pi' for interval type"); end if isempty(alpha), alpha = default_alpha; end s = t_inv(1-alpha/2,S.df)*S.normr/sqrt(S.df); dy = s*sqrt(pred+sumsq(A/S.R,2)); Can you use these two functions to do what you need? - Paul On Dec 31, 2005, at 3:06 AM, William Poetra Yoga Hadisoeseno wrote: > In the Matlab documentation from http://www.mathworks.com (for regress), > it mentions QR factorization (using QR factorization instead of > computing an inverse matrix). Can anyone explain to me about this? ```

 Re: [OctDev] regress and rcoplot (partially written, please help) From: Paul Kienzle - 2005-12-31 10:37 ```You should just be able to use errorbar. Assuming you don't want to because you don't like how it looks, can you set the symbol size larger instead of drawing circles yourself? __gnuplot_set__('pointsize 2'); plot(rand(5,1),'o;;'); I would suggest plotting using: plot(cx,cy,'o;;',lx,ly,'-;;') where cx,cy are the points and lx,ly are matrices with three columns and two rows for each r defining the end points of the error bars and the caps respectively. If you decide to draw the circles yourself, change this to: plot(cx,cy,'o;;',lx,ly,'-;;') where cx,cy contain one column per circle. No for loops needed, and a small number of separate matrices being plotted should improve your performance a lot. - Paul On Dec 31, 2005, at 3:06 AM, William Poetra Yoga Hadisoeseno wrote: > For rcoplot: > 1. The round circles in the middle are ugly > 2. It's a bit slow, but that's the fastest I can get. I've tried > looping the plot command (and using hold on/off), but that's slower. ```

 Re: [OctDev] regress and rcoplot (partially written, please help) From: William Poetra Yoga Hadisoeseno - 2005-12-31 15:12 ```On 12/31/05, Paul Kienzle wrote: > You should just be able to use errorbar. > > Assuming you don't want to because you don't like how it looks, can you > set the symbol size larger instead of drawing circles yourself? > > __gnuplot_set__('pointsize 2'); plot(rand(5,1),'o;;'); > > I would suggest plotting using: > > plot(cx,cy,'o;;',lx,ly,'-;;') > > where cx,cy are the points and lx,ly are matrices with three columns > and two rows for each r defining the end points of the error bars and > the caps respectively. > > If you decide to draw the circles yourself, change this to: > > plot(cx,cy,'o;;',lx,ly,'-;;') > > where cx,cy contain one column per circle. > > No for loops needed, and a small number of separate matrices being > plotted should improve your performance a lot. > OK, I was at first a bit confused because your code generates '+' style points on the graph; but then I think it's a bug in octave. Well, it seems I would have to look at the plotting commands... Thanks anyway, I'll see what I can use ;) -- William Poetra Yoga Hadisoeseno ```

 Re: [OctDev] regress and rcoplot (partially written, please help) From: William Poetra Yoga Hadisoeseno - 2006-01-01 09:20 Attachments: regress.m ```On 12/31/05, William Poetra Yoga Hadisoeseno wrot= e: > On 12/31/05, Paul Kienzle wrote: > > Here's what I wrote in wsolve: > > > ... > > > > Here's what I wrote in polyconf: > > > > Thanks, I'll take a look ;) > OK, today I finally finished the function (I think). I used Paul's notes at the end of wsolve.m. Take a look at the attachment ;) Paul, do you want to have it in octave-forge, or should I submit it for inclusion in Octave instead? -- William Poetra Yoga Hadisoeseno ```

 Re: [OctDev] regress and rcoplot (partially written, please help) From: Paul Kienzle - 2006-01-01 17:05 ```On Jan 1, 2006, at 4:20 AM, William Poetra Yoga Hadisoeseno wrote: > OK, today I finally finished the function (I think). I used Paul's > notes at the end of wsolve.m. Take a look at the attachment ;) You are not using the economy QR decomposition. This can lead to very large matrices for over determined systems. Also, you should generally avoid computing the full inverse as part of solving a system. It is more expensive and I believe less stable than working directly off the QR decomposition that you have already computed. Any reason you did not use my wsolve code as it stands? Similarly, do you have a reason for not using the code I sent for computing confidence intervals, which also avoids direct computation of the inverse? If there is some reason to prefer your code I would like to know what it is. - Paul ```

 Re: [OctDev] regress and rcoplot (partially written, please help) From: William Poetra Yoga Hadisoeseno - 2006-01-03 08:40 ```On 1/2/06, Paul Kienzle wrote: > > On Jan 1, 2006, at 4:20 AM, William Poetra Yoga Hadisoeseno wrote: > > > OK, today I finally finished the function (I think). I used Paul's > > notes at the end of wsolve.m. Take a look at the attachment ;) > > You are not using the economy QR decomposition. This can lead to > very large matrices for over determined systems. > > Also, you should generally avoid computing the full inverse as > part of solving a system. It is more expensive and I believe > less stable than working directly off the QR decomposition that > you have already computed. > > Any reason you did not use my wsolve code as it stands? > > Similarly, do you have a reason for not using the code I sent > for computing confidence intervals, which also avoids direct > computation of the inverse? > > If there is some reason to prefer your code I would like > to know what it is. > No reason, it's just that I didn't study my linear algebra course well (never went to class, for example :( ), and I haven't retaken it yet, so that part of Mathematics is my "blurry spot" (not quite a blind spot...). As such, I don't really understand your code well; for my dataset, my code works, so for the time being I haven't used your code yet. I'm having my exams now, so I wouldn't have too much time for that. But I'll look at it again (so don't include it first). Alternatively, if you would like to modify the function, I don't mind :) (although it wouldn't be very educative for me). -- William Poetra Yoga Hadisoeseno ```

 Re: [OctDev] regress and rcoplot (partially written, please help) From: William Poetra Yoga Hadisoeseno - 2006-02-15 13:03 Attachments: regress.m ```OK, so it seems I've been idle for the past few weeks. Would you like to try the file? It's attached. -- William Poetra Yoga Hadisoeseno ```

 Re: [OctDev] regress and rcoplot (partially written, please help) From: Paul Kienzle - 2006-02-19 23:39 ```On Feb 15, 2006, at 8:03 AM, William Poetra Yoga Hadisoeseno wrote: > OK, so it seems I've been idle for the past few weeks. Would you like > to try the file? It's attached. > > -- > William Poetra Yoga Hadisoeseno > Okay, I added a test of the Longley data set from the NIST Statistical Reference Database. The performance is pretty bad (norm of answer-certified value is about 10^-7), but that's a bit better than the competition. Someone with more R experience than me will have to see if it performs any better. Also, if you can convert bint to standard error values that would be convenient. - Paul ---- % Longley data from the NIST Statistical Reference Dataset Z = [ 60323 83.0 234289 2356 1590 107608 1947 61122 88.5 259426 2325 1456 108632 1948 60171 88.2 258054 3682 1616 109773 1949 61187 89.5 284599 3351 1650 110929 1950 63221 96.2 328975 2099 3099 112075 1951 63639 98.1 346999 1932 3594 113270 1952 64989 99.0 365385 1870 3547 115094 1953 63761 100.0 363112 3578 3350 116219 1954 66019 101.2 397469 2904 3048 117388 1955 67857 104.6 419180 2822 2857 118734 1956 68169 108.4 442769 2936 2798 120445 1957 66513 110.8 444546 4681 2637 121950 1958 68655 112.6 482704 3813 2552 123366 1959 69564 114.2 502601 3931 2514 125368 1960 69331 115.7 518173 4806 2572 127852 1961 70551 116.9 554894 4007 2827 130081 1962 ]; % Results certified by NIST using 100 digit arithmetic % b and standard error in b V = [ -3482258.63459582 890420.383607373 15.0618722713733 84.9149257747669 -0.358191792925910E-01 0.334910077722432E-01 -2.02022980381683 0.488399681651699 -1.03322686717359 0.214274163161675 -0.511041056535807E-01 0.226073200069370 1829.15146461355 455.478499142212 ]; Rsq = 0.995479004577296; F = 330.285339234588; y = Z(:,1); X = [ones(rows(Z),1), Z(:,2:end)]; [b, bint, r, rint, stats] = regress (y, X, 0.05); assert(b, V(:,1), 6e-7) assert(stats(1), Rsq, 7e-14) assert(stats(2), F, 5e-9) ```

 Re: [OctDev] regress and rcoplot (partially written, please help) From: William Poetra Yoga Hadisoeseno - 2006-02-21 13:42 ```On 2/20/06, Paul Kienzle wrote: > > Okay, I added a test of the Longley data set from the NIST Statistical > Reference Database. > > The performance is pretty bad (norm of answer-certified value is about > 10^-7), but that's a bit better than the competition. > > Someone with more R experience than me will have to see if it performs > any better. > > Also, if you can convert bint to standard error values that would be > convenient. > > - Paul > > ---- > > % Longley data from the NIST Statistical Reference Dataset > Z =3D [ 60323 83.0 234289 2356 1590 107608 1947 > 61122 88.5 259426 2325 1456 108632 1948 > 60171 88.2 258054 3682 1616 109773 1949 > 61187 89.5 284599 3351 1650 110929 1950 > 63221 96.2 328975 2099 3099 112075 1951 > 63639 98.1 346999 1932 3594 113270 1952 > 64989 99.0 365385 1870 3547 115094 1953 > 63761 100.0 363112 3578 3350 116219 1954 > 66019 101.2 397469 2904 3048 117388 1955 > 67857 104.6 419180 2822 2857 118734 1956 > 68169 108.4 442769 2936 2798 120445 1957 > 66513 110.8 444546 4681 2637 121950 1958 > 68655 112.6 482704 3813 2552 123366 1959 > 69564 114.2 502601 3931 2514 125368 1960 > 69331 115.7 518173 4806 2572 127852 1961 > 70551 116.9 554894 4007 2827 130081 1962 ]; > % Results certified by NIST using 100 digit arithmetic > % b and standard error in b > V =3D [ -3482258.63459582 890420.383607373 > 15.0618722713733 84.9149257747669 > -0.358191792925910E-01 0.334910077722432E-01 > -2.02022980381683 0.488399681651699 > -1.03322686717359 0.214274163161675 > -0.511041056535807E-01 0.226073200069370 > 1829.15146461355 455.478499142212 ]; > Rsq =3D 0.995479004577296; > F =3D 330.285339234588; > y =3D Z(:,1); X =3D [ones(rows(Z),1), Z(:,2:end)]; > [b, bint, r, rint, stats] =3D regress (y, X, 0.05); > assert(b, V(:,1), 6e-7) > assert(stats(1), Rsq, 7e-14) > assert(stats(2), F, 5e-9) > > Yes, you can convert bint to the standard error values with (0.05 is alpha, 9 is the degree of freedom): ((bint(:, 1) - bint(:, 2)) / 2) / tinv (0.05 / 2, 9) but the accuracy will be affected. (It's not so accurate anyway...) -- William Poetra Yoga Hadisoeseno ```

 Re: [OctDev] regress and rcoplot (partially written, please help) From: William Poetra Yoga Hadisoeseno - 2006-03-08 01:51 Attachments: regress.m ```On 2/21/06, William Poetra Yoga Hadisoeseno wrote= : > On 2/20/06, Paul Kienzle wrote: > > > > Okay, I added a test of the Longley data set from the NIST Statistical > > Reference Database. > > > > The performance is pretty bad (norm of answer-certified value is about > > 10^-7), but that's a bit better than the competition. > > > > Someone with more R experience than me will have to see if it performs > > any better. > > > > Also, if you can convert bint to standard error values that would be > > convenient. > > > > - Paul > > > > ---- > > > > % Longley data from the NIST Statistical Reference Dataset > > Z =3D [ 60323 83.0 234289 2356 1590 107608 1947 > > 61122 88.5 259426 2325 1456 108632 1948 > > 60171 88.2 258054 3682 1616 109773 1949 > > 61187 89.5 284599 3351 1650 110929 1950 > > 63221 96.2 328975 2099 3099 112075 1951 > > 63639 98.1 346999 1932 3594 113270 1952 > > 64989 99.0 365385 1870 3547 115094 1953 > > 63761 100.0 363112 3578 3350 116219 1954 > > 66019 101.2 397469 2904 3048 117388 1955 > > 67857 104.6 419180 2822 2857 118734 1956 > > 68169 108.4 442769 2936 2798 120445 1957 > > 66513 110.8 444546 4681 2637 121950 1958 > > 68655 112.6 482704 3813 2552 123366 1959 > > 69564 114.2 502601 3931 2514 125368 1960 > > 69331 115.7 518173 4806 2572 127852 1961 > > 70551 116.9 554894 4007 2827 130081 1962 ]; > > % Results certified by NIST using 100 digit arithmetic > > % b and standard error in b > > V =3D [ -3482258.63459582 890420.383607373 > > 15.0618722713733 84.9149257747669 > > -0.358191792925910E-01 0.334910077722432E-01 > > -2.02022980381683 0.488399681651699 > > -1.03322686717359 0.214274163161675 > > -0.511041056535807E-01 0.226073200069370 > > 1829.15146461355 455.478499142212 ]; > > Rsq =3D 0.995479004577296; > > F =3D 330.285339234588; > > y =3D Z(:,1); X =3D [ones(rows(Z),1), Z(:,2:end)]; > > [b, bint, r, rint, stats] =3D regress (y, X, 0.05); > > assert(b, V(:,1), 6e-7) > > assert(stats(1), Rsq, 7e-14) > > assert(stats(2), F, 5e-9) > > > > > > Yes, you can convert bint to the standard error values with (0.05 is > alpha, 9 is the degree of freedom): > ((bint(:, 1) - bint(:, 2)) / 2) / tinv (0.05 / 2, 9) > but the accuracy will be affected. (It's not so accurate anyway...) > I still don't know how to improve the accuracy... I've added an assertion for the standard error values, but I've got to set the tolerance to 7.6e-3 (that's not good). Also, I think one decimal point for the tolerance would be good, what do you think? By the way, in the octave-forge version (in the repository), the functions tinv and fcdf are replaced by t_inv and f_cdf. Why? -- William Poetra Yoga Hadisoeseno ```

 Re: [OctDev] regress and rcoplot (partially written, please help) From: Paul Kienzle - 2006-03-08 12:37 ```On Mar 7, 2006, at 8:51 PM, William Poetra Yoga Hadisoeseno wrote: > On 2/21/06, William Poetra Yoga Hadisoeseno > wrote: >> On 2/20/06, Paul Kienzle wrote: >>> >>> Okay, I added a test of the Longley data set from the NIST >>> Statistical >>> Reference Database. >>> >>> The performance is pretty bad (norm of answer-certified value is >>> about >>> 10^-7), but that's a bit better than the competition. >>> >>> Someone with more R experience than me will have to see if it >>> performs >>> any better. >>> >>> Also, if you can convert bint to standard error values that would be >>> convenient. >>> >>> - Paul >>> >>> ---- >>> >>> % Longley data from the NIST Statistical Reference Dataset >>> Z = [ 60323 83.0 234289 2356 1590 107608 1947 >>> 61122 88.5 259426 2325 1456 108632 1948 >>> 60171 88.2 258054 3682 1616 109773 1949 >>> 61187 89.5 284599 3351 1650 110929 1950 >>> 63221 96.2 328975 2099 3099 112075 1951 >>> 63639 98.1 346999 1932 3594 113270 1952 >>> 64989 99.0 365385 1870 3547 115094 1953 >>> 63761 100.0 363112 3578 3350 116219 1954 >>> 66019 101.2 397469 2904 3048 117388 1955 >>> 67857 104.6 419180 2822 2857 118734 1956 >>> 68169 108.4 442769 2936 2798 120445 1957 >>> 66513 110.8 444546 4681 2637 121950 1958 >>> 68655 112.6 482704 3813 2552 123366 1959 >>> 69564 114.2 502601 3931 2514 125368 1960 >>> 69331 115.7 518173 4806 2572 127852 1961 >>> 70551 116.9 554894 4007 2827 130081 1962 ]; >>> % Results certified by NIST using 100 digit arithmetic >>> % b and standard error in b >>> V = [ -3482258.63459582 890420.383607373 >>> 15.0618722713733 84.9149257747669 >>> -0.358191792925910E-01 0.334910077722432E-01 >>> -2.02022980381683 0.488399681651699 >>> -1.03322686717359 0.214274163161675 >>> -0.511041056535807E-01 0.226073200069370 >>> 1829.15146461355 455.478499142212 ]; >>> Rsq = 0.995479004577296; >>> F = 330.285339234588; >>> y = Z(:,1); X = [ones(rows(Z),1), Z(:,2:end)]; >>> [b, bint, r, rint, stats] = regress (y, X, 0.05); >>> assert(b, V(:,1), 6e-7) >>> assert(stats(1), Rsq, 7e-14) >>> assert(stats(2), F, 5e-9) >>> >>> >> >> Yes, you can convert bint to the standard error values with (0.05 is >> alpha, 9 is the degree of freedom): >> ((bint(:, 1) - bint(:, 2)) / 2) / tinv (0.05 / 2, 9) >> but the accuracy will be affected. (It's not so accurate anyway...) >> > > I still don't know how to improve the accuracy... > I've added an assertion for the standard error values, but I've got to > set the tolerance to 7.6e-3 (that's not good). Also, I think one > decimal point for the tolerance would be good, what do you think? I generally use only one digit for the tolerance. > By the way, in the octave-forge version (in the repository), the > functions tinv and fcdf are replaced by t_inv and f_cdf. Why? Because I tinv and fcdf were not available in my version of Octave. These can change once we abandon support for 2.1.x - Paul ```

 Re: [OctDev] regress and rcoplot (partially written, please help) From: William Poetra Yoga Hadisoeseno - 2006-03-09 06:49 ```On 3/8/06, Paul Kienzle wrote: > > I generally use only one digit for the tolerance. > OK, then use 1 digit :) The standard error values differ by as much as 10e-3 (as can be seen from the assert), and it doesn't improve if I embed the test in the main part of the script (that is, just using sqrt(v*c) as the standard error values). > Because I tinv and fcdf were not available in my version of Octave. > These can change once we abandon support for 2.1.x > > OK, looking forward to it :) -- William Poetra Yoga Hadisoeseno ```

 Re: [OctDev] regress and rcoplot (partially written, please help) From: Paul Kienzle - 2006-03-09 09:41 ```On Mar 9, 2006, at 1:49 AM, William Poetra Yoga Hadisoeseno wrote: > On 3/8/06, Paul Kienzle wrote: >> >> I generally use only one digit for the tolerance. >> > > OK, then use 1 digit :) The standard error values differ by as much as > 10e-3 (as can be seen from the assert), and it doesn't improve if I > embed the test in the main part of the script (that is, just using > sqrt(v*c) as the standard error values). Yet again I had to loosen the criteria on the tests in order to pass the tests on OS X PPC architecture. The dataset is known to be hard, so I'm not surprised by the weak tolerances. I'm curious if e.g., R can compute better results. - Paul ```

 Re: [OctDev] regress and rcoplot (partially written, please help) From: William Poetra Yoga Hadisoeseno - 2006-03-09 11:41 ```On 3/9/06, Paul Kienzle wrote: > > Yet again I had to loosen the criteria on the tests in order to pass > the tests on OS X PPC architecture. > > The dataset is known to be hard, so I'm not surprised by the weak > tolerances. I'm curious if e.g., R can compute better results. > Well, OK, so can we say regress.m is quite finished by now? And maybe I can start on rcoplot. I didn't study the code on the rcoplot manual, so is it legal for me to write that function anyway? (without referring to the code). -- William Poetra Yoga Hadisoeseno ```

 Re: [OctDev] regress and rcoplot (partially written, please help) From: Paul Kienzle - 2006-03-10 01:37 ```It is best if you can find an independent explanation of the algorithm which you can cite in the documentation. It makes the documentation better, and it makes the argument that it is an independent explanation easier to support. E.g., http://www.itl.nist.gov/div898/handbook/pri/section2/pri245.htm - Paul On Mar 9, 2006, at 6:41 AM, William Poetra Yoga Hadisoeseno wrote: > On 3/9/06, Paul Kienzle wrote: >> >> Yet again I had to loosen the criteria on the tests in order to pass >> the tests on OS X PPC architecture. >> >> The dataset is known to be hard, so I'm not surprised by the weak >> tolerances. I'm curious if e.g., R can compute better results. >> > > Well, OK, so can we say regress.m is quite finished by now? And maybe > I can start on rcoplot. I didn't study the code on the rcoplot manual, > so is it legal for me to write that function anyway? (without > referring to the code). > > -- > William Poetra Yoga Hadisoeseno > ```