|
From: Anders P. <and...@op...> - 2004-04-22 07:55:26
|
On 2004-04-21, at 18.21, Inger, Matthew wrote:
> Actually 2 things
> 1. I've noticed that the .solve method returns slightly =
different
> results than our algorithm
I tried this yesterday, and there were a few bugs in the BigMatrix =20
implementation of the QR decomposition - I fixed those yesterday. =20
Please try this again with the latest version from CVS.
I'm surprised you you only got a slightly different solution ;-)
> 2. Using "solve" instead produces no speed gain, as I'm caching =
the
> majority of the matrix computations:
Since you=B4re caching almost everything, and in the end only work with =
=20
2-by-2 matrices, I assume you may be close to optimum performance. In =20=
general, with larger matrices, your way is not the better way!
> This is the formula we're using:
>
> B =3D (Xt * X)i * (Xt * Y)
This is one way of solving X*B =3D Y (I prefer to write A*X=3DB) when =
there =20
are more equations than variables. It gives you the least squares =20
solution.
Another way, almost the same, is:
X =3D A.mulyiplyLeft(A.transpose()).solve(B.mulyiplyLeft(A.transpose()))
Internally the solve method delegates to the matrix' LU decomposition.
And a third way, that I recommend, is:
X =3D A.solve(B)
Internally the solve method now delegates to the matrix' QR =20
decomposition. This will give you the least squares solution directly, =20=
and in most cases will be much faster than what you are doing! =20
(Provided that you are reusing A).
If/when you tried this you did not get correct answers because the code =20=
had bugs.
You can even try this with JamaMatrix....
> (t is transpose and i is inverse).
>
> Since the X value is the same, when we construct the
> cached X matrix, we also construct Xt and (Xt * X)i
> and cache those as well.
>
> Then i only have to do two matrix calculations, one for
> the (Xt * Y) and then again to multiply that by the inverse
> of the square I stored earlier.
>
> Most, if not all values are non-zero.
Nothing to do then...
> At this point, i'm just chalking it up to BigDecimal construction
> being time consuming (it takes ~ 25% of the total execution time), and
> then the "add" method on BigDecimal takes another 25% of the time.
/Anders
> -----Original Message-----
> From: Anders Peterson [mailto:and...@op...]
> Sent: Wednesday, April 21, 2004 11:31 AM
> To: Inger, Matthew
> Cc: 'oja...@li...'
> Subject: Re: [ojAlgo-user] Questions
>
>
> There are a couple of things you can try...
>
> 1) Basically what you want to do is solve a linear equation system =
with
> far more rows than columns: Ax=3DB
>
> Is your current strategy to calculate A.transpose() and multiplying
> both sides of the equation with this?
>
> There is another way. Simply call A.solve(B)
>
> Not sure I tested this, but it should produce the same result! =
Provided
> you use the same A for each of the 10000 iterations the heavy
> "processing" is cached for you (A internally stores its QR
> decomposition).
>
> 2) Perhaps you can reduce the number of BigDecimal calls. When you=B4re
> doing matrix multiplication; do the matrices involved have a lot of
> zeros? If they do, then implementing multiplication code in one or =
more
> of the MatrixStore subclasses may speed things up.
>
>
> Have to leave the office now - live in a different time zone,
> /Anders
>
> On 2004-04-21, at 17.06, Inger, Matthew wrote:
>
>> Yes, that kind of pattern for factories is very commonly
>> used. You could still have the implementation you supplied
>> which is a generic Reflection based strategy, but when you're
>> dealing with thousands, or millions of records, every little
>> bit counts.
>>
>> 1. I'm processing 10,000 records
>> 2. The X values are the same for every iteration
>>
>> Using this, i created a Solver class which caches the X matrix,
>> along with it's transpose, and a couple of other computed matrices
>> (such as Xt * X) which do not involve the Y values. And i'm also
>> caching the Unit matrix.
>>
>> So each iteration through the loop, only the Y values have to be set,
>> and i've minimized the number of matrix operations required to =
compute
>> the coefficients.
>>
>> After running this, i trimmed run time from 8s to 5s, and noticed =
that
>> 75% of the time is spent in the BigDecimal class, with about 1/3 of
>> that
>> spent in the constructor, and 1/3 in the "timesTenToThe" method. So
>> there's
>> really nothing further i can do to speed it up.
>>
>>
>> -----Original Message-----
>> From: Anders Peterson [mailto:and...@op...]
>> Sent: Wednesday, April 21, 2004 11:01 AM
>> To: Anders Peterson
>> Cc: Inger, Matthew; 'oja...@li...'
>> Subject: Re: [ojAlgo-user] Questions
>>
>>
>> This is done! It's in CVS.
>>
>> Please let me know if it faster.
>>
>> /Anders
>>
>> On 2004-04-21, at 11.41, Anders Peterson wrote:
>>
>>> In my previous mail it should read "It will speed things up..." ;-)
>>>
>>> On 2004-04-20, at 14.57, Inger, Matthew wrote:
>>>
>>>> The basic slowdown is that the constructors are not cached.
>>>> Finding the constructor each time is what makes that factory
>>>> slow.
>>>
>>> I understand! Do you know of any reason why MatrixFactory should not
>>> cache the constructors?
>>>
>>>> If MatrixFactory was an interface, one could just call
>>>> the appropriate constructurs directly without reflection.
>>>>
>>>> public interface MatrixFactory {
>>>> public BasicMatrix create(double values[][]);
>>>> }
>>>>
>>>> public class BigMatrixFactory implements MatrixFactory {
>>>> public BasicMatrix create(double values[][]) {
>>>> return new BigMatrix(values);
>>>> }
>>>> }
>>>>
>>>> You could still have the original vresion, but it would
>>>> be named something different, and implement this new interface.
>>>
>>> Ok, I see what you mean. Is that a commonly used design pattern for
>>> factories?
>>>
>>>> -----Original Message-----
>>>> From: Anders Peterson [mailto:and...@op...]
>>>> Sent: Tuesday, April 20, 2004 4:40 AM
>>>> To: Inger, Matthew
>>>> Cc: 'oja...@li...'
>>>> Subject: Re: [ojAlgo-user] Questions
>>>>
>>>>
>>>> On 2004-04-19, at 22.39, Inger, Matthew wrote:
>>>>
>>>>> FYI: Using MatrixFactory is also extremely slow.
>>>>> If i directly instantiate JamaMatrix for 100,000
>>>>> regressions, it takes ~ 2 seconds or so. If i use
>>>>> MatrixFactory to instantiate, it goes up to 8 seconds.
>>>>
>>>> MatrixFactory uses reflection to call the appropriate constructor.
>>>> Using it centralises matrix creation and allows you to switch
>>>> implementation by changing one line of code - the call to the
>>>> MatrixFactory constructor. (Make sure you re-use the factory
>>>> instances.)
>>>>
>>>> This will naturally take longer than calling the constructor
>>>> directly,
>>>> but I am surprised it is that much slower.
>>>>
>>>> If this is a problem; call the constructors directly. Perhaps you
>>>> should implement your own static factory method.
>>>>
>>>>> I'm a little curious why MatrixFactory is not an interface,
>>>>> with implementations per different type of Matrix.
>>>>
>>>> Interfaces can't specify contructors.
>>>>
>>>> /Anders
>>>>
>>>>> -----Original Message-----
>>>>> From: Inger, Matthew
>>>>> Sent: Monday, April 19, 2004 3:04 PM
>>>>> To: 'Anders Peterson'; Inger, Matthew
>>>>> Subject: RE: [ojAlgo-user] Questions
>>>>>
>>>>>
>>>>> There are a small number of variables. In most cases, we
>>>>> have a single variable equation, and up to 24 points with
>>>>> which to fit the linear regression.
>>>>>
>>>>> Then, we do that 10,000 times.
>>>>>
>>>>> Basically, each row in the database contains time elapsed
>>>>> data (typically one value per month). We use the # of the
>>>>> month as the x values, and the data as the y value, and try
>>>>> to fit a line to it to determine short term trends.
>>>>>
>>>>> So for each row, we calculate 1 set of coefficients (b0 and b1,
>>>>> aka intercept & slope), and make 1 more predictions.
>>>>>
>>>>> However, there are potentially millions of rows in the database.
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: Anders Peterson [mailto:and...@op...]
>>>>> Sent: Monday, April 19, 2004 2:32 PM
>>>>> To: Inger, Matthew
>>>>> Subject: Re: [ojAlgo-user] Questions
>>>>>
>>>>>
>>>>> This is just a naive first attempt...
>>>>>
>>>>> On 2004-04-19, at 20.06, Inger, Matthew wrote:
>>>>>
>>>>>> PS: As far as solving the equations, i basically have two
>>>>>> matrices. X and Y. Any kind of code snippet would help here.
>>>>>> Keep in mind, equations are of the form
>>>>>>
>>>>>> y =3D b0 + b1*X
>>>>>
>>>>> 1*b0 + x*b1 =3D y
>>>>>
>>>>> A*X =3D B
>>>>>
>>>>> where
>>>>>
>>>>> A:
>>>>> 1 x1
>>>>> 1 x2
>>>>> 1 x3
>>>>> ... ...
>>>>>
>>>>> B:
>>>>> y1
>>>>> y2
>>>>> y3
>>>>> ...
>>>>>
>>>>> X:
>>>>> b0
>>>>> b1
>>>>>
>>>>> X =3D A.solve(B);
>>>>>
>>>>> Are you doing this 10 000 times with a relatively small number of =20=
>>>>> x-
>>>>> and y-values or do you have 10 000 x- and y-values for each
>>>>> calculation?
>>>>>
>>>>> /Anders
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Anders Peterson [mailto:and...@op...]
>>>>>> Sent: Monday, April 19, 2004 2:02 PM
>>>>>> To: Inger, Matthew
>>>>>> Cc: 'oja...@li...'
>>>>>> Subject: Re: [ojAlgo-user] Questions
>>>>>>
>>>>>>
>>>>>> On 2004-04-19, at 18.33, Inger, Matthew wrote:
>>>>>>
>>>>>>> 1. Have any performance tests been run with BigMatrix?
>>>>>>
>>>>>> Not really...
>>>>>>
>>>>>>> I'm using
>>>>>>> BigMatrix to do a
>>>>>>> LinearRegression algorithm. I had previously been using =20=
>>>>>>> Jama
>>>>>>> directly,
>>>>>>> but our FD
>>>>>>> department noticed a difference between the results given =
by
>>>>>>> java
>>>>>>> and
>>>>>>> the LINEST
>>>>>>> function in Excel. I have determined that the difference =
is
>>>>>>> DEFINATELY
>>>>>>> due to rounding
>>>>>>> limitations in the java double primitive type.
>>>>>>>
>>>>>>> These issues go away when i use ojAlgo library,
>>>>>>
>>>>>> That's great news to me - that is the reason I created ojAlgo!
>>>>>>
>>>>>>> however, the time
>>>>>>> required to do the
>>>>>>> calculations increases to about 12-13x what it was using
>>>>>>> regular
>>>>>>> double
>>>>>>> calculations.
>>>>>>>
>>>>>>> Doing 10,000 linear regressions with Jama took about .781s
>>>>>>> (according
>>>>>>> to junit), and
>>>>>>> with BigMatrix, took 10.345s
>>>>>>
>>>>>> I wasn't aware the difference was that big. ;-)
>>>>>>
>>>>>>> Any idea why such a hug disparity?
>>>>>>
>>>>>> Try altering the scale. A larger scale means better results, but =20=
>>>>>> it
>>>>>> takes longer. You can change a matrix' scale whenever you want -
>>>>>> the
>>>>>> new scale will be used from then on. Elements are not rounded
>>>>>> unless
>>>>>> you call enforceScale().
>>>>>>
>>>>>> Internally BigMatrix uses various decompositions for some of the
>>>>>> more
>>>>>> complex calculations. These decompositions inherit a scale from =20=
>>>>>> the
>>>>>> parent matrix. It is however not the same - it's bigger. I =
believe
>>>>>> the
>>>>>> formula is 2 * (3 + max(9, matrixScale)) and this formula is
>>>>>> evaluated
>>>>>> when the decomposition is created. After the decomposition is
>>>>>> created
>>>>>> its scale wont change even if you change the matrix' scale. The
>>>>>> scale
>>>>>> of a decomposition is never smaller than 24 (according to the =20
>>>>>> above
>>>>>> formula).
>>>>>>
>>>>>> I'm not happy with this, and have been meaning to change it.
>>>>>> Decompositions should have the same scale as its parent matrix, =20=
>>>>>> and
>>>>>> when it is changed, updated "everywhere".
>>>>>>
>>>>>> Would you prefer that?
>>>>>>
>>>>>>> 2. Is there a way to use the solvers in OjAlg to do the linear
>>>>>>> regression?
>>>>>>> Or is that not
>>>>>>> something that is provided?
>>>>>>
>>>>>> How are you currently doing it?
>>>>>>
>>>>>> I imagine the best way would be to simply build an =
over-determined
>>>>>> set
>>>>>> of linear equations - Ax=3DB - and call A.solve(B). Internally
>>>>>> BigMatrix
>>>>>> would then use the QR decomposition to give a least squares
>>>>>> estimation
>>>>>> of x. That would be my first attempt.
>>>>>>
>>>>>> You could use the QP solver do the same thing, but I don't see =
how
>>>>>> that
>>>>>> would simplify or improve anything.
>>>>>>
>>>>>> What version of ojAlgo are you using - v1, v2 or CVS?
>>>>>>
>>>>>> I recommend working from CVS. Unfortunately I have moved things
>>>>>> around
>>>>>> a bit in between version...
>>>>>>
>>>>>> I don't have access to a profiling tool, but I know they can be
>>>>>> very
>>>>>> useful in finding bottlenecks. If you can identify a problem,
>>>>>> and/or
>>>>>> suggest a specific improvement, I'll be happy to help you =20
>>>>>> implement
>>>>>> it.
>>>>>>
>>>>>> /Anders
>>>>>>
>>>>>>> ----------------------
>>>>>>> Matthew Inger
>>>>>>> Design Architect
>>>>>>> Synygy, Inc
>>>>>>> 610-664-7433 x7770
>>>>>>>
>>>>>>> -------------------------------------------------------
>>>>>>> This SF.Net email is sponsored by: IBM Linux Tutorials
>>>>>>> Free Linux tutorial presented by Daniel Robbins, President and =20=
>>>>>>> CEO
>>>>>>> of
>>>>>>> GenToo technologies. Learn everything from fundamentals to =
system
>>>>>>> administration.http://ads.osdn.com/?
>>>>>>> ad_id=3D1470&alloc_id=3D3638&op=3Dclick
>>>>>>> _______________________________________________
>>>>>>> ojAlgo-user mailing list
>>>>>>> ojA...@li...
>>>>>>> https://lists.sourceforge.net/lists/listinfo/ojalgo-user
>>>>
>>>> -------------------------------------------------------
>>>> This SF.Net email is sponsored by: IBM Linux Tutorials
>>>> Free Linux tutorial presented by Daniel Robbins, President and CEO =20=
>>>> of
>>>> GenToo technologies. Learn everything from fundamentals to system
>>>> administration.http://ads.osdn.com/?=20
>>>> ad_id=3D1470&alloc_id=3D3638&op=3Dclick
>>>> _______________________________________________
>>>> ojAlgo-user mailing list
>>>> ojA...@li...
>>>> https://lists.sourceforge.net/lists/listinfo/ojalgo-user
|