Re: [ojAlgo-user] Questions

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

There are a couple of things you can try...

1) Basically what you want to do is solve a linear equation system with=20=

far more rows than columns: Ax=3DB

Is your current strategy to calculate A.transpose() and multiplying=20
both sides of the equation with this?

There is another way. Simply call A.solve(B)

Not sure I tested this, but it should produce the same result! Provided=20=

you use the same A for each of the 10000 iterations the heavy=20
"processing" is cached for you (A internally stores its QR=20
decomposition).

2) Perhaps you can reduce the number of BigDecimal calls. When you=B4re=20=

doing matrix multiplication; do the matrices involved have a lot of=20
zeros? If they do, then implementing multiplication code in one or more=20=

of the MatrixStore subclasses may speed things up.

Have to leave the office now - live in a different time zone,
/Anders

On 2004-04-21, at 17.06, Inger, Matthew wrote:

> Yes, that kind of pattern for factories is very commonly
> used.  You could still have the implementation you supplied
> which is a generic Reflection based strategy, but when you're
> dealing with thousands, or millions of records, every little
> bit counts.
>
> 1.  I'm processing 10,000 records
> 2.  The X values are the same for every iteration
>
> Using this, i created a Solver class which caches the X matrix,
> along with it's transpose, and a couple of other computed matrices
> (such as Xt * X) which do not involve the Y values.  And i'm also
> caching the Unit matrix.
>
> So each iteration through the loop, only the Y values have to be set,
> and i've minimized the number of matrix operations required to compute
> the coefficients.
>
> After running this, i trimmed run time from 8s to 5s, and noticed that
> 75% of the time is spent in the BigDecimal class, with about 1/3 of=20
> that
> spent in the constructor, and 1/3 in the "timesTenToThe" method.  So=20=

> there's
> really nothing further i can do to speed it up.
>
>
> -----Original Message-----
> From: Anders Peterson [mailto:and...@op...]
> Sent: Wednesday, April 21, 2004 11:01 AM
> To: Anders Peterson
> Cc: Inger, Matthew; 'oja...@li...'
> Subject: Re: [ojAlgo-user] Questions
>
>
> This is done! It's in CVS.
>
> Please let me know if it faster.
>
> /Anders
>
> On 2004-04-21, at 11.41, Anders Peterson wrote:
>
>> In my previous mail it should read "It will speed things up..." ;-)
>>
>> On 2004-04-20, at 14.57, Inger, Matthew wrote:
>>
>>> The basic slowdown is that the constructors are not cached.
>>> Finding the constructor each time is what makes that factory
>>> slow.
>>
>> I understand! Do you know of any reason why MatrixFactory should not
>> cache the constructors?
>>
>>> If MatrixFactory was an interface, one could just call
>>> the appropriate constructurs directly without reflection.
>>>
>>> public interface MatrixFactory {
>>>    public BasicMatrix create(double values[][]);
>>> }
>>>
>>> public class BigMatrixFactory implements MatrixFactory {
>>>   public BasicMatrix create(double values[][]) {
>>>      return new BigMatrix(values);
>>>   }
>>> }
>>>
>>> You could still have the original vresion, but it would
>>> be named something different, and implement this new interface.
>>
>> Ok, I see what you mean. Is that a commonly used design pattern for
>> factories?
>>
>>> -----Original Message-----
>>> From: Anders Peterson [mailto:and...@op...]
>>> Sent: Tuesday, April 20, 2004 4:40 AM
>>> To: Inger, Matthew
>>> Cc: 'oja...@li...'
>>> Subject: Re: [ojAlgo-user] Questions
>>>
>>>
>>> On 2004-04-19, at 22.39, Inger, Matthew wrote:
>>>
>>>> FYI:  Using MatrixFactory is also extremely slow.
>>>> If i directly instantiate JamaMatrix for 100,000
>>>> regressions, it takes ~ 2 seconds or so.  If i use
>>>> MatrixFactory to instantiate, it goes up to 8 seconds.
>>>
>>> MatrixFactory uses reflection to call the appropriate constructor.
>>> Using it centralises matrix creation and allows you to switch
>>> implementation by changing one line of code - the call to the
>>> MatrixFactory constructor. (Make sure you re-use the factory
>>> instances.)
>>>
>>> This will naturally take longer than calling the constructor=20
>>> directly,
>>> but I am surprised it is that much slower.
>>>
>>> If this is a problem; call the constructors directly. Perhaps you
>>> should implement your own static factory method.
>>>
>>>> I'm a little curious why MatrixFactory is not an interface,
>>>> with implementations per different type of Matrix.
>>>
>>> Interfaces can't specify contructors.
>>>
>>> /Anders
>>>
>>>> -----Original Message-----
>>>> From: Inger, Matthew
>>>> Sent: Monday, April 19, 2004 3:04 PM
>>>> To: 'Anders Peterson'; Inger, Matthew
>>>> Subject: RE: [ojAlgo-user] Questions
>>>>
>>>>
>>>> There are a small number of variables.  In most cases, we
>>>> have a single variable equation, and up to 24 points with
>>>> which to fit the linear regression.
>>>>
>>>> Then, we do that 10,000 times.
>>>>
>>>> Basically, each row in the database contains time elapsed
>>>> data (typically one value per month).  We use the # of the
>>>> month as the x values, and the data as the y value, and try
>>>> to fit a line to it to determine short term trends.
>>>>
>>>> So for each row, we calculate 1 set of coefficients (b0 and b1,
>>>> aka intercept & slope), and make 1 more predictions.
>>>>
>>>> However, there are potentially millions of rows in the database.
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Anders Peterson [mailto:and...@op...]
>>>> Sent: Monday, April 19, 2004 2:32 PM
>>>> To: Inger, Matthew
>>>> Subject: Re: [ojAlgo-user] Questions
>>>>
>>>>
>>>> This is just a naive first attempt...
>>>>
>>>> On 2004-04-19, at 20.06, Inger, Matthew wrote:
>>>>
>>>>> PS:  As far as solving the equations, i basically have two
>>>>> matrices.  X and Y.  Any kind of code snippet would help here.
>>>>> Keep in mind, equations are of the form
>>>>>
>>>>> 	y =3D b0 + b1*X
>>>>
>>>> 1*b0 + x*b1 =3D y
>>>>
>>>> A*X =3D B
>>>>
>>>> where
>>>>
>>>> A:
>>>> 1	x1
>>>> 1	x2
>>>> 1	x3
>>>> ...	...
>>>>
>>>> B:
>>>> y1
>>>> y2
>>>> y3
>>>> ...
>>>>
>>>> X:
>>>> b0
>>>> b1
>>>>
>>>> X =3D A.solve(B);
>>>>
>>>> Are you doing this 10 000 times with a relatively small number of =
x-
>>>> and y-values or do you have 10 000 x- and y-values for each
>>>> calculation?
>>>>
>>>> /Anders
>>>>
>>>>> -----Original Message-----
>>>>> From: Anders Peterson [mailto:and...@op...]
>>>>> Sent: Monday, April 19, 2004 2:02 PM
>>>>> To: Inger, Matthew
>>>>> Cc: 'oja...@li...'
>>>>> Subject: Re: [ojAlgo-user] Questions
>>>>>
>>>>>
>>>>> On 2004-04-19, at 18.33, Inger, Matthew wrote:
>>>>>
>>>>>> 1.  Have any performance tests been run with BigMatrix?
>>>>>
>>>>> Not really...
>>>>>
>>>>>>    I'm using
>>>>>> BigMatrix to do a
>>>>>>      LinearRegression algorithm.  I had previously been using =
Jama
>>>>>> directly,
>>>>>> but our FD
>>>>>>      department noticed a difference between the results given by
>>>>>> java
>>>>>> and
>>>>>> the LINEST
>>>>>>      function in Excel.  I have determined that the difference is
>>>>>> DEFINATELY
>>>>>> due to rounding
>>>>>>      limitations in the java double primitive type.
>>>>>>
>>>>>>      These issues go away when i use ojAlgo library,
>>>>>
>>>>> That's great news to me - that is the reason I created ojAlgo!
>>>>>
>>>>>> however, the time
>>>>>> required to do the
>>>>>>      calculations increases to about 12-13x what it was using
>>>>>> regular
>>>>>> double
>>>>>> calculations.
>>>>>>
>>>>>>      Doing 10,000 linear regressions with Jama took about .781s
>>>>>> (according
>>>>>> to junit), and
>>>>>>      with BigMatrix, took 10.345s
>>>>>
>>>>> I wasn't aware the difference was that big. ;-)
>>>>>
>>>>>>      Any idea why such a hug disparity?
>>>>>
>>>>> Try altering the scale. A larger scale means better results, but =
it
>>>>> takes longer. You can change a matrix' scale whenever you want -=20=

>>>>> the
>>>>> new scale will be used from then on. Elements are not rounded=20
>>>>> unless
>>>>> you call enforceScale().
>>>>>
>>>>> Internally BigMatrix uses various decompositions for some of the
>>>>> more
>>>>> complex calculations. These decompositions inherit a scale from =
the
>>>>> parent matrix. It is however not the same - it's bigger. I believe
>>>>> the
>>>>> formula is 2 * (3 + max(9, matrixScale)) and this formula is
>>>>> evaluated
>>>>> when the decomposition is created. After the decomposition is
>>>>> created
>>>>> its scale wont change even if you change the matrix' scale. The
>>>>> scale
>>>>> of a decomposition is never smaller than 24 (according to the =
above
>>>>> formula).
>>>>>
>>>>> I'm not happy with this, and have been meaning to change it.
>>>>> Decompositions should have the same scale as its parent matrix, =
and
>>>>> when it is changed, updated "everywhere".
>>>>>
>>>>> Would you prefer that?
>>>>>
>>>>>> 2.  Is there a way to use the solvers in OjAlg to do the linear
>>>>>> regression?
>>>>>> Or is that not
>>>>>>      something that is provided?
>>>>>
>>>>> How are you currently doing it?
>>>>>
>>>>> I imagine the best way would be to simply build an over-determined
>>>>> set
>>>>> of linear equations - Ax=3DB - and call A.solve(B). Internally
>>>>> BigMatrix
>>>>> would then use the QR decomposition to give a least squares
>>>>> estimation
>>>>> of x. That would be my first attempt.
>>>>>
>>>>> You could use the QP solver do the same thing, but I don't see how
>>>>> that
>>>>> would simplify or improve anything.
>>>>>
>>>>> What version of ojAlgo are you using - v1, v2 or CVS?
>>>>>
>>>>> I recommend working from CVS. Unfortunately I have moved things
>>>>> around
>>>>> a bit in between version...
>>>>>
>>>>> I don't have access to a profiling tool, but I know they can be=20
>>>>> very
>>>>> useful in finding bottlenecks. If you can identify a problem,=20
>>>>> and/or
>>>>> suggest a specific improvement, I'll be happy to help you =
implement
>>>>> it.
>>>>>
>>>>> /Anders
>>>>>
>>>>>> ----------------------
>>>>>> Matthew Inger
>>>>>> Design Architect
>>>>>> Synygy, Inc
>>>>>> 610-664-7433 x7770
>>>>>>
>>>>>>
>>>>>>
>>>>>> -------------------------------------------------------
>>>>>> This SF.Net email is sponsored by: IBM Linux Tutorials
>>>>>> Free Linux tutorial presented by Daniel Robbins, President and =
CEO
>>>>>> of
>>>>>> GenToo technologies. Learn everything from fundamentals to system
>>>>>> administration.http://ads.osdn.com/?
>>>>>> ad_id=3D1470&alloc_id=3D3638&op=3Dclick
>>>>>> _______________________________________________
>>>>>> ojAlgo-user mailing list
>>>>>> ojA...@li...
>>>>>> https://lists.sourceforge.net/lists/listinfo/ojalgo-user
>>>>
>>>
>>>
>>> -------------------------------------------------------
>>> This SF.Net email is sponsored by: IBM Linux Tutorials
>>> Free Linux tutorial presented by Daniel Robbins, President and CEO =
of
>>> GenToo technologies. Learn everything from fundamentals to system
>>> =
administration.http://ads.osdn.com/?ad_id=3D1470&alloc_id=3D3638&op=3Dclic=
k
>>> _______________________________________________
>>> ojAlgo-user mailing list
>>> ojA...@li...
>>> https://lists.sourceforge.net/lists/listinfo/ojalgo-user
>>>
>>
>

Re: [ojAlgo-user] Questions

Mathematics, linear algebra and optimisation

Re: [ojAlgo-user] Questions