From: Rob <eu...@ho...> - 2001-08-28 00:40:56
|
Hi Tim, Yes this is part of a FEM-MOM hybrid EM simulator. It was originally written in C, so I've had to deal with the C way of doing things. I'd like to consolidate many of these types of operations so that the program becomes streamlined and easier to understand- "The Python Way" TM. :) I will try your method to see how it works. By contrast, I have a FDTD simulator which is a true speed demon with Numpy. But it was organized in a way that I would easily use slicing to time march the 6 scalar Maxwell's equations. In any case I'm having fun. I do this at home. At work I have Ansoft Serenade and Agilent ADS at my disposal. Rob. Tim Hochberg wrote: > > Hi Rob, > > It looks like you are trying to use the function below to integrate over a > single triangle. I'm almost certain that you will _not_ be able to get this > to run fast; there's just too much overhead per triangle from all the > function calls and what not. Instead you need to structure things so that > you do more work per call. One way is to pass a list of triangles and get > back a list of answers. This way you spread your area over many triangles. > Another possibility, assuming that you need to evaluate the integral for all > seven possible values of QPoint each time, is to compute the answer for all > values of QPoint at once so that you reduce the overhead per triangle by > seven. For example (minimally tested): > > Qpnt=reshape(arange(7*3), (7,3)) > # Other code from other messages elided.... > > def Quads(TrngleNode, Qpnt, NodeCord): > c = take(NodeCord, TrngleNode) > SrcPointCols= add.reduce(Qpnt[...,NewAxis] * c[NewAxis,],1) > return SrcPointCols > > # Initialize stuff to minimize effects of timing.... > from time import clock > q1 = [None]*7 > q2 = [None]*7 > rng = range(7) > > # Time the three methods.... > > t0 = clock() > for i in rng: > q1[i] = ComputeGaussQuadPoint(i,TrngleNode,Qpnt,NodeCord) > t1 = clock() > for i in rng: > q2[i] = Quad(i,TrngleNode,Qpnt,NodeCord) > t2 = clock() > q3 = Quads(TrngleNode, Qpnt, NodeCord) > t3 = clock() > > # And the results... > > print "Times (us):", (t1-t0)*1e6, (t2-t1)*1e6, (t3-t2)*1e6 > for i in range(7): > print "The Answers:", q1[i], q2[i], q3[i] > > If this code is still a bottleneck, you could do both (compute the values > for all nodes and all values of QPoint at once, but this may be enough to > move the bottleneck elsewhere; by my timing it's over 7 times as fast. > > -tim > > [snip] > > > Evidently transpose is not very fast even for > > smal matrices. > > Actually, transpose should be fast, and should look progressively faster for > larger matrices. Typically all that happens is that the strides are changed. > > ----- Original Message ----- > From: "Rob" <eu...@ho...> > To: <num...@li...> > Sent: Monday, August 27, 2001 7:49 AM > Subject: [Numpy-discussion] Gaussian Quadrature routine Numpy-ization :) > > > I finally got it to work, but the Numpy-ized version runs slower than > > the plain Python one. I think that I can transpose the NodeCord matrix > > once in the program and feed that in, rather than the scratch matrix > > that is generated here. Evidently transpose is not very fast even for > > smal matrices. Here is my test program: > > > > > > from Numeric import * > > > > > > > > Qpnt=array(([20,21,22],[23,24,25],[26,27,28])) > > NodeCord=array(([1,2,3],[4,5,6],[7,8,9])) > > TrngleNode=array((1,2,0)) > > > > > > #the original routine > > def ComputeGaussQuadPoint(QuadPoint,TrngleNode,Qpnt,NodeCord): > > > > SrcPointCol=zeros((3)) > > > > > > SrcPointCol[0] = Qpnt[QuadPoint,0]*NodeCord[TrngleNode[0],0]\ > > + Qpnt[QuadPoint,1]*NodeCord[TrngleNode[1],0]\ > > + Qpnt[QuadPoint,2]*NodeCord[TrngleNode[2],0] > > > > SrcPointCol[1] = Qpnt[QuadPoint,0]*NodeCord[TrngleNode[0],1]\ > > + Qpnt[QuadPoint,1]*NodeCord[TrngleNode[1],1]\ > > + Qpnt[QuadPoint,2]*NodeCord[TrngleNode[2],1] > > > > SrcPointCol[2] = Qpnt[QuadPoint,0]*NodeCord[TrngleNode[0],2]\ > > + Qpnt[QuadPoint,1]*NodeCord[TrngleNode[1],2]\ > > + Qpnt[QuadPoint,2]*NodeCord[TrngleNode[2],2] > > > > return SrcPointCol > > > > > > #the yet-to-be-faster routine > > > > def Quad(QuadPoint, TrngleNode, Qpnt,NodeCord): > > s = Qpnt[QuadPoint,:] > > c= take(NodeCord, TrngleNode) > > SrcPointCol= add.reduce(s * > > transpose(c),1) > > > > return SrcPointCol > > > > QuadPoint=1 > > > > print "The Correct:" > > print ComputeGaussQuadPoint(QuadPoint,TrngleNode,Qpnt,NodeCord) > > > > print "The New" > > print Quad(QuadPoint,TrngleNode,Qpnt,NodeCord) > > > > > > > > > > > > -- > > The Numeric Python EM Project > > > > www.members.home.net/europax > > > > _______________________________________________ > > Numpy-discussion mailing list > > Num...@li... > > http://lists.sourceforge.net/lists/listinfo/numpy-discussion > > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > http://lists.sourceforge.net/lists/listinfo/numpy-discussion -- The Numeric Python EM Project www.members.home.net/europax |