Re: [Flashforth-devel] C stack - passing the params (pic24/33)
Brought to you by:
oh2aun
From: pito <pi...@vo...> - 2014-07-16 09:51:30
|
And a final benchmark this week - the fsin fcos ftan fatan written in forth, and in forth but with 4 Cfloat primitives: forth: bsin 315.614E3 CPU INSTRs per fsin ok<#,ram> bcos 317.457E3 CPU INSTRs per fcos ok<#,ram> btan 645.600E3 CPU INSTRs per ftan ok<#,ram> batan 95.9004E3 CPU INSTRs per fatan ok<#,ram> Cfloat 4 primitives: bsin 20.9120E3 CPU INSTRs per fsin ok<#,ram> bcos 20.6356E3 CPU INSTRs per fcos ok<#,ram> btan 42.2846E3 CPU INSTRs per ftan ok<#,ram> batan 2.11884E3 CPU INSTRs per fatan ok<#,ram> 15x faster for sin cos tan, 45x for atan (a simple algorithm only). These trigo functions will be even much faster when called from C directly .. :) Pito ______________________________________________________________ > Od: "pito" <pi...@vo...> > Komu: Mikael Nordman <mik...@pp...>, "flashforth-devel" <fla...@li...> > Datum: 16.07.2014 11:31 > Předmět: Re: [Flashforth-devel] C stack - passing the params (pic24/33) > >4 float primitives added, now the benchmarks (empty loop not subtracted): > >Cfloats: >be 35.0069E0 CPU INSTRs per empty loop ok<#,ram> >b+ 279.134E0 CPU INSTRs per f+ ok<#,ram> >b- 277.291E0 CPU INSTRs per f- ok<#,ram> >b* 171.349E0 CPU INSTRs per f* ok<#,ram> >b/ 435.743E0 CPU INSTRs per f/ ok<#,ram> > > >forth: >be 35.0069E0 CPU INSTRs per empty loop ok<#,ram> >b+ 2.49101E3 CPU INSTRs per f+ ok<#,ram> >b- 2.72317E3 CPU INSTRs per f- ok<#,ram> >b* 26.6107E3 CPU INSTRs per f* ok<#,ram> >b/ 13.2768E3 CPU INSTRs per f/ ok<#,ram> > >Interestingly the forth's version of f/ is faster than f*... :) > >From 11x to 190x speedup.. > >Nice! > >The next step is to add the another 30 floating point functions.. :) > >Pito > >______________________________________________________________ >> Od: "pito" <pi...@vo...> >> Komu: Mikael Nordman <mik...@pp...>, "flashforth-devel" <fla...@li...> >> Datum: 16.07.2014 10:29 >> Předmět: Re: [Flashforth-devel] C stack - passing the params (pic24/33) >> >>Ok, reordered: >> >>100000. d>f 7 s>f Cf/ ok<#,ram> 14043 18015 >>f. 14285.7 ok<#,ram> >> >>1000000. d>f 7 s>f Cf/ fs. 1.42857E5 ok<#,ram> >> >>forth: >>bench/ 13.2768E3 CPU INSTRs per f/ ok<#,ram> >> >>Cf/: >>bench/ 435.743E0 CPU INSTRs per f/ ok<#,ram> >> >>13276 / 436 = 30x faster >>Nice.. >>Pito. >> >> >>______________________________________________________________ >>> Od: "pito" <pi...@vo...> >>> Komu: Mikael Nordman <mik...@pp...>, "flashforth-devel" <fla...@li...> >>> Datum: 16.07.2014 09:49 >>> Předmět: Re: [Flashforth-devel] C stack - passing the params (pic24/33) >>> >>>It works basically, but the order of the operands and its words is weird, so I have to swap everything before calling Cfsub. >>>So the order of the shorts passed needs to be rearranged.. >>> >>>"30000 - 10000" = >>> >>>10000 s>f swap 30000 s>f swap Cf- f. 20000.0 ok<#,ram> >>> >>>However, even with 2 addtional swaps I get following result for Cfsub: >>> >>>forth version: >>>bench- 2.72317E3 CPU INSTRs per f- ok<#,ram> >>> >>>Cf- version: >>>bench- 283.740E0 CPU INSTRs per f- ok<#,ram> >>> >>>So the C version of the f- is ~10x faster than the forth version. Nice... >>> >>>Pito >>> >>> >>> >>> >>> >>>______________________________________________________________ >>>> Od: Mikael Nordman <mik...@pp...> >>>> Komu: <pi...@vo...>, <fla...@li...> >>>> Datum: 16.07.2014 08:53 >>>> Předmět: Re: [Flashforth-devel] C stack - passing the params (pic24/33) >>>> >>>>Probably, but check the xc16 help section. I think it describes the parameter passing. >>>> >>>>Mike >>>> >>>>Sent from my LG Mobile >>>> >>>>------ Original message------ >>>> >>>>From: pito >>>> >>>>Date: Wed, 16/07/2014 02:52 >>>> >>>>To: flashforth-devel; >>>> >>>>Subject:[Flashforth-devel] C stack - passing the params (pic24/33) >>>> >>>>In the C example we pass a single 16bit short param: mov [W14], W0 .extern C4add call _C4add mov W0, [W14] So my current understanding is to pass 4 shorts (2 floats) would mean something like this: ( f1l f1h f2l f2h -- fl fh ) mov [W14--], W0 ; f2h mov [W14--], W1 ; f2l mov [W14--], W2 ; f1h mov [W14], W3 ; f1l .extern Cfadd call _Cfadd mov W0, [W14++] ; fl mov W1, [W14] ; fh Am I correct?? Thanks, P. ------------------------------------------------------------------------------ Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds _______________________________________________ Flashforth-devel mailing list Fla...@li... https://lists.sourceforge.net/lists/listinfo/flashforth-devel >>>> >>> >> > |