Re: [Flashforth-devel] C stack - passing the params (pic24/33)
Brought to you by:
oh2aun
From: pito <pi...@vo...> - 2014-07-16 10:57:06
|
I cannot resist to add the others: 777 s>f 1000 s>f f/ fconstant arg \ = 0.777 ok<#,ram> arg fsin fs. 7.01144E-1 ok<#,ram> arg fcos fs. 7.13020E-1 ok<#,ram> arg ftan fs. 9.83343E-1 ok<#,ram> arg fsqrt fs. 8.81476E-1 ok<#,ram> arg fexp fs. 2.17494E0 ok<#,ram> arg flog fs. -2.52315E-1 ok<#,ram> arg arg fpow fs. 8.21972E-1 ok<#,ram> pure C functions called from FF: bsin 2.26716E3 CPU INSTRs per fsin ok<#,ram> bcos 3.03178E3 CPU INSTRs per fcos ok<#,ram> btan 2.90925E3 CPU INSTRs per ftan ok<#,ram> blog 2.53247E3 CPU INSTRs per flog ok<#,ram> bexp 3.04007E3 CPU INSTRs per fexp ok<#,ram> bsqrt 525.103E0 CPU INSTRs per fsqrt ok<#,ram> 7-9x faster than the same with 4 Cfloat primitves and forth.. ;) Very nice! Pito. ______________________________________________________________ > Od: "pito" <pi...@vo...> > Komu: Mikael Nordman <mik...@pp...>, "flashforth-devel" <fla...@li...> > Datum: 16.07.2014 11:51 > Předmět: Re: [Flashforth-devel] C stack - passing the params (pic24/33) > >And a final benchmark this week - the fsin fcos ftan fatan written in forth, and in forth but with 4 Cfloat primitives: > >forth: >bsin 315.614E3 CPU INSTRs per fsin ok<#,ram> >bcos 317.457E3 CPU INSTRs per fcos ok<#,ram> >btan 645.600E3 CPU INSTRs per ftan ok<#,ram> >batan 95.9004E3 CPU INSTRs per fatan ok<#,ram> > >Cfloat 4 primitives: >bsin 20.9120E3 CPU INSTRs per fsin ok<#,ram> >bcos 20.6356E3 CPU INSTRs per fcos ok<#,ram> >btan 42.2846E3 CPU INSTRs per ftan ok<#,ram> >batan 2.11884E3 CPU INSTRs per fatan ok<#,ram> > >15x faster for sin cos tan, 45x for atan (a simple algorithm only). >These trigo functions will be even much faster when called from C directly .. :) >Pito > >______________________________________________________________ >> Od: "pito" <pi...@vo...> >> Komu: Mikael Nordman <mik...@pp...>, "flashforth-devel" <fla...@li...> >> Datum: 16.07.2014 11:31 >> Předmět: Re: [Flashforth-devel] C stack - passing the params (pic24/33) >> >>4 float primitives added, now the benchmarks (empty loop not subtracted): >> >>Cfloats: >>be 35.0069E0 CPU INSTRs per empty loop ok<#,ram> >>b+ 279.134E0 CPU INSTRs per f+ ok<#,ram> >>b- 277.291E0 CPU INSTRs per f- ok<#,ram> >>b* 171.349E0 CPU INSTRs per f* ok<#,ram> >>b/ 435.743E0 CPU INSTRs per f/ ok<#,ram> >> >> >>forth: >>be 35.0069E0 CPU INSTRs per empty loop ok<#,ram> >>b+ 2.49101E3 CPU INSTRs per f+ ok<#,ram> >>b- 2.72317E3 CPU INSTRs per f- ok<#,ram> >>b* 26.6107E3 CPU INSTRs per f* ok<#,ram> >>b/ 13.2768E3 CPU INSTRs per f/ ok<#,ram> >> >>Interestingly the forth's version of f/ is faster than f*... :) >> >>From 11x to 190x speedup.. >> >>Nice! >> >>The next step is to add the another 30 floating point functions.. :) >> >>Pito >> >>______________________________________________________________ >>> Od: "pito" <pi...@vo...> >>> Komu: Mikael Nordman <mik...@pp...>, "flashforth-devel" <fla...@li...> >>> Datum: 16.07.2014 10:29 >>> Předmět: Re: [Flashforth-devel] C stack - passing the params (pic24/33) >>> >>>Ok, reordered: >>> >>>100000. d>f 7 s>f Cf/ ok<#,ram> 14043 18015 >>>f. 14285.7 ok<#,ram> >>> >>>1000000. d>f 7 s>f Cf/ fs. 1.42857E5 ok<#,ram> >>> >>>forth: >>>bench/ 13.2768E3 CPU INSTRs per f/ ok<#,ram> >>> >>>Cf/: >>>bench/ 435.743E0 CPU INSTRs per f/ ok<#,ram> >>> >>>13276 / 436 = 30x faster >>>Nice.. >>>Pito. >>> >>> >>>______________________________________________________________ >>>> Od: "pito" <pi...@vo...> >>>> Komu: Mikael Nordman <mik...@pp...>, "flashforth-devel" <fla...@li...> >>>> Datum: 16.07.2014 09:49 >>>> Předmět: Re: [Flashforth-devel] C stack - passing the params (pic24/33) >>>> >>>>It works basically, but the order of the operands and its words is weird, so I have to swap everything before calling Cfsub. >>>>So the order of the shorts passed needs to be rearranged.. >>>> >>>>"30000 - 10000" = >>>> >>>>10000 s>f swap 30000 s>f swap Cf- f. 20000.0 ok<#,ram> >>>> >>>>However, even with 2 addtional swaps I get following result for Cfsub: >>>> >>>>forth version: >>>>bench- 2.72317E3 CPU INSTRs per f- ok<#,ram> >>>> >>>>Cf- version: >>>>bench- 283.740E0 CPU INSTRs per f- ok<#,ram> >>>> >>>>So the C version of the f- is ~10x faster than the forth version. Nice... >>>> >>>>Pito >>>> >>>> >>>> >>>> >>>> >>>>______________________________________________________________ >>>>> Od: Mikael Nordman <mik...@pp...> >>>>> Komu: <pi...@vo...>, <fla...@li...> >>>>> Datum: 16.07.2014 08:53 >>>>> Předmět: Re: [Flashforth-devel] C stack - passing the params (pic24/33) >>>>> >>>>>Probably, but check the xc16 help section. I think it describes the parameter passing. >>>>> >>>>>Mike >>>>> >>>>>Sent from my LG Mobile >>>>> >>>>>------ Original message------ >>>>> >>>>>From: pito >>>>> >>>>>Date: Wed, 16/07/2014 02:52 >>>>> >>>>>To: flashforth-devel; >>>>> >>>>>Subject:[Flashforth-devel] C stack - passing the params (pic24/33) >>>>> >>>>>In the C example we pass a single 16bit short param: mov [W14], W0 .extern C4add call _C4add mov W0, [W14] So my current understanding is to pass 4 shorts (2 floats) would mean something like this: ( f1l f1h f2l f2h -- fl fh ) mov [W14--], W0 ; f2h mov [W14--], W1 ; f2l mov [W14--], W2 ; f1h mov [W14], W3 ; f1l .extern Cfadd call _Cfadd mov W0, [W14++] ; fl mov W1, [W14] ; fh Am I correct?? Thanks, P. ------------------------------------------------------------------------------ Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds _______________________________________________ Flashforth-devel mailing list Fla...@li... https://lists.sourceforge.net/lists/listinfo/flashforth-devel >>>>> >>>> >>> >> > |