From: Kalus M. <mic...@on...> - 2010-09-16 17:44:05
|
Hi. Am 16.09.2010 um 14:39 schrieb pito: .. > PS: I am still thinking why the amforth overhead is so big?? It > seems from my naive measurements a typical forth word takes ~7us > plus minus. > This is about 175 clock cycles @25MHz, or aprox 100 instructions - > could it be so much? Just a stupid Q. Pito I have a real time clock (RTC) implemented, so I can use time@ to see time in hours minutes seconds on stack. Defining this: new : one ; : tt0 1000 0 do loop ; : tt1 1000 0 do one loop ; : tt20 0 do tt0 loop ; : tt21 0 do tt1 loop ; time@ 10000 tt20 time@ 10000 tt21 time@ .s I get: » time@ 10000 tt20 time@ 10000 tt21 time@ .s 0 1181 26 1 1183 6 2 1185 0 3 1187 26 4 1189 5 5 1191 0 6 1193 57 7 1195 4 8 1197 0 00:04:57 00:05:26 --> 29s*10^7 empty loop 00:06:26 --> 60s*10^7 one loop » time@ 10000 tt20 time@ 10000 tt21 time@ .s 0 1181 18 1 1183 40 2 1185 16 3 1187 19 4 1189 39 5 1191 16 6 1193 49 7 1195 38 8 1197 16 16:38:49 16:39:19 --> 30s * 10^7s empty loop 16:40:18 --> 59s * 10^7s one loop We get 30s per 10^7 times one word, or 3us for a single into and out word procedure. On 20MHz atmega168 one instruction cycle takes 0.05us. So I have 3 / 0.05 = 60 cycles for 'one word'. The inner interpreter is: ... DO_COLON: C:001c0a 93bf push XH 2 2 C:001c0b 93af push XL ; PUSH IP 2 4 C:001c0c 01db movw XL, wl 1 5 C:001c0d 9611 adiw xl, 1 2 7 DO_NEXT: .endif C:001c0e 01fd movw zl, XL ; READ IP 1 8 C:001c0f + readflashcell wl, wh C:001c0f 0fee lsl zl 1 9 C:001c10 1fff rol zh 1 10 C:001c11 9165 lpm wl, Z+ 3 13 C:001c12 9175 lpm wh, Z+ 3 16 C:001c13 9611 adiw XL, 1 ; INC IP 2 18 DO_EXECUTE: C:001c14 01fb movw zl, wl 1 19 C:001c15 + readflashcell temp0,temp1 C:001c15 0fee lsl zl 1 20 C:001c16 1fff rol zh 1 21 C:001c17 9105 lpm temp0, Z+ 3 24 C:001c18 9115 lpm temp1, Z+ 3 27 C:001c19 01f8 movw zl, temp0 1 28 C:001c1a 9409 ijmp 2 30 ; ( -- ) Compiler ; R( xt -- ) ; end of current colon word VE_EXIT: .dw $ff04 .db "exit" .dw VE_HEAD .set VE_HEAD = VE_EXIT XT_EXIT: .dw PFA_EXIT PFA_EXIT: pop XL 2 32 pop XH 2 34 rjmp DO_NEXT 2 36 total of 36 cycles, right? So where does amforth spend 24 more cycles? That is at an average the overhead caused by my ISR time ticker of RTC, I guess. To see a word "as is" connect an oscilloscope to a port pin. let an empty loop toggle your port pin. Than add your word to the loop an run again. In the resulting frequency difference you get the execution time of your word. Michael |