Menu

Assembly language in, vs inn,

2024-04-01
2024-04-05
  • Lief Koepsel

    Lief Koepsel - 2024-04-01

    Hi Mikael,
    I'm finally getting back to FlashForth and I am really enjoying it! Again, many thanks for your great work!

    I saw your comment on my blog (thank you) and added an optimized test using assembly language. The speed doubled from 56kHz to 108kHz. Comparing Board and Language Speeds
    I also checked the Pico diagrams, the Pico is not in parallel, its in serial as you can see the waveforms don't align.

    In working with asm.fs (the latest version, 03.06.2023, I am confused by the in, and the inn, words. What is the difference between the two? I'm attempting to read a pin on port D and push it on to the stack. I've attempted (using both versions):

    : readf ( -- )
      [ R18 pind-io in, ]
      [ R18 push, ]
    ; 
    

    However, this isn't working. I'll either get nothing on the stack or an abort.

    I also get a 18 pushed on to the stack when I define the above word.

    Could I get a some guidance on this?

    Thanks,
    Lief

     
  • Mikael Nordman

    Mikael Nordman - 2024-04-01

    To use the machine IN instruction you have to use the INN, word. There was a name conflict with IN, which is used for copying a whole assembler word inline by the forth compiler. I could switch those names, IN, is just internal to the compiler anyway.

    You cannot use PUSH. It pushes a register to the return stack.
    To make room on the data stack you must first push the TOS registers to memory with DUP or with the equivalent code in assembler.

      [ R25 -Y st, ]
      [ R24 -Y st, ]
    

    And then read portd to R24, the low byte of the data stack.

      [ R24 $b inn, ]
    

    After that you must clear high byte on the stack with

    [ R25 clr, ]
    

    So the word is defined like this

    : portd@ ( -- c )
      \ Make room in the stack top registers. 
      \ Or just use DUP, it inlines the same code automatically.
      [ R25 -Y st,  ]
      [ R24 -Y st,  ]
      [ R24 $b inn, ]  \ put the low byte on the stack
      [ R25 clr,    ]  \ clear the high byte
    ; 
    

    See also https://sourceforge.net/p/flashforth/code/ci/master/tree/avr/forth/asm-examples.fs

     
  • Mikael Nordman

    Mikael Nordman - 2024-04-01

    Toggle portd should be like this (not tested)

    : toggle-portd ( c -- )
      [ R18 portd inn,   ]
      [ R18 R24, eor, ]
      [ R18 portd out,  ]
      drop
    ; inlined
    

    Or even faster

    : toggle-portd-fast ( c -- )
      [ R24 pind out,   ]
      drop
    ; inlined
    
     

    Last edit: Mikael Nordman 2024-04-01
  • Mikael Nordman

    Mikael Nordman - 2024-04-01

    Or even like this. Should be the fastest version, but maybe not so elegant. It just generates the hardcoded toggling without using the stack. Also not tested.

    : toggle-portd-fastest ( c -- )
      R18 swap ldi,
      R18 PIND out,
    ; 
    : toggle-pin1 ( c -- ) [ 0x01 toggle-portd-fastest ] ; inlined
    : toggle-pin8 ( c -- ) [ 0x80 toggle-portd-fastest ] ; inlined
    : main begin toggle-pin1 toggle-pin8 again ;
    
     

    Last edit: Mikael Nordman 2024-04-02
  • Lief Koepsel

    Lief Koepsel - 2024-04-01

    Thank you! Very helpful.

    One, thank you for the code. For others looking at this...reading PIND needs to be:

    [ R24 $9 inn, ] \ use the IO register address of PIND
    

    I mis-spoke calling it port D...

    And after spending more time on your DUP explanation, I now understand it.
    R24/R25 are TOS
    memory is the rest of the stack

    DUP pushes TOS down into memory and I fill R24/R25 with PIND and $0.

    I was racking my brains over the "TOS is cached in R24:R25", now I get it.

    Thanks!
    Lief

     
  • Mikael Nordman

    Mikael Nordman - 2024-04-01

    And you can just write to the PINx register. If you write a high bit the pin will toggle in the hardware. No need to read or xor anything.
    Actually you can use the SBI instruction to toggle with just one instruction. Smart hardware:-)

    : toggled-pin1 ( -- ) PIND 0 sbi, ; inlined
    : toggled-pin8 ( -- ) PIND 7 sbi, ; inlined
    
    : main begin toggled-pin1 toggled-pin8 again ;
    
     

    Last edit: Mikael Nordman 2024-04-02
  • Lief Koepsel

    Lief Koepsel - 2024-04-01

    Thanks, Mikael, I was waiting for you to get to the last one. :)

    That is the technique I used. Its a great instruction and I've been surprised so many people haven't used it...(Arduino?)

    I use the asm version now in my timing loops, where I'm trying to understand overhead. Its great because this takes 11 clock cycles, 689ns:

    \ use toggle command to flip bits, allows for symmetry
    : tog_D8 ( -- )
      [
        pinb-io #0 sbi,  \ one toggle per loop
      ]
    ;
    
    : time_it ( -- )
        D8 out
        begin
            tog_D8
            \ stick code to be timed here, and use a scope to measure pulse width
        again
    ;
    
     

    Last edit: Lief Koepsel 2024-04-02
    • Mikael Nordman

      Mikael Nordman - 2024-04-02

      I use the asm version now in my timing loops, where I'm trying to understand overhead. Its great because this takes 11 clock cycles, 689ns

      And if you inline the toggle word you can cut 7 cycles from that. Then on an Arduino the loop should run in 250 ns, 4MHz. The call/ret overhead is 7 cycles on Atmega328 and 9 cycles on Atmega2560.

      I was waiting for you to get to the last one. :)

      It took some time since I use PIC18 for everything myself. PIC18 has a bit toggle instruction that can be used on any register and memory location.

       
  • Lief Koepsel

    Lief Koepsel - 2024-04-02

    Good points. I did make this change, and got it to 4 cycles...

    : time_ita ( -- )
        D8 out
        begin
            [
              pinb-io #0 sbi,  \ one toggle per loop
            ]
        again
    ;
    

    I didn't inline.
    To your credit, this is so absurdly easy to test and execute in Forth...thank you.

     
  • Mikael Nordman

    Mikael Nordman - 2024-04-02

    Lief, really nice of you to write these programming articles. You could update your compile FlashForth article. I just updated the source code to work with XC8 2.46.

     
  • Lief Koepsel

    Lief Koepsel - 2024-04-02

    Thank you, Mikael

    I did attempt 2.46 a few days ago (with the change you had noted) and continued to have the same problem. I have a Mac, not sure if that is the issue or not. I also have the issue, where I need to change the include files from <> to "" (brackets to quotes)...

    I'll test again in a few days.
    Lief

     
    • Mikael Nordman

      Mikael Nordman - 2024-04-02

      I pushed the changes to git yesterday, so you tried the old code, if it was before yesterday.

       
  • Lief Koepsel

    Lief Koepsel - 2024-04-05

    Yes, the new version works with 2.46. Also posted a note on the original issue.

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.