Menu

#470 pdk mulchar

None
pending-accepted
pdk (4)
5
2025-08-10
2023-12-29
No

I wrote tiny fast mul 8x8->16 which not use SLOCs .

unsigned int muluchar (unsigned char x, unsigned char y) __naked {
    x;
    y;
#if defined(__SDCC_pdk13) || defined(__SDCC_pdk14) || defined(__SDCC_pdk15) || defined(__SDCC_pdk16)
__asm   ; loop 8x of 9 instruction cycles, data inplace, no SLOCs
    mov     a, #0x00
    clear   p

#if !defined(__SDCC_pdk13) // mulint/mullong/.. may emit high probable multiplications to 0
    cneqsn  a,_test_muluchar_PARM_1     ;x==0 ?
    ret
1$:
    cneqsn  a,_test_muluchar_PARM_2     ;y==0 ?
    ret
2$:
#endif

    inc     p   ; {p,a} = 0x0100
0$:
    sl      a
    slc     p
    slc     _test_muluchar_PARM_1       ;   x <<= 1;
    t0sn.io f,c
    add     a, _test_muluchar_PARM_2    ;   result += y;
3$:
    addc    p
    t1sn    _test_muluchar_PARM_1, #0
    goto    0$
4$:
    ret
__endasm;
#endif
}

It was tested against following code:

unsigned int muluchar (unsigned char x, unsigned char y)
{
  unsigned int result = 0;
  unsigned char i = 8;
  if (x|y)  // mulint/mullong/.. may emit high probable multiplications to 0
  do {
      result <<= 1;
      if (x & 0x80)
        result += y;
      x <<= 1;
  } while (--i);
  return result;
}

Related

Wiki: NGI0-Commons-SDCC

Discussion

  • Philipp Klaus Krause

    • assigned_to: Philipp Klaus Krause
    • Group: -->
     
  • Philipp Klaus Krause

    Thanks applied to next branch in [r14584].

     

    Related

    Commit: [r14584]

  • Philipp Klaus Krause

    • status: open --> pending-accepted
     
  • Philipp Klaus Krause

    Thanks; applied to next branch in [r14584].

    P.S.: I am unsure if the zero check is worth it. While I agree that 0 may be a common operand, doing the checks takes 4 extra cycles for multiplications where no operand is 0. For now, I left the check out.

     

    Related

    Commit: [r14584]


    Last edit: Philipp Klaus Krause 2024-01-03
    • Konstantin Kim

      Konstantin Kim - 2024-01-03

      Nice point to start discussion ;)

       
  • Konstantin Kim

    Konstantin Kim - 2024-01-05

    Sorry for topic mixing. Please have a look for fast 16bit multiplication implementation also.

     
    • Philipp Klaus Krause

      This routine uses a non-zero p+1. Current SDCC relies on p+1 always being zero.

       
      • Konstantin Kim

        Konstantin Kim - 2024-01-31

        I got (seems wrongly) P as 16bit scratchpad ;(

         
        • Philipp Klaus Krause

          For now, it is 8 lower bits of data with fixed zero upper 8 bits. It seemed like a good start when the pdk ports were created initially.
          Changing to full 16 bits would have advantages and disadvantages:
          * Obvious advantage of having 8 more bits for a pseudo-register.
          * Increased interrupt latency due to having to save and restore the upper bits in the interrupt handler.
          * Less efficient RAM access (via pointers), since we'd have to zero p+1 before RAM accesses via idxm.
          * Substantial rewrite of code generation required when changing this.
          IMO, it probably isn't worth the effort to look into changing it now. But once we get a pdk16 port, we should revisit this decision.

           
          • Konstantin Kim

            Konstantin Kim - 2024-02-01

            The interrupt handler should only care about P if it is used.
            Another way is to use a register frame assigned to a handler. Depends on the target trade-off between speed and size.
            In the mean time we may adapt fast mulint with non-stack local variable instead of P+1

             
      • Konstantin Kim

        Konstantin Kim - 2024-01-31

        another issue is labels inside .rept macro

         
  • Konstantin Kim

    Konstantin Kim - 2024-01-29

    Do I really need to open another separate patch case for 16 bit multiplication?

     
    • Philipp Klaus Krause

      Most work on SDCC is volunteer work, so it often happens that other stuff takes priority, and we SDCC developers don't find much time for SDCC. Recently, for me that meant that I decided to use the time I could spend on SDCC on the SDCC 4.4. 0 release; dealing with patch tickets got postponed.

       
      ❤️
      1
      • Konstantin Kim

        Konstantin Kim - 2024-01-31

        no discussion. life is too short. I really appreciate what you do.

         
  • Philipp Klaus Krause

    Today I found that apparently the new mulchar wasn't actually used, except for pdk13. I noticed when I saw pdk13 tests fail to link, due to the parameter name being wrong. I'll look into fixing it, and enabling it for all pdk ports today.

     
    • Philipp Klaus Krause

      The optimized muluchar is not finally working in [r15578]. I had to change it a bit:

      __muluchar::
      
          mov a, __muluchar_PARM_1
          mov p, a
          mov a, #0x08
          mov __muluchar_PARM_1, a
          mov a, #0x00
      
      0$:
          sl  a
          slc p
          t0sn.io f, c
          add a, __muluchar_PARM_2
      3$:
          addc    p
      
          dzsn    __muluchar_PARM_1
          goto    0$
      4$:
      
          ret
      

      To avoid the t1sn which would result in link failures depending on how the linker arranges the parameters in memory (the t1sn pdk instruction is not available for the full pdk RAM address range).

       
      👍
      1

      Related

      Commit: [r15578]


Log in to post a comment.