The initial revision of the code is ugly written as it has been developed while debugging the hardware. The most problematic parts are the acceleration/deceleration phase, and the size of the code as the code is simply duplicated for each axis. The initial acceleration code comes from
Other interesting considerations can be found in
Factorization of the code using structs for axis data: this avoids code duplication, but the use of generic struct pointers requires pointer arithmetic which results in a longer asm code. sdcc can't even compile it on a 16k PIC. Microchip XC8 succeeds with a code size of 15k (it claims it can reduce the code size to less than 6k, using the pro version). Here is the code produced with sdcc for an access to a long int (4 bytes), axis->wormperiod, whose address is stored in local registers r0x00, r0x01, r0x02:
MOVF r0x00, W
ADDLW 0x18
MOVWF r0x03
MOVLW 0x00
ADDWFC r0x01, W
MOVWF r0x04
MOVLW 0x00
ADDWFC r0x02, W
MOVWF r0x05
MOVFF r0x03, FSR0L
MOVFF r0x04, PRODL
MOVF r0x05, W
CALL __gptrget4
MOVWF r0x03
MOVFF PRODL, r0x04
MOVFF PRODH, r0x05
MOVFF FSR0L, r0x06
Replaced generic pointer with array of struct. Global replacement of axis->field with motorsaxisindex.field. sdcc succeeds to compile with a final code size of 12.5k. Most of code in high_isr (4k) and main (2.5k).