Menu

Produce "DMA optimized" with new 18FXXQ43 PICs

ikonsgr74
2020-12-27
2021-09-29
1 2 > >> (Page 1 of 2)
  • ikonsgr74

    ikonsgr74 - 2020-12-27

    I was taking a look at the features of the new 18FXXQ43 family, and one that looks very promising in boosting performance, is the existance of 6 (DMA) Controllers that can be used for Data transfers to SFR/GPR spaces from either Program Flash Memory, Data EEPROM or SFR/GPR spaces.
    I suppose that Cow Basic functions like readtable or hsersend/hserreceive could benefit a lot by utilizing DMA.
    So i was wondering, does this new feature is utilized by the compiler, and if not, is there any schedule of adding it in the future, in order for cow basic to produce "DMA optimized" code, whenever a PIC equipped with it is used?

     

    Last edit: ikonsgr74 2020-12-27
  • Anobium

    Anobium - 2020-12-27

    As the new release candidates support the Q43s you can investigate the methods that need to be adapted to support DMA.

    Bill Roth is working on the DMA on a project and he will be sharing his insights in the coming days. But, the essentials are already in the compiler we may need to tweak to enable DMA.

     
  • ikonsgr74

    ikonsgr74 - 2020-12-27

    Great! If you remember, i'm the guy who had some issues with large tables, on a rather big project (currently using a 18F47Q10):
    https://sourceforge.net/p/gcbasic/discussion/596084/thread/5398dd8bc1/?limit=25&page=0

    Do you think that "DMA supported" methods will be available with the coming release of Cow Basic then? As i'm heavily using readtable method in my project (btw i suppose that access of large table variables implemented in RAM, could also benefit from DMA, right?) , and if DMA can offer significant speed gain, it willl surely boost a lot the performance!
    Take a look here if you want, to see a small presentation i made.

     

    Last edit: ikonsgr74 2020-12-27
    • mkstevo

      mkstevo - 2020-12-28

      That disk interface you have made is incredible. Congratulations, I'm very, very impressed.

       
      • ikonsgr74

        ikonsgr74 - 2020-12-29

        Thanks my friend! True, especially the 765 Floppy Disk Controller low level emulation, was really a tough job to do, but thanks to covid19 quarantines and plenty of free time,i've manage to make it work! :-)
        I may also upload the CB code for this project in a new topic! ;-)

         
  • ikonsgr74

    ikonsgr74 - 2020-12-29

    @Anobium ,I took a more thorough look on DMA details from 18F47Q43 datasheet, i don't know if i get this right, but it seems that the utilization of DMA controllers can have a MAJOR impact on performance!
    For example, i take a look on the asm code of my project generated by Cow Basic, regarding the interrupt trigger on receiving a byte from hardware UART module (On Interrupt UsartRX2Ready Call readUSART ):

    Interrupt
    ;Save Context
        movff   WREG,SysW
        movff   STATUS,SysSTATUS
        movff   BSR,SysBSR
    ;Store system variables
        movff   FSR0L,SaveFSR0L
        movff   FSR0H,SaveFSR0H
        movff   SysWordTempB,SaveSysWordTempB
        movff   SysWordTempB_H,SaveSysWordTempB_H
        movff   SysWordTempA,SaveSysWordTempA
        movff   SysWordTempA_H,SaveSysWordTempA_H
        movff   SysByteTempX,SaveSysByteTempX
    ;On Interrupt handlers
        banksel PIE3
        btfss   PIE3,RC2IE,BANKED
        bra NotRC2IF
        btfss   PIR3,RC2IF,BANKED
        bra NotRC2IF
        banksel 0
        call    READUSART
        banksel PIR3
        bcf PIR3,RC2IF,BANKED
        bra INTERRUPTDONE
    NotRC2IF
    ;User Interrupt routine
    INTERRUPTDONE
    ;Restore Context
    ;Restore system variables
        movff   SaveFSR0L,FSR0L
        movff   SaveFSR0H,FSR0H
        movff   SaveSysWordTempB,SysWordTempB
        movff   SaveSysWordTempB_H,SysWordTempB_H
        movff   SaveSysWordTempA,SysWordTempA
        movff   SaveSysWordTempA_H,SysWordTempA_H
        movff   SaveSysByteTempX,SysByteTempX
        movff   SysW,WREG
        movff   SysSTATUS,STATUS
        movff   SysBSR,BSR
        retfie  0
        banksel 0
    

    As you can see, dozens of instructions are needed for saving CPU state before executing interrupt routine (which moves a byte from UART input buffer to a large buffer table variable in RAM) and restoring it after finishing.
    Is it correct to assume that, if a DMA controller is used to service the interrupt routine, there is no need to save/restore CPU state?
    Moreover, since the actual code of the interrupt routine:

      READUSART
    ;buffer(next_in) = HSerReceive2
        call    FN_HSERRECEIVE2
        lfsr    0,BUFFER
        movf    NEXT_IN,W,ACCESS
        addwf   AFSR0,F,ACCESS
        movf    NEXT_IN_H,W,ACCESS
        addwfc  AFSR0_H,F,ACCESS
        movff   HSERRECEIVE2,INDF0
    ;next_in = ( next_in + 1 )
        incf    NEXT_IN,F,ACCESS
        btfsc   STATUS,Z,ACCESS
        incf    NEXT_IN_H,F,ACCESS
    ;IF (NEXT_IN>BUFFER_SIZE) Then
        movff   NEXT_IN,SysWORDTempB
        movff   NEXT_IN_H,SysWORDTempB_H
        movlw   28
        movwf   SysWORDTempA,ACCESS
        movlw   12
        movwf   SysWORDTempA_H,ACCESS
        call    SysCompLessThan16
        btfss   SysByteTempX,0,ACCESS
        bra ENDIF234
    ;NEXT_IN=1
        movlw   1
        movwf   NEXT_IN,ACCESS
        clrf    NEXT_IN_H,ACCESS
    ;END IF
    ENDIF234
    ;next_in = ( next_in + 1 ) % BUFFER_SIZE
    ;DIR PORTA.5 IN
        return
    

    will be modified for DMA utilization, maybe it will be faster to execute too ?
    (from what i read on datasheet, you only need to set a bunch of DMA registeres and then the actual DMA transfer of 1 byte takes only 2 instructions!)

    Btw, i just ordered a PICKIT4 and a couple of 18F47Q43 from Microchip direct, so when i get them, i might be able to give you extra feedback on DMA testing! ;-)

     

    Last edit: ikonsgr74 2020-12-29
  • ikonsgr74

    ikonsgr74 - 2020-12-29

    And here is an example code i found from datasheet:
    This code example illustrates using DMA1 to transfer 10 bytes of data from 0x1000 in Flash memory to the UART transmit buffer.

    void initializeDMA(){
    //Select DMA1 by setting DMASELECT register to 0x00
     DMASELECT = 0x00;
    //DMAnCON1 - DPTR remains, Source Memory Region PFM, SPTR increments, SSTP
     DMAnCON1 = 0x0B;
    //Source registers
    //Source size
     DMAnSSZH = 0x00;
     DMAnSSZL = 0x0A;
    //Source start address, 0x1000
     DMAnSSAU = 0x00;
     DMAnSSAH = 0x10;
     DMAnSSAL = 0x00;
    //Destination registers
    //Destination size
     DMAnDSZH = 0x00;
     DMAnDSZL = 0x01;
    //Destination start address,
     DMAnDSA = &U1TXB;
    //Start trigger source U1TX. Refer the datasheet for the correct code
     DMAnSIRQ = 0xnn;
    //Change arbiter priority if needed and perform lock operation
     DMA1PR = 0x01; // Change the priority only if needed
     PRLOCK = 0x55; // This sequence
     PRLOCK = 0xAA; // is mandatory
     PRLOCKbits.PRLOCKED = 1; // for DMA operation
    //Enable the DMA & the trigger to start DMA transfer
     DMAnCON0 = 0xC0;
    }
    

    So,it seems that any routine implementation using DMA, is practically only a bunch of DMA register sets! ;-)

     
  • Anobium

    Anobium - 2020-12-29

    This is pretty simple to write in Great Cow BASIC. A few changes to support the word pointer addresses (bu using the alias).

    #chip 18f26Q43
    
    
    initializeDMA
    'do stuff
    end
    
        sub initializeDMA
    
        'create an word alias to support the
         dim DMAnDSAWord as word alias DMAnDSAH, DMAnDSAL
    
        //Select DMA1 by setting DMASELECT register to 0x00
         DMASELECT = 0x00;
        //DMAnCON1 - DPTR remains, Source Memory Region PFM, SPTR increments, SSTP
         DMAnCON1 = 0x0B;
        //Source registers
        //Source size
         DMAnSSZH = 0x00;
         DMAnSSZL = 0x0A;
        //Source start address, 0x1000
         DMAnSSAU = 0x00;
         DMAnSSAH = 0x10;
         DMAnSSAL = 0x00;
        //Destination registers
        //Destination size
         DMAnDSZH = 0x00;
         DMAnDSZL = 0x01;
        //Destination start address,
    'change to word pointer - as this would have only pointed to the lower address byte of U1TXB
         DMAnDSAWord = @U1TXB;
        //Start trigger source U1TX. Refer the datasheet for the correct code
         DMAnSIRQ = 0xnn;
        //Change arbiter priority if needed and perform lock operation
         DMA1PR = 0x01; // Change the priority only if needed
         PRLOCK = 0x55; // This sequence
         PRLOCK = 0xAA; // is mandatory
    'remove  PRLOCKbits.
         PRLOCKED = 1; // for DMA operation
        //Enable the DMA & the trigger to start DMA transfer
         DMAnCON0 = 0xC0;
        end sub
    

    I used the latest RC candidate (RC34) and PICInfo to figure that I need to create the alias. The pointer assignment was

    This would fail as & is invalid, and, the assignment would only move the (low) address of U1TXB (in Great Cow BASIC) as DMAnDSA is a byte (address 240/0x00F0).

    //Destination start address,
     DMAnDSA = &U1TXB;
    

    Yields in the assembly, with the change of & to @.

    ;DMAnDSA = @U1TXB;
        movlw   low(U1TXB)
        movwf   DMANDSA,BANKED
    

    So, the changes:

    Create a word alias and then use a similar assignment.

    'create an word alias to support the
     dim DMAnDSAWord as word alias DMAnDSAH, DMAnDSAL
    

    creates the word at the correct address, as follows:

    ;Alias variables
    DMANDSAWORD EQU 240
    DMANDSAWORD_H   EQU 241
    

    And, the assignment.

    ;DMAnDSAWord = @U1TXB;
    

    Yields in the assembly: Show the low and high address being loaded into the correct DMA addresses.

    ;Destination start address,
    ;change to word pointer - as this would have only pointed to the lower address byte of U1TXB
    ;DMAnDSAWord = @U1TXB;
        movlw   low(U1TXB)
        movwf   DMANDSAWORD,BANKED
        movlw   high(U1TXB)
        movwf   DMANDSAWORD_H,BANKED
    

    Enjoy. Hope this makes sense.

    Evan

     
  • Anobium

    Anobium - 2020-12-29

    PICInfo shows the addresses to cross-reference to the alias addresses.

     

    Last edit: Anobium 2020-12-29
  • ikonsgr74

    ikonsgr74 - 2020-12-30

    Thanks for the "insight" Evan!
    So it seems that modification of various COW BASIC routines to include DMA utilization (whenever supported by the selected PIC) would be rather easy and simple after all!
    Can you make a rough estimate on performance increase when using DMA?
    For example,how faster an "on interrupt" HW UART byte read or a readtable byte read will be, using for example a 18FXXQ43, compared to current Routines usde for 18FXXQ10 (e.g. like the ones i post earlier)?

     

    Last edit: ikonsgr74 2020-12-30
    • Anobium

      Anobium - 2020-12-30

      I would have to test.

      Do you have any MPLAB-X code as a baseline?

       
    • Chris Roper

      Chris Roper - 2020-12-30

      In my experience using DMA, on a PIC32 device not the 18f26Q43, trying to estimate a performance improvement is a mute point as DMA is effectively hardware multi tasking,
      On the PIC32 at least when you executed the DMA transfer it was fire and forget, working in the background whilst the user program continued at full speed in the foreground. It was several years ago and my memory is not what it was so I don't recall any of the c++ code that I used but it was fast.

       
      • ikonsgr74

        ikonsgr74 - 2020-12-30

        Depending on system arbitration used, this is true for 18FXXQ43 family too. Quoted from datasheet:
        Depending on the priority of the DMA with respect to CPU execution (Refer to section “Memory Access Scheme” in
        the “PIC18 CPU” chapter for more information), the DMA Controller can move data through two methods:
        • Stalling the CPU execution until it has completed its transfers (DMA has higher priority over the CPU in this
        mode of operation).
        • Utilizing unused CPU cycles for DMA transfers (CPU has higher priority over the DMA in this mode of
        operation). Unused CPU cycles are referred to as bubbles, which are instruction cycles available for use by the
        DMA to perform read and write operations. In this way, the effective bandwidth for handling data is increased; at
        the same time, DMA operations can proceed without causing a processor stall.

        If you use the 2nd method, it practically executes DMA transfer without any speed penalty for CPU

         
  • ikonsgr74

    ikonsgr74 - 2020-12-30

    I never developed any code using MPLAB, only Cow Basic :-)
    But, i have installed MPLAB X IDE 5.20 and mainly use it for MCC code configurator (in order to configure the various CLC's needed for my project), but i see that the new 18FXXQ43 are not supported, maybe i need to install a newer version.

     

    Last edit: ikonsgr74 2020-12-30
  • ikonsgr74

    ikonsgr74 - 2021-05-10

    Any news about DMA support?
    I receive a couple of 18F47Q43's and have a pickit4 programmer too, so i'm really looking forward to test a... "DMA optimized" code! :-)

     
  • Anobium

    Anobium - 2021-05-10

    The post https://sourceforge.net/p/gcbasic/discussion/579125/thread/b0baec8294/#6acb shows the method. Unless someone writes a DMA editor (like PPSTool or PICINFO tool) then you will have to hack through the datasheet to setup the registers.

    and, Q43 is supported by PICKit2 & 3 .... using PICKitPlus. :-)

     
  • ikonsgr74

    ikonsgr74 - 2021-05-10

    Ok then, maybe you can write a "how to" code guide (based on 18F4XQ43 as this is the 1st PIC family supporting DMA, and most probable all that follows, like Q83/Q84 and future PIC's, will use same methods too) , with specific DMA examples like:
    - Read from HWuart and place byte to a single variable/array variable
    - Read a byte from a a single variable/array variable/table and write to HWuart/PIC port
    Then,i will try to incorporate these codes to my GCB code, and make tests to see if they work right, and what impact will have in performance.

     

    Last edit: ikonsgr74 2021-05-10
  • ikonsgr74

    ikonsgr74 - 2021-05-10

    I was wondering, is there a way to access a table element directly, without using "readtable" command? Reading bytes from byte tables and place them to a PIC's PORT or a variable, is done all the time in my project, but in order to use DMA for that, i need a way to read specific element without calling readtable, as this command does the transaction directly, but without using DMA....

     
    • William Roth

      William Roth - 2021-05-11

      To answer your question, Yes and No

      When a table is defined with TABLE the compiler looks for a related Readtable command
      IF there is none , the table is never written to memory. So this is a "NO"

      However when a readtable is executed, even if only to initialize the table in memory then the table will then be written to program memory.

      But where in memory is the question.

      There is no way to tell the compiler where in memory to put the table;

      The compiler decides based upon how much memory the rest of the code uses. I cannot tell by looking at the ASM where in memory the table begins. Someone else might.

      However if you know the data you are looking for you can look at the hex and see where the first byte of the table is located. But if your code changes, this this memory address location will change as well.

      But for the sake of argument, Let's say the code never changes and the table never changes. You could then read the data directly via the TBLRD* command as described in the Chip's Datasheet. See the section on the Nonvolatile Memory (NVM) module

      Not worth the trouble IMO

       

      Last edit: William Roth 2021-05-11
  • ikonsgr74

    ikonsgr74 - 2021-05-10
     

    Last edit: ikonsgr74 2021-05-10
  • William Roth

    William Roth - 2021-05-11

    My name was mentioned somewhere in regards to adding DMA support to GCB for chips that support it. To be clear, I have no plans now or in the future to do so.

    It would be a rather huge, time consuming effort that in the end would likely only be utilized by a handful of advanced users.

    I am not saying that it will not be done eventually, just that I will not be the one doing it.

    As far as an estimated time for adding DMA support, Anobium or Hugh can answer that better than I can. However, I would not think it would be any sooner than 6 months if not a year or more.

    Bill

     

    Last edit: William Roth 2021-05-11
    • Anobium

      Anobium - 2021-05-11

      My error attributing you to writing some DMA stuff. I dont know what I was thinking.

       
  • mkstevo

    mkstevo - 2021-05-11

    Not fully understanding the concept, but...

    If the "Table" was written to storage area flash, could the location in the PIC be specified and therefore be a known value? The storage area flash looks to be limited to 128 words, which might restrain the size of any table.

     
  • Anobium

    Anobium - 2021-05-11

    We can look into this soon, but, looks rather simple to use, but, this would require a fundamental change change to the serial write (in this example).

    But, there is nothing to stop you from using the code shown in the DMA posts (above) in the latest release candidate.

     
  • Anobium

    Anobium - 2021-05-11

    I read AppNote TB3164 today. This AppNote lays out the basics in a total vacuum of other practices used with an overall solution.

    To use DMA requires a total architectural approach/impact analysis. Example. Move data from a table to serial looks easy. But, what is the data to be moved to the serial and the format (byte or word data) ? If byte data then it may work, if word ...then, the table data in the Progmem would need to formatted (laid out) so the DMA is usable.

    Then, assuming the data is byte data then moving the data out the serial would still be one byte at a time. So, what is the time advantage of RAM buffer read (loaded by the DMA activity) verses the existing Table read ? It is really a huge benefit?

     
1 2 > >> (Page 1 of 2)

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.