Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.

Close

Function pointers and harvard architecture

Colonel-k
2010-04-26
2013-03-12
  • Colonel-k
    Colonel-k
    2010-04-26

    I have a general question about devices not blessed with the ability to directly load the PC (other than an absolute jump). Would it be possible to port SDCC or is any kind of C compiler out of the question?

    I thought a possible strategy might be a) simply convert JUMPTABLEs back into a series of conditional branches b) automatically generate a special function which translates PCALLs into calls to the labels which are function entry points.

    Any thoughts? Is this entirely impractical?

    I am considering a port to the KCPSM3 (PicoBlaze) "soft-core" processor. It has separate register set, (initialised) data RAM and program space. Program space is limited to 1k words, so there would be a reasonable upper limit to the size of any jump table.

     
  • Maarten Brock
    Maarten Brock
    2010-04-27

    It should be possible. There is no need for the compiler to use a JUMPTABLE for a switch statement. Solution a) is already used when it's more efficient.

    To implement function pointers solution b) could work, but one can also choose not to implement PCALL at all. Function pointers are rare in small embedded applications IMHO. It is also possible to map a part of the RAMBLOCK as scratchpad memory and modify the contents to create an indirect jump.

    I've also thought about porting to the picoBlaze, but did not yet find the time and drive. Another problem is that there is no linker (with library support) and one also needs a simulator for the regression tests.

     
  • Colonel-k
    Colonel-k
    2010-04-27

    I've also thought about porting to the picoBlaze, but did not yet find the time and drive. Another problem is that there is no linker (with library support) and one also needs a simulator for the regression tests.

    Great! Glad to hear I'm not the only one.

    I have no experience of writing linkers. Can we make do without one to begin with? (i.e. by #including the other source files where required.) I don't think the simulator will be a problem. I have actually just finished writing one, together with some tests to check the functionality.

    … It is also possible to map a part of the RAMBLOCK as scratchpad memory and modify the contents to create an indirect jump.

    I wondered about that. Does it involve modifying the macro to support this behaviour or is there a version which already does this? (I guess the I/O bus could be mapped to a second port on the program memory without needing to modify anything.)

     
  • Maarten Brock
    Maarten Brock
    2010-04-27

    Hi (do you have a real name?)

    I've already implemented this and posted to the picoBlaze forum.

    http://forums.xilinx.com/xlnx/board/crawl_message?board.id=PicoBlaze&message.id=634

    The instruction set is extended with HALT and I also have experimented with indexed RAM access extensions which can be usefull for array access, struct access and stack access. The stack would be a software stack though.

    The pBlazAsm assembler by Mediatronix supports preinitialized scratchpad ram and can be used for CONSTs, STATICs and globals.

    Maarten

     
  • Colonel-k
    Colonel-k
    2010-04-27

    The instruction set is extended with HALT and I also have experimented with indexed RAM access extensions which can be usefull for array access, struct access and stack access. The stack would be a software stack though.

    That looks very useful. I'm not much of a VHDL-er though. How do you resolve the discrepancy between the program word width (18-bits) and the scratchpad RAM (8-bits)? Are some bits simply ignored?

    I think I'd be more interested in targeting an unmodified KCPSM, simply because it is well tested and I guess hand-placed to get the best out of the FPGA. Maybe there could be switches to tell the compiler whether the processor has any modifications to support PCALLs.

    Out of interest though, would a dedicated PCALL instruction not be a simpler way to add the functionality? Maybe the CALL instruction could be extended to take advantage of the unused bit 13, making up the target address out of the content of a register somehow? (PCALL targets have to be .ORG'd on four word boundaries, or targets must be in a particular 256 word page maybe, a-la 6502 page 0 addressing, or the upper 2 bits of the address is simply hard coded into the PCALL.)

    The pBlazAsm assembler by Mediatronix supports preinitialized scratchpad ram and can be used for CONSTs, STATICs and globals.

    Yes, I've seen that and have been playing with it.

     
  • Maarten Brock
    Maarten Brock
    2010-04-27

    How do you resolve the discrepancy between the program word width (18-bits) and the scratchpad RAM (8-bits)?

    The second port of the BRAM is connected in 8 bit mode to the scratchpad memory bus instead of the 64 bytes distributed ram with the highest address bits fixed at "111". So the last 128 instruction addresses overlap the 256 bytes scratchpad ram. The very last instruction is the interrupt vector and therefor the last two bytes should not be used as ram. The second last instruction could be used for the PCALL by writing two bytes in the ram.

    The current KCPSM3 would suffer severly in speed if a PCALL is added because of the 4 input limit on LUTs (IIRC: t_state, pc+1, jump/call, ret/reti). On the Spartan6 which has 6 input LUTs this would be feasible I guess.

    You cannot use an unmodified KCPSM3 because the scratchpad ram cannot be initialized in it.

     
  • Colonel-k
    Colonel-k
    2010-04-27

    You cannot use an unmodified KCPSM3 because the scratchpad ram cannot be initialized in it.

    Drat! Of course. But what if the startup code initialised the RAM? Rotten solution, I know, as it's two instructions per byte…

    very last instruction is the interrupt vector and therefor the last two bytes should not be used as ram. The second last instruction could be used for the PCALL by writing two bytes in the ram.

    I see. 30XXX == CALL

    The current KCPSM3 would suffer severly in speed if a PCALL is added because of the 4 input limit on LUTs (IIRC: t_state, pc+1, jump/call, ret/reti). On the Spartan6 which has 6 input LUTs this would be feasible I guess.

    Pardon my ignorance, what's t_state? Isn't the clock enable used to gate the data into the PC register in hardware? I also had worked out four possible sources for the PC: PC+1, a , (SP) and (SP) + 1. (SP) is for RETURNI and (SP) + 1 is for RETURN, but I guess they could probably be rationalised.

     
  • Maarten Brock
    Maarten Brock
    2010-04-27

    Using startup code to initialize ram really sucks. I really don't want to go there.

    t_state is the internal signal that toggles every clock tick and determines in which state the processor is. It is what makes all instructions take two clock ticks. Inverted it is called pc_enable. But there are a few other signals in play too, like instruction(13) and instruction(15). They really optimized that part making use of the holes (don't cares) in the instruction set. See kcpsm3.vhd.

    I like the idea of a jumptable in the zero page though to convert 8 bit "function handles" into 10 bit function pointers.

    But I think that function pointers should just be impossible for KCPSM3 at first. I would also start without a data stack and thus without reentrancy.

     
  • Colonel-k
    Colonel-k
    2010-04-27

    like the idea of a jumptable in the zero page though to convert 8 bit "function handles" into 10 bit function pointers.

    But I think that function pointers should just be impossible for KCPSM3 at first. I would also start without a data stack and thus without reentrancy.

    Are any of the existing SDCC targets like this (lacking a software stack and function pointers)? Just wondering where one would best start with the port.

     
  • Maarten Brock
    Maarten Brock
    2010-04-27

    IIRC the pic14 has little or no data stack and uses a software stack. I'm not sure about the pic16. But also the mcs51 and hc08 don't use the stack by default since it is limited in both size and ways of access. Still the mcs51 is huge compared to the picoBlaze.

    I do not know of a target that has no indirect jump since you can usually always push address bytes on stack and perform a return. And even though the mcs51 has a JMP @DPTR it is hardly used by SDCC as it needs DPTR for passing parameters.