Menu

#418 Reading ROMTABLEs with "dap info" seems to irrecoverably hang the DAP

0.10.0
new
nobody
None
2024-01-16
2024-01-10
Bill Paul
No

In order to be able to debug most many multi-core ARM SoCs, you generally need to know the base addresses of the CPUs on what I believe is the MEM-AP bus. Generally the easiest way to obtain this is to dump the ROM tables from the Arm debug interface and look at the component base address values. In OpenOCD, this can be done with the "dap info" command.

However dumping the ROM tables seems to always lead to a "JTAG-DP STICKY ERROR" at some point, and after that the debug interface is unusable until I hard reset the target board.

Here are the relevant details:

  • OpenOCD cloned from git as of yesterday

Open On-Chip Debugger 0.12.0+dev-01472-gadcc8ef87 (2024-01-08-12:50)

  • Host operating systems: FreeBSD 13.2-RELEASE and Ubuntu 20.04
  • Debugger: Olimex ARM-USB-OCD-H
  • Target hardware: NXP IMX8MQ-EVK board (4xCortex-A53 + 1xCortexM4)
  • NXP/Freescale IMX6Q SabreSD board (4xCortex-A9)

Basically, I do the following:

1) Connect the Olimex debugger to the host and target
2) Power up the target board, and stop it at the U-Boot prompt
3) Run OpenOCD to connect to the first Cortex-A core via the debugger
4) telnet to port 4444 to access the OpenOCD shell
5) Run "dap info 1"

This will start printing out some of the ROM tables, but it never finishes successfully. I think the reasons are different, but in either case the symptom is the same: OpenOCD returns a JTAG-DP STICKY ERROR and is never able to recover. I can stop and restart OpenOCD but then it says that examination of the core fails.

From what I can tell, for each component discovered in the ROM table, OpenOCD runs rtp_read_cs_regs() to read the CoreSight registers so that it can identify it. I think the problem is that some components may not respond to these accesses because they may not be currently turned on. For example, with the SabreSD board, enumerating the Debug Unit and Performance Monitoring Unit of core 0 succeeds, but it fails for core 1. This is likely because U-Boot only launches core 0, and the remaining cores remain disabled until an OS starts them.

The output looks like this:

> dap info 1
Timeout during WAIT recovery
JTAG-DP STICKY ERROR
JTAG-DP STICKY ERROR
AP # 0x1
                AP ID register 0x24770002
                Type is MEM-AP APB2 or APB3
MEM-AP BASE 0x82140000
                ROM table in legacy format
                Component base address 0x82140000
                Peripheral ID 0x0000080000
                Designer is 0x000, <invalid>
                Part is 0x000, Unrecognized 
                Component class is 0x1, ROM table
                MEMTYPE system memory not present: dedicated debug bus
        ROMTABLE[0x0] = 0x00001003
                Component base address 0x82141000
                Peripheral ID 0x04003bb907
                Designer is 0x23b, ARM Ltd
                Part is 0x907, CoreSight ETB (Trace Buffer)
                Component class is 0x9, CoreSight component
                Type is 0x21, Trace Sink, Buffer
        ROMTABLE[0x4] = 0x00002003
                Component base address 0x82142000
                Peripheral ID 0x04002bb906
                Designer is 0x23b, ARM Ltd
                Part is 0x906, CoreSight CTI (Cross Trigger)
                Component class is 0x9, CoreSight component
                Type is 0x14, Debug Control, Trigger Matrix
        ROMTABLE[0x8] = 0x00003003
                Component base address 0x82143000
                Peripheral ID 0x04004bb912
                Designer is 0x23b, ARM Ltd
                Part is 0x912, CoreSight TPIU (Trace Port Interface Unit)
                Component class is 0x9, CoreSight component
                Type is 0x11, Trace Sink, Port
        ROMTABLE[0xc] = 0x00004003
                Component base address 0x82144000
                Peripheral ID 0x04001bb908
                Designer is 0x23b, ARM Ltd
                Part is 0x908, CoreSight CSTF (Trace Funnel)
                Component class is 0x9, CoreSight component
                Type is 0x12, Trace Link, Funnel, router
        ROMTABLE[0x10] = 0x0000f003
                Component base address 0x8214f000
                Peripheral ID 0x04000bb4a9
                Designer is 0x23b, ARM Ltd
                Part is 0x4a9, Cortex-A9 ROM (ROM Table)
                Component class is 0x1, ROM table
                MEMTYPE system memory not present: dedicated debug bus
        [L01] ROMTABLE[0x0] = 0x00001003
                Component base address 0x82150000
                Peripheral ID 0x04000bbc09
                Designer is 0x23b, ARM Ltd
                Part is 0xc09, Cortex-A9 Debug (Debug Unit)
                Component class is 0x9, CoreSight component
                Type is 0x15, Debug Logic, Processor
        [L01] ROMTABLE[0x4] = 0x00002003
                Component base address 0x82151000
                Peripheral ID 0x04000bb9a0
                Designer is 0x23b, ARM Ltd
                Part is 0x9a0, CoreSight PMU (Performance Monitoring Unit)
                Component class is 0x9, CoreSight component
                Type is 0x16, Performance Monitor, Processor
        [L01] ROMTABLE[0x8] = 0x00003003
                Component base address 0x82152000
                Can't read component, the corresponding core might be turned off
JTAG-DP STICKY ERROR
Polling target imx6.cpu.0 failed, trying to reexamine

For the IMX8 board, the situation is a little different: the ROM table entries for all 4 Cortex-A53 cores are displayed, but then it hangs later for some other device that I can't identify:

> dap info 1
Timeout during WAIT recovery
JTAG-DP STICKY ERROR
AP # 0x1
                AP ID register 0x44770002
                Type is MEM-AP APB2 or APB3
MEM-AP BASE 0x80000000
            ROM table in legacy format
            Component base address 0x80000000
            Peripheral ID 0x000008e88e
            Designer is 0x00e, Freescale (Motorola)
            Part is 0x88e, Unrecognized 
            Component class is 0x1, ROM table
            MEMTYPE system memory not present: dedicated debug bus
    ROMTABLE[0x0] = 0x00400003
            Component base address 0x80400000
            Peripheral ID 0x04004bb4a1
            Designer is 0x23b, ARM Ltd
            Part is 0x4a1, Cortex-A53 ROM (v8 Memory Map ROM Table)
            Component class is 0x1, ROM table
            MEMTYPE system memory not present: dedicated debug bus
    [L01] ROMTABLE[0x0] = 0x00010003
            Component base address 0x80410000
            Peripheral ID 0x04004bbd03
            Designer is 0x23b, ARM Ltd
            Part is 0xd03, Cortex-A53 Debug (Debug Unit)
            Component class is 0x9, CoreSight component
            Type is 0x15, Debug Logic, Processor
            Dev Arch is 0x47706a15, ARM Ltd "Processor debug architecture (v8.0-A)" rev.0
    [L01] ROMTABLE[0x4] = 0x00020003
            Component base address 0x80420000
            Peripheral ID 0x04004bb9a8
            Designer is 0x23b, ARM Ltd
            Part is 0x9a8, Cortex-A53 CTI (Cross Trigger)
            Component class is 0x9, CoreSight component
            Type is 0x14, Debug Control, Trigger Matrix
            Dev Arch is 0x47701a14, ARM Ltd "Cross Trigger Interface (CTI) architecture" rev.0
    [L01] ROMTABLE[0x8] = 0x00030003
            Component base address 0x80430000
            Peripheral ID 0x04004bb9d3
            Designer is 0x23b, ARM Ltd
            Part is 0x9d3, Cortex-A53 PMU (Performance Monitor Unit)
            Component class is 0x9, CoreSight component
            Type is 0x16, Performance Monitor, Processor
            Dev Arch is 0x47702a16, ARM Ltd "Processor Performance Monitor (PMU) architecture" rev.0
    [L01] ROMTABLE[0xc] = 0x00040003
            Component base address 0x80440000
            Peripheral ID 0x04004bb95d
            Designer is 0x23b, ARM Ltd
            Part is 0x95d, Cortex-A53 ETM (Embedded Trace)
            Component class is 0x9, CoreSight component
            Type is 0x13, Trace Source, Processor
            Dev Arch is 0x47704a13, ARM Ltd "Embedded Trace Macrocell (ETM) architecture" rev.0
    [L01] ROMTABLE[0x10] = 0x00110003
            Component base address 0x80510000
            Peripheral ID 0x04004bbd03
            Designer is 0x23b, ARM Ltd
            Part is 0xd03, Cortex-A53 Debug (Debug Unit)
            Component class is 0x9, CoreSight component
            Type is 0x15, Debug Logic, Processor
            Dev Arch is 0x47706a15, ARM Ltd "Processor debug architecture (v8.0-A)" rev.0
    [L01] ROMTABLE[0x14] = 0x00120003
            Component base address 0x80520000
            Peripheral ID 0x04004bb9a8
            Designer is 0x23b, ARM Ltd
            Part is 0x9a8, Cortex-A53 CTI (Cross Trigger)
            Component class is 0x9, CoreSight component
            Type is 0x14, Debug Control, Trigger Matrix
            Dev Arch is 0x47701a14, ARM Ltd "Cross Trigger Interface (CTI) architecture" rev.0
    [L01] ROMTABLE[0x18] = 0x00130003
            Component base address 0x80530000
            Peripheral ID 0x04004bb9d3
            Designer is 0x23b, ARM Ltd
            Part is 0x9d3, Cortex-A53 PMU (Performance Monitor Unit)
            Component class is 0x9, CoreSight component
            Type is 0x16, Performance Monitor, Processor
            Dev Arch is 0x47702a16, ARM Ltd "Processor Performance Monitor (PMU) architecture" rev.0
    [L01] ROMTABLE[0x1c] = 0x00140003
            Component base address 0x80540000
            Peripheral ID 0x04004bb95d
            Designer is 0x23b, ARM Ltd
            Part is 0x95d, Cortex-A53 ETM (Embedded Trace)
            Component class is 0x9, CoreSight component
            Type is 0x13, Trace Source, Processor
            Dev Arch is 0x47704a13, ARM Ltd "Embedded Trace Macrocell (ETM) architecture" rev.0
    [L01] ROMTABLE[0x20] = 0x00210003
            Component base address 0x80610000
            Peripheral ID 0x04004bbd03
            Designer is 0x23b, ARM Ltd
            Part is 0xd03, Cortex-A53 Debug (Debug Unit)
            Component class is 0x9, CoreSight component
            Type is 0x15, Debug Logic, Processor
            Dev Arch is 0x47706a15, ARM Ltd "Processor debug architecture (v8.0-A)" rev.0
    [L01] ROMTABLE[0x24] = 0x00220003
            Component base address 0x80620000
            Peripheral ID 0x04004bb9a8
            Designer is 0x23b, ARM Ltd
            Part is 0x9a8, Cortex-A53 CTI (Cross Trigger)
            Component class is 0x9, CoreSight component
            Type is 0x14, Debug Control, Trigger Matrix
            Dev Arch is 0x47701a14, ARM Ltd "Cross Trigger Interface (CTI) architecture" rev.0
    [L01] ROMTABLE[0x28] = 0x00230003
            Component base address 0x80630000
            Peripheral ID 0x04004bb9d3
            Designer is 0x23b, ARM Ltd
            Part is 0x9d3, Cortex-A53 PMU (Performance Monitor Unit)
            Component class is 0x9, CoreSight component
            Type is 0x16, Performance Monitor, Processor
            Dev Arch is 0x47702a16, ARM Ltd "Processor Performance Monitor (PMU) architecture" rev.0
    [L01] ROMTABLE[0x2c] = 0x00240003
            Component base address 0x80640000
            Peripheral ID 0x04004bb95d
            Designer is 0x23b, ARM Ltd
            Part is 0x95d, Cortex-A53 ETM (Embedded Trace)
            Component class is 0x9, CoreSight component
            Type is 0x13, Trace Source, Processor
            Dev Arch is 0x47704a13, ARM Ltd "Embedded Trace Macrocell (ETM) architecture" rev.0
    [L01] ROMTABLE[0x30] = 0x00310003
            Component base address 0x80710000
            Peripheral ID 0x04004bbd03
            Designer is 0x23b, ARM Ltd
            Part is 0xd03, Cortex-A53 Debug (Debug Unit)
            Component class is 0x9, CoreSight component
            Type is 0x15, Debug Logic, Processor
            Dev Arch is 0x47706a15, ARM Ltd "Processor debug architecture (v8.0-A)" rev.0
    [L01] ROMTABLE[0x34] = 0x00320003
            Component base address 0x80720000
            Peripheral ID 0x04004bb9a8
            Designer is 0x23b, ARM Ltd
            Part is 0x9a8, Cortex-A53 CTI (Cross Trigger)
            Component class is 0x9, CoreSight component
            Type is 0x14, Debug Control, Trigger Matrix
            Dev Arch is 0x47701a14, ARM Ltd "Cross Trigger Interface (CTI) architecture" rev.0
    [L01] ROMTABLE[0x38] = 0x00330003
            Component base address 0x80730000
            Peripheral ID 0x04004bb9d3
            Designer is 0x23b, ARM Ltd
            Part is 0x9d3, Cortex-A53 PMU (Performance Monitor Unit)
            Component class is 0x9, CoreSight component
            Type is 0x16, Performance Monitor, Processor
            Dev Arch is 0x47702a16, ARM Ltd "Processor Performance Monitor (PMU) architecture" rev.0
    [L01] ROMTABLE[0x3c] = 0x00340003
            Component base address 0x80740000
            Peripheral ID 0x04004bb95d
            Designer is 0x23b, ARM Ltd
            Part is 0x95d, Cortex-A53 ETM (Embedded Trace)
            Component class is 0x9, CoreSight component
            Type is 0x13, Trace Source, Processor
            Dev Arch is 0x47704a13, ARM Ltd "Embedded Trace Macrocell (ETM) architecture" rev.0
    [L01] ROMTABLE[0x40] = 0x00000000
    [L01]   End of ROM table
    ROMTABLE[0x4] = 0x00800003
            Component base address 0x80800000
            Can't read component, the corresponding core might be turned off
>

Again, at this point the debug interface is wedged and I have to reset the board to recover.

Now, I guess I can understand that accessing the CoreSight registers for a component that's not turned on might not work, but I don't think that should result in OpenOCD leaving the entire debug interface in an unusable state. Note that my typical use case is bring-up, meaning that I may not have an OS running on the target yet, but I still need to be able to set up the debugger configuration correctly before I can get to bootstrapping, and for that I need to be able to read the ROM tables correctly. Hence there is a bit of a "chicken and the egg" problem.

From my view, there are two possible issues

1) Somehow the code needs to detect if a component is switched off in a way that doesn't require probing the CoreSight registers, and then just skip it and move to the next ROM table entry.
2) The error recovery code needs to be more robust. As it is now, it doesn't seem to actually reset the interface correctly.

I hope somebody has an idea how to fix this.

-Bill

Discussion

  • Antonio Borneo

    Antonio Borneo - 2024-01-11

    I also have few SoC that present the same issue.
    If one CoreSight device is not clocked, the debug bus (between DAP AP and CoreSight) hangs, waiting for transfer to complete.
    It is possible for OpenOCD to send an ABORT request (it works in JTAG, apparently it does not work well in SWD) so the DAP stop waiting for the hanged AP bus and we can send new commands to DAP, but the bus above is forever locked, and the SoC must be reset or power-down to recover the bus.
    The only workaround I have found is to enable the missing clocks at OpenOCD connect, directly in OpenOCD script.

    I have instead another SoC that after a similar bus hang recovers if in JTAG mode I send an ABORT. But I have to send it manually. It would be great to make this automatically in OpenOCD. Plus ABORT in SWD seams not working well, to be investigated.
    Quitting from OpenOCD and reconnecting it recovers the SoC, because an ABORT is send at OpenOCD connect. This recovers SWD too.
    I did it long time ago, I don't remember the details.

    In your case, disconnecting OpenOCD and re-connecting it "without reset" and, of course, "without power cycle", allows you to get again access to the SoC?

     
  • Bill Paul

    Bill Paul - 2024-01-11

    The only workaround I have found is to enable the missing clocks at OpenOCD connect, directly in OpenOCD script.

    I'm going to guess that your script pokes at SoC-specific registers and is therefore not a general-purpose solution.

    It is possible for OpenOCD to send an ABORT request[...]

    Can you please tell me how to do this? Which DAP are you sending the abort to?

    In your case, disconnecting OpenOCD and re-connecting it "without reset" and, of course, "without power cycle", allows you to get again access to the SoC?

    Er... well, no. :(

    I've been trying to learn how to set up my own config files for new targets so as an experiment I created a simple configuration for the IMX8MQ-EVK board. Normally, running it when the target is sitting at the U-Boot prompt, I see this:

    Open On-Chip Debugger 0.12.0+dev-01472-gadcc8ef87 (2024-01-08-12:50)
    Licensed under GNU GPL v2
    For bug reports, read
            http://openocd.org/doc/doxygen/bugs.html
    Info : Hardware thread awareness created
    force hard breakpoints
    Info : clock speed 1000 kHz
    Error: interface can't drive 'nSRST' high
    Info : JTAG tap: chip.arm tap/device found: 0x5ba00477 (mfg: 0x23b (ARM Ltd), part: 0xba00, ver: 0x5)
    Info : a53.0: hardware has 6 breakpoints, 4 watchpoints
    Info : [a53.0] Examination succeed
    Info : [m4.0] Cortex-M4 r0p1 processor detected
    Info : [m4.0] target has 6 breakpoints, 4 watchpoints
    Info : [m4.0] Examination succeed
    Info : [ahb] Examination succeed
    Info : starting gdb server for a53.0 on 3333
    Info : Listening on port 3333 for gdb connections
    Info : starting gdb server for m4.0 on 3334
    Info : Listening on port 3334 for gdb connections
    Info : gdb port disabled
    Info : Putting Cortex-M4 into a spin loop
    Info : Listening on port 6666 for tcl connections
    Info : Listening on port 4444 for telnet connections
    Info : [m4.0] external reset detected
    Info : accepting 'telnet' connection on tcp/4444
    

    If I restart OpenOCD after "dap info 1" causes a lockup, now I get this:

    Open On-Chip Debugger 0.12.0+dev-01472-gadcc8ef87 (2024-01-08-12:50)
    Licensed under GNU GPL v2
    For bug reports, read
            http://openocd.org/doc/doxygen/bugs.html
    Info : Hardware thread awareness created
    force hard breakpoints
    Info : clock speed 1000 kHz
    Error: interface can't drive 'nSRST' high
    Info : JTAG tap: chip.arm tap/device found: 0x5ba00477 (mfg: 0x23b (ARM Ltd), part: 0xba00, ver: 0x5)
    Error: JTAG-DP STICKY ERROR
    Error: [a53.0] Examination failed
    Warn : target a53.0 examination failed
    Info : [m4.0] Cortex-M4 r0p1 processor detected
    Info : [m4.0] target has 6 breakpoints, 4 watchpoints
    Info : [m4.0] Examination succeed
    Info : [ahb] Examination succeed
    Info : starting gdb server for a53.0 on 3333
    Info : Listening on port 3333 for gdb connections
    Info : starting gdb server for m4.0 on 3334
    Info : Listening on port 3334 for gdb connections
    Info : gdb port disabled
    Info : Putting Cortex-M4 into a spin loop
    Info : Listening on port 6666 for tcl connections
    Info : Listening on port 4444 for telnet connections
    Info : [m4.0] external reset detected
    

    Here, examination of the A53 core has failed.

    It seems I am still able to access AP #0 (the MEM-AP access port) and AP #4 (the Cortex-M4 access port). However the CoreSight components on AP #1 seem to still be in an invalid state. In addition to failing to examine Cortex-A53 core 0, running "dap info 1" to examine the ROM tables again yields this:

    > dap info 1
    JTAG-DP STICKY ERROR
    AP # 0x1
                    AP ID register 0x44770002
                    Type is MEM-AP APB2 or APB3
    MEM-AP BASE 0x80000000
                    ROM table in legacy format
                    Component base address 0x80000000
                    Can't read component, the corresponding core might be turned off
    >
    

    So it seems that the TAP and DAP are still working, and AP #0, #2, #3 and #4 are still working ok, but AP #1 still seems to have problems. It seems like it is the CoreSight logic in particular that is stuck. So far the only way I've found to clear it is a hard reset or power-cycle of the SoC.

    -Bill

     
  • Alex

    Alex - 2024-01-11

    Hello,

    I just want to chime in and say that I noticed the exact same thing today, so the timing of this bug report is quite fortuitous (I'm used to finding bug reports that are 5+ years old and unresolved haha).

    My setup is a little different. Relevant details:

    SW Version: xPack Open On-Chip Debugger 0.12.0+dev-01312-g18281b0c4-dirty (2023-09-04-22:32)
    Host operating systems: Windows 10 64-bit
    Debugger: TI XDS-110
    Target hardware: NXP IMX8MMINI-EVK board (4xCortex-A53 + 1xCortexM4)

    I'm still trying to get this eval kit totally up and running, but I've reached the point where I can see the DAP, create the dap via "dap create...", and then look at the info available. I was looking at the different APs by running $dap_name apsel [num] and then $dap_name info, but I've noticed the same behavior. I can get AP info for AP 0, 2 and 3 (still working on 4, but I think that's a different issue), but once I attempt to get AP info for AP 1, I can't go back and get the info for AP 0 or any others; it just says "AP ID register 0x00000000 No AP found at this AP#0x0". The only way I've found to fix this issue is by power-cycling the target board and reconnecting OpenOCD.

    Antonio, if you have a script that pokes at specific registers and want to share it, I'm sure that's as good a place as any to start figuring out what Bill and I need to do to get ours working. Hopefully eventually it can turn into a more general solution.

     
  • Bill Paul

    Bill Paul - 2024-01-13

    The bus is connected to an AP, so the whole AP gets unaccessible. But the DAP itself has never hanged in my tests, as far as I remember.

    Yes, I apologize: it occurred to me after I filed the bug that I was probably wrong when I wrote DAP in the subject line. The distinction between TAP, DAP and AP is a little tricky to grasp. You are correct that the DAP itself is not what hangs. I should have said that it's the APB AP that gets stuck instead.

    Unfortunately for my failing cases, I am using JTAG transport rather than SWD, so I'm not sure how to proceed with those.

    BTW, I also tested a Texas Instruments SK-AM62 board (ti_am625evm.cfg) and with that one doing "dap info 1" prints all the ROM table entries and doesn't get stuck.

    -Bill

     
  • Alex

    Alex - 2024-01-16

    I believe sending the ABORT command via JTAG is actually pretty simple. I've been looking at the SoC-400 TRM and according to the table on this page https://developer.arm.com/documentation/ddi0480/g/Programmers-Model/DAP-register-summary/Debug-port-register-summary?lang=en ABORT should be as simple as sending a new IR instruction value. Side note: I'm not entirely sure the SoC-400 TRM applies to the i.Mx 8m processors, but it hasn't let me down yet.

    Antonio, thank you for the info! I wonder if our AP is also crashing because there is no clock going to that module. Do we know what module AP 1 is trying to communicate with? Is it the A53 cores? The M4 core? Something else?

     

Log in to post a comment.

MongoDB Logo MongoDB