Menu

#131 [patch] add -t command line option to specify i2c communication timeout

none
New
nobody
None
Medium
Patch
2023-12-21
2021-04-28
No

Hello,
This patch adds -t timeout command line option to add i2c timeout.
This makes erase and flashing to work with i2c designware IP + stm32f723 combo.
Without this patch, I was able to dump flash content but not erase/write.
I was getting i2c designware driver timeouts, due to long clock stretches (seen with logic analyzer).

1 Attachments

Discussion

1 2 > >> (Page 1 of 2)
  • Yann Sionneau

    Yann Sionneau - 2021-04-29

    A new version of the patch, with an helpful error message.

     
  • Tormod Volden

    Tormod Volden - 2021-08-18

    In the commit message you say that the default timeout is 10 ms, but I2C.txt says it can be milliseconds to seconds depending on HW and drivers. Can you clarify?

    Should we rather take a ms argument instead of 10ms multiples? I understand the ioctl uses the latter but e.g. your error message lists it in ms. I was thinking if the same -t option could apply for serial as well, there is a termios timeout there, currently hard-coded to 500 ms. That can be added later of course, but it would be good to use ms then.

     
  • Tormod Volden

    Tormod Volden - 2021-08-21

    Do I understand correctly that these timeouts don't happen in the case we are using the non-stretching commands (which are selected if the device support both kinds)? So the message hinting the user to try setting a timeout should only be displayed in the stretching case? Can it be added to stm32_warn_stretching() instead?

     
  • Tormod Volden

    Tormod Volden - 2021-08-21

    Hmm, no, your example in the commit message is a no-stretch command causing the clock stretch, and I understand the stm32_warn_stretching() warns about another possible failure mode.

     
  • Tormod Volden

    Tormod Volden - 2021-08-21

    Well, the timeout discussion in I2C.c is related to your patch here and would need updating, whereas "controller not accepting clock stretching" sounds like something else than "controller time outs if clock stretching is too long".

     
  • Tormod Volden

    Tormod Volden - 2021-08-23

    What is return value and errno from read() in i2c_read() when the driver times out? The problem is that i2c_read() doesn't return PORT_ERR_TIMEDOUT like the serial_posix_read() would do.

     
  • Tormod Volden

    Tormod Volden - 2021-08-23

    Hmm, I2C.txt says "In this case the I2C controller will timeout and report error to stm32flash.
    There is no possibility for stm32flash to retry, so it can only signal the error and exit.".

    Does this mean the port must be reset somehow and that just reading again (like the serial code would do) does not work?

     
  • Tormod Volden

    Tormod Volden - 2021-08-28

    Here is a rebased version of Yann's patch, on top of current master. Any run-time suggestions about using the new option can probably be added to stm32_warn_stretching().

    I still don't understand why you get timeout with STM32F723 because this chip has the non-stretching commands and therefore long clock stretching shouldn't happen. Where exactly in the communication does it appear?

     

    Last edit: Tormod Volden 2021-08-28
    • Yann Sionneau

      Yann Sionneau - 2021-12-20

      Hello Tormod,

      About your v2, I prefer my version because I make it explicit about the "10 ms steps".
      For instance with the v2 patch if I do "-t 9" I get 0 timeout.
      If I do "-t 19" I get 10 ms.
      Or maybe round it to superior bound?

       
      • Tormod Volden

        Tormod Volden - 2021-12-20

        Yes, I agree, rounding up makes sense.

         
  • Yann Sionneau

    Yann Sionneau - 2021-09-22

    Hello,
    I would need to find time to have a look again at what happens exactly. I will do that in the next ~30 days like for ticket 98.
    In my case when the master times out I need to reboot the board. After the timeout, the i2c Linux bus (/dev/i2c-XX) is un-usable/dead.

     
  • Tormod Volden

    Tormod Volden - 2021-09-22

    OK. From reading up on this, I understand that the STM32 bootloader needs reboot if the communication is stopped by the controller due to clock-stretching timeout. But the mystery is why is it clock-stretching in your case, when non-stretching commands should be used?

     
  • Yann Sionneau

    Yann Sionneau - 2021-10-13

    It is a bit of mystery to me also.
    I can confirm that without the "-t timeout" patch, I get this behaviour:

    stm32flash -w flashdump -v -g 0x0 -a 0x4d /dev/i2c-1

    stm32flash 0.6

    http://stm32flash.sourceforge.net/

    Using Parser : Raw BINARY
    Size : 524288
    Warning: Not a tty: /dev/i2c-1
    Error probing interface "serial_posix"
    Interface i2c: addr 0x4d
    Version : 0x12
    Device ID : 0x0452 (STM32F72xxx/73xxx)

    • RAM : Up to 256KiB (16384b reserved by bootloader)
    • Flash : Up to 512KiB (size first sector: 1x16384)
    • Option RAM : 32b
    • System RAM : 59KiB
      Write to memory
      Erasing memory
      [ 296.082595] i2c_designware 20191000.i2c: controller timed out
      [ 297.090858] i2c_designware 20191000.i2c: timeout in disabling adapter
      Failed to read ACK byte
      Mass erase failed. Try specifying the number of pages to be erased.
      Failed to erase memory

    With the -t timeout patch I get:

    stm32flash -w flashdump -v -g 0x0 -a 0x4d /dev/i2c-1 -t 8000

    stm32flash 0.6

    http://stm32flash.sourceforge.net/

    Using Parser : Raw BINARY
    Size : 524288
    Warning: Not a tty: /dev/i2c-1
    Error probing interface "serial_posix"
    Interface i2c: addr 0x4d
    Version : 0x12
    Device ID : 0x0452 (STM32F72xxx/73xxx)

    • RAM : Up to 256KiB (16384b reserved by bootloader)
    • Flash : Up to 512KiB (size first sector: 1x16384)
    • Option RAM : 32b
    • System RAM : 59KiB
      Write to memory
      Erasing memory
      Wrote and verified address 0x08080000 (100.00%) Done.

    Starting execution at address 0x08000000... done.

    Also, please see attached screenshots of logic analyzer dump of the I2C lines:

    big_stretch.jpg shows the "unzoomed" situation, with the big 7+ seconds stretch.
    write_stretched.jpg shows the zoomed-in situation, right before the stretch. When the mass erase command it sent.

    we can see that a "non-stretched global mass erase" is sent:
    Write : 0x45 0xBA
    Read : 0x79 (y == ACK)
    Write : 0xFF 0xFF 0x00 (special mass erase + checksum)
    Read : the address of the i2c slave is not ACKnowledged untill 7 seconds later, it is being clock stretched.
    Then after the ACK of address is sent, the 0x 79 (y == ACK) is read quickly.

     

    Last edit: Yann Sionneau 2021-10-13
  • Tormod Volden

    Tormod Volden - 2021-10-13

    It shouldn't stretch but respond with 0x76 (BUSY). Strange. Also, the bootloader protocol version is 1.2 (0x12). Table 3 in AN4221 (rev 10, June 2021) only lists V1.0 and V1.1.

     
  • Tormod Volden

    Tormod Volden - 2021-10-13

    What is the non-decoded traffic on the bus in write_stretched.jpg, before 45 BA, the R?, and before and after FF FF 00 ? The last part before the clock is held down, is it the bootloader starting on replying something?

     
  • Yann Sionneau

    Yann Sionneau - 2021-10-21

    Here are zoomed-in pictures of the analyzer
    I honestly don't know what the bootloader is doing since I don't have its source code :/
    I also don't have the spec of v1.2, just like you indeed I can only check that in the spec only 1.0 and 1.1 are specified... :/
    If you have friends at STmicro, maybe it's the time to ask them :p

     

    Last edit: Yann Sionneau 2021-10-21
  • Tormod Volden

    Tormod Volden - 2021-10-21

    It actually looks the same as the start of the previous packets, that is, the 0x4D slave address written by the master. So if I understand correctly it is the master writing the address and a "read" bit (high) and then the slave immediately takes down the clock. AN2606 confirms that the 7-bit slave address on I2C2 should be 0x4D. So it is exactly what you would expect if stretching commands were in use. Maybe there is an erratum for your chips? I think the customer must ask ST :) (I don't have any connections there).

     
  • Tormod Volden

    Tormod Volden - 2021-10-21

    Maybe you can try inserting a small delay between "sending list of pages" and the stm32_get_ack_timeout? In case there would be a curious race condition in the bootloader.

     
  • Yann Sionneau

    Yann Sionneau - 2021-10-29

    I added a sleep(10) (since I get approx' 7 seconds of clock stretching usually) in stm32_mass_erase()
    between the write of FF FF 00
    and the call to stm32_get_ack_timeout()

    The result I get on the lines is that the stm32_get_ack_timeout() is NACKed.

    Meaning I see a long "pause" during which no I2C activity, and then I can see a Read issues to address 0x4D, the I2C address is NACK'ed. And the whole operation fails.

     
  • Yann Sionneau

    Yann Sionneau - 2021-10-29

    I replaced the sleep(10) with sleep(1) and I don't get the NACK anymore, I am back to the timeout due to clock being stretched a very long time.

    Now I get :
    Write FF FF 00 ; 1 second idle on the I2C lines (logical level 1) ; then the master tries to issue a Read, but the ACK bit is clock stretched (like the original issue).
    Then I end up seeing 0x76 on the lines but it's too late, the master abandoned because of a timeout in the Linux driver stack.
    btw 0x76 seems to mean "BUSY" so I get both a big clock stretch AND a "busy" at the end.
    It seems very weird.

    I checked silicon errata of the device and I saw nothing related to I2C or bootloader.

     
  • Yann Sionneau

    Yann Sionneau - 2021-11-02

    What can I do to make this patch progress forward?

     
  • Tormod Volden

    Tormod Volden - 2021-11-02

    When you added the sleep(10) you did exactly what AN4221 recommends for clock-stretching bootloaders:

    For Write, Erase and Read Unprotect commands, the host must respect the related timings
    (i.e. page write, sector erase) specified in product datasheets. As an example, when
    launching an Erase command, the host has to wait (before the last ACK of the command)
    for a duration equivalent to the maximum sector/page erase time specified in datasheet (or
    at least the typical sector/page erase time).

    I was actually going to propose an alternative patch where the user can provide chip specific timing for page writes, page erase, and mass erase and then stm32flash will calculate the time needed and wait before reading the ACK, since this would follow the above recommendation. However it seems this would not work in your case.

    Is this issue seen on one single chip, or many from the same batch or even across batches? If this is just a bad batch from ST I am not sure we should change stm32flash for it. Do you have the chance to test also other STM32 models? I will try to see if I can buy an STM32 with I2C bootloader and test a bit myself. It seems like most models have bootloader 1.1 and /should/ work as-is and only a few have I2C bootloader 1.0 and would require a workaround for the clock-stretching. Now your 1.2 is mysterious.

     
  • Anonymous

    Anonymous - 2021-12-04

    Also getting this issue with an STM32H750 and v1.2 bootloader.

    From ST forum:

    I was running into an issue when the bootloader restarted the STM32 (can be seen by looking at the NRST line) if the sleep() before checking the result of the erase command was too long. Asking for ACK after that naturally returned NACK. I was able to experimentally figure out that a single page erase takes 2s max and a full erase of STM32H750 is 5s. I'm now able to erase and flash.

     

    Last edit: Tormod Volden 2021-12-04
1 2 > >> (Page 1 of 2)

Anonymous
Anonymous

Add attachments
Cancel





MongoDB Logo MongoDB