[patch] add -t command line option to specify i2c communication timeout
Open source flash program for STM32 using the ST serial bootloader
Brought to you by:
tormod
Hello,
This patch adds -t timeout command line option to add i2c timeout.
This makes erase and flashing to work with i2c designware IP + stm32f723 combo.
Without this patch, I was able to dump flash content but not erase/write.
I was getting i2c designware driver timeouts, due to long clock stretches (seen with logic analyzer).
Anonymous
A new version of the patch, with an helpful error message.
In the commit message you say that the default timeout is 10 ms, but I2C.txt says it can be milliseconds to seconds depending on HW and drivers. Can you clarify?
Should we rather take a ms argument instead of 10ms multiples? I understand the ioctl uses the latter but e.g. your error message lists it in ms. I was thinking if the same -t option could apply for serial as well, there is a termios timeout there, currently hard-coded to 500 ms. That can be added later of course, but it would be good to use ms then.
Do I understand correctly that these timeouts don't happen in the case we are using the non-stretching commands (which are selected if the device support both kinds)? So the message hinting the user to try setting a timeout should only be displayed in the stretching case? Can it be added to stm32_warn_stretching() instead?
Hmm, no, your example in the commit message is a no-stretch command causing the clock stretch, and I understand the stm32_warn_stretching() warns about another possible failure mode.
Well, the timeout discussion in I2C.c is related to your patch here and would need updating, whereas "controller not accepting clock stretching" sounds like something else than "controller time outs if clock stretching is too long".
What is return value and errno from read() in i2c_read() when the driver times out? The problem is that i2c_read() doesn't return PORT_ERR_TIMEDOUT like the serial_posix_read() would do.
Hmm, I2C.txt says "In this case the I2C controller will timeout and report error to stm32flash.
There is no possibility for stm32flash to retry, so it can only signal the error and exit.".
Does this mean the port must be reset somehow and that just reading again (like the serial code would do) does not work?
Here is a rebased version of Yann's patch, on top of current master. Any run-time suggestions about using the new option can probably be added to stm32_warn_stretching().
I still don't understand why you get timeout with STM32F723 because this chip has the non-stretching commands and therefore long clock stretching shouldn't happen. Where exactly in the communication does it appear?
Last edit: Tormod Volden 2021-08-28
Hello Tormod,
About your v2, I prefer my version because I make it explicit about the "10 ms steps".
For instance with the v2 patch if I do "-t 9" I get 0 timeout.
If I do "-t 19" I get 10 ms.
Or maybe round it to superior bound?
Yes, I agree, rounding up makes sense.
Hello,
I would need to find time to have a look again at what happens exactly. I will do that in the next ~30 days like for ticket 98.
In my case when the master times out I need to reboot the board. After the timeout, the i2c Linux bus (/dev/i2c-XX) is un-usable/dead.
OK. From reading up on this, I understand that the STM32 bootloader needs reboot if the communication is stopped by the controller due to clock-stretching timeout. But the mystery is why is it clock-stretching in your case, when non-stretching commands should be used?
It is a bit of mystery to me also.
I can confirm that without the "-t timeout" patch, I get this behaviour:
stm32flash -w flashdump -v -g 0x0 -a 0x4d /dev/i2c-1
stm32flash 0.6
http://stm32flash.sourceforge.net/
Using Parser : Raw BINARY
Size : 524288
Warning: Not a tty: /dev/i2c-1
Error probing interface "serial_posix"
Interface i2c: addr 0x4d
Version : 0x12
Device ID : 0x0452 (STM32F72xxx/73xxx)
Write to memory
Erasing memory
[ 296.082595] i2c_designware 20191000.i2c: controller timed out
[ 297.090858] i2c_designware 20191000.i2c: timeout in disabling adapter
Failed to read ACK byte
Mass erase failed. Try specifying the number of pages to be erased.
Failed to erase memory
With the -t timeout patch I get:
stm32flash -w flashdump -v -g 0x0 -a 0x4d /dev/i2c-1 -t 8000
stm32flash 0.6
http://stm32flash.sourceforge.net/
Using Parser : Raw BINARY
Size : 524288
Warning: Not a tty: /dev/i2c-1
Error probing interface "serial_posix"
Interface i2c: addr 0x4d
Version : 0x12
Device ID : 0x0452 (STM32F72xxx/73xxx)
Write to memory
Erasing memory
Wrote and verified address 0x08080000 (100.00%) Done.
Starting execution at address 0x08000000... done.
Also, please see attached screenshots of logic analyzer dump of the I2C lines:
big_stretch.jpg shows the "unzoomed" situation, with the big 7+ seconds stretch.
write_stretched.jpg shows the zoomed-in situation, right before the stretch. When the mass erase command it sent.
we can see that a "non-stretched global mass erase" is sent:
Write : 0x45 0xBA
Read : 0x79 (y == ACK)
Write : 0xFF 0xFF 0x00 (special mass erase + checksum)
Read : the address of the i2c slave is not ACKnowledged untill 7 seconds later, it is being clock stretched.
Then after the ACK of address is sent, the 0x 79 (y == ACK) is read quickly.
Last edit: Yann Sionneau 2021-10-13
It shouldn't stretch but respond with 0x76 (BUSY). Strange. Also, the bootloader protocol version is 1.2 (0x12). Table 3 in AN4221 (rev 10, June 2021) only lists V1.0 and V1.1.
What is the non-decoded traffic on the bus in write_stretched.jpg, before 45 BA, the R?, and before and after FF FF 00 ? The last part before the clock is held down, is it the bootloader starting on replying something?
Here are zoomed-in pictures of the analyzer
I honestly don't know what the bootloader is doing since I don't have its source code :/
I also don't have the spec of v1.2, just like you indeed I can only check that in the spec only 1.0 and 1.1 are specified... :/
If you have friends at STmicro, maybe it's the time to ask them :p
Last edit: Yann Sionneau 2021-10-21
It actually looks the same as the start of the previous packets, that is, the 0x4D slave address written by the master. So if I understand correctly it is the master writing the address and a "read" bit (high) and then the slave immediately takes down the clock. AN2606 confirms that the 7-bit slave address on I2C2 should be 0x4D. So it is exactly what you would expect if stretching commands were in use. Maybe there is an erratum for your chips? I think the customer must ask ST :) (I don't have any connections there).
Maybe you can try inserting a small delay between "sending list of pages" and the stm32_get_ack_timeout? In case there would be a curious race condition in the bootloader.
Only reference I have on bootloader v1.2 is: https://community.st.com/s/question/0D53W00000bfOip/stm32g030f6-i2c-bootloader-v12
I added a sleep(10) (since I get approx' 7 seconds of clock stretching usually) in
stm32_mass_erase()between the write of FF FF 00
and the call to
stm32_get_ack_timeout()The result I get on the lines is that the
stm32_get_ack_timeout()is NACKed.Meaning I see a long "pause" during which no I2C activity, and then I can see a Read issues to address 0x4D, the I2C address is NACK'ed. And the whole operation fails.
I replaced the
sleep(10)withsleep(1)and I don't get the NACK anymore, I am back to the timeout due to clock being stretched a very long time.Now I get :
Write FF FF 00 ; 1 second idle on the I2C lines (logical level 1) ; then the master tries to issue a Read, but the ACK bit is clock stretched (like the original issue).
Then I end up seeing 0x76 on the lines but it's too late, the master abandoned because of a timeout in the Linux driver stack.
btw 0x76 seems to mean "BUSY" so I get both a big clock stretch AND a "busy" at the end.
It seems very weird.
I checked silicon errata of the device and I saw nothing related to I2C or bootloader.
What can I do to make this patch progress forward?
When you added the sleep(10) you did exactly what AN4221 recommends for clock-stretching bootloaders:
I was actually going to propose an alternative patch where the user can provide chip specific timing for page writes, page erase, and mass erase and then stm32flash will calculate the time needed and wait before reading the ACK, since this would follow the above recommendation. However it seems this would not work in your case.
Is this issue seen on one single chip, or many from the same batch or even across batches? If this is just a bad batch from ST I am not sure we should change stm32flash for it. Do you have the chance to test also other STM32 models? I will try to see if I can buy an STM32 with I2C bootloader and test a bit myself. It seems like most models have bootloader 1.1 and /should/ work as-is and only a few have I2C bootloader 1.0 and would require a workaround for the clock-stretching. Now your 1.2 is mysterious.
Hello,
I've asked the question on ST forum: https://community.st.com/s/question/0D53W00001BTueOSAT/stm32f723-i2c-bootloader-nostretch-erase-command-leads-to-long-clock-stretching-anyway-how-to-deal-with-it
Also getting this issue with an STM32H750 and v1.2 bootloader.
From ST forum:
Last edit: Tormod Volden 2021-12-04