From: Chuck M. <chu...@gm...> - 2017-03-13 21:24:15
|
Thank you Matthew, that is excellent. --Chuck On Mon, Mar 13, 2017 at 1:58 PM, Matthew Lai <m...@ma...> wrote: > Code: > > --------------------------------------------------------------------- > > #include <libopencm3/stm32/rcc.h> > #include <libopencm3/stm32/timer.h> > > int main(void) > { > rcc_periph_clock_enable(RCC_TIM3); > timer_set_period(TIM3, 1); > return 0; > } > > --------------------------------------------------------------------- > > Using default compiler options for f4: > > --------------------------------------------------------------------- > > 080001ac <main>: > 80001ac: b508 push {r3, lr} > 80001ae: f640 0001 movw r0, #2049 ; 0x801 > 80001b2: f000 f80b bl 80001cc <rcc_periph_clock_enable> > 80001b6: 2101 movs r1, #1 > 80001b8: 4802 ldr r0, [pc, #8] ; (80001c4 <main+0x18>) > 80001ba: f000 f805 bl 80001c8 <timer_set_period> > 80001be: 2000 movs r0, #0 > 80001c0: bd08 pop {r3, pc} > 80001c2: bf00 nop > 80001c4: 40000400 .word 0x40000400 > > 080001c8 <timer_set_period>: > 80001c8: 62c1 str r1, [r0, #44] ; 0x2c > 80001ca: 4770 bx lr > > 080001cc <rcc_periph_clock_enable>: > 80001cc: 0943 lsrs r3, r0, #5 > 80001ce: f103 4380 add.w r3, r3, #1073741824 ; 0x40000000 > 80001d2: f503 330e add.w r3, r3, #145408 ; 0x23800 > 80001d6: f000 021f and.w r2, r0, #31 > 80001da: 6819 ldr r1, [r3, #0] > 80001dc: 2001 movs r0, #1 > 80001de: 4090 lsls r0, r2 > 80001e0: 4308 orrs r0, r1 > 80001e2: 6018 str r0, [r3, #0] > 80001e4: 4770 bx lr > > --------------------------------------------------------------------- > The store in rcc_periph_clock_enable happens at 80001e2. After returning > to main(), we take 2 cycles to load arguments to timer_set_period, then > calls it, and timer_set_period writes to the timer register in the first > instruction. > > If my math is correct, the second write happens on the 5th cycle after the > first write. This violates the errata sheet constraint by one cycle. For > APB peripherals, the errata sheet says we need to wait 1 + (AHB/APB) > cycles, which in this case would be 5 (most people use (AHB/APB = 4). > > Now this is with default compiler options on f4, with the lowest APB > prescaler. With a higher prescaler, it would be even worse. With -flto, it > would be even worse. On f7, we are very very far from the 8 cycles required > (keeping in mind that Cortex-M7 can execute 2 instructions per cycle). > > So I believe we are hitting it on all series (that require this delay), at > default compiler settings and clock settings provided by libopencm3. > > Matthew > > > On 03/13/2017 08:15 PM, Chuck McManis wrote: > > Matthew, > > I'm really grateful you brought up the warning you read in the F7 manual. > Clearly if someone tried to read or write a register and it wasn't ready > the system would hard fault and that would be bad. > > I have reviewed the code and the warning in the datasheet and believe it > is impossible to hit this situation with the current library. Certainly on > the F0-F4 series and unlikely on the F7 series. > > It sounded like you still had concerns. If you do, it would be super > helpful if you had an example that demonstrated this problem and especially > helpful if that example could be recreated on one of the ST demo boards. > > --Chuck > > > On Mon, Mar 13, 2017 at 1:01 PM, Matthew Lai <m...@ma...> wrote: > >> Chuck, I'm not sure why you feel that way, but having disagreements and >> discussions about technical aspects of a project is usually a good thing. >> >> I am also not sure how I am not being helpful by pointing out something >> on the errata sheets that would be a potential source of bugs, some ways it >> can happen, and some ways to mitigate it, for different chips, based on the >> chip manufacturer's recommendations. >> >> Use any combination of compiler flags to do what? Prove that there will >> be a less than 16 instructions delay between clock enable and the first >> peripheral register write? Does that really need proving? You don't even >> need to change compiler flags for that. >> >> I also have no idea what ST's supplied software has anything to do with >> any of this. >> >> Can we go back to technical discussions now? >> >> If so, here is how Chromium OS fixed it: >> <https://chromium-review.googlesource.com/c/246181/> >> https://chromium-review.googlesource.com/c/246181/ >> >> Matthew >> On 03/13/2017 07:41 PM, Chuck McManis wrote: >> >> Matthew, it feels like you are trying to create an argument here. >> >> If you want to be helpful, and this is actually a problem given the way >> libopencm3 is built, then *reproduce the problem* with libopencm3. You can >> use any combination of compiler flags you want, you just can't change the >> source inside of libopencm3. >> >> *If* you can reproduce the problem then that will show what the window of >> opportunity is. If you cannot, then you will see that Karl is correct in >> that it won't occur in libopencm3. >> >> If having this issue is such a problem for you, then by all means use the >> ST Micro supplied software which is written presumably by their experts. >> >> ---Chuck >> >> On Mon, Mar 13, 2017 at 10:57 AM, Matthew Lai < <m...@ma...> >> m...@ma...> wrote: >> >>> On 03/13/2017 01:25 PM, Karl Palsson wrote: >>> >>>> Matthew Lai <m...@ma...> wrote: >>>> >>>> I would ignore it. There's very few places where it will really >>>>>> matter, and the people who will need it, will be handling it themselves >>>>>> anyway, not using high level helper functions. >>>>>> >>>>> It would mean things like: >>>>> rcc_periph_clock_enable(RCC_TIM3); >>>>> timer_set_period(TIM3, 1024); >>>>> >>>>> Would violate this, at high prescalar settings (or low >>>>> prescalar settings with LTO). That is actually the example >>>>> given here: >>>>> https://github.com/libopencm3/libopencm3/blob/master/lib/stm >>>>> 32/common/timer_common_all.c#L70 >>>>> >>>>> It would be a potential source of very difficult to find bugs. >>>>> >>>> There's plenty of edge cases if you really want to go looking. >>>> I'm not really entirely sure how far to go trying to protect >>>> people. [1] >>>> >>> That is true, but in this case, an user that uses libopencm3 functions >>> to setup clocks, enables clock for a timer, and uses it right away will be >>> bitten. >>> >>> I'd say we should protect the user in this case, since 1) it's a very >>> common sequence, 2) most users probably don't know about it (it's not >>> mentioned at all in reference manuals until f7), 3) resulting bugs will be >>> very hard to find, and 4) I think there's a pretty easy fix (see below). >>> >>> APB1 is usually AHB/4 on f2, f4, and f7 at least. That means 5 cycles >>> are required for f2 and f4, and 8 cycles on f7. I wouldn't want to rely on >>> the compiler to happen to generate that much delay, especially on f7 which >>> is dual-issue (8 cycles can mean 16 instructions). >>> >>>> Since I imagine rcc_periph_clock_enable() to not be >>>>> performance-critical in most applications, wouldn't adding the >>>>> wait states be better default behaviour, and people who really >>>>> need to enable clocks very quickly can go to registers >>>>> directly? >>>>> >>>> It's more that I'm not sure you can do it neatly, and be robust >>>> enough to be worth trying. You'll need to interpret what bus a >>>> peripheral is on, as well as what speed it's currently running >>>> at. Doing this using the rcc_axx_speed globals is... ok, but >>>> they're not set if people didn't use clock helpers to set them >>>> up. >>>> >>>> Adding your peripheral speed as a parameter is a gross user API. >>>> For the vast majority of use cases, this is simply never going to >>>> be a problem. Doing any sort of meaningful cycle delay is going >>>> to just add code that most people will never need. >>>> >>>> In my opinion at least :) >>>> >>> According to the errata sheets, it's very easy on pre-f7 chips. All >>> that's required is a DSB barrier instruction. We don't actually have to do >>> any cycle counting, or determining which bus the peripheral is on. >>> >>> Unfortunately that option is not mentioned for f7, which requires 2x >>> peripheral clock cycles (could be because of the new AXIM bus). However, >>> there's still no need to figure out which bus a peripheral is on - we can >>> always just do enough delay for APB1. There's no harm in delaying for >>> longer than necessary, unless someone needs to enable clocks really >>> quickly. Figuring out APB1 frequency is not easy. I would just use the >>> globals and add a note in the documentation that users are on their own if >>> they are setting up their own clocks. >>> >>> Matthew >>> >> >> >> > > |