openipmi-developer Mailing List for Open IPMI
Brought to you by:
cminyard
You can subscribe to this list here.
2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(8) |
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2003 |
Jan
(4) |
Feb
(14) |
Mar
(40) |
Apr
(41) |
May
(17) |
Jun
(50) |
Jul
(16) |
Aug
(37) |
Sep
(57) |
Oct
(44) |
Nov
(48) |
Dec
(35) |
2004 |
Jan
(12) |
Feb
(3) |
Mar
(8) |
Apr
(8) |
May
(22) |
Jun
(23) |
Jul
(14) |
Aug
(51) |
Sep
(21) |
Oct
(38) |
Nov
(8) |
Dec
(17) |
2005 |
Jan
(27) |
Feb
(28) |
Mar
(50) |
Apr
(32) |
May
(55) |
Jun
(38) |
Jul
(26) |
Aug
(40) |
Sep
(67) |
Oct
(86) |
Nov
(25) |
Dec
(29) |
2006 |
Jan
(53) |
Feb
(19) |
Mar
(36) |
Apr
(25) |
May
(27) |
Jun
(56) |
Jul
(28) |
Aug
(15) |
Sep
(37) |
Oct
(63) |
Nov
(63) |
Dec
(105) |
2007 |
Jan
(54) |
Feb
(29) |
Mar
(23) |
Apr
(42) |
May
(6) |
Jun
(70) |
Jul
(51) |
Aug
(58) |
Sep
(27) |
Oct
(43) |
Nov
(52) |
Dec
(24) |
2008 |
Jan
(39) |
Feb
(76) |
Mar
(23) |
Apr
(18) |
May
(5) |
Jun
(7) |
Jul
(12) |
Aug
(7) |
Sep
(2) |
Oct
(6) |
Nov
(22) |
Dec
(31) |
2009 |
Jan
(4) |
Feb
(2) |
Mar
(32) |
Apr
(5) |
May
(22) |
Jun
(5) |
Jul
(9) |
Aug
(6) |
Sep
(12) |
Oct
(30) |
Nov
(27) |
Dec
(31) |
2010 |
Jan
(17) |
Feb
(2) |
Mar
(41) |
Apr
(8) |
May
(19) |
Jun
(11) |
Jul
(53) |
Aug
(1) |
Sep
(14) |
Oct
(31) |
Nov
(13) |
Dec
(10) |
2011 |
Jan
(10) |
Feb
(15) |
Mar
(6) |
Apr
(6) |
May
(4) |
Jun
|
Jul
(6) |
Aug
(5) |
Sep
(6) |
Oct
(9) |
Nov
(2) |
Dec
(3) |
2012 |
Jan
|
Feb
(10) |
Mar
(11) |
Apr
(3) |
May
(2) |
Jun
(6) |
Jul
(12) |
Aug
(1) |
Sep
(3) |
Oct
(23) |
Nov
(6) |
Dec
(11) |
2013 |
Jan
(9) |
Feb
(2) |
Mar
(8) |
Apr
(7) |
May
(40) |
Jun
(9) |
Jul
(47) |
Aug
(23) |
Sep
(52) |
Oct
(6) |
Nov
(9) |
Dec
(8) |
2014 |
Jan
(27) |
Feb
(15) |
Mar
(26) |
Apr
(36) |
May
(33) |
Jun
(4) |
Jul
(15) |
Aug
(2) |
Sep
(11) |
Oct
(120) |
Nov
(32) |
Dec
(27) |
2015 |
Jan
(30) |
Feb
(15) |
Mar
(7) |
Apr
(17) |
May
(27) |
Jun
(23) |
Jul
(15) |
Aug
(39) |
Sep
(19) |
Oct
(5) |
Nov
(26) |
Dec
(6) |
2016 |
Jan
(37) |
Feb
(35) |
Mar
(51) |
Apr
(18) |
May
(8) |
Jun
(11) |
Jul
(5) |
Aug
(7) |
Sep
(54) |
Oct
(6) |
Nov
(33) |
Dec
(11) |
2017 |
Jan
(15) |
Feb
(25) |
Mar
(25) |
Apr
(19) |
May
(17) |
Jun
(28) |
Jul
(11) |
Aug
(56) |
Sep
(53) |
Oct
(15) |
Nov
(19) |
Dec
(30) |
2018 |
Jan
(63) |
Feb
(44) |
Mar
(42) |
Apr
(41) |
May
(19) |
Jun
(22) |
Jul
(16) |
Aug
(38) |
Sep
(14) |
Oct
(6) |
Nov
(11) |
Dec
(12) |
2019 |
Jan
(44) |
Feb
(7) |
Mar
(11) |
Apr
(58) |
May
(10) |
Jun
(10) |
Jul
(42) |
Aug
(36) |
Sep
(3) |
Oct
(29) |
Nov
(29) |
Dec
(23) |
2020 |
Jan
(7) |
Feb
(22) |
Mar
(3) |
Apr
(38) |
May
(14) |
Jun
(7) |
Jul
(12) |
Aug
(48) |
Sep
(85) |
Oct
(71) |
Nov
(14) |
Dec
(4) |
2021 |
Jan
(11) |
Feb
(36) |
Mar
(65) |
Apr
(106) |
May
(73) |
Jun
(33) |
Jul
(25) |
Aug
(19) |
Sep
(19) |
Oct
(29) |
Nov
(95) |
Dec
(21) |
2022 |
Jan
(91) |
Feb
(30) |
Mar
(43) |
Apr
(95) |
May
(136) |
Jun
(47) |
Jul
(28) |
Aug
(36) |
Sep
(17) |
Oct
(46) |
Nov
(53) |
Dec
(15) |
2023 |
Jan
|
Feb
(15) |
Mar
(44) |
Apr
(9) |
May
(20) |
Jun
(18) |
Jul
(8) |
Aug
(18) |
Sep
(41) |
Oct
(67) |
Nov
(44) |
Dec
(2) |
2024 |
Jan
(4) |
Feb
(7) |
Mar
(45) |
Apr
(35) |
May
(4) |
Jun
(29) |
Jul
(4) |
Aug
(37) |
Sep
(16) |
Oct
(12) |
Nov
(6) |
Dec
(8) |
2025 |
Jan
(179) |
Feb
(49) |
Mar
(8) |
Apr
(41) |
May
(32) |
Jun
(35) |
Jul
(31) |
Aug
(33) |
Sep
|
Oct
|
Nov
|
Dec
|
From: Corey M. <co...@mi...> - 2025-08-16 02:02:50
|
On Fri, Aug 15, 2025 at 04:23:08PM -0500, Frederick Lawler wrote: > Hi Corey, > > On Thu, Aug 07, 2025 at 06:02:31PM -0500, Corey Minyard wrote: > > I went ahead and did some patches for this, since it was on my mind. > > > > With these, if a reset is sent to the BMC, the driver will disable > > messages to the BMC for a time, defaulting to 30 seconds. Don't > > modify message timing, since no messages are allowed, anyway. > > > > If a firmware update command is sent to the BMC, then just reject > > sysfs commands that query the BMC. Modify message timing and > > allow direct messages through the driver interface. > > > > Hopefully this will work around the problem, and it's a good idea, > > anyway. > > > > -corey > > > > Thanks for the patches, and sorry for the delay in response. > It's one of _those weeks_. Anyway, I backported the patch series > to 6.12, and the changes seem reasonable to me overall. Ran it > through our infra on a single node, and nothing seemed to break. > > I did observe with testing that resetting BMC via ipmitool on the host > did kick out sysfs reads as expected. Ok, I took the liberty of adding a "Tested-by" line with your name. If that's not ok, I can pull it out. > > Resetting the BMC remotely, was not handled (this seems obvious given the state > changes are handled via ipmi_msg handler). Would the BMC send an event > to the kernel letting it know its resetting so that case could be > handled? Unfortunately not. It's one of the many things that would be nice to have... In general, dealing with a BMC being reset is a real pain. They tend to do all kinds of different things. The worst is when they sort of act like they are operational, but then do strange things. I haven't thought of a good general purpose way to handle this. I'm toying with the idea of making it so if the BMC gets an error, just shut things down for a second or so and then test it to see if it's working. During this time just return errors, like the new patches do during reset. Thanks for testing these. -corey > > Best, > Fred |
From: Frederick L. <fr...@cl...> - 2025-08-15 21:23:17
|
Hi Corey, On Thu, Aug 07, 2025 at 06:02:31PM -0500, Corey Minyard wrote: > I went ahead and did some patches for this, since it was on my mind. > > With these, if a reset is sent to the BMC, the driver will disable > messages to the BMC for a time, defaulting to 30 seconds. Don't > modify message timing, since no messages are allowed, anyway. > > If a firmware update command is sent to the BMC, then just reject > sysfs commands that query the BMC. Modify message timing and > allow direct messages through the driver interface. > > Hopefully this will work around the problem, and it's a good idea, > anyway. > > -corey > Thanks for the patches, and sorry for the delay in response. It's one of _those weeks_. Anyway, I backported the patch series to 6.12, and the changes seem reasonable to me overall. Ran it through our infra on a single node, and nothing seemed to break. I did observe with testing that resetting BMC via ipmitool on the host did kick out sysfs reads as expected. Resetting the BMC remotely, was not handled (this seems obvious given the state changes are handled via ipmi_msg handler). Would the BMC send an event to the kernel letting it know its resetting so that case could be handled? Best, Fred |
From: Corey M. <co...@mi...> - 2025-08-14 18:09:48
|
On Thu, Aug 14, 2025 at 06:23:23PM +0100, Mark Bannister wrote: > > > Thanks for the bug report and debugging info. I think I know what is > > > going on, I've attached a patch that should hopefully fix it. > > > Basically, it looks like the BMC is alive enough that it sort of > > > responds to the host, but not alive enough to actually complete a > > > transaction. The driver needs to not immediately retry in that case, it > > > needs to delay a bit. > > > > > > It passes all my tests, but the situation you are in would be hard to > > > manufacture for me. > > > > > > Can you try this patch? > > > > Thanks for the super quick response, I'll try out this patch and report > back my findings. > > > > Best regards > > Mark > > The patch looks good. Without the patch I was able to reproduce the > problem on kernels 6.6 and 6.12 (but not 6.1) after 5-20 attempts of > running 'ipmitool mc reset cold' every 2 minutes. With the patch, I have > run it 50 times without incident. Perfect, I'll queue it for the next kernel release. I can get it into the current release if it's urgent. The change that caused this was c608966f3f9c "ipmi: fix msg stack when IPMI is disconnected" and it came in between 6.1 and 6.6. I'm adding the author of that patch because this change may affect that. In hindsight I think the fix that caused this is wrong. I'm not sure how what the author said was happening could happen. There's a limit of 100 messages per user. I am inclined right now to revert that change. > The hosed counter isn't as much of an > indicator as I thought, I saw it in the tens of thousands with and without > the patch, I have also seen it in the hundreds of thousands without the > patch and on other hardware I have seen it reach 5 million in one hour > without the patch (but also without incident). Yeah, that's just a count of how many issues it has with the BMC. You will still see it go up. -corey > > We will incorporate your patch into our builds so that we avoid hitting > this problem in production again. > > Best regards > Mark |
From: Mark B. <mba...@ja...> - 2025-08-14 17:23:42
|
> > Thanks for the bug report and debugging info. I think I know what is > > going on, I've attached a patch that should hopefully fix it. > > Basically, it looks like the BMC is alive enough that it sort of > > responds to the host, but not alive enough to actually complete a > > transaction. The driver needs to not immediately retry in that case, it > > needs to delay a bit. > > > > It passes all my tests, but the situation you are in would be hard to > > manufacture for me. > > > > Can you try this patch? > > Thanks for the super quick response, I'll try out this patch and report back my findings. > > Best regards > Mark The patch looks good. Without the patch I was able to reproduce the problem on kernels 6.6 and 6.12 (but not 6.1) after 5-20 attempts of running 'ipmitool mc reset cold' every 2 minutes. With the patch, I have run it 50 times without incident. The hosed counter isn't as much of an indicator as I thought, I saw it in the tens of thousands with and without the patch, I have also seen it in the hundreds of thousands without the patch and on other hardware I have seen it reach 5 million in one hour without the patch (but also without incident). We will incorporate your patch into our builds so that we avoid hitting this problem in production again. Best regards Mark |
From: Mark B. <mba...@ja...> - 2025-08-14 14:21:19
|
> Thanks for the bug report and debugging info. I think I know what is > going on, I've attached a patch that should hopefully fix it. > Basically, it looks like the BMC is alive enough that it sort of > responds to the host, but not alive enough to actually complete a > transaction. The driver needs to not immediately retry in that case, it > needs to delay a bit. > > It passes all my tests, but the situation you are in would be hard to > manufacture for me. > > Can you try this patch? Thanks for the super quick response, I'll try out this patch and report back my findings. Best regards Mark |
From: Corey M. <co...@mi...> - 2025-08-14 14:00:24
|
If the BMC is in a state where it is partially responding but not really there, the driver could go into an infinite loop trying error recovery over and over. The device should eventually come back, but we don't want to be continually retrying. Add a delay between retries. Signed-off-by: Corey Minyard <co...@mi...> --- drivers/char/ipmi/ipmi_kcs_sm.c | 4 ++-- drivers/char/ipmi/ipmi_si_intf.c | 9 +++++++-- 2 files changed, 9 insertions(+), 4 deletions(-) Thanks for the bug report and debugging info. I think I know what is going on, I've attached a patch that should hopefully fix it. Basically, it looks like the BMC is alive enough that it sort of responds to the host, but not alive enough to actually complete a transaction. The driver needs to not immediately retry in that case, it needs to delay a bit. It passes all my tests, but the situation you are in would be hard to manufacture for me. Can you try this patch? -corey diff --git a/drivers/char/ipmi/ipmi_kcs_sm.c b/drivers/char/ipmi/ipmi_kcs_sm.c index ecfcb50302f6..20f3611c5444 100644 --- a/drivers/char/ipmi/ipmi_kcs_sm.c +++ b/drivers/char/ipmi/ipmi_kcs_sm.c @@ -467,7 +467,7 @@ static enum si_sm_result kcs_event(struct si_sm_data *kcs, long time) if (state != KCS_READ_STATE) { start_error_recovery(kcs, "Not in read state for error2"); - break; + return SI_SM_CALL_WITH_TICK_DELAY; } if (!check_obf(kcs, status, time)) return SI_SM_CALL_WITH_DELAY; @@ -481,7 +481,7 @@ static enum si_sm_result kcs_event(struct si_sm_data *kcs, long time) if (state != KCS_IDLE_STATE) { start_error_recovery(kcs, "Not in idle state for error3"); - break; + return SI_SM_CALL_WITH_TICK_DELAY; } if (!check_obf(kcs, status, time)) diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c index 8b5524069c15..3f4747ae5ddb 100644 --- a/drivers/char/ipmi/ipmi_si_intf.c +++ b/drivers/char/ipmi/ipmi_si_intf.c @@ -790,7 +790,10 @@ static enum si_sm_result smi_event_handler(struct smi_info *smi_info, */ return_hosed_msg(smi_info, IPMI_ERR_UNSPECIFIED); } - goto restart; + /* + * If the device isn't working, we want a delay before + * trying again. + */ } /* @@ -888,15 +891,17 @@ static void flush_messages(void *send_info) { struct smi_info *smi_info = send_info; enum si_sm_result result; + int loops_left = 10000; /* Don't try forever. */ /* * Currently, this function is called only in run-to-completion * mode. This means we are single-threaded, no need for locks. */ result = smi_event_handler(smi_info, 0); - while (result != SI_SM_IDLE) { + while (result != SI_SM_IDLE && loops_left > 0) { udelay(SI_SHORT_TIMEOUT_USEC); result = smi_event_handler(smi_info, SI_SHORT_TIMEOUT_USEC); + loops_left--; } } -- 2.43.0 |
From: Mark B. <mba...@ja...> - 2025-08-14 09:16:23
|
Hi Corey I crashed a machine on 1st August after issuing 'ipmitool mc reset cold' to reset a BMC. I got a crash dump from this event which I have been analyzing. The crash occurred when the NMI watchdog detected a hard LOCKUP in an interrupt handler: [144482.968722] CPU: 1 PID: 96220 Comm: process-finder Kdump: loaded Tainted: G W O 6.6.93-1.el8.x86_64 #1 [144482.968724] RIP: 0010:port_outb+0x13/0x20 [ipmi_si] [144482.968735] Code: 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 66 0f af 77 18 89 d0 0f b7 57 28 01 f2 ee <c3> cc cc cc cc 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 [144482.968736] RSP: 0018:ff626798c007ce50 EFLAGS: 00000002 [144482.968737] RAX: 0000000000000000 RBX: ff2e8eaa120b1c00 RCX: ff2e8ee87e860640 [144482.968738] RDX: 0000000000000ca2 RSI: 0000000000000000 RDI: ff2e8ee98e8c0840 [144482.968738] RBP: 0000000000000001 R08: ff2e8ee87e860668 R09: ff626798c007cf08 [144482.968739] R10: 0000000000000006 R11: 000000000000044d R12: 0000000000000000 [144482.968739] R13: ff2e8ee98e8c0800 R14: ffffffffc27ad210 R15: ff626798c007cf00 [144482.968740] FS: 00007fffe8bff700(0000) GS:ff2e8ee87e840000(0000) knlGS:0000000000000000 [144482.968740] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [144482.968741] CR2: 00007ffff7ceb528 CR3: 000000047de9e001 CR4: 0000000000771ee0 [144482.968742] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [144482.968742] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 [144482.968743] PKRU: 55555554 [144482.968743] Call Trace: [144482.968745] <IRQ> [144482.968746] kcs_event+0x253/0x960 [ipmi_si] [144482.968751] smi_event_handler+0x5b/0x280 [ipmi_si] [144482.968756] smi_timeout+0x3b/0xc0 [ipmi_si] [144482.968760] ? __pfx_smi_timeout+0x10/0x10 [ipmi_si] [144482.968764] call_timer_fn+0x24/0x130 [144482.968769] __run_timers.part.0+0x1d8/0x280 [144482.968771] ? enqueue_hrtimer+0x35/0x90 [144482.968772] ? __hrtimer_run_queues+0x141/0x2b0 [144482.968772] ? sched_clock+0xc/0x30 [144482.968775] run_timer_softirq+0x26/0x50 [144482.968776] handle_softirqs+0xdd/0x2d0 [144482.968779] irq_exit_rcu+0xa8/0xd0 [144482.968781] sysvec_apic_timer_interrupt+0x6e/0x90 [144482.968784] </IRQ> I was able to reproduce the crash two days ago (12th August) by running 'ipmitool mc reset cold' in a loop with 2 minute sleeps between on identical test hardware running the same kernel version, although so far when I have reproduced the crash I have not been able to get another crash dump. # c=0; while :; do ((c+=1)); echo $(date) - $c; ipmitool mc reset cold; sleep 120; done Tue 12 Aug 07:02:28 EDT 2025 - 1 Sent cold reset command to MC Tue 12 Aug 07:04:28 EDT 2025 - 2 Sent cold reset command to MC Tue 12 Aug 07:06:28 EDT 2025 - 3 Sent cold reset command to MC Tue 12 Aug 07:08:28 EDT 2025 - 4 Sent cold reset command to MC Tue 12 Aug 07:10:28 EDT 2025 - 5 Sent cold reset command to MC Tue 12 Aug 07:12:28 EDT 2025 - 6 Sent cold reset command to MC Tue 12 Aug 07:14:28 EDT 2025 - 7 Sent cold reset command to MC Tue 12 Aug 07:16:28 EDT 2025 - 8 Sent cold reset command to MC Tue 12 Aug 07:18:28 EDT 2025 - 9 Sent cold reset command to MC Tue 12 Aug 07:20:28 EDT 2025 - 10 Sent cold reset command to MC Tue 12 Aug 07:22:28 EDT 2025 - 11 Sent cold reset command to MC Tue 12 Aug 07:24:28 EDT 2025 - 12 Sent cold reset command to MC Tue 12 Aug 07:26:28 EDT 2025 - 13 Sent cold reset command to MC EXIT STATUS 255 I have tried (and so far failed) to reproduce the problem on kernel 6.1.144-1.el8.x86_64, but admittedly I haven't tried very hard yet so that might not be a reliable data point. On the reproducer, I was gathering debug data from the ipmi_si module using 'echo 7 > /sys/module/ipmi_si/parameters/kcs_debug' and was running 'journalctl -f' in a terminal window at the time of the crash, where the terminal buffer is filled up with thousands of lines like this, which were produced as the BMC was resetting: Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: ipmi_kcs_sm: kcs hosed: Not in read state for error2 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 6, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 7, c9 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 8, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 6, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 7, c9 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 8, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 6, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 7, c9 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 8, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 6, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 7, c9 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 8, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 6, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 7, c9 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 8, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 6, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 7, c9 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 8, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 6, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 7, c9 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 8, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 6, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 7, c9 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 8, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 6, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 7, c9 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 8, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 6, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 7, c9 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 8, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 6, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 7, c9 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 8, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: ipmi_kcs_sm: kcs hosed: Not in read state for error2 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 6, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 7, c9 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 8, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 6, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 7, c9 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 8, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 6, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 7, c9 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 8, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 6, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 7, c9 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 8, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 6, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 7, c9 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 8, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 6, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 7, c9 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 8, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 6, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 7, c9 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 8, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 6, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 7, c9 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 8, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 6, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 7, c9 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 8, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 6, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 7, c9 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 8, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 6, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 7, c9 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 8, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: ipmi_kcs_sm: kcs hosed: Not in read state for error2 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 6, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 7, c9 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 8, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 6, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 7, c9 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 8, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 6, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 7, c9 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 8, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 6, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 7, c9 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 8, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 6, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 7, c9 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 8, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 6, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 7, c9 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 8, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 6, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 7, c9 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 8, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 6, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 7, c9 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 8, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 6, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 7, c9 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 8, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 6, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 7, c9 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 8, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 6, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 7, c9 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 8, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: ipmi_kcs_sm: kcs hosed: Not in read state for error2 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 6, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 7, c9 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 8, c1 Aug 12 07:27:44 kernel: ipmi_si IPI0001:00: KCS: State = 6, c1 I collected some more debug data from the vmcore file collected on 1st August: $ crash --zero_excluded /usr/lib/debug/lib/modules/6.6.93-1.el8.x86_64/vmlinux vmcore ... crash> mod -s ipmi_si MODULE NAME TEXT_BASE SIZE OBJECT FILE ffffffffc27dde80 ipmi_si ffffffffc27ab000 86016 /usr/lib/debug/lib/modules/6.6.93-1.el8.x86_64/kernel/drivers/char/ipmi/ipmi_si.ko.debug crash> struct smi_info 0xff2e8ee98e8c0800 struct smi_info { si_num = 0, intf = 0xff2e8ee98fbaa000, si_sm = 0xff2e8eaa120b1c00, handlers = 0xffffffffc27e4240 <kcs_smi_handlers>, si_lock = { { rlock = { raw_lock = { { val = { counter = 257 }, { locked = 1 '\001', pending = 1 '\001' }, { locked_pending = 257, tail = 0 } } } } } }, waiting_msg = 0x0, curr_msg = 0x0, si_state = SI_NORMAL, io = { inputb = 0xffffffffc27b1940 <port_inb>, outputb = 0xffffffffc27b1970 <port_outb>, addr = 0x0, regspacing = 1, regsize = 1, regshift = 0, addr_space = IPMI_IO_ADDR_SPACE, addr_data = 3234, addr_source = SI_ACPI, addr_info = { acpi_info = { acpi_handle = 0xff2e8ee9891e2f30 } }, io_setup = 0xffffffffc27b1ac0 <ipmi_si_port_setup>, io_cleanup = 0xffffffffc27b1a60 <port_cleanup>, io_size = 2, irq = 0, irq_setup = 0x0, irq_handler_data = 0x0, irq_cleanup = 0x0, slave_addr = 32 ' ', si_type = SI_KCS, dev = 0xff2e8ee98ac6c010 }, oem_data_avail_handler = 0x0, msg_flags = 0 '\000', has_event_buffer = false, req_events = { counter = 0 }, run_to_completion = false, si_timer = { entry = { next = 0xdead000000000122, pprev = 0x0 }, expires = 4439136548, function = 0xffffffffc27ad210 <smi_timeout>, flags = 155189249 }, timer_can_start = true, timer_running = true, last_timeout_jiffies = 4439136547, need_watch = { counter = 0 }, interrupt_disabled = true, supports_event_msg_buff = false, cannot_disable_irq = false, irq_enable_broken = false, in_maintenance_mode = true, got_attn = false, device_id = { device_id = 32 ' ', device_revision = 2 '\002', firmware_revision_1 = 1 '\001', firmware_revision_2 = 0 '\000', ipmi_version = 2 '\002', additional_device_support = 191 '\277', manufacturer_id = 10876, product_id = 7496, aux_firmware_revision = "!\001\000 ", aux_firmware_revision_set = 1 }, dev_group_added = true, stats = {{ counter = 13470 }, { counter = 1809 }, { counter = 358202 }, { counter = 0 }, { counter = 0 }, { counter = 0 }, { counter = 24503 }, { counter = 357924 }, { counter = 0 }, { counter = 0 }, { counter = 0 }}, thread = 0xff2e8eaa82124100, link = { next = 0xffffffffc27dd780 <smi_infos>, prev = 0xffffffffc27dd780 <smi_infos> } } crash> struct si_sm_data 0xff2e8eaa120b1c00 struct si_sm_data { state = KCS_ERROR1, io = 0xff2e8ee98e8c0840, write_data = "\030\001\003\001\000\000&\030@\000\000\000\000\000\000\000\330\002\000\000\000\000\000\000\330\002\000\000\000\000\000\000\b\000\000\000\000\000\000\000\003\000\000\000\004\000\000\000\030\003\000\000\000\000\000\000\030\003\000\000\000\000\000\000\030\003\000\000\000\000\000\000\034\000\000\000\000\000\000\000\034\000\000\000\000\000\000\000\001\000\000\000\000\000\000\000\001\000\000\000\004\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\n\000\000\000\000\000\000\000\n\000\000\000\000\000\000\000\020\000\000\000\000\000\000\001\000\000\000\005\000\000\000\000\020\000\000\000\000\000\000\000\020\000\000\000\000\000\000\000\020\000\000\000\000\000\000E\005\000\000\000\000\000\000E\005\000\000\000\000\000\000\000\020\000\000\000\000\000\000\001\000\000\000\004\000\000\000\000 \000\000\000\000\000\000\000 \000\000\000\000\000\000\000 \000\000\000\000\000\000\000\003\000\000\000\000\000\000\000\003\000\000\000\000\000\000\000"..., write_pos = 0, write_count = 0, orig_write_count = 0, read_data = "\034\002\000 \002\001\000\002\277|*\000H\035!\001\000 \034\000@SDA Temp\000\a-C\374\177\200KF\000\000\006\000\000\000@ -\000\000\000\000\000\000@=\000\000\000\000\000\000@=\000\000\000\000\000\000 \002\000\000\000\000\000\000 \002\000\000\000\000\000\000\b\000\000\000\000\000\000\000\004\000\000\000\004\000\000\000\070\003\000\000\000\000\000\000\070\003\000\000\000\000\000\000\070\003\000\000\000\000\000\000 \000\000\000\000\000\000\000 \000\000\000\000\000\000\000\b\000\000\000\000\000\000\000\004\000\000\000\004\000\000\000X\003\000\000\000\000\000\000X\003\000\000\000\000\000\000X\003\000\000\000\000\000\000D\000\000\000\000\000\000\000D\000\000\000\000\000\000\000\004\000\000\000\000\000\000\000S\345td\004\000\000\000pz\350\320\023u\023\376\070\003\000\000\000\000\000\000\070\003\000\000\000\000\000\000 \000\000\000\000\000\000\000 \000\000\000\000\000\000\000\b\000\000\000\000\000\000\000"..., read_pos = 0, truncated = 0, error_retries = 6, ibf_timeout = 5000000, obf_timeout = 5000000, error0_timeout = 4439151592 } crash> >From the above it looks like, at the time of the crash, the state machine was at KCS_ERROR1 (si_sm_data.state) having at that moment in time handled 6 retries (si_sm_data.error_retries), but having a hosed counter of 24,503 (smi_info.stats[6]). Looking in the smi_event_handler code, I wasn't immediately sure whether a result of SI_SM_HOSED would cause the interrupt handler to keep looping around and not allow other interrupts to fire, but the symptoms might suggest that? Although if that was the case I'm surprised we haven't seen the problem more often, we have lots of machines. My presumption was that this: [144482.968724] RIP: 0010:port_outb+0x13/0x20 [ipmi_si] [144482.968735] Code: 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 66 0f af 77 18 89 d0 0f b7 57 28 01 f2 ee <c3> cc cc cc cc 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 ... as well as the thousands of state transitions I saw when reproducing the problem with debug output, and the hosed counter being very high in the vmcore, suggested that it wasn't actually stuck at a ret instruction (c3) in port_outb, but that's just where RIP was at the point the crash was taken. Happy to collect more info from the vmcore as needed or test patches etc. Best regards Mark |
From: Binbin Z. <zho...@lo...> - 2025-08-12 12:00:23
|
This patch adds Loongson-2K BMC IPMI support. According to the existing design, we use software simulation to implement the KCS interface registers: Stauts/Command/Data_Out/Data_In. Also since both host side and BMC side read and write kcs status, fifo flag is used to ensure data consistency. The single KCS message block is as follows: +-------------------------------------------------------------------------+ |FIFO flags| KCS register data | CMD data | KCS version | WR REQ | WR ACK | +-------------------------------------------------------------------------+ Co-developed-by: Chong Qiao <qia...@lo...> Signed-off-by: Chong Qiao <qia...@lo...> Reviewed-by: Huacai Chen <che...@lo...> Acked-by: Corey Minyard <co...@mi...> Signed-off-by: Binbin Zhou <zho...@lo...> --- MAINTAINERS | 1 + drivers/char/ipmi/Kconfig | 7 ++ drivers/char/ipmi/Makefile | 1 + drivers/char/ipmi/ipmi_si.h | 7 ++ drivers/char/ipmi/ipmi_si_intf.c | 4 + drivers/char/ipmi/ipmi_si_ls2k.c | 189 +++++++++++++++++++++++++++++++ 6 files changed, 209 insertions(+) create mode 100644 drivers/char/ipmi/ipmi_si_ls2k.c diff --git a/MAINTAINERS b/MAINTAINERS index d50b2c3b2bb8..ce1fdc47e9f3 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -14210,6 +14210,7 @@ LOONGSON-2K Board Management Controller (BMC) DRIVER M: Binbin Zhou <zho...@lo...> M: Chong Qiao <qia...@lo...> S: Maintained +F: drivers/char/ipmi/ipmi_si_ls2k.c F: drivers/mfd/ls2k-bmc-core.c LOONGSON EDAC DRIVER diff --git a/drivers/char/ipmi/Kconfig b/drivers/char/ipmi/Kconfig index f4adc6feb3b2..92bed266d07c 100644 --- a/drivers/char/ipmi/Kconfig +++ b/drivers/char/ipmi/Kconfig @@ -84,6 +84,13 @@ config IPMI_IPMB bus, and it also supports direct messaging on the bus using IPMB direct messages. This module requires I2C support. +config IPMI_LS2K + bool 'Loongson-2K IPMI interface' + depends on LOONGARCH + select MFD_LS2K_BMC_CORE + help + Provides a driver for Loongson-2K IPMI interfaces. + config IPMI_POWERNV depends on PPC_POWERNV tristate 'POWERNV (OPAL firmware) IPMI interface' diff --git a/drivers/char/ipmi/Makefile b/drivers/char/ipmi/Makefile index e0944547c9d0..4ea450a82242 100644 --- a/drivers/char/ipmi/Makefile +++ b/drivers/char/ipmi/Makefile @@ -8,6 +8,7 @@ ipmi_si-y := ipmi_si_intf.o ipmi_kcs_sm.o ipmi_smic_sm.o ipmi_bt_sm.o \ ipmi_si_mem_io.o ipmi_si-$(CONFIG_HAS_IOPORT) += ipmi_si_port_io.o ipmi_si-$(CONFIG_PCI) += ipmi_si_pci.o +ipmi_si-$(CONFIG_IPMI_LS2K) += ipmi_si_ls2k.o ipmi_si-$(CONFIG_PARISC) += ipmi_si_parisc.o obj-$(CONFIG_IPMI_HANDLER) += ipmi_msghandler.o diff --git a/drivers/char/ipmi/ipmi_si.h b/drivers/char/ipmi/ipmi_si.h index 508c3fd45877..687835b53da5 100644 --- a/drivers/char/ipmi/ipmi_si.h +++ b/drivers/char/ipmi/ipmi_si.h @@ -101,6 +101,13 @@ void ipmi_si_pci_shutdown(void); static inline void ipmi_si_pci_init(void) { } static inline void ipmi_si_pci_shutdown(void) { } #endif +#ifdef CONFIG_IPMI_LS2K +void ipmi_si_ls2k_init(void); +void ipmi_si_ls2k_shutdown(void); +#else +static inline void ipmi_si_ls2k_init(void) { } +static inline void ipmi_si_ls2k_shutdown(void) { } +#endif #ifdef CONFIG_PARISC void ipmi_si_parisc_init(void); void ipmi_si_parisc_shutdown(void); diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c index bb42dfe1c6a8..9c38aca16fd0 100644 --- a/drivers/char/ipmi/ipmi_si_intf.c +++ b/drivers/char/ipmi/ipmi_si_intf.c @@ -2121,6 +2121,8 @@ static int __init init_ipmi_si(void) ipmi_si_pci_init(); + ipmi_si_ls2k_init(); + ipmi_si_parisc_init(); mutex_lock(&smi_infos_lock); @@ -2335,6 +2337,8 @@ static void cleanup_ipmi_si(void) ipmi_si_pci_shutdown(); + ipmi_si_ls2k_shutdown(); + ipmi_si_parisc_shutdown(); ipmi_si_platform_shutdown(); diff --git a/drivers/char/ipmi/ipmi_si_ls2k.c b/drivers/char/ipmi/ipmi_si_ls2k.c new file mode 100644 index 000000000000..45442c257efd --- /dev/null +++ b/drivers/char/ipmi/ipmi_si_ls2k.c @@ -0,0 +1,189 @@ +// SPDX-License-Identifier: GPL-2.0+ +/* + * Driver for Loongson-2K BMC IPMI interface + * + * Copyright (C) 2024-2025 Loongson Technology Corporation Limited. + * + * Authors: + * Chong Qiao <qia...@lo...> + * Binbin Zhou <zho...@lo...> + */ + +#include <linux/bitfield.h> +#include <linux/ioport.h> +#include <linux/module.h> +#include <linux/types.h> + +#include "ipmi_si.h" + +#define LS2K_KCS_FIFO_IBFH 0x0 +#define LS2K_KCS_FIFO_IBFT 0x1 +#define LS2K_KCS_FIFO_OBFH 0x2 +#define LS2K_KCS_FIFO_OBFT 0x3 + +/* KCS registers */ +#define LS2K_KCS_REG_STS 0x4 +#define LS2K_KCS_REG_DATA_OUT 0x5 +#define LS2K_KCS_REG_DATA_IN 0x6 +#define LS2K_KCS_REG_CMD 0x8 + +#define LS2K_KCS_CMD_DATA 0xa +#define LS2K_KCS_VERSION 0xb +#define LS2K_KCS_WR_REQ 0xc +#define LS2K_KCS_WR_ACK 0x10 + +#define LS2K_KCS_STS_OBF BIT(0) +#define LS2K_KCS_STS_IBF BIT(1) +#define LS2K_KCS_STS_SMS_ATN BIT(2) +#define LS2K_KCS_STS_CMD BIT(3) + +#define LS2K_KCS_DATA_MASK (LS2K_KCS_STS_OBF | LS2K_KCS_STS_IBF | LS2K_KCS_STS_CMD) + +static bool ls2k_registered; + +static unsigned char ls2k_mem_inb_v0(const struct si_sm_io *io, unsigned int offset) +{ + void __iomem *addr = io->addr; + int reg_offset; + + if (offset & BIT(0)) { + reg_offset = LS2K_KCS_REG_STS; + } else { + writeb(readb(addr + LS2K_KCS_REG_STS) & ~LS2K_KCS_STS_OBF, addr + LS2K_KCS_REG_STS); + reg_offset = LS2K_KCS_REG_DATA_OUT; + } + + return readb(addr + reg_offset); +} + +static unsigned char ls2k_mem_inb_v1(const struct si_sm_io *io, unsigned int offset) +{ + void __iomem *addr = io->addr; + unsigned char inb = 0, cmd; + bool obf, ibf; + + obf = readb(addr + LS2K_KCS_FIFO_OBFH) ^ readb(addr + LS2K_KCS_FIFO_OBFT); + ibf = readb(addr + LS2K_KCS_FIFO_IBFH) ^ readb(addr + LS2K_KCS_FIFO_IBFT); + cmd = readb(addr + LS2K_KCS_CMD_DATA); + + if (offset & BIT(0)) { + inb = readb(addr + LS2K_KCS_REG_STS) & ~LS2K_KCS_DATA_MASK; + inb |= FIELD_PREP(LS2K_KCS_STS_OBF, obf) + | FIELD_PREP(LS2K_KCS_STS_IBF, ibf) + | FIELD_PREP(LS2K_KCS_STS_CMD, cmd); + } else { + inb = readb(addr + LS2K_KCS_REG_DATA_OUT); + writeb(readb(addr + LS2K_KCS_FIFO_OBFH), addr + LS2K_KCS_FIFO_OBFT); + } + + return inb; +} + +static void ls2k_mem_outb_v0(const struct si_sm_io *io, unsigned int offset, + unsigned char val) +{ + void __iomem *addr = io->addr; + unsigned char sts = readb(addr + LS2K_KCS_REG_STS); + int reg_offset; + + if (sts & LS2K_KCS_STS_IBF) + return; + + if (offset & BIT(0)) { + reg_offset = LS2K_KCS_REG_CMD; + sts |= LS2K_KCS_STS_CMD; + } else { + reg_offset = LS2K_KCS_REG_DATA_IN; + sts &= ~LS2K_KCS_STS_CMD; + } + + writew(val, addr + reg_offset); + writeb(sts | LS2K_KCS_STS_IBF, addr + LS2K_KCS_REG_STS); + writel(readl(addr + LS2K_KCS_WR_REQ) + 1, addr + LS2K_KCS_WR_REQ); +} + +static void ls2k_mem_outb_v1(const struct si_sm_io *io, unsigned int offset, + unsigned char val) +{ + void __iomem *addr = io->addr; + unsigned char ibfh, ibft; + int reg_offset; + + ibfh = readb(addr + LS2K_KCS_FIFO_IBFH); + ibft = readb(addr + LS2K_KCS_FIFO_IBFT); + + if (ibfh ^ ibft) + return; + + reg_offset = (offset & BIT(0)) ? LS2K_KCS_REG_CMD : LS2K_KCS_REG_DATA_IN; + writew(val, addr + reg_offset); + + writeb(offset & BIT(0), addr + LS2K_KCS_CMD_DATA); + writeb(!ibft, addr + LS2K_KCS_FIFO_IBFH); + writel(readl(addr + LS2K_KCS_WR_REQ) + 1, addr + LS2K_KCS_WR_REQ); +} + +static void ls2k_mem_cleanup(struct si_sm_io *io) +{ + if (io->addr) + iounmap(io->addr); +} + +static int ipmi_ls2k_mem_setup(struct si_sm_io *io) +{ + unsigned char version; + + io->addr = ioremap(io->addr_data, io->regspacing); + if (!io->addr) + return -EIO; + + version = readb(io->addr + LS2K_KCS_VERSION); + + io->inputb = version ? ls2k_mem_inb_v1 : ls2k_mem_inb_v0; + io->outputb = version ? ls2k_mem_outb_v1 : ls2k_mem_outb_v0; + io->io_cleanup = ls2k_mem_cleanup; + + return 0; +} + +static int ipmi_ls2k_probe(struct platform_device *pdev) +{ + struct si_sm_io io; + + memset(&io, 0, sizeof(io)); + + io.si_info = &ipmi_kcs_si_info; + io.io_setup = ipmi_ls2k_mem_setup; + io.addr_data = pdev->resource[0].start; + io.regspacing = resource_size(&pdev->resource[0]); + io.dev = &pdev->dev; + + dev_dbg(&pdev->dev, "addr 0x%lx, spacing %d.\n", io.addr_data, io.regspacing); + + return ipmi_si_add_smi(&io); +} + +static void ipmi_ls2k_remove(struct platform_device *pdev) +{ + ipmi_si_remove_by_dev(&pdev->dev); +} + +struct platform_driver ipmi_ls2k_platform_driver = { + .driver = { + .name = "ls2k-ipmi-si", + }, + .probe = ipmi_ls2k_probe, + .remove = ipmi_ls2k_remove, +}; + +void ipmi_si_ls2k_init(void) +{ + platform_driver_register(&ipmi_ls2k_platform_driver); + ls2k_registered = true; +} + +void ipmi_si_ls2k_shutdown(void) +{ + if (ls2k_registered) + platform_driver_unregister(&ipmi_ls2k_platform_driver); +} -- 2.47.3 |
From: Binbin Z. <zho...@lo...> - 2025-08-12 12:00:14
|
The Loongson-2K Board Management Controller provides an PCIe interface to the host to access the feature implemented in the BMC. The BMC is assembled on a server similar to the server machine with Loongson-3 CPU. It supports multiple sub-devices like DRM and IPMI. Co-developed-by: Chong Qiao <qia...@lo...> Signed-off-by: Chong Qiao <qia...@lo...> Reviewed-by: Huacai Chen <che...@lo...> Acked-by: Corey Minyard <co...@mi...> Signed-off-by: Binbin Zhou <zho...@lo...> --- MAINTAINERS | 6 ++ drivers/mfd/Kconfig | 13 +++ drivers/mfd/Makefile | 2 + drivers/mfd/ls2k-bmc-core.c | 189 ++++++++++++++++++++++++++++++++++++ 4 files changed, 210 insertions(+) create mode 100644 drivers/mfd/ls2k-bmc-core.c diff --git a/MAINTAINERS b/MAINTAINERS index 0f84051ef044..d50b2c3b2bb8 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -14206,6 +14206,12 @@ S: Maintained F: Documentation/devicetree/bindings/thermal/loongson,ls2k-thermal.yaml F: drivers/thermal/loongson2_thermal.c +LOONGSON-2K Board Management Controller (BMC) DRIVER +M: Binbin Zhou <zho...@lo...> +M: Chong Qiao <qia...@lo...> +S: Maintained +F: drivers/mfd/ls2k-bmc-core.c + LOONGSON EDAC DRIVER M: Zhao Qunqin <zha...@lo...> L: lin...@vg... diff --git a/drivers/mfd/Kconfig b/drivers/mfd/Kconfig index 425c5fba6cb1..55fbeba2ca33 100644 --- a/drivers/mfd/Kconfig +++ b/drivers/mfd/Kconfig @@ -2428,6 +2428,19 @@ config MFD_INTEL_M10_BMC_PMCI additional drivers must be enabled in order to use the functionality of the device. +config MFD_LS2K_BMC_CORE + bool "Loongson-2K Board Management Controller Support" + depends on PCI && ACPI_GENERIC_GSI + select MFD_CORE + help + Say yes here to add support for the Loongson-2K BMC which is a Board + Management Controller connected to the PCIe bus. The device supports + multiple sub-devices like display and IPMI. This driver provides common + support for accessing the devices. + + The display is enabled by default in the driver, while the IPMI interface + is enabled independently through the IPMI_LS2K option in the IPMI section. + config MFD_QNAP_MCU tristate "QNAP microcontroller unit core driver" depends on SERIAL_DEV_BUS diff --git a/drivers/mfd/Makefile b/drivers/mfd/Makefile index f7bdedd5a66d..a950e670efba 100644 --- a/drivers/mfd/Makefile +++ b/drivers/mfd/Makefile @@ -286,6 +286,8 @@ obj-$(CONFIG_MFD_INTEL_M10_BMC_CORE) += intel-m10-bmc-core.o obj-$(CONFIG_MFD_INTEL_M10_BMC_SPI) += intel-m10-bmc-spi.o obj-$(CONFIG_MFD_INTEL_M10_BMC_PMCI) += intel-m10-bmc-pmci.o +obj-$(CONFIG_MFD_LS2K_BMC_CORE) += ls2k-bmc-core.o + obj-$(CONFIG_MFD_ATC260X) += atc260x-core.o obj-$(CONFIG_MFD_ATC260X_I2C) += atc260x-i2c.o diff --git a/drivers/mfd/ls2k-bmc-core.c b/drivers/mfd/ls2k-bmc-core.c new file mode 100644 index 000000000000..39cc481d9ba1 --- /dev/null +++ b/drivers/mfd/ls2k-bmc-core.c @@ -0,0 +1,189 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Loongson-2K Board Management Controller (BMC) Core Driver. + * + * Copyright (C) 2024-2025 Loongson Technology Corporation Limited. + * + * Authors: + * Chong Qiao <qia...@lo...> + * Binbin Zhou <zho...@lo...> + */ + +#include <linux/aperture.h> +#include <linux/errno.h> +#include <linux/init.h> +#include <linux/kernel.h> +#include <linux/mfd/core.h> +#include <linux/module.h> +#include <linux/pci.h> +#include <linux/pci_ids.h> +#include <linux/platform_data/simplefb.h> +#include <linux/platform_device.h> + +/* LS2K BMC resources */ +#define LS2K_DISPLAY_RES_START (SZ_16M + SZ_2M) +#define LS2K_IPMI_RES_SIZE 0x1C +#define LS2K_IPMI0_RES_START (SZ_16M + 0xF00000) +#define LS2K_IPMI1_RES_START (LS2K_IPMI0_RES_START + LS2K_IPMI_RES_SIZE) +#define LS2K_IPMI2_RES_START (LS2K_IPMI1_RES_START + LS2K_IPMI_RES_SIZE) +#define LS2K_IPMI3_RES_START (LS2K_IPMI2_RES_START + LS2K_IPMI_RES_SIZE) +#define LS2K_IPMI4_RES_START (LS2K_IPMI3_RES_START + LS2K_IPMI_RES_SIZE) + +enum { + LS2K_BMC_DISPLAY, + LS2K_BMC_IPMI0, + LS2K_BMC_IPMI1, + LS2K_BMC_IPMI2, + LS2K_BMC_IPMI3, + LS2K_BMC_IPMI4, +}; + +static struct resource ls2k_display_resources[] = { + DEFINE_RES_MEM_NAMED(LS2K_DISPLAY_RES_START, SZ_4M, "simpledrm-res"), +}; + +static struct resource ls2k_ipmi0_resources[] = { + DEFINE_RES_MEM_NAMED(LS2K_IPMI0_RES_START, LS2K_IPMI_RES_SIZE, "ipmi0-res"), +}; + +static struct resource ls2k_ipmi1_resources[] = { + DEFINE_RES_MEM_NAMED(LS2K_IPMI1_RES_START, LS2K_IPMI_RES_SIZE, "ipmi1-res"), +}; + +static struct resource ls2k_ipmi2_resources[] = { + DEFINE_RES_MEM_NAMED(LS2K_IPMI2_RES_START, LS2K_IPMI_RES_SIZE, "ipmi2-res"), +}; + +static struct resource ls2k_ipmi3_resources[] = { + DEFINE_RES_MEM_NAMED(LS2K_IPMI3_RES_START, LS2K_IPMI_RES_SIZE, "ipmi3-res"), +}; + +static struct resource ls2k_ipmi4_resources[] = { + DEFINE_RES_MEM_NAMED(LS2K_IPMI4_RES_START, LS2K_IPMI_RES_SIZE, "ipmi4-res"), +}; + +static struct mfd_cell ls2k_bmc_cells[] = { + [LS2K_BMC_DISPLAY] = { + .name = "simple-framebuffer", + .num_resources = ARRAY_SIZE(ls2k_display_resources), + .resources = ls2k_display_resources + }, + [LS2K_BMC_IPMI0] = { + .name = "ls2k-ipmi-si", + .num_resources = ARRAY_SIZE(ls2k_ipmi0_resources), + .resources = ls2k_ipmi0_resources + }, + [LS2K_BMC_IPMI1] = { + .name = "ls2k-ipmi-si", + .num_resources = ARRAY_SIZE(ls2k_ipmi1_resources), + .resources = ls2k_ipmi1_resources + }, + [LS2K_BMC_IPMI2] = { + .name = "ls2k-ipmi-si", + .num_resources = ARRAY_SIZE(ls2k_ipmi2_resources), + .resources = ls2k_ipmi2_resources + }, + [LS2K_BMC_IPMI3] = { + .name = "ls2k-ipmi-si", + .num_resources = ARRAY_SIZE(ls2k_ipmi3_resources), + .resources = ls2k_ipmi3_resources + }, + [LS2K_BMC_IPMI4] = { + .name = "ls2k-ipmi-si", + .num_resources = ARRAY_SIZE(ls2k_ipmi4_resources), + .resources = ls2k_ipmi4_resources + }, +}; + +/* + * Currently the Loongson-2K BMC hardware does not have an I2C interface to adapt to the + * resolution. We set the resolution by presetting "video=1280x1024-16@2M" to the BMC memory. + */ +static int ls2k_bmc_parse_mode(struct pci_dev *pdev, struct simplefb_platform_data *pd) +{ + char *mode; + int depth, ret; + + /* The last 16M of PCI BAR0 is used to store the resolution string. */ + mode = devm_ioremap(&pdev->dev, pci_resource_start(pdev, 0) + SZ_16M, SZ_16M); + if (!mode) + return -ENOMEM; + + /* The resolution field starts with the flag "video=". */ + if (!strncmp(mode, "video=", 6)) + mode = mode + 6; + + ret = kstrtoint(strsep(&mode, "x"), 10, &pd->width); + if (ret) + return ret; + + ret = kstrtoint(strsep(&mode, "-"), 10, &pd->height); + if (ret) + return ret; + + ret = kstrtoint(strsep(&mode, "@"), 10, &depth); + if (ret) + return ret; + + pd->stride = pd->width * depth / 8; + pd->format = depth == 32 ? "a8r8g8b8" : "r5g6b5"; + + return 0; +} + +static int ls2k_bmc_probe(struct pci_dev *dev, const struct pci_device_id *id) +{ + struct simplefb_platform_data pd; + resource_size_t base; + int ret; + + ret = pci_enable_device(dev); + if (ret) + return ret; + + ret = ls2k_bmc_parse_mode(dev, &pd); + if (ret) + goto disable_pci; + + ls2k_bmc_cells[LS2K_BMC_DISPLAY].platform_data = &pd; + ls2k_bmc_cells[LS2K_BMC_DISPLAY].pdata_size = sizeof(pd); + base = dev->resource[0].start + LS2K_DISPLAY_RES_START; + + /* Remove conflicting efifb device */ + ret = aperture_remove_conflicting_devices(base, SZ_4M, "simple-framebuffer"); + if (ret) { + dev_err(&dev->dev, "Failed to removed firmware framebuffers: %d\n", ret); + goto disable_pci; + } + + return devm_mfd_add_devices(&dev->dev, PLATFORM_DEVID_AUTO, + ls2k_bmc_cells, ARRAY_SIZE(ls2k_bmc_cells), + &dev->resource[0], 0, NULL); + +disable_pci: + pci_disable_device(dev); + return ret; +} + +static void ls2k_bmc_remove(struct pci_dev *dev) +{ + pci_disable_device(dev); +} + +static struct pci_device_id ls2k_bmc_devices[] = { + { PCI_DEVICE(PCI_VENDOR_ID_LOONGSON, 0x1a05) }, + { } +}; +MODULE_DEVICE_TABLE(pci, ls2k_bmc_devices); + +static struct pci_driver ls2k_bmc_driver = { + .name = "ls2k-bmc", + .id_table = ls2k_bmc_devices, + .probe = ls2k_bmc_probe, + .remove = ls2k_bmc_remove, +}; +module_pci_driver(ls2k_bmc_driver); + +MODULE_DESCRIPTION("Loongson-2K Board Management Controller (BMC) Core driver"); +MODULE_AUTHOR("Loongson Technology Corporation Limited"); +MODULE_LICENSE("GPL"); -- 2.47.3 |
From: Binbin Z. <zho...@lo...> - 2025-08-12 12:00:14
|
Since the display is a sub-function of the Loongson-2K BMC, when the BMC reset, the entire BMC PCIe is disconnected, including the display which is interrupted. Quick overview of the entire LS2K BMC reset process: There are two types of reset methods: soft reset (BMC-initiated reboot of IPMI reset command) and BMC watchdog reset (watchdog timeout). First, regardless of the method, an interrupt is generated (PCIe interrupt for soft reset/GPIO interrupt for watchdog reset); Second, during the interrupt process, the system enters bmc_reset_work, clears the bus/IO/mem resources of the LS7A PCI-E bridge, waits for the BMC reset to begin, then restores the parent device's PCI configuration space, waits for the BMC reset to complete, and finally restores the BMC PCI configuration space. Display restoration occurs last. Co-developed-by: Chong Qiao <qia...@lo...> Signed-off-by: Chong Qiao <qia...@lo...> Reviewed-by: Huacai Chen <che...@lo...> Acked-by: Corey Minyard <co...@mi...> Signed-off-by: Binbin Zhou <zho...@lo...> --- drivers/mfd/ls2k-bmc-core.c | 336 ++++++++++++++++++++++++++++++++++++ 1 file changed, 336 insertions(+) diff --git a/drivers/mfd/ls2k-bmc-core.c b/drivers/mfd/ls2k-bmc-core.c index 39cc481d9ba1..ec94526628aa 100644 --- a/drivers/mfd/ls2k-bmc-core.c +++ b/drivers/mfd/ls2k-bmc-core.c @@ -10,8 +10,12 @@ */ #include <linux/aperture.h> +#include <linux/bitfield.h> +#include <linux/delay.h> #include <linux/errno.h> #include <linux/init.h> +#include <linux/iopoll.h> +#include <linux/kbd_kern.h> #include <linux/kernel.h> #include <linux/mfd/core.h> #include <linux/module.h> @@ -19,6 +23,8 @@ #include <linux/pci_ids.h> #include <linux/platform_data/simplefb.h> #include <linux/platform_device.h> +#include <linux/stop_machine.h> +#include <linux/vt_kern.h> /* LS2K BMC resources */ #define LS2K_DISPLAY_RES_START (SZ_16M + SZ_2M) @@ -29,6 +35,48 @@ #define LS2K_IPMI3_RES_START (LS2K_IPMI2_RES_START + LS2K_IPMI_RES_SIZE) #define LS2K_IPMI4_RES_START (LS2K_IPMI3_RES_START + LS2K_IPMI_RES_SIZE) +#define LS7A_PCI_CFG_SIZE 0x100 + +/* LS7A bridge registers */ +#define LS7A_PCIE_PORT_CTL0 0x0 +#define LS7A_PCIE_PORT_STS1 0xC +#define LS7A_GEN2_CTL 0x80C +#define LS7A_SYMBOL_TIMER 0x71C + +/* Bits of LS7A_PCIE_PORT_CTL0 */ +#define LS2K_BMC_PCIE_LTSSM_ENABLE BIT(3) + +/* Bits of LS7A_PCIE_PORT_STS1 */ +#define LS2K_BMC_PCIE_LTSSM_STS GENMASK(5, 0) +#define LS2K_BMC_PCIE_CONNECTED 0x11 + +#define LS2K_BMC_PCIE_DELAY_US 1000 +#define LS2K_BMC_PCIE_TIMEOUT_US 1000000 + +/* Bits of LS7A_GEN2_CTL */ +#define LS7A_GEN2_SPEED_CHANG BIT(17) +#define LS7A_CONF_PHY_TX BIT(18) + +/* Bits of LS7A_SYMBOL_TIMER */ +#define LS7A_MASK_LEN_MATCH BIT(26) + +/* Interval between interruptions */ +#define LS2K_BMC_INT_INTERVAL (60 * HZ) + +/* Maximum time to wait for U-Boot and DDR to be ready with ms. */ +#define LS2K_BMC_RESET_WAIT_TIME 10000 + +/* It's an experience value */ +#define LS7A_BAR0_CHECK_MAX_TIMES 2000 + +#define LS2K_BMC_RESET_GPIO 14 +#define LOONGSON_GPIO_REG_BASE 0x1FE00500 +#define LOONGSON_GPIO_REG_SIZE 0x18 +#define LOONGSON_GPIO_OEN 0x0 +#define LOONGSON_GPIO_FUNC 0x4 +#define LOONGSON_GPIO_INTPOL 0x10 +#define LOONGSON_GPIO_INTEN 0x14 + enum { LS2K_BMC_DISPLAY, LS2K_BMC_IPMI0, @@ -95,6 +143,281 @@ static struct mfd_cell ls2k_bmc_cells[] = { }, }; +/* Index of the BMC PCI configuration space to be restored at BMC reset. */ +struct ls2k_bmc_pci_data { + u32 pci_command; + u32 base_address0; + u32 interrupt_line; +}; + +/* Index of the parent PCI configuration space to be restored at BMC reset. */ +struct ls2k_bmc_bridge_pci_data { + u32 pci_command; + u32 base_address[6]; + u32 rom_addreess; + u32 interrupt_line; + u32 msi_hi; + u32 msi_lo; + u32 devctl; + u32 linkcap; + u32 linkctl_sts; + u32 symbol_timer; + u32 gen2_ctrl; +}; + +struct ls2k_bmc_pdata { + struct device *dev; + struct work_struct bmc_reset_work; + struct ls2k_bmc_pci_data bmc_pci_data; + struct ls2k_bmc_bridge_pci_data bridge_pci_data; +}; + +static bool ls2k_bmc_bar0_addr_is_set(struct pci_dev *pdev) +{ + u32 addr; + + pci_read_config_dword(pdev, PCI_BASE_ADDRESS_0, &addr); + + return addr & PCI_BASE_ADDRESS_MEM_MASK ? true : false; +} + +static bool ls2k_bmc_pcie_is_connected(struct pci_dev *parent, struct ls2k_bmc_pdata *ddata) +{ + void __iomem *base; + int val, ret; + + base = pci_iomap(parent, 0, LS7A_PCI_CFG_SIZE); + if (!base) + return false; + + val = readl(base + LS7A_PCIE_PORT_CTL0); + writel(val | LS2K_BMC_PCIE_LTSSM_ENABLE, base + LS7A_PCIE_PORT_CTL0); + + ret = readl_poll_timeout_atomic(base + LS7A_PCIE_PORT_STS1, val, + (val & LS2K_BMC_PCIE_LTSSM_STS) == LS2K_BMC_PCIE_CONNECTED, + LS2K_BMC_PCIE_DELAY_US, LS2K_BMC_PCIE_TIMEOUT_US); + if (ret) { + pci_iounmap(parent, base); + dev_err(ddata->dev, "PCI-E training failed status=0x%x\n", val); + return false; + } + + pci_iounmap(parent, base); + return true; +} + +static void ls2k_bmc_restore_bridge_pci_data(struct pci_dev *parent, struct ls2k_bmc_pdata *ddata) +{ + int base, i = 0; + + pci_write_config_dword(parent, PCI_COMMAND, ddata->bridge_pci_data.pci_command); + + for (base = PCI_BASE_ADDRESS_0; base <= PCI_BASE_ADDRESS_5; base += 4, i++) + pci_write_config_dword(parent, base, ddata->bridge_pci_data.base_address[i]); + + pci_write_config_dword(parent, PCI_ROM_ADDRESS, ddata->bridge_pci_data.rom_addreess); + pci_write_config_dword(parent, PCI_INTERRUPT_LINE, ddata->bridge_pci_data.interrupt_line); + + pci_write_config_dword(parent, parent->msi_cap + PCI_MSI_ADDRESS_LO, + ddata->bridge_pci_data.msi_lo); + pci_write_config_dword(parent, parent->msi_cap + PCI_MSI_ADDRESS_HI, + ddata->bridge_pci_data.msi_hi); + pci_write_config_dword(parent, parent->pcie_cap + PCI_EXP_DEVCTL, + ddata->bridge_pci_data.devctl); + pci_write_config_dword(parent, parent->pcie_cap + PCI_EXP_LNKCAP, + ddata->bridge_pci_data.linkcap); + pci_write_config_dword(parent, parent->pcie_cap + PCI_EXP_LNKCTL, + ddata->bridge_pci_data.linkctl_sts); + + pci_write_config_dword(parent, LS7A_GEN2_CTL, ddata->bridge_pci_data.gen2_ctrl); + pci_write_config_dword(parent, LS7A_SYMBOL_TIMER, ddata->bridge_pci_data.symbol_timer); +} + +static int ls2k_bmc_recover_pci_data(void *data) +{ + struct ls2k_bmc_pdata *ddata = data; + struct pci_dev *pdev = to_pci_dev(ddata->dev); + struct pci_dev *parent = pdev->bus->self; + u32 i; + + /* + * Clear the bus, io and mem resources of the PCI-E bridge to zero, so that + * the processor can not access the LS2K PCI-E port, to avoid crashing due to + * the lack of return signal from accessing the LS2K PCI-E port. + */ + pci_write_config_dword(parent, PCI_BASE_ADDRESS_2, 0); + pci_write_config_dword(parent, PCI_BASE_ADDRESS_3, 0); + pci_write_config_dword(parent, PCI_BASE_ADDRESS_4, 0); + + /* + * When the LS2K BMC is reset, the LS7A PCI-E port is also reset, and its PCI + * BAR0 register is cleared. Due to the time gap between the GPIO interrupt + * generation and the LS2K BMC reset, the LS7A PCI BAR0 register is read to + * determine whether the reset has begun. + */ + for (i = LS7A_BAR0_CHECK_MAX_TIMES; i > 0 ; i--) { + if (!ls2k_bmc_bar0_addr_is_set(parent)) + break; + mdelay(1); + }; + + if (i == 0) + return false; + + ls2k_bmc_restore_bridge_pci_data(parent, ddata); + + /* Check if PCI-E is connected */ + if (!ls2k_bmc_pcie_is_connected(parent, ddata)) + return false; + + /* Waiting for U-Boot and DDR ready */ + mdelay(LS2K_BMC_RESET_WAIT_TIME); + if (!ls2k_bmc_bar0_addr_is_set(parent)) + return false; + + /* Restore LS2K BMC PCI-E config data */ + pci_write_config_dword(pdev, PCI_COMMAND, ddata->bmc_pci_data.pci_command); + pci_write_config_dword(pdev, PCI_BASE_ADDRESS_0, ddata->bmc_pci_data.base_address0); + pci_write_config_dword(pdev, PCI_INTERRUPT_LINE, ddata->bmc_pci_data.interrupt_line); + + return 0; +} + +static void ls2k_bmc_events_fn(struct work_struct *work) +{ + struct ls2k_bmc_pdata *ddata = container_of(work, struct ls2k_bmc_pdata, bmc_reset_work); + + /* + * The PCI-E is lost when the BMC resets, at which point access to the PCI-E + * from other CPUs is suspended to prevent a crash. + */ + stop_machine(ls2k_bmc_recover_pci_data, ddata, NULL); + + if (IS_ENABLED(CONFIG_VT)) { + /* Re-push the display due to previous PCI-E loss. */ + set_console(vt_move_to_console(MAX_NR_CONSOLES - 1, 1)); + } +} + +static irqreturn_t ls2k_bmc_interrupt(int irq, void *arg) +{ + struct ls2k_bmc_pdata *ddata = arg; + static unsigned long last_jiffies; + + if (system_state != SYSTEM_RUNNING) + return IRQ_HANDLED; + + /* Skip interrupt in LS2K_BMC_INT_INTERVAL */ + if (time_after(jiffies, last_jiffies + LS2K_BMC_INT_INTERVAL)) { + schedule_work(&ddata->bmc_reset_work); + last_jiffies = jiffies; + } + + return IRQ_HANDLED; +} + +/* + * Saves the BMC parent device (LS7A) and its own PCI configuration space registers + * that need to be restored after BMC reset. + */ +static void ls2k_bmc_save_pci_data(struct pci_dev *pdev, struct ls2k_bmc_pdata *ddata) +{ + struct pci_dev *parent = pdev->bus->self; + int base, i = 0; + + pci_read_config_dword(parent, PCI_COMMAND, &ddata->bridge_pci_data.pci_command); + + for (base = PCI_BASE_ADDRESS_0; base <= PCI_BASE_ADDRESS_5; base += 4, i++) + pci_read_config_dword(parent, base, &ddata->bridge_pci_data.base_address[i]); + + pci_read_config_dword(parent, PCI_ROM_ADDRESS, &ddata->bridge_pci_data.rom_addreess); + pci_read_config_dword(parent, PCI_INTERRUPT_LINE, &ddata->bridge_pci_data.interrupt_line); + + pci_read_config_dword(parent, parent->msi_cap + PCI_MSI_ADDRESS_LO, + &ddata->bridge_pci_data.msi_lo); + pci_read_config_dword(parent, parent->msi_cap + PCI_MSI_ADDRESS_HI, + &ddata->bridge_pci_data.msi_hi); + + pci_read_config_dword(parent, parent->pcie_cap + PCI_EXP_DEVCTL, + &ddata->bridge_pci_data.devctl); + pci_read_config_dword(parent, parent->pcie_cap + PCI_EXP_LNKCAP, + &ddata->bridge_pci_data.linkcap); + pci_read_config_dword(parent, parent->pcie_cap + PCI_EXP_LNKCTL, + &ddata->bridge_pci_data.linkctl_sts); + + pci_read_config_dword(parent, LS7A_GEN2_CTL, &ddata->bridge_pci_data.gen2_ctrl); + ddata->bridge_pci_data.gen2_ctrl |= FIELD_PREP(LS7A_GEN2_SPEED_CHANG, 0x1) + | FIELD_PREP(LS7A_CONF_PHY_TX, 0x0); + + pci_read_config_dword(parent, LS7A_SYMBOL_TIMER, &ddata->bridge_pci_data.symbol_timer); + ddata->bridge_pci_data.symbol_timer |= LS7A_MASK_LEN_MATCH; + + pci_read_config_dword(pdev, PCI_COMMAND, &ddata->bmc_pci_data.pci_command); + pci_read_config_dword(pdev, PCI_BASE_ADDRESS_0, &ddata->bmc_pci_data.base_address0); + pci_read_config_dword(pdev, PCI_INTERRUPT_LINE, &ddata->bmc_pci_data.interrupt_line); +} + +static int ls2k_bmc_pdata_initial(struct ls2k_bmc_pdata *ddata) +{ + struct pci_dev *pdev = to_pci_dev(ddata->dev); + int gsi = 16 + (LS2K_BMC_RESET_GPIO & 7); + void __iomem *gpio_base; + int irq, ret, val; + + ls2k_bmc_save_pci_data(pdev, ddata); + + INIT_WORK(&ddata->bmc_reset_work, ls2k_bmc_events_fn); + + ret = devm_request_irq(&pdev->dev, pdev->irq, ls2k_bmc_interrupt, + IRQF_SHARED | IRQF_TRIGGER_FALLING, "ls2kbmc pcie", ddata); + if (ret) { + dev_err(ddata->dev, "Failed to request LS2KBMC PCI-E IRQ %d.\n", pdev->irq); + return ret; + } + + /* + * Since gpio_chip->to_irq is not implemented in the Loongson-3 GPIO driver, + * acpi_register_gsi() is used to obtain the GPIO IRQ. The GPIO interrupt is a + * watchdog interrupt that is triggered when the BMC resets. + */ + irq = acpi_register_gsi(NULL, gsi, ACPI_EDGE_SENSITIVE, ACPI_ACTIVE_LOW); + if (irq < 0) + return irq; + + gpio_base = ioremap(LOONGSON_GPIO_REG_BASE, LOONGSON_GPIO_REG_SIZE); + if (!gpio_base) { + ret = PTR_ERR(gpio_base); + goto acpi_failed; + } + + /* Disable GPIO output */ + val = readl(gpio_base + LOONGSON_GPIO_OEN); + writel(val | BIT(LS2K_BMC_RESET_GPIO), gpio_base + LOONGSON_GPIO_OEN); + + /* Enable GPIO functionality */ + val = readl(gpio_base + LOONGSON_GPIO_FUNC); + writel(val & ~BIT(LS2K_BMC_RESET_GPIO), gpio_base + LOONGSON_GPIO_FUNC); + + /* Set GPIO interrupts to low-level active */ + val = readl(gpio_base + LOONGSON_GPIO_INTPOL); + writel(val & ~BIT(LS2K_BMC_RESET_GPIO), gpio_base + LOONGSON_GPIO_INTPOL); + + /* Enable GPIO interrupts */ + val = readl(gpio_base + LOONGSON_GPIO_INTEN); + writel(val | BIT(LS2K_BMC_RESET_GPIO), gpio_base + LOONGSON_GPIO_INTEN); + + ret = devm_request_irq(ddata->dev, irq, ls2k_bmc_interrupt, + IRQF_SHARED | IRQF_TRIGGER_FALLING, "ls2kbmc gpio", ddata); + if (ret) + dev_err(ddata->dev, "Failed to request LS2KBMC GPIO IRQ %d.\n", irq); + + iounmap(gpio_base); + +acpi_failed: + acpi_unregister_gsi(gsi); + return ret; +} + /* * Currently the Loongson-2K BMC hardware does not have an I2C interface to adapt to the * resolution. We set the resolution by presetting "video=1280x1024-16@2M" to the BMC memory. @@ -134,6 +457,7 @@ static int ls2k_bmc_parse_mode(struct pci_dev *pdev, struct simplefb_platform_da static int ls2k_bmc_probe(struct pci_dev *dev, const struct pci_device_id *id) { struct simplefb_platform_data pd; + struct ls2k_bmc_pdata *ddata; resource_size_t base; int ret; @@ -141,6 +465,18 @@ static int ls2k_bmc_probe(struct pci_dev *dev, const struct pci_device_id *id) if (ret) return ret; + ddata = devm_kzalloc(&dev->dev, sizeof(*ddata), GFP_KERNEL); + if (IS_ERR(ddata)) { + ret = -ENOMEM; + goto disable_pci; + } + + ddata->dev = &dev->dev; + + ret = ls2k_bmc_pdata_initial(ddata); + if (ret) + goto disable_pci; + ret = ls2k_bmc_parse_mode(dev, &pd); if (ret) goto disable_pci; -- 2.47.3 |
From: Binbin Z. <zho...@lo...> - 2025-08-12 12:00:05
|
Hi all: This patchset introduces the Loongson-2K BMC. It is a PCIe device present on servers similar to the Loongson-3 CPUs. And it is a multifunctional device (MFD), such as display as a sub-function of it. For IPMI, according to the existing design, we use software simulation to implement the KCS interface registers: Stauts/Command/Data_Out/Data_In. Also since both host side and BMC side read and write kcs status, we use fifo pointer to ensure data consistency. For the display, based on simpledrm, the resolution is read from a fixed position in the BMC since the hardware does not support auto-detection of the resolution. Of course, we will try to support multiple resolutions later, through a vbios-like approach. Especially, for the BMC reset function, since the display will be disconnected when BMC reset, we made a special treatment of re-push. Based on this, I will present it in four patches: patch-1: BMC device PCI resource allocation. patch-2: BMC reset function support patch-3: IPMI implementation Thanks. ------- V9: Patch (2/3): - PCIE -> PCI-E in dev_err(); - Separate the read from the write; Link to V8: https://lore.kernel.org/all/cov...@lo.../ V8: Patch (1/3): - Similar to as3711_subdevs, identify elements in ls2k_bmc_cells. Patch (2/3): - Rename variables using usual names, such as `priv` -> `ddata`; - Use if statements instead of #ifery; - Rewrite the error message to ensure it is easy to understand; - ls2k_bmc_pdata_initial(dev, priv); -> ls2k_bmc_pdata_initial(priv); Link to V7: https://lore.kernel.org/all/cov...@lo.../ V7: Patch (1/3): - Fix build warning by lkp: Add depend on ACPI_GENERIC_GSI - https://lore.kernel.org/all/202...@in.../ Link to V6: https://lore.kernel.org/all/cov...@lo.../ V6: - Add Acked-by tag from Corey, thanks; Patch (1/3): - Fix build warning by lkp: Add depend on PCI - https://lore.kernel.org/all/202...@in.../ - https://lore.kernel.org/all/202...@in.../ - https://lore.kernel.org/all/202...@in.../ - https://lore.kernel.org/all/202...@in.../ Link to V5: https://lore.kernel.org/all/cov...@lo.../ V5: Patch (1/3): - Rename ls2kbmc-mfd.c to ls2k-bmc-core.c; - Rename MFD_LS2K_BMC to MFD_LS2K_BMC_CORE and update its help text. Patch (3/3): - Add an IPMI_LS2K config in the IPMI section that enables the IPMI interface and selects MFD_LS2K_BMC_CORE. Link to V4: https://lore.kernel.org/all/cov...@lo.../ V4: - Add Reviewed-by tag; - Change the order of the patches. Patch (1/3): - Fix build warning by lkp: Kconfig tristate -> bool - https://lore.kernel.org/all/202...@in.../ - Update commit message; - Move MFD_LS2K_BMC after MFD_INTEL_M10_BMC_PMCI in Kconfig and Makefile. Patch (2/3): - Remove unnecessary newlines; - Rename ls2k_bmc_check_pcie_connected() to ls2k_bmc_pcie_is_connected(); - Update comment message. Patch (3/3): - Remove unnecessary newlines. Link to V3: https://lore.kernel.org/all/cov...@lo.../ V3: Patch (1/3): - Drop "MFD" in title and comment; - Fromatting code; - Add clearer comments. Patch (2/3): - Rebase linux-ipmi/next tree; - Use readx()/writex() to read and write IPMI data instead of structure pointer references; - CONFIG_LOONGARCH -> MFD_LS2K_BMC; - Drop unused output. Patch (3/3): - Inline the ls2k_bmc_gpio_reset_handler() function to ls2k_bmc_pdata_initial(); - Add clearer comments. - Use proper multi-line commentary as per the Coding Style documentation; - Define all magic numbers. Link to V2: https://lore.kernel.org/all/cov...@lo.../ V2: - Drop ls2kdrm, use simpledrm instead. Patch (1/3): - Use DEFINE_RES_MEM_NAMED/MFD_CELL_RES simplified code; - Add resolution fetching due to replacing the original display solution with simpledrm; - Add aperture_remove_conflicting_devices() to avoid efifb conflict with simpledrm. Patch (3/3): - This part of the function, moved from the original ls2kdrm to mfd; - Use set_console to implement the Re-push display function. Link to V1: https://lore.kernel.org/all/cov...@lo.../ Binbin Zhou (3): mfd: ls2kbmc: Introduce Loongson-2K BMC core driver mfd: ls2kbmc: Add Loongson-2K BMC reset function support ipmi: Add Loongson-2K BMC support MAINTAINERS | 7 + drivers/char/ipmi/Kconfig | 7 + drivers/char/ipmi/Makefile | 1 + drivers/char/ipmi/ipmi_si.h | 7 + drivers/char/ipmi/ipmi_si_intf.c | 4 + drivers/char/ipmi/ipmi_si_ls2k.c | 189 +++++++++++ drivers/mfd/Kconfig | 13 + drivers/mfd/Makefile | 2 + drivers/mfd/ls2k-bmc-core.c | 525 +++++++++++++++++++++++++++++++ 9 files changed, 755 insertions(+) create mode 100644 drivers/char/ipmi/ipmi_si_ls2k.c create mode 100644 drivers/mfd/ls2k-bmc-core.c base-commit: 006aa8f57f55dd5bf68c4ada1e0d3f4e59027d71 -- 2.47.3 |
From: Sasha L. <sa...@ke...> - 2025-08-10 00:21:47
|
From: Breno Leitao <le...@de...> [ Upstream commit ec50ec378e3fd83bde9b3d622ceac3509a60b6b5 ] During BMC firmware upgrades on live systems, the ipmi_msghandler generates excessive "BMC returned incorrect response" warnings while the BMC is temporarily offline. This can flood system logs in large deployments. Replace dev_warn() with dev_warn_ratelimited() to throttle these warnings and prevent log spam during BMC maintenance operations. Signed-off-by: Breno Leitao <le...@de...> Message-ID: <202...@de...> Signed-off-by: Corey Minyard <co...@mi...> Signed-off-by: Sasha Levin <sa...@ke...> --- LLM Generated explanations, may be completely bogus: **Backport Status: YES** This commit is an excellent candidate for backporting to stable kernel trees for the following reasons: 1. **Fixes a Real User-Impacting Bug**: The commit addresses log flooding during BMC firmware upgrades or when BMC is temporarily offline. This is a practical operational issue that affects production systems, especially in large deployments where BMC maintenance is routine. 2. **Minimal and Safe Change**: The change is trivial - simply replacing `dev_warn()` with `dev_warn_ratelimited()`. This is: - A single-line change - Uses well-established kernel infrastructure (ratelimited logging) - Has no functional impact on the IPMI driver's operation - Cannot introduce regressions in driver functionality 3. **Clear Problem Statement**: The commit message clearly describes a specific scenario (BMC firmware upgrades) where excessive "BMC returned incorrect response" warnings flood system logs. This is a documented, reproducible issue. 4. **Follows Established Pattern**: The kernel already has precedent for this type of fix. Commit ca8c1c53b03b ("ipmi: rate limit ipmi smi_event failure message") from kernel v5.15 made similar changes to address log flooding in the same subsystem, and that was successfully integrated into stable trees. 5. **Low Risk**: The change only affects logging behavior, not the actual handling of IPMI messages. The error path (goto return_unspecified) remains unchanged, maintaining the same error handling logic. 6. **Addresses Operational Concerns**: In production environments with many servers, BMC maintenance is common. Without rate limiting, logs can grow excessively, potentially: - Filling up disk space - Making real issues harder to spot - Degrading system performance due to excessive logging I/O 7. **Code Context Shows Repeated Condition**: Looking at the code path (lines 4604-4615), this warning triggers when the BMC response doesn't match the expected NetFN and Command. During BMC firmware updates or reboots, this mismatch can occur repeatedly until the BMC comes back online, making rate limiting essential. The change meets all criteria for stable backporting: it fixes a bug affecting users, is small and contained, has minimal risk of regression, and addresses a clear operational problem without introducing new features or architectural changes. drivers/char/ipmi/ipmi_msghandler.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/char/ipmi/ipmi_msghandler.c b/drivers/char/ipmi/ipmi_msghandler.c index 064944ae9fdc..8e9050f99e9e 100644 --- a/drivers/char/ipmi/ipmi_msghandler.c +++ b/drivers/char/ipmi/ipmi_msghandler.c @@ -4607,10 +4607,10 @@ static int handle_one_recv_msg(struct ipmi_smi *intf, * The NetFN and Command in the response is not even * marginally correct. */ - dev_warn(intf->si_dev, - "BMC returned incorrect response, expected netfn %x cmd %x, got netfn %x cmd %x\n", - (msg->data[0] >> 2) | 1, msg->data[1], - msg->rsp[0] >> 2, msg->rsp[1]); + dev_warn_ratelimited(intf->si_dev, + "BMC returned incorrect response, expected netfn %x cmd %x, got netfn %x cmd %x\n", + (msg->data[0] >> 2) | 1, msg->data[1], + msg->rsp[0] >> 2, msg->rsp[1]); goto return_unspecified; } -- 2.39.5 |
From: Sasha L. <sa...@ke...> - 2025-08-10 00:21:34
|
From: Corey Minyard <co...@mi...> [ Upstream commit 8ffcb7560b4a15faf821df95e3ab532b2b020f8c ] The source and destination of some strcpy operations was the same. Split out the part of the operations that needed to be done for those particular calls so the unnecessary copy wasn't done. Reported-by: kernel test robot <lk...@in...> Closes: https://lore.kernel.org/oe-kbuild-all/202...@in.../ Signed-off-by: Corey Minyard <co...@mi...> Signed-off-by: Sasha Levin <sa...@ke...> --- LLM Generated explanations, may be completely bogus: **Backport Status: YES** This commit is an excellent candidate for backporting to stable kernel trees. Here's my extensive analysis: ## Bug Description and Impact The commit fixes a critical bug where `strcpy()` operations were copying a string to itself (source and destination being the same). Looking at the original code: 1. In `action_op()` at line 1213: `strcpy(action, inval);` - but when called from `ipmi_wdog_init()` line 1294 with `action_op(action, NULL)`, the first parameter `action` is passed, and if validation passes, it copies `action` to itself. 2. Similar issues in `preaction_op()` line 1237 and `preop_op()` line 1257 when called from initialization. This undefined behavior could lead to: - Memory corruption - Unpredictable watchdog behavior during initialization - Potential system instability in the IPMI watchdog subsystem ## Fix Analysis The fix properly refactors the code by: 1. Creating separate `*_set_val()` functions that only validate and set the value variables 2. Modifying the original functions to use these new helpers 3. In `ipmi_wdog_init()`, directly calling the `*_set_val()` functions for validation without the problematic strcpy This is a minimal, targeted fix that: - Doesn't introduce new features - Maintains existing functionality - Only restructures code to eliminate undefined behavior - Has clear boundaries within the IPMI watchdog driver ## Stable Tree Criteria Met 1. **Fixes a real bug**: Yes - strcpy with same source/destination is undefined behavior 2. **Small and contained**: Yes - changes are limited to one file and involve straightforward refactoring 3. **No major architectural changes**: Correct - only refactors existing functions 4. **Minimal regression risk**: Yes - the logic remains identical, just properly separated 5. **Critical subsystem**: IPMI watchdog is important for system reliability and recovery 6. **Automated detection**: The bug was found by kernel test robot, indicating it's a real issue ## Additional Evidence - The commit was reported by the kernel test robot with a specific bug report link - The fix is defensive programming that eliminates undefined behavior - IPMI watchdog is used in production servers for system recovery - The bug could manifest differently across compilers/architectures This meets all stable kernel criteria for backporting as it fixes a concrete bug with minimal risk. drivers/char/ipmi/ipmi_watchdog.c | 59 ++++++++++++++++++++++--------- 1 file changed, 42 insertions(+), 17 deletions(-) diff --git a/drivers/char/ipmi/ipmi_watchdog.c b/drivers/char/ipmi/ipmi_watchdog.c index ab759b492fdd..a013ddbf1466 100644 --- a/drivers/char/ipmi/ipmi_watchdog.c +++ b/drivers/char/ipmi/ipmi_watchdog.c @@ -1146,14 +1146,8 @@ static struct ipmi_smi_watcher smi_watcher = { .smi_gone = ipmi_smi_gone }; -static int action_op(const char *inval, char *outval) +static int action_op_set_val(const char *inval) { - if (outval) - strcpy(outval, action); - - if (!inval) - return 0; - if (strcmp(inval, "reset") == 0) action_val = WDOG_TIMEOUT_RESET; else if (strcmp(inval, "none") == 0) @@ -1164,18 +1158,26 @@ static int action_op(const char *inval, char *outval) action_val = WDOG_TIMEOUT_POWER_DOWN; else return -EINVAL; - strcpy(action, inval); return 0; } -static int preaction_op(const char *inval, char *outval) +static int action_op(const char *inval, char *outval) { + int rv; + if (outval) - strcpy(outval, preaction); + strcpy(outval, action); if (!inval) return 0; + rv = action_op_set_val(inval); + if (!rv) + strcpy(action, inval); + return rv; +} +static int preaction_op_set_val(const char *inval) +{ if (strcmp(inval, "pre_none") == 0) preaction_val = WDOG_PRETIMEOUT_NONE; else if (strcmp(inval, "pre_smi") == 0) @@ -1188,18 +1190,26 @@ static int preaction_op(const char *inval, char *outval) preaction_val = WDOG_PRETIMEOUT_MSG_INT; else return -EINVAL; - strcpy(preaction, inval); return 0; } -static int preop_op(const char *inval, char *outval) +static int preaction_op(const char *inval, char *outval) { + int rv; + if (outval) - strcpy(outval, preop); + strcpy(outval, preaction); if (!inval) return 0; + rv = preaction_op_set_val(inval); + if (!rv) + strcpy(preaction, inval); + return 0; +} +static int preop_op_set_val(const char *inval) +{ if (strcmp(inval, "preop_none") == 0) preop_val = WDOG_PREOP_NONE; else if (strcmp(inval, "preop_panic") == 0) @@ -1208,7 +1218,22 @@ static int preop_op(const char *inval, char *outval) preop_val = WDOG_PREOP_GIVE_DATA; else return -EINVAL; - strcpy(preop, inval); + return 0; +} + +static int preop_op(const char *inval, char *outval) +{ + int rv; + + if (outval) + strcpy(outval, preop); + + if (!inval) + return 0; + + rv = preop_op_set_val(inval); + if (!rv) + strcpy(preop, inval); return 0; } @@ -1245,18 +1270,18 @@ static int __init ipmi_wdog_init(void) { int rv; - if (action_op(action, NULL)) { + if (action_op_set_val(action)) { action_op("reset", NULL); pr_info("Unknown action '%s', defaulting to reset\n", action); } - if (preaction_op(preaction, NULL)) { + if (preaction_op_set_val(preaction)) { preaction_op("pre_none", NULL); pr_info("Unknown preaction '%s', defaulting to none\n", preaction); } - if (preop_op(preop, NULL)) { + if (preop_op_set_val(preop)) { preop_op("preop_none", NULL); pr_info("Unknown preop '%s', defaulting to none\n", preop); } -- 2.39.5 |
From: Corey M. <co...@mi...> - 2025-08-08 22:28:27
|
On Fri, Aug 08, 2025 at 03:37:51PM -0500, Frederick Lawler wrote: > Hi Corey, > > On Thu, Aug 07, 2025 at 06:02:33PM -0500, Corey Minyard wrote: > > If the driver goes into any maintenance mode, disable sysfs access until > > it is done. > > > > Why specifically sysfs reads during FW update state? Is there an expectation > that during a FW update, that redfish/ipmi/etc... are chunking/buffering the > FW payloads to the device, thus needs write access? I'm assuming that the > device is blocking waiting for paylods to finish, so sending additional messages > just get ignored? In my experience, when the BMC goes into firmware update mode, it doesn't behave normally. But it's just my experience. It general, it's best not to mess with something during an update. -corey > > > If the driver goes into reset maintenance mode, disable all messages > > until it is done. > > > > Signed-off-by: Corey Minyard <co...@mi...> > > --- > > drivers/char/ipmi/ipmi_msghandler.c | 11 +++++++++++ > > 1 file changed, 11 insertions(+) > > > > diff --git a/drivers/char/ipmi/ipmi_msghandler.c b/drivers/char/ipmi/ipmi_msghandler.c > > index f124c0b33db8..72f5f4a0c056 100644 > > --- a/drivers/char/ipmi/ipmi_msghandler.c > > +++ b/drivers/char/ipmi/ipmi_msghandler.c > > @@ -2338,6 +2338,11 @@ static int i_ipmi_request(struct ipmi_user *user, > > > > if (!run_to_completion) > > mutex_lock(&intf->users_mutex); > > + if (intf->maintenance_mode_state == IPMI_MAINTENANCE_MODE_STATE_RESET) { > > + /* No messages while the BMC is in reset. */ > > + rv = -EBUSY; > > + goto out_err; > > + } > > if (intf->in_shutdown) { > > rv = -ENODEV; > > goto out_err; > > @@ -2639,6 +2644,12 @@ static int __bmc_get_device_id(struct ipmi_smi *intf, struct bmc_device *bmc, > > (bmc->dyn_id_set && time_is_after_jiffies(bmc->dyn_id_expiry))) > > goto out_noprocessing; > > > > + /* Don't allow sysfs access when in maintenance mode. */ > > + if (intf->maintenance_mode_state) { > > + rv = -EBUSY; > > + goto out_noprocessing; > > + } > > + > > prev_guid_set = bmc->dyn_guid_set; > > __get_guid(intf); > > > > -- > > 2.43.0 > > > > Best, Fred |
From: Frederick L. <fr...@cl...> - 2025-08-08 20:38:05
|
Hi Corey, On Thu, Aug 07, 2025 at 06:02:33PM -0500, Corey Minyard wrote: > If the driver goes into any maintenance mode, disable sysfs access until > it is done. > Why specifically sysfs reads during FW update state? Is there an expectation that during a FW update, that redfish/ipmi/etc... are chunking/buffering the FW payloads to the device, thus needs write access? I'm assuming that the device is blocking waiting for paylods to finish, so sending additional messages just get ignored? > If the driver goes into reset maintenance mode, disable all messages > until it is done. > > Signed-off-by: Corey Minyard <co...@mi...> > --- > drivers/char/ipmi/ipmi_msghandler.c | 11 +++++++++++ > 1 file changed, 11 insertions(+) > > diff --git a/drivers/char/ipmi/ipmi_msghandler.c b/drivers/char/ipmi/ipmi_msghandler.c > index f124c0b33db8..72f5f4a0c056 100644 > --- a/drivers/char/ipmi/ipmi_msghandler.c > +++ b/drivers/char/ipmi/ipmi_msghandler.c > @@ -2338,6 +2338,11 @@ static int i_ipmi_request(struct ipmi_user *user, > > if (!run_to_completion) > mutex_lock(&intf->users_mutex); > + if (intf->maintenance_mode_state == IPMI_MAINTENANCE_MODE_STATE_RESET) { > + /* No messages while the BMC is in reset. */ > + rv = -EBUSY; > + goto out_err; > + } > if (intf->in_shutdown) { > rv = -ENODEV; > goto out_err; > @@ -2639,6 +2644,12 @@ static int __bmc_get_device_id(struct ipmi_smi *intf, struct bmc_device *bmc, > (bmc->dyn_id_set && time_is_after_jiffies(bmc->dyn_id_expiry))) > goto out_noprocessing; > > + /* Don't allow sysfs access when in maintenance mode. */ > + if (intf->maintenance_mode_state) { > + rv = -EBUSY; > + goto out_noprocessing; > + } > + > prev_guid_set = bmc->dyn_guid_set; > __get_guid(intf); > > -- > 2.43.0 > Best, Fred |
From: Corey M. <co...@mi...> - 2025-08-08 14:48:20
|
On Fri, Aug 08, 2025 at 11:17:29AM +0930, Andrew Jeffery wrote: > On Thu, 2025-08-07 at 08:28 -0500, Rob Herring (Arm) wrote: > > The ASpeed kcs-bmc nodes have a "clocks" property which isn't > > documented. It looks like all the LPC child devices have the same clock > > source and some of the drivers manage their clock. Perhaps it is the > > parent device that should have the clock, but it's too late for that. > > > > Signed-off-by: Rob Herring (Arm) <ro...@ke...> > > Thanks Rob. > > Acked-by: Andrew Jeffery <an...@co...> Queued for 4.18, I'll add it to the next tree when 4.17-rc1 releases. Thanks, -corey |
From: Andrew J. <an...@co...> - 2025-08-08 02:06:19
|
On Thu, 2025-08-07 at 08:28 -0500, Rob Herring (Arm) wrote: > The ASpeed kcs-bmc nodes have a "clocks" property which isn't > documented. It looks like all the LPC child devices have the same clock > source and some of the drivers manage their clock. Perhaps it is the > parent device that should have the clock, but it's too late for that. > > Signed-off-by: Rob Herring (Arm) <ro...@ke...> Thanks Rob. Acked-by: Andrew Jeffery <an...@co...> |
From: Corey M. <co...@mi...> - 2025-08-07 23:31:54
|
So you can see if it's in maintenance mode and see how long is left. Signed-off-by: Corey Minyard <co...@mi...> --- drivers/char/ipmi/ipmi_msghandler.c | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/drivers/char/ipmi/ipmi_msghandler.c b/drivers/char/ipmi/ipmi_msghandler.c index 72f5f4a0c056..5ff35c473b50 100644 --- a/drivers/char/ipmi/ipmi_msghandler.c +++ b/drivers/char/ipmi/ipmi_msghandler.c @@ -432,6 +432,7 @@ struct ipmi_smi { atomic_t nr_users; struct device_attribute nr_users_devattr; struct device_attribute nr_msgs_devattr; + struct device_attribute maintenance_mode_devattr; /* Used for wake ups at startup. */ @@ -3545,6 +3546,19 @@ static ssize_t nr_msgs_show(struct device *dev, } static DEVICE_ATTR_RO(nr_msgs); +static ssize_t maintenance_mode_show(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct ipmi_smi *intf = container_of(attr, + struct ipmi_smi, + maintenance_mode_devattr); + + return sysfs_emit(buf, "%u %d\n", intf->maintenance_mode_state, + intf->auto_maintenance_timeout); +} +static DEVICE_ATTR_RO(maintenance_mode); + static void redo_bmc_reg(struct work_struct *work) { struct ipmi_smi *intf = container_of(work, struct ipmi_smi, @@ -3681,6 +3695,14 @@ int ipmi_add_smi(struct module *owner, goto out_err_bmc_reg; } + intf->maintenance_mode_devattr = dev_attr_maintenance_mode; + sysfs_attr_init(&intf->maintenance_mode_devattr.attr); + rv = device_create_file(intf->si_dev, &intf->maintenance_mode_devattr); + if (rv) { + device_remove_file(intf->si_dev, &intf->nr_users_devattr); + goto out_err_bmc_reg; + } + intf->intf_num = i; mutex_unlock(&ipmi_interfaces_mutex); @@ -3788,6 +3810,7 @@ void ipmi_unregister_smi(struct ipmi_smi *intf) if (intf->handlers->shutdown) intf->handlers->shutdown(intf->send_info); + device_remove_file(intf->si_dev, &intf->maintenance_mode_devattr); device_remove_file(intf->si_dev, &intf->nr_msgs_devattr); device_remove_file(intf->si_dev, &intf->nr_users_devattr); -- 2.43.0 |
From: Corey M. <co...@mi...> - 2025-08-07 23:07:11
|
Now that maintenance mode rejects all messages, there's nothing to run time timer. Make sure the timer is running in maintenance mode. Signed-off-by: Corey Minyard <co...@mi...> --- drivers/char/ipmi/ipmi_msghandler.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/char/ipmi/ipmi_msghandler.c b/drivers/char/ipmi/ipmi_msghandler.c index 5ff35c473b50..786c71eb00f4 100644 --- a/drivers/char/ipmi/ipmi_msghandler.c +++ b/drivers/char/ipmi/ipmi_msghandler.c @@ -50,6 +50,8 @@ static void intf_free(struct kref *ref); static bool initialized; static bool drvregistered; +static struct timer_list ipmi_timer; + /* Numbers in this enumerator should be mapped to ipmi_panic_event_str */ enum ipmi_panic_event_op { IPMI_SEND_PANIC_EVENT_NONE, @@ -1948,6 +1950,7 @@ static int i_ipmi_req_sysintf(struct ipmi_smi *intf, && intf->maintenance_mode_state < newst) { intf->maintenance_mode_state = newst; maintenance_mode_update(intf); + mod_timer(&ipmi_timer, jiffies + IPMI_TIMEOUT_JIFFIES); } spin_unlock_irqrestore(&intf->maintenance_mode_lock, flags); @@ -5136,6 +5139,7 @@ static bool ipmi_timeout_handler(struct ipmi_smi *intf, && (intf->auto_maintenance_timeout <= 0)) { intf->maintenance_mode_state = IPMI_MAINTENANCE_MODE_STATE_OFF; + intf->auto_maintenance_timeout = 0; maintenance_mode_update(intf); } } @@ -5158,8 +5162,6 @@ static void ipmi_request_event(struct ipmi_smi *intf) intf->handlers->request_events(intf->send_info); } -static struct timer_list ipmi_timer; - static atomic_t stop_operation; static void ipmi_timeout_work(struct work_struct *work) @@ -5183,6 +5185,8 @@ static void ipmi_timeout_work(struct work_struct *work) } need_timer = true; } + if (intf->maintenance_mode_state) + need_timer = true; need_timer |= ipmi_timeout_handler(intf, IPMI_TIMEOUT_TIME); } -- 2.43.0 |
From: Corey M. <co...@mi...> - 2025-08-07 23:07:07
|
If the driver goes into any maintenance mode, disable sysfs access until it is done. If the driver goes into reset maintenance mode, disable all messages until it is done. Signed-off-by: Corey Minyard <co...@mi...> --- drivers/char/ipmi/ipmi_msghandler.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/drivers/char/ipmi/ipmi_msghandler.c b/drivers/char/ipmi/ipmi_msghandler.c index f124c0b33db8..72f5f4a0c056 100644 --- a/drivers/char/ipmi/ipmi_msghandler.c +++ b/drivers/char/ipmi/ipmi_msghandler.c @@ -2338,6 +2338,11 @@ static int i_ipmi_request(struct ipmi_user *user, if (!run_to_completion) mutex_lock(&intf->users_mutex); + if (intf->maintenance_mode_state == IPMI_MAINTENANCE_MODE_STATE_RESET) { + /* No messages while the BMC is in reset. */ + rv = -EBUSY; + goto out_err; + } if (intf->in_shutdown) { rv = -ENODEV; goto out_err; @@ -2639,6 +2644,12 @@ static int __bmc_get_device_id(struct ipmi_smi *intf, struct bmc_device *bmc, (bmc->dyn_id_set && time_is_after_jiffies(bmc->dyn_id_expiry))) goto out_noprocessing; + /* Don't allow sysfs access when in maintenance mode. */ + if (intf->maintenance_mode_state) { + rv = -EBUSY; + goto out_noprocessing; + } + prev_guid_set = bmc->dyn_guid_set; __get_guid(intf); -- 2.43.0 |
From: Corey M. <co...@mi...> - 2025-08-07 23:07:06
|
This allows later changes to have different behaviour during a reset verses a firmware update. Signed-off-by: Corey Minyard <co...@mi...> --- drivers/char/ipmi/ipmi_msghandler.c | 42 ++++++++++++++++++++--------- 1 file changed, 30 insertions(+), 12 deletions(-) diff --git a/drivers/char/ipmi/ipmi_msghandler.c b/drivers/char/ipmi/ipmi_msghandler.c index 8e9050f99e9e..f124c0b33db8 100644 --- a/drivers/char/ipmi/ipmi_msghandler.c +++ b/drivers/char/ipmi/ipmi_msghandler.c @@ -539,7 +539,11 @@ struct ipmi_smi { /* For handling of maintenance mode. */ int maintenance_mode; - bool maintenance_mode_enable; + +#define IPMI_MAINTENANCE_MODE_STATE_OFF 0 +#define IPMI_MAINTENANCE_MODE_STATE_FIRMWARE 1 +#define IPMI_MAINTENANCE_MODE_STATE_RESET 2 + int maintenance_mode_state; int auto_maintenance_timeout; spinlock_t maintenance_mode_lock; /* Used in a timer... */ @@ -1534,8 +1538,15 @@ EXPORT_SYMBOL(ipmi_get_maintenance_mode); static void maintenance_mode_update(struct ipmi_smi *intf) { if (intf->handlers->set_maintenance_mode) + /* + * Lower level drivers only care about firmware mode + * as it affects their timing. They don't care about + * reset, which disables all commands for a while. + */ intf->handlers->set_maintenance_mode( - intf->send_info, intf->maintenance_mode_enable); + intf->send_info, + (intf->maintenance_mode_state == + IPMI_MAINTENANCE_MODE_STATE_FIRMWARE)); } int ipmi_set_maintenance_mode(struct ipmi_user *user, int mode) @@ -1552,16 +1563,17 @@ int ipmi_set_maintenance_mode(struct ipmi_user *user, int mode) if (intf->maintenance_mode != mode) { switch (mode) { case IPMI_MAINTENANCE_MODE_AUTO: - intf->maintenance_mode_enable - = (intf->auto_maintenance_timeout > 0); + /* Just leave it alone. */ break; case IPMI_MAINTENANCE_MODE_OFF: - intf->maintenance_mode_enable = false; + intf->maintenance_mode_state = + IPMI_MAINTENANCE_MODE_STATE_OFF; break; case IPMI_MAINTENANCE_MODE_ON: - intf->maintenance_mode_enable = true; + intf->maintenance_mode_state = + IPMI_MAINTENANCE_MODE_STATE_FIRMWARE; break; default: @@ -1922,13 +1934,18 @@ static int i_ipmi_req_sysintf(struct ipmi_smi *intf, if (is_maintenance_mode_cmd(msg)) { unsigned long flags; + int newst; + + if (msg->netfn == IPMI_NETFN_FIRMWARE_REQUEST) + newst = IPMI_MAINTENANCE_MODE_STATE_FIRMWARE; + else + newst = IPMI_MAINTENANCE_MODE_STATE_RESET; spin_lock_irqsave(&intf->maintenance_mode_lock, flags); - intf->auto_maintenance_timeout - = maintenance_mode_timeout_ms; + intf->auto_maintenance_timeout = maintenance_mode_timeout_ms; if (!intf->maintenance_mode - && !intf->maintenance_mode_enable) { - intf->maintenance_mode_enable = true; + && intf->maintenance_mode_state < newst) { + intf->maintenance_mode_state = newst; maintenance_mode_update(intf); } spin_unlock_irqrestore(&intf->maintenance_mode_lock, @@ -5083,7 +5100,8 @@ static bool ipmi_timeout_handler(struct ipmi_smi *intf, -= timeout_period; if (!intf->maintenance_mode && (intf->auto_maintenance_timeout <= 0)) { - intf->maintenance_mode_enable = false; + intf->maintenance_mode_state = + IPMI_MAINTENANCE_MODE_STATE_OFF; maintenance_mode_update(intf); } } @@ -5099,7 +5117,7 @@ static bool ipmi_timeout_handler(struct ipmi_smi *intf, static void ipmi_request_event(struct ipmi_smi *intf) { /* No event requests when in maintenance mode. */ - if (intf->maintenance_mode_enable) + if (intf->maintenance_mode_state) return; if (!intf->in_shutdown) -- 2.43.0 |
From: Corey M. <co...@mi...> - 2025-08-07 23:07:04
|
I went ahead and did some patches for this, since it was on my mind. With these, if a reset is sent to the BMC, the driver will disable messages to the BMC for a time, defaulting to 30 seconds. Don't modify message timing, since no messages are allowed, anyway. If a firmware update command is sent to the BMC, then just reject sysfs commands that query the BMC. Modify message timing and allow direct messages through the driver interface. Hopefully this will work around the problem, and it's a good idea, anyway. -corey |
From: Corey M. <co...@mi...> - 2025-08-07 20:29:19
|
On Thu, Aug 07, 2025 at 02:43:14PM -0500, Frederick Lawler wrote: > > It occurred to me last night that I'd probably like a rate limit on the KCS > messages as well. I didn't see if a patch for that was made. I can whip > that up sometime next week, that could be of use to anyone. That jogged my memory a bit; there is something called "maintenance mode" in the IPMI driver. It's used primarily for firmware updates, but it's triggered by reset commands in addition to firmware update commands. It has three basic affects: * It turns off automatic messages sent to the BMC by the driver (only fetching flags, I think). * It changes the way the timing works to check for the BMC being ready a lot more often. (This is a hardware check and shouldn't affect the BMC, but maybe it does on some.) * It changes the timing for messages routed to the IPMB bus to give them more time. It solved two problems: * For systems without IPMI interrupts, firmware updates were taking forever. * When you would reset the BMC, the driver's automatic messages would generally time out. And IPMB messages pending would time out. The theory was that if the user reset the BMC, they wouldn't issue any IPMI commands, and the driver wouldn't either, so it would leave the BMC interface alone until it's done resetting. It's not perfect, the reset or firmware update can happen over the LAN interface, but it seemed to help a lot of people. Anyway, after that long explaination, maybe that needs to be extended and if the driver goes into maintenance mode have all sysfs accesses to the BMC return an error. It also might be a good idea to differentiate between resets and firmware update commands. After a reset nothing will probably work, but the BMC is still partially function during a firmware update. So no IPMI commands at all for a little while after a reset. That is a behavioral change, but it's probably not a lot different that what would happen, anyway. The error just comes back faster. None of this solves the basic issue, though. I'm not exactly sure what you mean by a rate limit on KCS messages. It would lower the probability, perhaps, but it wouldn't eliminate the problem, either. Just not allowing anything during these times is probably better. > > [1533534.869508] [Hardware Error]: Corrected error, no action required. > [1533534.884635] [Hardware Error]: CPU:1 (17:31:0) MC18_STATUS[Over|CE|MiscV|AddrV|-|-|SyndV|CECC|-|-|-]: 0xdc2040000000011b > [1533534.912122] [Hardware Error]: Error Addr: 0x0000000313c7a020 > [1533534.926641] [Hardware Error]: IPID: 0x0000009600350f00, Syndrome: 0x9fec08000a800a01 > [1533534.943278] [Hardware Error]: Unified Memory Controller Ext. Error Code: 0 > [1533534.946635] EDAC MC0: 1 CE Cannot decode normalized address on mc#0csrow#1channel#3 (csrow:1 channel:3 page:0x0 offset:0x0 grain:64 syndrome:0x800) > [1533535.369487] INFO: task cat:1844873 blocked for more than 10 seconds. > [1533535.385145] Tainted: G W O 6.12.35-cloudflare-2025.6.15 #1 > [1533535.401614] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > [1533535.418715] task:cat state:D stack:0 pid:1844873 tgid:1844873 ppid:1844872 task_flags:0x400000 flags:0x00004002 > [1533535.447475] Call Trace: > [1533535.458691] <TASK> > [1533535.469154] __schedule+0x4fa/0xbf0 > [1533535.481433] schedule+0x27/0xf0 > [1533535.493181] __get_guid+0xf4/0x130 [ipmi_msghandler] > [1533535.506325] ? __pfx_autoremove_wake_function+0x10/0x10 > [1533535.519910] __bmc_get_device_id+0xd6/0xa30 [ipmi_msghandler] Yeah, this is what I would expect to see if you are doing this operation and the BMC is in reset. It's going to sit there until it times out and returns an error. -corey > [1533535.534459] ? srso_return_thunk+0x5/0x5f > [1533535.546509] ? srso_return_thunk+0x5/0x5f > [1533535.558540] ? __memcg_slab_post_alloc_hook+0x21b/0x410 > [1533535.571722] aux_firmware_rev_show+0x38/0x90 [ipmi_msghandler] > [1533535.585304] ? __kmalloc_node_noprof+0x3f6/0x450 > [1533535.598144] ? seq_read_iter+0x376/0x460 > [1533535.609621] dev_attr_show+0x1c/0x40 > [1533535.621024] sysfs_kf_seq_show+0x8f/0xe0 > [1533535.632316] seq_read_iter+0x11f/0x460 > [1533535.643172] ? security_file_permission+0x9/0xb0 > [1533535.655102] vfs_read+0x260/0x330 > [1533535.665368] ksys_read+0x65/0xe0 > [1533535.675559] do_syscall_64+0x4b/0x110 > [1533535.686324] entry_SYSCALL_64_after_hwframe+0x76/0x7e > [1533535.698530] RIP: 0033:0x7f72b587125d > [1533535.708857] RSP: 002b:00007ffccc21bb48 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 > [1533535.723411] RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007f72b587125d > [1533535.737361] RDX: 0000000000020000 RSI: 00007f72b5755000 RDI: 0000000000000003 > [1533535.751191] RBP: 0000000000020000 R08: 00000000ffffffff R09: 0000000000000000 > [1533535.764847] R10: 00007f72b5788b60 R11: 0000000000000246 R12: 00007f72b5755000 > [1533535.778536] R13: 0000000000000003 R14: 0000000000020000 R15: 0000000000000000 > [1533535.792210] </TASK> > > crash> bt -l 1781073 > PID: 1781073 TASK: ffff9d91c7040000 CPU: 81 COMMAND: "/usr/bin/python" > #0 [ffffb3a171683c00] __schedule at ffffffff9d559eea > /cfsetup_build/build/linux/kernel/sched/core.c: 5338 > #1 [ffffb3a171683c80] schedule at ffffffff9d55a617 > /cfsetup_build/build/linux/arch/x86/include/asm/preempt.h: 84 > #2 [ffffb3a171683c90] __get_guid at ffffffffc22aa574 [ipmi_msghandler] > #3 [ffffb3a171683ce8] __bmc_get_device_id at ffffffffc22aa696 [ipmi_msghandler] > #4 [ffffb3a171683da0] aux_firmware_rev_show at ffffffffc22ab1c8 [ipmi_msghandler] > #5 [ffffb3a171683dd0] dev_attr_show at ffffffff9d1175dc > /cfsetup_build/build/linux/drivers/base/core.c: 2425 > #6 [ffffb3a171683de8] sysfs_kf_seq_show at ffffffff9cc64caf > /cfsetup_build/build/linux/fs/sysfs/file.c: 60 > #7 [ffffb3a171683e10] seq_read_iter at ffffffff9cbddf7f > /cfsetup_build/build/linux/fs/seq_file.c: 230 > #8 [ffffb3a171683e68] vfs_read at ffffffff9cba8590 > /cfsetup_build/build/linux/fs/read_write.c: 489 > #9 [ffffb3a171683f00] ksys_read at ffffffff9cba9165 > /cfsetup_build/build/linux/fs/read_write.c: 713 > #10 [ffffb3a171683f38] do_syscall_64 at ffffffff9d550c8b > /cfsetup_build/build/linux/arch/x86/entry/common.c: 52 > #11 [ffffb3a171683f50] entry_SYSCALL_64_after_hwframe at ffffffff9d60012f > /cfsetup_build/build/linux/arch/x86/entry/entry_64.S: 130 > RIP: 00007f04e1b7c29c RSP: 00007ffea7aaf6c0 RFLAGS: 00000246 > RAX: ffffffffffffffda RBX: 0000000000a840f8 RCX: 00007f04e1b7c29c > RDX: 0000000000001001 RSI: 000000002fd06ef0 RDI: 00000000000000c1 > RBP: 00007f04e1a82fc0 R8: 0000000000000000 R9: 0000000000000000 > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000001001 > R13: 000000002fd06ef0 R14: 00000000000000c1 R15: 0000000000a41520 > ORIG_RAX: 0000000000000000 CS: 0033 SS: 002b > > crash> files 1781073 > ... > 193 ffff9db5132e5800 ffff9dafb18bd200 ffff9da7b780bcf0 REG /sys/devices/platform/ipmi_bmc.0/aux_firmware_revision > > crash> log -c > ... > [1533553.998160] [ C7] ipmi_si IPI0001:00: KCS in invalid state 6 > [1533554.009156] [ C7] ipmi_si IPI0001:00: KCS in invalid state 8 > [1533554.019973] [T1844873] ipmi_si IPI0001:00: KCS in invalid state 9 > [1533554.031005] [ C81] ipmi_si IPI0001:00: IPMI message handler: device id fetch failed: 0xd5 > |
From: Frederick L. <fr...@cl...> - 2025-08-07 19:43:28
|
On Wed, Aug 06, 2025 at 05:51:29PM -0500, Corey Minyard wrote: > On Wed, Aug 06, 2025 at 04:36:41PM -0500, Frederick Lawler wrote: > > On Wed, Aug 06, 2025 at 04:16:18PM -0500, Corey Minyard wrote: > > > On Wed, Aug 06, 2025 at 03:19:02PM -0500, Fred Lawler wrote: > > > > + CC: Corey Minyard <co...@mi...> > > > > > > > > > I'm wondering if something is happening with the BMC resetting and > > > interactions with ACPI involved in that. Adding the extra part of > > > trying to talk to the BMC while it's being reset could cause the BMC to > > > get confused and do bad things? > > > > > > > Sure, it's a possibility we explored. We have a lot of automation. > > Predominately of which is a prometheus module exporting IPMI information > > from the sysfs files. And we also have config management that's querying > > sysfs files to regulate updates etc... Sometimes, the config management > > automation will attempt to reset the BMC. > > Ok. I have tests that do BMC resets, but I can't run at the scale you > do, and I'm running in a simulator so it's not going to be have the > same. > > The other possibility is the processor goes into the idle code while > interrupts are off, but I think the kernel has checks all around that. > I can't think of how else a processor would get stuck in idle. > Yes, it's a bit of an odd case. There's nothing obvious reported by the crash utility. By the time we get the NMI/panic, the CPUs are off doing something else in our crash typical case. That said, earlier this week I got a hard lockup outside of a BMC reset, but the node had too many MCE correctable memory errors. For sake of completeness, I'll post that stack trace here anyway since that may provide some more context clues. In this case, I did catch two separate reads to sysfs files, and then they appear to have competed. The cat process seemed to already be off CPU, but the KCS message is still coming in at the same time the python script was being processed too. Only the python run was on CPU at time of crash. But NMI panic was still on a idle CPU. Unfortunately, I didn't write down all the logs this one, so it's missing the idle state NMI for watchdog, but hopefully the snippets show what's happening. I posted this below. > > > > > > > > > > > > I tried also tried to load the CPUs with stress-ng, but the best I can do > > > > > are the hung tasks. > > > > > > > > > > I identified that sni_send()[1] could be locked behind the > > > > > spin_lock_irqsave() and within the KCS send handler, there's another irq > > > > > save lock. I suspect this is where we're getting hung up. Below is a > > > > > sample stack trace + log output. > > > > > > Yeah, I don't see that in the traceback. There is a lock in the KCS > > > sender, but I don't see how that could do anything. > > > > > > Maybe you could try changing the cpuidle handler? That would be at > > > least something to try. > > > > > > > Would that help in forming a reproducer? I'd need to deploy any kernel > > modifications fleet wide to cast a wide enough net. The lockups arn't > > extremely consistent. We may get a couple or more a week. > > Ah, so this isn't readily reproducable. Bummer. > > If the problem goes away if you change the cpuidle handler to something > non-ACPI, that would be a big clue that it's an ACPI issue. > > > > > Lastly, I have the rate limit patch backported. I'll be able to start > > testing with that tomorrow, and same with loading the IPMI watchdog > > module. > > Ok. I don't have much hope for it making much difference, but it's safe > and will be coming in the next kernel release. > It occurred to me last night that I'd probably like a rate limit on the KCS messages as well. I didn't see if a patch for that was made. I can whip that up sometime next week, that could be of use to anyone. [1533534.869508] [Hardware Error]: Corrected error, no action required. [1533534.884635] [Hardware Error]: CPU:1 (17:31:0) MC18_STATUS[Over|CE|MiscV|AddrV|-|-|SyndV|CECC|-|-|-]: 0xdc2040000000011b [1533534.912122] [Hardware Error]: Error Addr: 0x0000000313c7a020 [1533534.926641] [Hardware Error]: IPID: 0x0000009600350f00, Syndrome: 0x9fec08000a800a01 [1533534.943278] [Hardware Error]: Unified Memory Controller Ext. Error Code: 0 [1533534.946635] EDAC MC0: 1 CE Cannot decode normalized address on mc#0csrow#1channel#3 (csrow:1 channel:3 page:0x0 offset:0x0 grain:64 syndrome:0x800) [1533535.369487] INFO: task cat:1844873 blocked for more than 10 seconds. [1533535.385145] Tainted: G W O 6.12.35-cloudflare-2025.6.15 #1 [1533535.401614] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [1533535.418715] task:cat state:D stack:0 pid:1844873 tgid:1844873 ppid:1844872 task_flags:0x400000 flags:0x00004002 [1533535.447475] Call Trace: [1533535.458691] <TASK> [1533535.469154] __schedule+0x4fa/0xbf0 [1533535.481433] schedule+0x27/0xf0 [1533535.493181] __get_guid+0xf4/0x130 [ipmi_msghandler] [1533535.506325] ? __pfx_autoremove_wake_function+0x10/0x10 [1533535.519910] __bmc_get_device_id+0xd6/0xa30 [ipmi_msghandler] [1533535.534459] ? srso_return_thunk+0x5/0x5f [1533535.546509] ? srso_return_thunk+0x5/0x5f [1533535.558540] ? __memcg_slab_post_alloc_hook+0x21b/0x410 [1533535.571722] aux_firmware_rev_show+0x38/0x90 [ipmi_msghandler] [1533535.585304] ? __kmalloc_node_noprof+0x3f6/0x450 [1533535.598144] ? seq_read_iter+0x376/0x460 [1533535.609621] dev_attr_show+0x1c/0x40 [1533535.621024] sysfs_kf_seq_show+0x8f/0xe0 [1533535.632316] seq_read_iter+0x11f/0x460 [1533535.643172] ? security_file_permission+0x9/0xb0 [1533535.655102] vfs_read+0x260/0x330 [1533535.665368] ksys_read+0x65/0xe0 [1533535.675559] do_syscall_64+0x4b/0x110 [1533535.686324] entry_SYSCALL_64_after_hwframe+0x76/0x7e [1533535.698530] RIP: 0033:0x7f72b587125d [1533535.708857] RSP: 002b:00007ffccc21bb48 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [1533535.723411] RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007f72b587125d [1533535.737361] RDX: 0000000000020000 RSI: 00007f72b5755000 RDI: 0000000000000003 [1533535.751191] RBP: 0000000000020000 R08: 00000000ffffffff R09: 0000000000000000 [1533535.764847] R10: 00007f72b5788b60 R11: 0000000000000246 R12: 00007f72b5755000 [1533535.778536] R13: 0000000000000003 R14: 0000000000020000 R15: 0000000000000000 [1533535.792210] </TASK> crash> bt -l 1781073 PID: 1781073 TASK: ffff9d91c7040000 CPU: 81 COMMAND: "/usr/bin/python" #0 [ffffb3a171683c00] __schedule at ffffffff9d559eea /cfsetup_build/build/linux/kernel/sched/core.c: 5338 #1 [ffffb3a171683c80] schedule at ffffffff9d55a617 /cfsetup_build/build/linux/arch/x86/include/asm/preempt.h: 84 #2 [ffffb3a171683c90] __get_guid at ffffffffc22aa574 [ipmi_msghandler] #3 [ffffb3a171683ce8] __bmc_get_device_id at ffffffffc22aa696 [ipmi_msghandler] #4 [ffffb3a171683da0] aux_firmware_rev_show at ffffffffc22ab1c8 [ipmi_msghandler] #5 [ffffb3a171683dd0] dev_attr_show at ffffffff9d1175dc /cfsetup_build/build/linux/drivers/base/core.c: 2425 #6 [ffffb3a171683de8] sysfs_kf_seq_show at ffffffff9cc64caf /cfsetup_build/build/linux/fs/sysfs/file.c: 60 #7 [ffffb3a171683e10] seq_read_iter at ffffffff9cbddf7f /cfsetup_build/build/linux/fs/seq_file.c: 230 #8 [ffffb3a171683e68] vfs_read at ffffffff9cba8590 /cfsetup_build/build/linux/fs/read_write.c: 489 #9 [ffffb3a171683f00] ksys_read at ffffffff9cba9165 /cfsetup_build/build/linux/fs/read_write.c: 713 #10 [ffffb3a171683f38] do_syscall_64 at ffffffff9d550c8b /cfsetup_build/build/linux/arch/x86/entry/common.c: 52 #11 [ffffb3a171683f50] entry_SYSCALL_64_after_hwframe at ffffffff9d60012f /cfsetup_build/build/linux/arch/x86/entry/entry_64.S: 130 RIP: 00007f04e1b7c29c RSP: 00007ffea7aaf6c0 RFLAGS: 00000246 RAX: ffffffffffffffda RBX: 0000000000a840f8 RCX: 00007f04e1b7c29c RDX: 0000000000001001 RSI: 000000002fd06ef0 RDI: 00000000000000c1 RBP: 00007f04e1a82fc0 R8: 0000000000000000 R9: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000001001 R13: 000000002fd06ef0 R14: 00000000000000c1 R15: 0000000000a41520 ORIG_RAX: 0000000000000000 CS: 0033 SS: 002b crash> files 1781073 ... 193 ffff9db5132e5800 ffff9dafb18bd200 ffff9da7b780bcf0 REG /sys/devices/platform/ipmi_bmc.0/aux_firmware_revision crash> log -c ... [1533553.998160] [ C7] ipmi_si IPI0001:00: KCS in invalid state 6 [1533554.009156] [ C7] ipmi_si IPI0001:00: KCS in invalid state 8 [1533554.019973] [T1844873] ipmi_si IPI0001:00: KCS in invalid state 9 [1533554.031005] [ C81] ipmi_si IPI0001:00: IPMI message handler: device id fetch failed: 0xd5 |
From: Rob H. (Arm) <ro...@ke...> - 2025-08-07 13:29:12
|
The ASpeed kcs-bmc nodes have a "clocks" property which isn't documented. It looks like all the LPC child devices have the same clock source and some of the drivers manage their clock. Perhaps it is the parent device that should have the clock, but it's too late for that. Signed-off-by: Rob Herring (Arm) <ro...@ke...> --- .../devicetree/bindings/ipmi/aspeed,ast2400-kcs-bmc.yaml | 3 +++ 1 file changed, 3 insertions(+) diff --git a/Documentation/devicetree/bindings/ipmi/aspeed,ast2400-kcs-bmc.yaml b/Documentation/devicetree/bindings/ipmi/aspeed,ast2400-kcs-bmc.yaml index 129e32c4c774..610c79863208 100644 --- a/Documentation/devicetree/bindings/ipmi/aspeed,ast2400-kcs-bmc.yaml +++ b/Documentation/devicetree/bindings/ipmi/aspeed,ast2400-kcs-bmc.yaml @@ -40,6 +40,9 @@ properties: - description: ODR register - description: STR register + clocks: + maxItems: 1 + aspeed,lpc-io-reg: $ref: /schemas/types.yaml#/definitions/uint32-array minItems: 1 -- 2.47.2 |