|
From: Derek D. <z2...@po...> - 2022-05-12 16:21:34
|
I have been working on this as best I can. However, I confess that I am
not a kernel developer and have really no understanding of these tboot
internals. Nevertheless here is a brief update. Please anyone feel free
to share any ideas how to move forward to some resolution.
I got a desktop machine with rs232 serial output running tboot and
reproduced the suspend problem that way and with this setup I can
collect kernel printk and also cpu hotplug (cpuhp) tracing output. I
have also thankfully got quite a bit of help from Vincent Donnefort who
wrote the cpuhp changes (the commit I posted) that have exposed the
issue. He has been very helpful, let me try to tell you what we have
figured out.
On suspend, I get into the tboot callback:
static int tboot_dying_cpu(unsigned int cpu)
{
atomic_inc(&ap_wfs_count);
if (num_online_cpus() == 1) {
if (tboot_wait_for_aps(atomic_read(&ap_wfs_count)))
return -EBUSY;
}
return 0;
}
but the tboot_wait_for_aps times out for me so the callback returns
EBUSY. The problem with that happening is that there is not a rollback
mechanism in place at this point in the cpuhp sequence. So I mean from
cpuhp point of view, there is not even a mechanism to handle the tboot
callback failure. Besides that, we don't know what could be a sensible
thing to do in the case of EBUSY. What does it mean tboot is busy and
what should be done about it? Please help us to understand.
|