From: Timo L. <tim...@ik...> - 2020-05-08 21:56:01
|
Hi, I get the following build failure on debian unstable with GCC 9.3.0: tar xf tboot-1.9.12.tar.gz cd tboot-1.9.12/ env CFLAGS="-g" make ... cc -z noexecstack -z relo -z now -c -o obj/mem_primitives_lib.o safeclib/mem_primitives_lib.c -g -Wall -Wformat-security -Werror -Wstrict-prototypes -Wextra -Winit-self -Wswitch-default -Wunused-parameter -Wwrite-strings -Wlogical-op -Wno-missing-field-initializers -Wno-address-of-packed-member -fno-strict-aliasing -std=gnu99 -Wno-array-bounds -O2 -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -m64 -I/home/lindi/tboot-1.9.12/safestringlib/include -Wall -Wformat-security -Werror -Wstrict-prototypes -Wextra -Winit-self -Wswitch-default -Wunused-parameter -Wwrite-strings -Wlogical-op -Wno-missing-field-initializers -Wno-address-of-packed-member -fno-strict-aliasing -std=gnu99 -Wno-array-bounds -O2 -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -m64 -I/home/lindi/tboot-1.9.12/safestringlib/include -Iinclude -fstack-protector-strong -fPIE -fPIC -O2 -D_FORTIFY_SOURCE=2 -Wformat -Wformat-security -DSTDC_HEADERS safeclib/mem_primitives_lib.c: In function \u2018mem_prim_set\u2019: safeclib/mem_primitives_lib.c:111:25: error: this statement may fall through [-Werror=implicit-fallthrough=] 111 | case 15: *lp++ = value32; | ~~~~~~^~~~~~~~~ safeclib/mem_primitives_lib.c:112:9: note: here 112 | case 14: *lp++ = value32; | ^~~~ It seems that Config.mk adds -Werror and -Wextra that cause this to happen. Why doesn't this happen when CFLAGS is not set as an environment variable? Apparently because CFLAGS += $(CFLAGS_WARN) -fno-strict-aliasing -std=gnu99 behaves differently with recursive makefiles if CFLAGS is in the environment: "By default, only variables that came from the environment or the command line are passed to recursive invocations." https://www.gnu.org/software/make/manual/html_node/Environment.html Is the intent here that CFLAGS_WARN should be used for the whole build? If yes, then we need to add "export CFLAGS" to ensure that it is passed to other makefiles and also fix that build failure. If not, we need to add "unexport CFLAGS" and don't necessary need to fix the switch-case statement. -Timo |
From: Lukasz H. <luk...@li...> - 2020-05-12 08:17:14
|
On Sat, 2020-05-09 at 00:55 +0300, Timo Lindfors wrote: > Hi, > > I get the following build failure on debian unstable with GCC 9.3.0: > > tar xf tboot-1.9.12.tar.gz > cd tboot-1.9.12/ > env CFLAGS="-g" make > ... > cc -z noexecstack -z relo -z now -c -o obj/mem_primitives_lib.o > safeclib/mem_primitives_lib.c -g -Wall -Wformat-security -Werror > -Wstrict-prototypes -Wextra -Winit-self -Wswitch-default > -Wunused-parameter -Wwrite-strings -Wlogical-op > -Wno-missing-field-initializers -Wno-address-of-packed-member > -fno-strict-aliasing -std=gnu99 -Wno-array-bounds -O2 -U_FORTIFY_SOURCE > -D_FORTIFY_SOURCE=2 -m64 -I/home/lindi/tboot-1.9.12/safestringlib/include > -Wall -Wformat-security -Werror -Wstrict-prototypes -Wextra -Winit-self > -Wswitch-default -Wunused-parameter -Wwrite-strings -Wlogical-op > -Wno-missing-field-initializers -Wno-address-of-packed-member > -fno-strict-aliasing -std=gnu99 -Wno-array-bounds -O2 -U_FORTIFY_SOURCE > -D_FORTIFY_SOURCE=2 -m64 -I/home/lindi/tboot-1.9.12/safestringlib/include > -Iinclude -fstack-protector-strong -fPIE -fPIC -O2 -D_FORTIFY_SOURCE=2 > -Wformat -Wformat-security -DSTDC_HEADERS > safeclib/mem_primitives_lib.c: In function \u2018mem_prim_set\u2019: > safeclib/mem_primitives_lib.c:111:25: error: this statement may fall > through [-Werror=implicit-fallthrough=] > 111 | case 15: *lp++ = value32; > | ~~~~~~^~~~~~~~~ > safeclib/mem_primitives_lib.c:112:9: note: here > 112 | case 14: *lp++ = value32; > | ^~~~ > > > It seems that Config.mk adds -Werror and -Wextra that cause this to > happen. Why doesn't this happen when CFLAGS is not set as an > environment variable? Apparently because > > CFLAGS += $(CFLAGS_WARN) -fno-strict-aliasing -std=gnu99 > > behaves differently with recursive makefiles if CFLAGS is in the > environment: > > "By default, only variables that came from the environment or the command > line are passed to recursive invocations." > > https://www.gnu.org/software/make/manual/html_node/Environment.html > > Is the intent here that CFLAGS_WARN should be used for the whole build? If > yes, then we need to add "export CFLAGS" to ensure that it is passed to > other makefiles and also fix that build failure. > > If not, we need to add "unexport CFLAGS" and don't necessary need to fix > the switch-case statement. > > > -Timo > Hi Thanks for investigating that issue. Fixed in a6180f9e9e86 Lukasz |
From: Timo L. <tim...@ik...> - 2020-05-12 09:35:11
|
Hi, On Tue, 12 May 2020, Lukasz Hawrylko wrote: > Thanks for investigating that issue. Fixed in a6180f9e9e86 Thanks, seems to build now. -Timo |
From: Timo L. <tim...@ik...> - 2020-05-23 16:27:17
|
Hi, On Tue, 12 May 2020, Timo Lindfors wrote: > On Tue, 12 May 2020, Lukasz Hawrylko wrote: >> Thanks for investigating that issue. Fixed in a6180f9e9e86 > > Thanks, seems to build now. I said this perhaps bit too soon. I am experiencing tboot getting stuck at boot on Lenovo T430s when I boot the latest code from mercurial. 1.9.12 seems to boot ok. Commenting out "export CFLAGS" seems to help. How should I debug this? -Timo |
From: Timo L. <tim...@ik...> - 2020-05-24 16:31:03
|
Hi, On Sat, 23 May 2020, Timo Lindfors wrote: > boot on Lenovo T430s when I boot the latest code from mercurial. 1.9.12 seems > to boot ok. Commenting out "export CFLAGS" seems to help. How should > I debug this? Currently it seems that tboot actually only boots properly if I first boot Linux and then reboot and select tboot. If I cold-boot tboot then it gets stuck. I'm investigating options on how to test this in a more automatic way. -Timo |
From: Lukasz H. <luk...@li...> - 2020-05-25 08:20:27
|
Hi Timo On Sun, 2020-05-24 at 19:15 +0300, Timo Lindfors wrote: > Hi, > > On Sat, 23 May 2020, Timo Lindfors wrote: > > boot on Lenovo T430s when I boot the latest code from mercurial. 1.9.12 seems > > to boot ok. Commenting out "export CFLAGS" seems to help. How should > > I debug this? > > Currently it seems that tboot actually only boots properly if I first boot > Linux and then reboot and select tboot. If I cold-boot tboot then it gets > stuck. I'm investigating options on how to test this in a more automatic > way. That is a really strange behaviour. I have just build tip from mercurial and run it on TPM1.2 and TPM2.0 PCs - it works (cold-booted too). Can you please share me more informations about your test case? Do you see anything on the screen? Thanks, Lukasz |
From: Timo L. <tim...@ik...> - 2020-05-25 10:09:24
|
On Mon, 25 May 2020, Lukasz Hawrylko wrote: > That is a really strange behaviour. I have just build tip from mercurial > and run it on TPM1.2 and TPM2.0 PCs - it works (cold-booted too). Can > you please share me more informations about your test case? Do you see > anything on the screen? I only see the "original e820 map:" listing. I'm trying to get serial console to make this easier to debug and to compare how warm-boot and cold-boot differs without having to compare photos from the screen. -Timo |
From: Timo L. <tim...@ik...> - 2020-05-27 23:23:09
|
Hi, On Mon, 25 May 2020, Timo Lindfors wrote: > I only see the "original e820 map:" listing. I'm trying to get serial console > to make this easier to debug and to compare how warm-boot and cold-boot > differs without having to compare photos from the screen. I bought a second-hand Thinkpad R400 and Thinkpad type 2504 dock that includes a serial port. Then I soldered a relay to the power button and wrote a tool that lets me say baremetal_run -o foo.tar foo.img to run "foo.img" on real hardware and to collect network, serial, audio and video output automatically. Internally this works by setting the laptop boot order so that it tries to boot from network first and then from local hard disk. By changing the DHCP configuration I can alternate between PXE booting an initrd that writes an image to disk and booting from local disk. Anyways, with the help of this I was able to run git bisect. It tells me that the first bad commit is changeset: 562:77bca150d0d5 user: Lukasz Hawrylko <luk...@in...> date: Fri Feb 21 11:07:00 2020 +0100 summary: Add support for EFI memory map parse/modification Any idea on how to debug this further? -Timo |
From: Lukasz H. <luk...@li...> - 2020-05-28 07:44:14
|
Hi Tomo On Thu, 2020-05-28 at 02:22 +0300, Timo Lindfors wrote: > Hi, > > On Mon, 25 May 2020, Timo Lindfors wrote: > > I only see the "original e820 map:" listing. I'm trying to get serial console > > to make this easier to debug and to compare how warm-boot and cold-boot > > differs without having to compare photos from the screen. > > I bought a second-hand Thinkpad R400 and Thinkpad type 2504 dock that > includes a serial port. Then I soldered a relay to the power button and > wrote a tool that lets me say > > baremetal_run -o foo.tar foo.img > > to run "foo.img" on real hardware and to collect network, serial, audio > and video output automatically. > > Internally this works by setting the laptop boot order so that it tries > to boot from network first and then from local hard disk. By changing the > DHCP configuration I can alternate between PXE booting an initrd that > writes an image to disk and booting from local disk. > > > Anyways, with the help of this I was able to run git bisect. It tells me > that the first bad commit is > > changeset: 562:77bca150d0d5 > user: Lukasz Hawrylko <luk...@in...> > date: Fri Feb 21 11:07:00 2020 +0100 > summary: Add support for EFI memory map parse/modification > > > Any idea on how to debug this further? > > -Timo > That's awesome idea to create such environment for an automated testing. I understand that still you have the same behaviour - cold boot failing, reboot after Linux working, correct? Please add "dump_memmap=true" to TBOOT's command line it should enable dumping of EFI memory map. If you don't see this dump in failing scenario please add "set debug=mmap" to grub.cfg, now GRUB should print that. Than you can compare if there are any differences in EFI memory map between passing and failing scenario. Please also send me both logs. Commit, that you have mentioned, adds EFI memory map parsing in TBOOT to exclude memory occupied by EFI boot services from internal allocator. I had to do that because in some platforms BIOS puts there data that Linux wants to access and TBOOT overwrites it. Thanks, Lukasz |
From: Timo L. <tim...@ik...> - 2020-05-28 18:30:58
|
On Thu, 28 May 2020, Lukasz Hawrylko wrote: > I understand that still you have the same behaviour - cold boot failing, > reboot after Linux working, correct? Please add "dump_memmap=true" to > TBOOT's command line it should enable dumping of EFI memory map. Correct. Unfortunately dump_memmap=true does not print anything before it gets stuck on cold boot. > If you don't see this dump in failing scenario please add > "set debug=mmap" to grub.cfg, now GRUB should print that. I added this after the serial console setup but this does not seem to print anything? I also cannot find it in the grub2 source code. Is this the correct syntax? You can see the logs and other data here: https://lindi.iki.fi/lindi/tboot/dump_memmap-cold.tar https://lindi.iki.fi/lindi/tboot/dump_memmap-warm.tar -Timo |
From: Timo L. <tim...@ik...> - 2020-05-29 09:36:46
Attachments:
warm-serial.log.gz
cold-serial.log.gz
|
On Thu, 28 May 2020, Timo Lindfors wrote: >> If you don't see this dump in failing scenario please add >> "set debug=mmap" to grub.cfg, now GRUB should print that. > > I added this after the serial console setup but this does not seem to print > anything? I also cannot find it in the grub2 source code. Is this the correct > syntax? Assuming you meant "lsmmap" I am attaching here output from cold and warm boot. Unfortunately as you can see they are identical until the cold boot gets stuck but maybe this still helps? -Timo |
From: Lukasz H. <luk...@li...> - 2020-05-29 14:24:52
|
On Fri, 2020-05-29 at 12:36 +0300, Timo Lindfors wrote: > On Thu, 28 May 2020, Timo Lindfors wrote: > > > If you don't see this dump in failing scenario please add > > > "set debug=mmap" to grub.cfg, now GRUB should print that. > > > > I added this after the serial console setup but this does not seem to print > > anything? I also cannot find it in the grub2 source code. Is this the correct > > syntax? > > Assuming you meant "lsmmap" I am attaching here output from cold and warm > boot. Unfortunately as you can see they are identical until the cold boot > gets stuck but maybe this still helps? I see "Failed to get EFI memory map" message, did you configure BIOS to use legacy boot? "set debug=mmap" should enable EFI memory map print in grub_efi_mmap_iterate(), but this does not work when booted in legacy mode. I will setup my environment to test legacy boot and I will check if the same problem occurs. If it is possible, please try EFI boot on your PC. As printk is blocking call you can add few additional prints somewhere around tboot.c:384 and inside copy_e820_map() and efi_memmap_copy() to find out where exactly it hangs. Thanks, Lukasz |
From: Timo L. <tim...@ik...> - 2020-06-01 20:28:36
|
On Fri, 29 May 2020, Lukasz Hawrylko wrote: > I will setup my environment to test legacy boot and I will check if the > same problem occurs. If it is possible, please try EFI boot on your PC. I set a Thinkpad T430s (BIOS version 2.69) to UEFI-only mode without CSM and installed a fresh Debian 10. I then upgraded to debian unstable and installed tboot and txt-enabled kernel. The boot seems to get stuck with both warm and cold boot. What is worse, I get no output from tboot at all, there is only a warning "WARNING: no console will be available to OS" supposedly from grub2? As T430s does not have a serial port and none of the docking stations for that model have a serial port I am kind of blind here. -Timo |
From: Timo L. <tim...@ik...> - 2020-05-31 21:57:18
|
On Fri, 29 May 2020, Lukasz Hawrylko wrote: > On Fri, 2020-05-29 at 12:36 +0300, Timo Lindfors wrote: > I see "Failed to get EFI memory map" message, did you configure BIOS to > use legacy boot? "set debug=mmap" should enable EFI memory map print in > grub_efi_mmap_iterate(), but this does not work when booted in legacy > mode. Yes, Thinkpad R400 does not seem to support EFI. Finding a laptop that that does TPM 2.0, UEFI, TXT and serial port seems to be bit tricky but I'll keep looking. > As printk is blocking call you can add few additional prints somewhere > around tboot.c:384 and inside copy_e820_map() and efi_memmap_copy() to > find out where exactly it hangs. It seems to be stuck in the while loop in find_mb2_tag_type. Placing printk(TBOOT_INFO"start=%p tag_type=%d start->type=%d start->size=%d\n", start, tag_type, start->type, start->size); inside the while loop prints TBOOT: start=0x10008 tag_type=17 start->type=3031684 start->size=-2147418113 TBOOT: start=0x80020008 tag_type=17 start->type=-1 start->size=-1 TBOOT: start=0x80020008 tag_type=17 start->type=-1 start->size=-1 TBOOT: start=0x80020008 tag_type=17 start->type=-1 start->size=-1 TBOOT: start=0x80020008 tag_type=17 start->type=-1 start->size=-1 TBOOT: start=0x80020008 tag_type=17 start->type=-1 start->size=-1 TBOOT: start=0x80020008 tag_type=17 start->type=-1 start->size=-1 TBOOT: start=0x80020008 tag_type=17 start->type=-1 start->size=-1 TBOOT: start=0x80020008 tag_type=17 start->type=-1 start->size=-1 TBOOT: start=0x80020008 tag_type=17 start->type=-1 start->size=-1 TBOOT: start=0x80020008 tag_type=17 start->type=-1 start->size=-1 ... -Timo |
From: Timo L. <tim...@ik...> - 2020-05-31 22:27:40
|
On Mon, 1 Jun 2020, Timo Lindfors wrote: > printk(TBOOT_INFO"start=%p tag_type=%d start->type=%d start->size=%d\n", > start, > tag_type, > start->type, > start->size); On warm boot this prints just TBOOT: start=0x0x10008 tag_type=17 start->type=3031684 start->size=-2147418113 TBOOT: start=0x0x80020008 tag_type=17 start->type=0 start->size=0 -Timo |
From: Lukasz H. <luk...@li...> - 2020-06-01 15:21:25
|
On Mon, 2020-06-01 at 01:27 +0300, Timo Lindfors wrote: > On Mon, 1 Jun 2020, Timo Lindfors wrote: > > printk(TBOOT_INFO"start=%p tag_type=%d start->type=%d start->size=%d\n", > > start, > > tag_type, > > start->type, > > start->size); > > On warm boot this prints just > > TBOOT: start=0x0x10008 tag_type=17 start->type=3031684 start->size=-2147418113 > TBOOT: start=0x0x80020008 tag_type=17 start->type=0 start->size=0 > That looks like memory corruption... Does it work when you remove all SINITs except the good one? Could you please apply following patch and send me a log? One more test - please remove 'memory' option in 'logging' parameter from TBOOT command line in grub.cfg and check if that helps. Thanks, Lukasz diff -r 1f912c52b1cc tboot/common/loader.c --- a/tboot/common/loader.c Sat May 23 20:32:48 2020 +0300 +++ b/tboot/common/loader.c Mon Jun 01 17:17:01 2020 +0200 @@ -1907,10 +1907,11 @@ return; } else { struct mb2_tag *start = (struct mb2_tag *)(lctx->addr + 8); - printk(TBOOT_INFO"MB2 dump, size %d\n", *(uint32_t *)lctx->addr); + printk(TBOOT_INFO"MB2 dump, size %d addr %p\n", *(uint32_t *)lctx->addr, + lctx->addr); while (start != NULL){ - printk(TBOOT_INFO"MB2 tag found of type %d size %d ", - start->type, start->size); + printk(TBOOT_INFO"MB2 tag found of type %d size %d addr %p ", + start->type, start->size, start); switch (start->type){ case MB2_TAG_TYPE_CMDLINE: case MB2_TAG_TYPE_LOADER_NAME: @@ -1924,6 +1925,8 @@ { struct mb2_tag_module *ts = (struct mb2_tag_module *) start; + printk(TBOOT_INFO"mod_start 0x%x, mod_end 0x%x ", + ts->mod_start, ts->mod_end); printk_long(ts->cmdline); } break; diff -r 1f912c52b1cc tboot/common/tboot.c --- a/tboot/common/tboot.c Sat May 23 20:32:48 2020 +0300 +++ b/tboot/common/tboot.c Mon Jun 01 17:17:01 2020 +0200 @@ -369,6 +369,8 @@ print_loader_ctx(g_ldr_ctx); */ + print_loader_ctx(g_ldr_ctx); + /* clear resume vector on S3 resume so any resets will not use it */ if ( !is_launched() && s3_flag ) set_s3_resume_vector(&_tboot_shared.acpi_sinfo, 0); |
From: Timo L. <tim...@ik...> - 2020-06-01 18:08:56
|
Hi, On Mon, 1 Jun 2020, Lukasz Hawrylko wrote: >> On warm boot this prints just >> >> TBOOT: start=0x0x10008 tag_type=17 start->type=3031684 start->size=-2147418113 >> TBOOT: start=0x0x80020008 tag_type=17 start->type=0 start->size=0 >> > > That looks like memory corruption... Does it work when you remove all > SINITs except the good one? Hmm, So do both cold boot and warm boot prints look like memory corruption or just the cold boot where it gets stuck? Listing only GM45_GS45_PM45_SINIT_51.BIN in grub.cfg still results in tboot getting stuck. > Could you please apply following patch and send me a log? tboot prints "TBOOT: this routine only prints out multiboot 2" and never enters the else block where the printk()s are... > One more test - please remove 'memory' option in 'logging' parameter > from TBOOT command line in grub.cfg and check if that helps. This does not seem to change the behavior either. -Timo |
From: Timo L. <tim...@ik...> - 2020-06-01 22:16:15
|
On Mon, 1 Jun 2020, Timo Lindfors wrote: > tboot prints > > "TBOOT: this routine only prints out multiboot 2" > > and never enters the else block where the printk()s are... This gave me a hint: Using multiboot2/module2 seems to work with cold boot. This might not mean anything of course if the issue is caused by memory corruption. -Timo |