Menu

ERR_BADRESPONSETAG error after attempting to seal data.

2020-02-27
2020-03-31
  • Brian Gardner

    Brian Gardner - 2020-02-27

    Any ideas on why this occurs or how to resolve it?

    I've looked in tpmutil.c where this error is thrown but the comments only seem to indicate that it's a bad response from the TPM chip. "Bad tag in response message" is the string associated with this error.

    We're on Windows 10 using TPM 1.2 and using TPM_LOWLEVEL_TRANSPORT_TCP_SOCKET.

     
    • Ken Goldman

      Ken Goldman - 2020-02-28

      Can you decribe your environment? You say 'TPM chip', but you're using a socket transport, which implies a SW TPM. Are you running a transport session? What command?

      Those utilities are quite old. If you're using Windows 10 to a HW TPM, there will be some porting of the device driver interface. It is likely that openssl 1.1.1 porting is needed. They were meant for experimenting, not product code.

      The code comments imply that some HW TPM had a quirk in Quote, but the code doesn't refer to Quote and the -3 is mysterious.

      Once I know your exact environment, I can see if anyone remembers the details.

       
  • Brian Gardner

    Brian Gardner - 2020-03-02

    Ken,

    This is our environment:
    1. Windows 10 IoT RS5 1809 running in an embedded environment.
    2. Hardware TPM chip (SLB9670VQ1.2) with firmware 6.43.
    3. Our custom TPM management code that leverages libtpm version 4769 (no TPM proxy), OpenSSL 1.0.2, Windows TBS service and the Windows TPM driver.

    Our TPM management code uses the RNG, sealing and unsealing functionality of the TPM. We leverage libtpm and Windows TBS like so:
    1. We initialize libtpm with TPM_LOWLEVEL_TRANSPORT_TCP_SOCKET (TPM_LowLevel_Transport_Init (0)) and set the transport with TPM_LowLevel_Transport_Set ().
    2. For the tpm_transport structure, we override the open, close and receive socket functions with our own dummy functions that don't do anything. However, we do provide our own transmit socket function that just calls the Windows TBS function to submit commands and receive data back from the TPM HW chip (Tbsip_Submit_Command ()).

    Basically, we use Windows TBS to create and close context and to send and receive data. We initialize libtpm and use its TPM_GetRandom (), TPM_PcrRead (), TPM_Extend (), TPM_SealCurrPCR (), TPM_Unseal () and TPM_GetErrMsg () functions.

    This has been working well except sometimes we receive the ERR_BADRESPONSETAG error while attempting to seal data with libtpm's TPM_SealCurrPCR () function.

    To help clarify, this is an example of what we're doing:

    Overriding libtpm socket functions:

    static uint32_t openClientSocket (int sock_fd) { return 0; }
    static uint32_t closeClientSocket (int sock_fd) { return 0; }
    static uint32_t receiveSocket (int sock_fd, struct tpm_buffer
    tb) { return 0; }
    static uint32_t receiveBytes (int sock_fd, unsigned char *buffer, size_t nbytes) { return 0; }

    static uint32_t transmitSocket (int sock_fd, struct tpm_buffer tb, const char msg)
    {
    call Tbsip_Submit_Command (...) and process response.
    }

    Creating context for Windows TBS and initializing libtpm:

    bool initialize (...)
    {
    // Create TBS context.
    call Tbsi_Context_Create (...)

     // Initialize libtpm.
     static tpm_transport transport =
      {
         openClientSocket, // overridden with dummy function
         closeClientSocket, // overridden with dummy function
         transmitSocket, // overridden with our function to send/receive via TBS
         receiveSocket // overridden with dummy function
      };
    
      TPM_LowLevel_Transport_Init (0);
      TPM_LowLevel_Transport_Set (&transport);
    

    }

    Sealing data:

    bool sealData (...)
    {
    call TPM_SealCurrPCR (...)
    check for error using TPM_GetErrMsg (...)
    if no error, return sealed data
    }

     

    Last edit: Brian Gardner 2020-03-02
    • Ken Goldman

      Ken Goldman - 2020-03-02

      I think I understand the setup now. I forwarded the thread to someone who was more familiar with that piece of the code. Also:

      1. When you say "sometimes", do some commands work but the seal always fails, or does the seal sometimes succeed and sometimes fail?
      2. I recall that the utilities all have a -v option that traces the command and response packet. Assuming that works in your setup, can you trace a success and failure case. We can see what's different.
       
  • Brian Gardner

    Brian Gardner - 2020-03-02

    All commands are working and we've only encountered this issue with sealing maybe 4 or 5 times. 99% of the time, sealing works as expected. And we haven't encountered any other issues with other commands yet.

    Thanks Ken, I'll investigate the -v option.

    Correction: We have seen this same error with unsealing as well. I wasn't able to see a TPM log for an unsealing failure until now and it's also throwing ERR_BADRESPONSETAG. 99% of the time, both sealing and unsealing are working.

     

    Last edit: Brian Gardner 2020-03-02
    • Stefan Berger

      Stefan Berger - 2020-03-03

      Do you have a script or program that triggers this issue? If so, could you run this program in a Linux environment that doesn't have the changes you have made to see whether it triggers the error there as well. Otherwise it may be helpful to crank up the logging on the client and server sides and see what's happening there. Is maybe the connection breaking when this happens?

      This is from tpmutil.c after TPM_Send around line 1181. This seems to be the only place where such an error tag may originate on the recipient side and my guess would be that for some reason no enough bytes were received.

                  if (0 == rc) {
                      tpm_buffer_load16(tb, tagoffset, &tag_in);
                      if ((tag_in - 3)  != tag_out) {
                          rc = ERR_BADRESPONSETAG;
                      }
                  }
      
       
    • Ken Goldman

      Ken Goldman - 2020-03-03

      That's a hint. Could you gather more statistics? Could it be working 99.6% of the time.? A crypto operation that fails 1 in 256 times is often a bignum issue, where it fails when the upper byte is 0x00. I can run a createkey / loadkey / seal / unseal loop 1000's of times without error.

      That doesn't explain the ERR_BADRESPONSETAG, but it may provide a clue.

      As Stefan recommended, perhaps run your application on Linux with a SW TPM to see if it's the TPM, the library, or the application.

      And post the -v output on error.

       
  • Brian Gardner

    Brian Gardner - 2020-03-03

    Ken and Stefan,

    I will look at the -v option or some other way to log more information so I can compare successful and unsuccessful operations.

    I don't know if we can get a true percentage of how often this is occurring. We have hundreds of engineers using our Windows 10 implementation on their target hardware and we'd need to set up some type of framework to record all operations. That's not a viable option at the moment.

    I don't have access to a Linux box but I can probably get a Windows environment up and running with a TPM proxy and SW TPM.

    Thanks for your help so far and I will post updates when available.

     
    • Ken Goldman

      Ken Goldman - 2020-03-04

      So 99% isn't really 99%? It's important to know the number - 1/256 is a pointer to the problem. Can you just run a loop in one failing platform like I did?

       
  • Brian Gardner

    Brian Gardner - 2020-03-05

    Ken,

    When I return next week, I'll set up a test to get a true failing percentage.

     
  • Brian Gardner

    Brian Gardner - 2020-03-31

    Ken,

    I ran an automated test to perform 45,000 iterations of the following and it succeeded without issue.
    1. TPM_GetRandom ()
    2. TPM_SealCurrPCR ()
    3. TPM_Unseal ()
    4. Compare the plain text random data against the unsealed data.

    Right now we can't replicate this issue in a development environment so I'm going to run some different tests, log more information in our release environment and try some of your other suggestions.

     

Log in to post a comment.