#1251 Form boundary string should be truly random

closed-fixed
None
5
2013-06-25
2013-06-24
Floris
No

The use of predicatable pseudo-random numbers to generate the multipart/form boundary can lead to security issues in software using libcurl.

See: http://localhost.re/p/solusvm-whmcs-module-316-vulnerability

Discussion

  • Linus Nielsen

    Linus Nielsen - 2013-06-24

    Looks interesting. But please explain a little more how libcurl is to blame here. The multipart boundary is not a security feature, and any security implementation that depends on the multipart boundary being secret is flawed IMHO.

     
  • Floris

    Floris - 2013-06-24

    RFC2046:

    As stated previously, each body part is preceded by a boundary
    delimiter line that contains the boundary delimiter. The boundary
    delimiter MUST NOT appear inside any of the encapsulated parts, on a
    line by itself or as the prefix of any line.
    --

    libcurl is the composing agent choosing the boundary, so also responsible for choosing one that MUST NOT appear inside any of the encapsulated parts.
    Either by searching the form fields for the boundary it intends to use, and choosing a different one if found, or using one that is so random it is unlikely to be in the data.
    In any case not a predicatable number.

     
    Last edit: Floris 2013-06-24
  • brim

    brim - 2013-06-24

    Well I figure there are a few ways to go here. One, find a way to present the multipart boundary back to the library user so that the user can check and escape their inputs and note with a big ol' caveat in the documentation that this is can be an issue to look out for (currently this is not mentioned at all). Two, check that the post contents including uploaded files passed by the user do not contain the selected multipart boundary and end up producing malformed requests; escape the input for the user or permute the multipart boundary so that the (none-too-bright) user input does not cause breakage. Three, do as Floris suggests and semi-securely randomize the seed value in Curl_srand.

    #1 helps and increases awareness of the potential pitfall, but pushes responsibility to the user who will inevitably make bad choices. The form boundary is also not created until Curl_getformdata is called, so that'd have to be shuffled around to be available earlier.
    #2 is probably the best solution for library usability in that user intervention is not necessary for the problem to go away, helping the most people with the least overall change. Unfortunately, it requires validating the form input similar to strstr when constructing the post data which is extra work that wouldn't be needed if the calling program disallowed strings in the same format as formboundary and pre-scanned file attachments for the same.
    #3 seems like the easiest effective solution to the problem listed by making the form boundary more difficult to guess and potentially corrupt post data. Use OS dependent mechanisms to get a random value to seed the PRNG with instead of time().

    What do you think is the best way to go here?

    edit: Oops, didn't read the RFC until now. The boundary is not escapable and a safe boundary must be chosen by inspection or by virtue of being sufficiently random to not have collisions.

     
    Last edit: brim 2013-06-24
  • Floris

    Floris - 2013-06-24

    If the uploaded files are base64 encoded they do not have to be searched.
    The boundary starts with "--" and "-" is not part of the base64 alphabet.

     
  • Linus Nielsen

    Linus Nielsen - 2013-06-24

    Yes, I can see how this might be a problem, but I still fail to recognize this as a security issue.

    I think #3 is the best way to solve this.

     
  • brim

    brim - 2013-06-24

    Well, let me explain it by analogy. Software PantsAdministrator uses libcurl to RPC multipart post messages to the Pants backend system. For this example it has two functions Remove_Pants and Store_In_Pocket_For_Significant_Other and it stores which function to call as the last argument of the post message list. The Remove_Pants function is only to be called by the user of PantsAdministrator, but Store_In_Pocket_For_Significant_Other takes a text field argument and proxies the call for an external program. PantsAdministrator is not very clever about curl, so it spawns it in its own process which initializes Curl_srand just before calling the curl_easy post mechanism.

    Significant_Other uses the TequilaShots program to predict the formboundary and precisely time the proxy event to the system clock of the machine PantsAdministrator is running on, including a %{formboundary}\r\nRemove_Pants line, which overrides the RPC command. Not knowing it should check for %{formboundary} because it doesn't know what curl will select for it, it passes the text field through unaltered.

    The RPC is executed with the permissions of PantsAdministrator and now you are wearing no pants, a win for Significant_Other.


    Basically, since the calling program doesn't know what libcurl is going to choose as form boundary, it can't be required to test for it and libcurl must test, or choose a suitably random form boundary that is very difficult to guess.

     
  • Daniel Stenberg

    Daniel Stenberg - 2013-06-24
    • assigned_to: Daniel Stenberg
     
  • Daniel Stenberg

    Daniel Stenberg - 2013-06-24

    I suggest we make sure this is properly documented to not surprise users.

    We can also consider adding something else to the Curl_srand() function to prevent it from generating the same value if repeatedly called within the same second.

     
  • Floris

    Floris - 2013-06-24

    Keep in mind that this is an security issue that affects virtually every PHP webapplication that uses libcurl to call RESTfull webservices, and passes through user provided input.

    PantsAdministrator is not very clever about curl, so it spawns it in its own process which initializes Curl_srand just before calling the curl_easy post mechanism.

    That is pretty standard in shared hosting environements, where PHP is started as CGI script for each request, so it can run under the privileges of the username of the webhosting customer.
    Webservers like Apache also sends the server's date & time with every request to the website visitor. So the end-user knows exactly what boundary is going to be used.

    I suggest we make sure this is properly documented to not surprise users.

    Documented instead of fixed?
    What do you recommend webapplications using libcurl do?
    Filter out every line of user input that starts with "--" as it could be a boundary?
    Remember that applications using libcurl do not know which boundary libcurl is going to use. That is an implementation detail inside libcurl.

     
    Last edit: Floris 2013-06-24
  • Daniel Stenberg

    Daniel Stenberg - 2013-06-24

    Let me emphasize that implementors that rely on the boundary string being cryptographically secure should rethink their designs!

     
  • brim

    brim - 2013-06-24

    Thank you, Daniel. I applied your patch to my local git repository and it solves the problem we were having (predictability of the form boundary). I agree that the libcurl users should be sanitizing their inputs, however, there is no way to tell what form boundary libcurl may use and test for that specifically. It may be beneficial to share that information with the host program somehow, though I realize it changes with attachment depth. This patch is sufficient.

     
  • Daniel Stenberg

    Daniel Stenberg - 2013-06-24

    I found some minor mistakes in the patch that I've fixed locally so I'll merge a slightly updated version.

     
  • Dan Fandrich

    Dan Fandrich - 2013-06-24

    A couple of suggestions on the patch. There's a typo in the comments (acess). And maybe it's lame to even try, but in the #ifndef have_curlssl_random case, you can get a few more bits of randomness by adding getpid() to time() (and clock_gettime() when available, if you want to get fancy). Since with this patch the form generation uses a cryptographically-secure random number generator when available, by extending the multipart line generation code to call Curl_rand() twice and use both numbers, you get a 64-bit cryptographically-strong multipart separation line, which should solve this problem for good.

     
  • Daniel Stenberg

    Daniel Stenberg - 2013-06-24

    Thanks Dan!

    Due to my initial mistakes and Dan's added suggestions I'm posting my v2 of the patch and I'm calling it a night. I'll get a version of this merged into master tomorrow assuming no major nits found!

     
  • Dan Fandrich

    Dan Fandrich - 2013-06-24

    I can't think of any reason to strip off the bottom 8 bits of each random number when generating the multipart line. More randomness is better, after all, and those extra bits are free! In fact, some of those redundant dashes could be replaced with another 32 bits of entropy with little effort.

    On a side note, the previous use of a non cryptographically-secure PRNG in Curl_sasl_create_digest_md5_message gives me a bad feeling. I wonder what security impact that had?

     
  • Daniel Stenberg

    Daniel Stenberg - 2013-06-25

    You're right, I made it use 48 bits just because that's the length we used before. I'll make it use all 64 instead.

    And yes, the digest_md5 usage may indeed have some security impact but I haven't yet given that any deeper thoughts.

     
  • Daniel Stenberg

    Daniel Stenberg - 2013-06-25
    • status: open --> closed-fixed
     
  • Daniel Stenberg

    Daniel Stenberg - 2013-06-25

    This issue has now been fixed in git with commit 365c5ba39591fab2e. It relies on another independent patch though that fixes a printf() flaw this change brought into the light.

     

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks