Menu

#4 client overwrites UA string with binary

open-works-for-me
nobody
None
2
2002-03-26
2002-03-22
No

An earlier version of the client submitted arbitrary
binary "characters" as the User-Agent string, sometimes
followed by a "http". Eventually, this got changed to
"Mozilla/4.0 (compatible; grub-client-0.2.3; blah...".

However, I see an increasing number of entries in my
logfiles that are shortened to "Mozi!". This is
probably the same old bug that now inserts an
exclamation mark and a null byte (and possbily more
garbage) into the internal UA string. I found this
confirmed today by a log entry that said "Mozi^Y". Note
that the <ctrl-Y> (and any other control characters in
the UA) results in an invalid HTTP request, although
most servers seem to be fairly tolerant about it.

Watch your pointers, folks!
Bogus UA strings will get you permanently banned from
many sites by webmasters who care for what happens to
their stuff. You don't want that to happen, do you?

-schorsch

Discussion

  • Kord Campbell

    Kord Campbell - 2002-03-26
    • priority: 5 --> 2
    • status: open --> open-works-for-me
     
  • Kord Campbell

    Kord Campbell - 2002-03-26

    Logged In: YES
    user_id=37362

    Although there *was* a bug in an earlier version of the
    client that affected the user agent field, the current
    version does not, to our knowledge, contain this same
    flaw. If you look at the Crawler.h file on CVS, you will
    see that in revision 1.6, we added DEFINEs to contain the
    client UA information. The Crawler.cpp file takes these
    defines, as well as the version define and passes them to
    the cURL libraries without passing them into variables.

    Defines are substitutions during compile time, so are the
    same as typing "somestring" in your code. There simply
    aren't any pointers to keep track of with this logic, so
    it is *highly* unlikely that we are doing anything wrong
    with this code.

    That said, it is still possible that cURL is doing
    something weird, but I find that even more unlikely as
    cURL has had VERY active development on it in the past
    year, and someone would have caught this by now.

    Another matter to consider is that another part of the code
    is overwriting the string in memory, but because these are
    defines, the program should crash when it (incorrectly)
    tries to write to this *protected* memory area. Just try
    to write to a const char * variable and you will see an
    example of this in action.

    All this said, I am dubious that your "Mozi!" string is
    actually one of our crawlers in action. It is possible
    that someone is still running an older version of the
    client which would cause this, but everyone has been
    notified of the new release and there have been two
    releases since this bug was active.

    I did a search on Google and Altavista for your rouge
    string and came up with a couple of pages that log UA
    coming into their site and rank their visit percentages.
    Several pages contained a high occurrence of "Mozi!" (like
    1%), yet had no trace of the actual grub-client UA string.
    Others had both, and a few had only the correct grub-client
    UA string. All this leads me to think that there are other
    crawlers/browsers out there with bugs in them that truncate
    the correct UA, and that it isn't necessarily us doing it.

    Kord

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.