#841 Some keywords not parsed out

closed
None
8
2014-08-28
2002-07-29
James Leatherman
No

PARSED:

cbcba-137p228.ppp13.odn.ad.jp - -
[26/Jul/2002:20:58:30 -0700] "GET /gen/ind00043.htm
HTTP/1.1" "http://www.google.com/search?
q=ZENRO&hl=ja&lr=&ie=UTF-
8&inlang=ja&start=20&sa=N"; 200 1216

WILL NOT PARSE:

alb-66-67-155-237.nycap.rr.com - -
[27/Jul/2002:09:41:02 -0700] "GET /
HTTP/1.1" "http://google.yahoo.com/bin/query?
p=james+leatherman&hc=0&hs=0" 200 1162

Why?

Discussion

  • Logged In: YES
    user_id=53583

    Why is one log entry logged with a semi-colon after the referrer and the
    other not? In any case, which version of AWStats are you using? It
    could be that it's a case of mistaken identity in where AWStats is looking
    for the key words. The second line probably matches Google for the
    search engine even though it's using the Yahoo keyword identifier in
    the query.

    Once I know the version I can test it and let you know
    what the real problem is.

     
  • Logged In: YES
    user_id=584778

    Current version, 4.1.

     
  • Logged In: YES
    user_id=53583

    Well, the problem is as I suspected... AWStats looks through the
    second line and matches the google search engine first since the name
    matches, however, p= is not the standard google query string so
    keywords can't be pulled.

    Right now, there isn't an easy
    workaround for this since AWStats will match Google for the second
    line before it tries Yahoo as long as Google appears in the temporary
    search engine hash first (which is pulled from a previous log line). I'll
    forward this on to our lead developer to notify him of the problem. Look
    for a possible fix in a future release.

     
  • Logged In: YES
    user_id=53583

    Also, I assume that the semi-colon in the first log line you pasted in was a
    typo? The presence of the semi-colon under certain circumstances will
    cause an error in the log file processing and cause the line to be
    excluded as corrupt. I'm assuming that your logs don't have a mixture of
    lines with and without semi-colons after the referrer field?

     
  • Logged In: YES
    user_id=584778

    1: Should I be able to add another search engine ID

    (Google2) with the other syntax?

    2: The log with the semicolon was not a typo. And, it was

    the one that DID get parsed.

     
  • Logged In: YES
    user_id=53583

    Interesting, which log format are you using for AWStats and which log
    format are you using for your webserver?

    You might try using
    the unstable release AWStats 5.0 and changing the search order hash
    to have google. appear after yahoo., this would fix your
    problem.

    Alternatively, if you're married to 4.1 or don't want to try
    an unstable release you could edit your 4.1 version's
    db/search_engines.pl file to use www.google.com as the Google
    identifier in each of the hashes.

     
  • Logged In: YES
    user_id=53583

    Actually, now that I think about it, changing the search_engines.pl
    Google hash identifiers to "www.google" would probably be better
    since you'd still be able to capture foreign Googles, not just the .com

     
  • Logged In: YES
    user_id=53583

    Also, Google rewrites requests to http://google.com as
    http://www.google.com anyway, so you shouldn't be losing any
    results by making that change.