Menu

#2 Option to count visitors based on cookies

open
nobody
None
5
2014-08-12
2001-04-24
Anonymous
No

Any devt for counting visitors based on cookie field

--
Pascal Landeau

Discussion

  • Kato Wulf

    Kato Wulf - 2002-06-18

    Logged In: YES
    user_id=169865

    If awestats could use cookies to track visitors, thus, making
    the system highly accurate concerning people coming from
    business environs and AOL - like platforms, this would be the
    program for us.

    As it stands, we must pay for some inferior product at my
    work so that we can have cookie tracking.

     
  • Justin Koivisto

    Justin Koivisto - 2002-08-08

    Logged In: YES
    user_id=466127

    you all are missing the point of this project, it is a log
    analyzer - that means it reads the server log files.It
    doesn't collect the information itself. The programs that do
    this (like SuperStats) also effect your site's performance
    while users are visiting because of the extra bandwidth
    needed to collect the required information.

     
  • Nobody/Anonymous

    Logged In: NO

    Actually, apache will happily provide the cookie information
    in the log file... allowing for awestats to look at the
    cookies without any performance issues.

    This is simple to set up (uncomment a module in apache and
    add a field to the log format), and wouldn't be much
    different than determining the users based on IP address.

    The real advantage of such a feature is that it allows for
    mass ip based groups (like aol users, some dialup users,
    etc) to be counted for who they are, rather than one massive
    user.

    Best wishes, and thanks for the great software.

    </kato>

     
  • Nobody/Anonymous

    Logged In: NO

    This option should be provided by Awstats itself. I am
    submitting to awstst.pl the logs retrieved from our IIS web
    server with the all the field checked (Fields: date time c-ip cs-
    username s-sitename s-computername s-ip s-port cs-method
    cs-uri-stem cs-uri-query sc-status sc-win32-status sc-bytes
    cs-bytes time-taken cs-version cs-host cs(User-Agent) cs
    (Cookie) cs(Referer) ).
    Since this is a a personalized format, the fields s-sitename, s-
    port, cs-bytes, time-taken, cs-host, cs(User-Agent), cs
    (Cookie) are not considered in the section "# Personalized
    log format" of "# GENERATING PerlParsingFormatmat".

    After all how the can Awstat generate report by counting
    visitors based on cookie field ?

    Thanks,
    Osvaldo

     
  • Sebastian Mendel

    Logged In: YES
    user_id=326580

    how is it to count and follow visitors based an the
    SESSIONID in the url ? or any other user-defined string in
    the url

    or generaly with just regexp over the logline? so the user
    can easily define by itself how to identify unique visitors

     
  • Antoine EMERIT

    Antoine EMERIT - 2006-12-01

    Logged In: YES
    user_id=1553726
    Originator: NO

    Hi,

    Since my previous patch, that use IP+USerAgent as the visitor key, I've worked on a new one that use cookies to track visitors and sessions.

    To use this pactch, add the folowing lines in the .config file :

    LogFormat = "%host %other %logname %time1 %methodurl %code %bytesd %refererquot %uaquot %extraquot1 %extraquot2"
    VisitorCookie = "idvisiteur"
    SessionCookie = "PHPSESSID"

    where extraquot1 and extraquot2 match the web server cookie fields. In my case I use the following format in our httpd.conf apache file :

    LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" \"%{Cookie}i\" \"%{Set-Cookie}o\"" combined_cookie
    CustomLog /var/log/apache/access.log combined_cookie

    In this exemple we use the PHP auto cookie to follow the session, and a home page generated "idvisiteur" cookies (which live many days) for the user/visitor tracking.

    Notes: any visitor non-cooked page are ignored for the visitor and session tracking.

    We're now testing this patch on our production servers. I think there is more works to do and some limitations :

    • the visitor cookie is mandatory, else awstats would count too much session changes from proxies (AOL, ...).
    • when matching the creation of a cookie by the web serveur, apache only log the first Set-Cookie (fortunaltaly PHPSESSID is the first in our configuration), and so the first page may be badly tracked.
    • If a browse refuse (visitor) cookies, it won't be count.

    The patch :

    --- awstats.pl.bak_before_cookies 2006-12-01 13:06:45.000000000 +0100
    +++ awstats.pl 2006-12-01 13:06:56.000000000 +0100
    @@ -61,6 +61,7 @@
    $pos_vh $pos_host $pos_logname $pos_date $pos_tz $pos_method $pos_url $pos_code $pos_size
    $pos_referer $pos_agent $pos_query $pos_gzipin $pos_gzipout $pos_compratio $pos_timetaken
    $pos_cluster $pos_emails $pos_emailr $pos_hostr @pos_extra
    +$pos_client_cookie $pos_server_cookie $client_cookies $server_cookies
    /;
    $DIR=$PROG=$Extension='';
    $Debug = $ShowSteps = 0;
    @@ -145,6 +146,10 @@
    $DecodeUA
    $IncludeUAInVisitors
    $VisitAllHosts
    +$VisitorCookie
    +$SessionCookie
    +$VCookie
    +$SCookie
    /;
    ($DebugMessages, $AllowToUpdateStatsFromBrowser, $EnableLockForUpdate, $DNSLookup, $AllowAccessFromWebToAuthenticatedUsersOnly,
    $BarHeight, $BarWidth, $CreateDirDataIfNotExists, $KeepBackupOfHistoricFiles,
    @@ -156,8 +161,8 @@
    $IncludeInternalLinksInOriginSection,
    $AuthenticatedUsersNotCaseSensitive,
    $Expires, $UpdateStats, $MigrateStats, $URLNotCaseSensitive, $URLWithQuery, $URLReferrerWithQuery,
    -$DecodeUA, $IncludeUAInVisitors, $VisitAllHosts)=
    -(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0);
    +$DecodeUA, $IncludeUAInVisitors, $VisitAllHosts, $VisitorCookie, $SessionCookie, $VCookie, $SCookie)=
    +(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,'','');
    use vars qw/
    $DetailedReportsOnNewWindows
    $FirstDayOfWeek $KeyWordsNotSensitive $SaveDatabaseFilesWithPermissionsForEveryone
    @@ -322,6 +327,7 @@
    %_domener_p %_domener_h %_domener_k %_errors_h %_errors_k
    %_filetypes_h %_filetypes_k %_filetypes_gz_in %_filetypes_gz_out
    %_host_p %_host_h %_host_k %_host_l %_host_s %_host_u
    +%_host_sid
    %_waithost_e %_waithost_l %_waithost_s %_waithost_u
    %_keyphrases %_keywords %_os_h %_pagesrefs_p %_pagesrefs_h %_robot_h %_robot_k %_robot_l %_robot_r
    %_worm_h %_worm_k %_worm_l %_login_h %_login_p %_login_k %_login_l %_screensize_h
    @@ -5036,6 +5042,7 @@
    $pos_vh = $pos_host = $pos_logname = $pos_date = $pos_tz = $pos_method = $pos_url = $pos_code = $pos_size = -1;
    $pos_referer = $pos_agent = $pos_query = $pos_gzipin = $pos_gzipout = $pos_compratio = -1;
    $pos_cluster = $pos_emails = $pos_emailr = $pos_hostr = -1;
    + $pos_client_cookie = $pos_server_cookie = -1;
    @pos_extra=();
    @fieldlib=();
    $PerlParsingFormat='';
    @@ -5050,10 +5057,10 @@
    # WebStar: 05/21/00 00:17:31 OK 200 212.242.30.6 Mozilla/4.0 (compatible; MSIE 5.0; Windows 98; DigExt) http://www.cover.dk/ "www.cover.dk" :Documentation:graphics:starninelogo.white.gif 1133
    # Squid extended: 12.229.91.170 - - [27/Jun/2002:03:30:50 -0700] "GET http://www.callistocms.com/images/printable.gif HTTP/1.1" 304 354 "-" "Mozilla/5.0 Galeon/1.0.3 (X11; Linux i686; U;) Gecko/0" TCP_REFRESH_HIT:DIRECT
    if ($Debug) { debug("Call To DefinePerlParsingFormat (LogType='$LogType', LogFormat='$LogFormat')"); }
    - if ($LogFormat =~ /^[1-6]$/) { # Pre-defined log format
    + if ($LogFormat =~ /^[1-6]$/) { # Pre-defined log format ;
    if ($LogFormat eq '1' || $LogFormat eq '6') { # Same than "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"".
    # %u (user) is "([^\[]+)" instead of "[^ ]+" because can contain space (Lotus Notes). referer and ua might be "".
    -# $PerlParsingFormat="([^ ]+) [^ ]+ ([^\[]+) \[([^ ]+) [^ ]+\] \\"([^ ]+) (.+) [^\\"]+\\" ([\d|-]+) ([\d|-]+) \\"(.?)\\" \\"([^\\"])\\"";
    +# $PerlParsingFormat="([^ ]+) [^ ]+ ([^\[]+) \[([^ ]+) [^ ]+\] \\"([^ ]+) (.+) [^\\"]+\\" ([\d|-]+) ([\d|-]+) \\"(.?)\\" \\"([^\\"])\\" ";
    $PerlParsingFormat="([^ ]+) [^ ]+ ([^\[]+) \[([^ ]+) [^ ]+\] \\"([^ ]+) ([^ ]+) [^\\"]+\\" ([\d|-]+) ([\d|-]+) \\"(.?)\\" \\"([^\\"])\\"";
    $pos_host=0;$pos_logname=1;$pos_date=2;$pos_method=3;$pos_url=4;$pos_code=5;$pos_size=6;$pos_referer=7;$pos_agent=8;
    @fieldlib=('host','logname','date','method','url','code','size','referer','ua');
    @@ -5273,6 +5280,11 @@
    $pos_extra[$1] = $i; $i++; push @fieldlib, "extra$1";
    $PerlParsingFormat .= "([^$LogSeparatorWithoutStar]+)";
    }
    + # Extra value between " (e.g. cookies string)
    + elsif ($f =~ /%extraquot(\d+)$/) {
    + $pos_extra[$1] = $i; $i++; push @fieldlib, "extra$1";
    + $PerlParsingFormat .= "\\"([^\\"]*)\\"";
    + }
    # Other tag
    elsif ($f =~ /%other$/) {
    $PerlParsingFormat .= "[^$LogSeparatorWithoutStar]+";
    @@ -6170,6 +6182,8 @@

         if ($IncludeUAInVisitors && $Debug) { debug("  Include User Agent in Visistor ID.",1); }
         if ($VisitAllHosts       && $Debug) { debug("  Visit all hosts.",1); }
    
    • if ($VisitorCookie && $Debug) { debug(" Use cookie '$VisitorCookie' as visitor identity.",1); }
    • if ($SessionCookie && $Debug) { debug(" Use cookie '$SessionCookie' as session identity.",1); }

      if ($EnableLockForUpdate) {
      # Trap signals to remove lock
      @@ -6886,8 +6900,10 @@
      my $UA;
      my $VisitorId;
      $UA = $UserAgent;
      - $UA =~ s/ //g;
      -
      + $UA =~ s/ /
      /g;
      + $VCookie = "";
      + $SCookie = "";
      +
      $VisitorId = $HostResolved;

          if ($IncludeUAInVisitors) {
      

      @@ -6898,15 +6914,59 @@
      if ($VisitAllHosts) {
      if ($Debug) { debug(" This is a second visit for $VisitorId.",4); }
      }
      +
      + # If we use any cookie get all of them from the log
      + if ($VisitorCookie || $SessionCookie) {
      + $client_cookies = $field[$pos_extra[1]];
      + $server_cookies = $field[$pos_extra[2]];
      + if ($Debug) {
      + debug("Client cookies = $client_cookies", 3);
      + debug("Server cookies = $server_cookies", 3);
      + }
      + }

    • if ($PageBool || $VisitAllHosts) {

    • If we use a visitor cookie, add this cookie to the visitor id to separate 'real user'

    • if ($VisitorCookie) {
    • Extract the visitor cookies from the client or server cookies

    • if ($server_cookies =~ /$VisitorCookie=([^;]*)(;|$)/) {
    • $VCookie = $1;
    • } elsif ($client_cookies =~ /$VisitorCookie=([^;]*)(;|$)/) {
    • $VCookie = $1;
    • } else {
    • $VCookie = "";
    • }
      +
    • $VisitorId .= "+" . $VCookie;
    • if ($Debug) { debug(" Include visitor cookie '$VisitorCookie = $VCookie' in Visistor ID",3); }
    • }
      +
    • If we use a session cookie, extract it now

    • if ($SessionCookie) {
    • Extract the session cookies from the client or server cookies

    • if ($server_cookies =~ /$SessionCookie=([^;]*)(;|$)/) {
    • $SCookie = $1;
    • } elsif ($client_cookies =~ /$SessionCookie=([^;]*)(;|$)/) {
    • $SCookie = $1;
    • } else {
    • $SCookie = "";
    • }
    • if ($Debug) { debug(" Session cookie : '$SessionCookie = $SCookie'",3); }
    • }
      +
    • We count this if the Page are accepted (good extention or VisitAllHosts on),

    • but if we use visitor cookie we exclude non cooked pages

    • if (($PageBool || $VisitAllHosts) && (!$VisitorCookie || $VCookie)) {
      my $timehostl=$_host_l{$VisitorId};
      if ($timehostl) {
      # A visit for this host was already detected
      # TODO everywhere there is $VISITTIMEOUT
      # $timehostl =~ /^\d\d\d\d\d\d(\d\d)/; my $daytimehostl=$1;
      # if ($timerecord > ($timehostl+$VISITTIMEOUT+($dateparts[3]>$daytimehostl?$NEWDAYVISITTIMEOUT:0))) {
    • if ($timerecord > ($timehostl+$VISITTIMEOUT)) {
    • A new session started if we reach the VISITTIMEOUT delay

    • or if the session cookie changed for this visitor

    • However blank session pass with success to avoid first (unique) visit on the web site

    • if ($timerecord > ($timehostl+$VISITTIMEOUT) || ($_host_sid{$VisitorId} && $_host_sid{$VisitorId} != $SCookie)) {
      # This is a second visit or more
      if (! $_waithost_s{$VisitorId}) {
      # This is a second visit or more
      @@ -6969,6 +7029,7 @@
      else {
      # This is a new visit (may be). First new visit found for this host. We save in wait array the entry page to count later
      if ($Debug) { debug(" New session (may be) for $VisitorId. Save in wait array to see later",4); }
    • If we use a cookie for the session, we revord rhe cookie value insted of the time

              $_waithost_e{$VisitorId}=$field[$pos_url];
              # Save new session properties
              $_host_u{$VisitorId}=$field[$pos_url];
      

      @@ -6979,6 +7040,7 @@
      }
      $_host_h{$VisitorId}++;
      $_host_k{$VisitorId}+=int($field[$pos_size]);

    • $_host_sid{$VisitorId} = $SCookie;

      # Analyze: Browser - OS
      #----------------------
      
     

Log in to post a comment.