Analyze past log files

2012-10-17
2012-10-17
  • John Callahan
    John Callahan
    2012-10-17

    I've been using AWStats for a few years. About 6 months ago, I updated AWStats (I thin kit was to 7.0) and my version of Perl (to 5.14.) Unfortunately, I was bitten by this bug: http://sourceforge.net/p/awstats/bugs/868/. The robots, SkipFiles, and OS (and maybe other problems I don't know about) were not detected. So, now I have 6 months of analysis with incorrect traffic numbers (visits, visitors, bandwidth, etc...)

    Does anyone have a suggestion on what to do? Ideally, now that the problem is fixed in AWStats, is to go back and reanalyze all 6 months. I use Apache 2.2 on Windows and rotate my log files daily. (I actually run awstats.pl hourly to give me up to the hour stats.) So, I have a separate log file for every day. Is there a script or method I can use to help? Any thoughts or suggestions would be appreciated. Thanks.

    • John
     
    • walkoffhomerun
      walkoffhomerun
      2012-10-17

      this is no different than running a backup set of log files on a separate server. I had this problem as my hosting company's AWStats were not running correctly so I download all the daily files onto my own private server where I have AWStats, GeoIP and Perl set up. When I run my own I get accurate AWStats that show correct traffic info my hosting company simply does not care to fix.

      So, to answer your question there are 2 solutions it seems like to me.

      1) download the daily log files onto a separate computer and run again to create a separate version. great like my example where my hosting company never seems to want to update anything (like GeoIP). This way you control your own AWStats output and get the most accurate readings.

      2) maybe for your exact application all you need to do is simply MOVE the daily log files you want to rerun out of their existing folder and into a temporary holding folder. More importantly you want to really do this for the daily files AWStats creates on it's own because it is these smaller files that AWStats reads every time is opens and it fills in the graphics based on what it reads from these smaller created files. If you move those files into a separate holding folder and refresh AWStats in your browser you will see all those days are now gone from your HTML output screen. For your server log files do not delete them because you can never get them back since they are created by the server daily. You are also going to rerun them as I mention below.

      What I would do to keep AWStats from timing out when you do this is move the server log files back in to their correct folder in batches - say 15 to 30 days at a time because there are limitations on how many lines AWStats can read at one time and this is set in the single config file - something like 20,000 log file lines. It can also take hours like mine does since I have alot of lines per log file and I run 30 days at a time. Make sure the daily files AWStats creates are also moved out of their folder. Now when you rerun AWStats it will see missing daily files it creates but there are log files there it needs to process. It will run as many log files as it sees but if you try to run 6 months worth it will time out and give an error message it exceeded it maximum line limit in the Perl screens.
      You will NOT get this error message if you update via the HTML screen. For some reason it wont give a time out error so I do everything from the Perl scripts so I can see error messages. That is why I say only move in enough log files that you can run at one time. Again, like 15 to 30 days. AWStats will then recreate all the daily files it reads for the HTML output screen. Since you moved the original ones out into a temp holding folder when AWStats outputs again via HTML you will be seeing the new, correct output now that you have fixed your other issues.

      Note: you will now have 2 sets of daily output files that AWStats has created. The old ones, that you say have incorrect info due to setup issues, should be in a holding folder. The new ones were just created when you reran AWStats after you made your config corrections. These new files will be in their correct folder and you can always go back to the old daily files by moving them back into their correct folder. Just in case the corrections you made were not really correct. Never delete any of the daily files until you are sure everything is working correctly. Of course never delete any of you server log files at all as you can never get those back.

      Again, this is the exact same process as running AWStats on a different machine using stats created say from your hosting company. This is what I do for my own personal copies where I keep GeoIP up to date where my hosting company does not. I simply transfer in 30 days of server log files at a time to my own server, run the Perl scripts, then move another batch of log files in.

       
      Last edit: walkoffhomerun 2012-10-17