Website managers need to have a clear idea how much traffic their site is receiving on a daily basis. While there are many log file analyzers available on the Internet (many at no cost), they typically don’t provide a good marriage of statistics analysis and excellent layout/graphs of that data. AWStats, SourceForge.net’s June 2004 Project of the month, is a multiplatform, Open Source, log analyzer that provides all the statistics a webmaster needs presented in a very professional, dynamic report. It only takes a quick glance at the project’s online demo to see why so many web professionals are downloading and deploying this software. AWStats has been on SF.net since October 2000 and has had 400,000 downloads since it’s inception. The SF.net team is proud to make AWStats SourceForge.net’s June 2004 project of the month.
Description of project:
AWStats is a short for Advanced Web Statistics. It’s a free, powerful, and feature-rich tool that generates advanced Web (and FTP and email) server statistics, graphically. This log analyzer works as a CGI script or from the command line and shows you all the information your log contains in a few graphical Web pages. It uses a partial information data file to be able to process large log files frequently and quickly. It can analyze log files from nearly all Web and mail servers and some FTP servers. You can use AWStats as you like — real-time, dynamic, or static pages; one report for several sites or several reports for same site; as a CGI script or from command line; on the distant server or locally; etc.
AWStats is released under the GPL license. It’s a Perl tool so it works on nearly all operating systems and hardware.
Why and how did you get started?
I wrote the first code for AWStats to provide better-looking statistics for my company’s Web site than I could get from Webalizer or Analog. After I learned how a log analyzer works, I noticed that values reported by the older-generation tools appeared not to be accurate, so I decided to include my own algorithm to reduce error rate. Seeing a lot of projects were hosted by SourceForge.net, I decided to add my tool to the SourceForge.net directory, just for fun. The growing success of AWStats (feedback by email) has helped me to enhance AWStats version after version.
When a major French media company was interested in having a more efficient tool for their own use, they ask me to rewrite the code (3.x and 4.x series) to support their high visitor numbers, and to add plugins. I rewrote it again for 5.x series to support the largest Web sites and to clean up the code. I also added support for FTP and mail log files.
At the beginning of this year I started the 6.x series, which provides more features and plugins and increases AWStats’ speed, accuracy, features, and flexibility.
What is the intended audience?
All administrators of mail, FTP, and Web server. AWStats can also build personalized reports useful for marketing services.
How many people do you believe are using your software?
When I chose the project name in 2000, I checked Google and got no results for “AWStats.” Before the 5.x series, the software’s generated statistics pages could be indexed by Google if they were online. A search on “AWStats” in 2002 returned 200,000 hits; most of these were users’ online statistics reports. In the two years since, AWStats has prevented Google from indexing AWStats report pages, so now the 400,000 hits Google returns are for Web sites speaking about AWStats and not online reports. However, there are five times as many downloads and hits on our official Web sites this year than in 2002, so I imagine there are about five times as many users. That means as many as a million Web sites might be analyzed by AWStats.
What are a couple of notable examples of how people are using your software?
Most Web hosting providers offer AWStats statistics to their customers.
What gave you an indication that your project was becoming successful?
For me, the number of results on Google and the amount of positive mail feedbacks were good indicators.
What has been your biggest surprise?
I was very pleased seeing AWStats packages available for most Linux distributions.
What has been your biggest challenge?
Rewriting the code for the 5.x series was a big headache. The 4.x algorithm used too much memory and CPU time for very large Web site (several million visitors a month). I think the 5.x algorithm, which is also used in the 6.x series, should last for a long time, since it is very efficient whatever the log size or Web site size, and is stable code that’s easy to change.
Why do you think your project has been so well received?
AWstats is easy to use and configure. Statistics are accurate and clearly reported. There are a lot of different ways to use it that fit each user’s needs.
Where do you see your project going?
It seems than more and more users replace their old Webalizer, Analog, and commercial products with AWStats. I hope this will continue!
What’s on your project wish list?
A new direction for AWStats is to build statistics as XML files. Reports can already be built as XHTML files, but the AWStats database is still stored as a text file with indexes for different sections. A database in XML format could be used to build reports with third-party tools, XSLT, Cocoon, etc. Storing AWStats database in a relational DBMS is also a future step.
What are you most proud of?
Offering more than commercial products, while keeping AWStats open and free.
If you could change something about the project, what would it be?
I don’t know, but I know what I will never change: Developing AWStats in Perl was the best idea I had, even though I didn’t know Perl at the time. It allows me to add feature or change algorithms completely in a short time. Development of AWStats would probably have stopped at the 2.x version (when it was not popular) if it had been developed in C or Java.
How do you coordinate the project?
AWStats is mainly written by one contributor, so I make no particular task or bug assignments. Some developers help me by answering support requests, or sending me “suprise” contributions by email. Translations (39 languages) are also sent to me spontaneously by email. I use SourceForge.net to centralize often-required feature enhancements, bugs, and the tasks I plan to do. For regression testing I built my own test samples and tools (in which language? Perl of course!).
Do you work on the project full-time, or do you have another job?
I have a full job as a computer engineer in Paris.
If you work on the project part-time, how much time would you say you spend, per week, on it?
When I had a lot of time, I spent 16 hours a week on AWStats. Now I spend about four hours a week. That’s not a lot, but don’t forget, AWStats is in Perl, in which you can develop a new feature in one hour, when five or 10 would be required in another language.
What is your development environment like?
A simple text editor is enough to build Perl tools.
I add features but with no planning. A new version comes when there are enough new features to justify it. In most cases, there is a new version every three months with two to five new major features plus a lot of minor features.
How can others contribute?
The most important feature I want to add that I have no time to build is a Java applet to report Web visitor countries in a colored graphical world map. Some code for this already exists in the GeoTools project, but sample code is for standalone Java applications (and I need an applet) and the data used to color the map is read from a DBF file. I need an applet that reads values from the Java applet parameters. If you can provide such an applet, I’ll pay a bounty of 20 euros (via PayPal)!