Hello!
I'm using sarg-2.3.4 on CentOS-6.6 x86-64. The rotation of logs is done everyday. I'm trying to make a report with useragent information.
Case 1.
The option useragent_log is not used in conf file.
Command
sarg -x -o /var/www/html/sarg/daily -d day-1 -l /var/log/squid/access.log-$(date +%Y%m%d).gz -b /var/log/squid/useragent.log-$(date +%Y%m%d).gz
Output:
SARG: Deleting temporary directory "/tmp/sarg"
SARG: Parameters:
SARG: Hostname or IP address (-a) =
SARG: Useragent log (-b) = /var/log/squid/useragent.log-20150525.gz
SARG: Exclude file (-c) =
SARG: Date from-until (-d) = 24/05/2015-24/05/2015
SARG: Email address to send reports (-e) =
SARG: Config file (-f) = /etc/sarg/sarg.conf
SARG: Date format (-g) = USA (mm/dd/yyyy)
SARG: IP report (-i) = No
SARG: Keep temporary files (-k) = No
SARG: Input log (-l) = /var/log/squid/access.log-20150525.gz
SARG: Resolve IP Address (-n) = Yes
SARG: Output dir (-o) = /var/www/html/sarg/daily/
SARG: Use Ip Address instead of userid (-p) = No
SARG: Accessed site (-s) =
SARG: Time (-t) =
SARG: User (-u) =
SARG: Temporary dir (-w) = /tmp/sarg
SARG: Debug messages (-x) = Yes
SARG: Process messages (-z) = No
SARG: Previous reports to keep (--lastlog) = 20
SARG:
SARG: sarg version: 2.3.4 Jan-05-2013
SARG: Decompressing log file "/var/log/squid/access.log-20150525.gz" with zcat
SARG: Reading access log file: /var/log/squid/access.log-20150525.gz
SARG: Records read: 781456, written: 401915, excluded: 373415
SARG: Squid log format
SARG: Period covered by log files: 24/05/2015-24/05/2015
SARG: Period: 2015 Май 24
SARG: Sorting log /tmp/sarg/iivanov.user_unsort
SARG: Making file: /tmp/sarg/iivanov
SARG: Sorting log /tmp/sarg/apetrov.user_unsort
SARG: Making file: /tmp/sarg/apetrov
...............
SARG: Sorting file: /tmp/sarg/192_168_1_141.utmp
SARG: Making report: 192.168.1.141
SARG: Sorting file: /tmp/sarg/192_168_1_164.utmp
SARG: Making report: 192.168.1.164
SARG: Making index.html
SARG: Purging temporary file sarg-general
SARG: End
and there is no useragent information in yesterday's statistics.
Case 2.
In config file useragent_log option is used:
useragent_log /var/log/squid/useragent.log
Command
sarg -x -o /var/www/html/sarg/daily -d day-1 -l /var/log/squid/access.log-$(date +%Y%m%d).gz -b /var/log/squid/useragent.log-$(date +%Y%m%d).gz
Output
SARG: Parameters:
SARG: Hostname or IP address (-a) =
SARG: Useragent log (-b) = **/var/log/squid/useragent.log-20150525.gz**
SARG: Exclude file (-c) =
SARG: Date from-until (-d) = 24/05/2015-24/05/2015
SARG: Email address to send reports (-e) =
SARG: Config file (-f) = /etc/sarg/sarg.conf
SARG: Date format (-g) = USA (mm/dd/yyyy)
SARG: IP report (-i) = No
SARG: Keep temporary files (-k) = No
SARG: Input log (-l) = /var/log/squid/access.log-20150525.gz
SARG: Resolve IP Address (-n) = Yes
SARG: Output dir (-o) = /var/www/html/sarg/daily/
SARG: Use Ip Address instead of userid (-p) = No
SARG: Accessed site (-s) =
SARG: Time (-t) =
SARG: User (-u) =
SARG: Temporary dir (-w) = /tmp/sarg
SARG: Debug messages (-x) = Yes
SARG: Process messages (-z) = No
SARG: Previous reports to keep (--lastlog) = 20
SARG:
SARG: sarg version: 2.3.4 Jan-05-2013
SARG: Decompressing log file "/var/log/squid/access.log-20150525.gz" with zcat
SARG: Reading access log file: /var/log/squid/access.log-20150525.gz
SARG: Records read: 781456, written: 401915, excluded: 373415
SARG: Squid log format
SARG: Period covered by log files: 24/05/2015-24/05/2015
SARG: Period: 2015 Май 24
**SARG: Reading useragent log: /var/log/squid/useragent.log**
**SARG: Records read: 455840**
**SARG: Sorting file: /tmp/sarg/squagent.int_log**
**SARG: Making Useragent report**
ARG: Sorting log /tmp/sarg/iivanov.user_unsort
SARG: Making file: /tmp/sarg/iivanov
SARG: Sorting log /tmp/sarg/apetrov.user_unsort
SARG: Making file: /tmp/sarg/apetrov
...............
SARG: Sorting file: /tmp/sarg/192_168_1_141.utmp
SARG: Making report: 192.168.1.141
SARG: Sorting file: /tmp/sarg/192_168_1_164.utmp
SARG: Making report: 192.168.1.164
SARG: Making index.html
SARG: Purging temporary file sarg-general
SARG: End
Sarg seems like ignoring -b option. I can get useragent information only when option useragent_log in config file is turned on and it points to a certain file (no '*' can be used). One way to get useragent information is to turn off rotation of /var/log/squid/useragent.log. But this file can grow very huge and that will slow down report making.
Is it a bug?
Thanks for reporting this bug. It has been around since the beginning of the commit log. I guess it isn't a much used feature :-)
The patch is one line long [4b64c31617179fefa5452ecfef4deb418c83b03d] in case you want to backport it.
Related
Commit: [4b64c3]
There is one more thing with this parameter.
If I run
I get this output:
Okay. Now if I run
the output looks like this:
It seems like only one file goes to useragent logfile (-b), and the others go to access logfiles (-l). And report breaks with this error:
It can't be avoided. The getopt library (responsible for parsing the command line options as required by the POSIX standard) reads only one file name after an option requiring a file name. Every additional file name is taken as a non-option.
In this case, the shell where you type the command replaces "/var/log/squid/access.log" and "/var/log/squid/useragent.log" by the matching file names before sarg is even started. The result is that sarg really sees the call like this (assuming only two access.log and two useragent.log to shorten the example):
When parsed by getopt, the options are returned in that order:
Due to the option reordering required by POSIX, the non-option files are at the end and it is not possible to know that /var/log/squid/useragent.log-20150521.gz was linked to the -b option.
Sarg 2.2 and earlier would have ignored the two lone file names but, to make it easier to process rotated access.log file, sarg 2.3 and later accepts a file name without option as an alias to -l.
You can therefore write a much simpler cron job simply calling
It automatically takes every rotated access.log file into account.
So in (-b) parameter I can specify only one file.
Can it be .gz file? Recently I had an error:
Should it be unpacked?
Last edit: Evgeniy Yakushev 2015-05-26
Unfortunately, the answer is yes to all the questions.
The useragent is a very old feature that was completely overlooked after its initial development. It never benefited from the improvements made to the access.log.
There can only be one useragent log on the command line or in sarg.conf. The command line takes precedence over sarg.conf.
The file cannot be compressed.
I'll try to improve that with the next version. I'll leave this bug open as a reminder that the useragent log feature is lacking.
I uploaded patch [137eb6] to the master branch. With this change, sarg accepts several user agent log files.
Both command line option -b and configuration option useragent_log can be repeated as many time as necessary (-b takes precedence over useragent_log).
It is still not possible to use wildcards or shell globing in the file name.
Compressed files are not yet supported.
Unfortunately, I have only one useragent.log file at my disposal. That's not enough to test this feature. If someone can test it for me, please report any success or failure!
Related
Commit: [137eb6]
I've tested. Works fine, thank you!
Is it possible to make support for compressed files?
I'm working on it during my spare time. I may have it ready for next week or, at least, before July 2015" :-)
That's great! Waiting for a new release!
Can you also make an option to cut off a domain name from users authenticated by Kerberos? Is it possible?
I just committed a set of changes to read gzipped and bzipped useragent files.
I completely rewrote the decompression functions. I dropped the old and outdated Z "compress" algorithm.
To benefit from the gz and bz2 code, the zlib and bzlib development packages must be installed on the system where sarg is built. If one of them is missing, the configure script will disable the corresponding decompression functions.
The big advantage is that sarg doesn't need zcat and bzcat to be in the path. It will handle the compressed log file itself.