I'm fairly certain that there is an issue with Apache writing to scribe_cat and that scribe_cat isn't flushing the buffers out to
scribe_cat is a tiny test program, so I wouldn't put too much faith in using it in production services.
I'm in the process of writing mod_log_scribe to log to scribe from apache2, but it'll be awhile before I finish that as I'm not a scribe developer and new to the facebook code base.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I would suggest using "logger" instead of scribe_test, assuming you already have syslog on the client machine configured to send all logs to the scribe server. Logger ships with most linux distros and was designed for this
sort of thing:
CustomLog "|/bin/logger -tapache" combined
Also realize this will NOT log apache errors to syslog, you need another apache directive for that:
ErrorLog syslog
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
At Facebook, we just log all Apache errors locally to a file. Each machine is also running Scribe locally with Scribe configured to forward messages to a central location. We then run a simple script on each machine that tails the Apache log and writes the data to Scribe.
Scribe_cat is a simple example of sending a test message to Scribe. It may not do exactly what you need, but you should just use it as an example of how to call Scribe. You’ll notice that scribe_cat calls Python’s sys.stdin.read(). This will read until an EOF is reached. I’m not an expert at httpd logging configuration, but I’m guessing changing the read() to a readline() might work better when using CustomLog.
Also, I assuming you’ve already tested Scribe by itself by running something like “echo ‘hello’ | scribe_cat test” or following the examples in the examples directory.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I wanted to know how do u keep tailing ? cause after a specific file-size or interval, the *_current sym link changes to a different file.
I've made a script which monitors the "scribe_stats" file, and keeps a track of all the 'rotated' files. All these rotated files then are processed into the DB.
I'd like to see what others are doing to keep a track of this.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
The unix command "tail --follow=name" current should do what you want. It will follow the current symlink and reopen the new log file when it gets rotated.
We use Ruby's File::Tail (http://file-tail.rubyforge.org/doc/classes/File/Tail.html)
Perl also has a similar CPAN module-> http://search.cpan.org/dist/File-Tail/
There is also "logtail.c" which will tail the file and also keep a history of the last position that it read ->http://www.drxyzzy.org/ntlog/logtail.c
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
while 1:
message=sys.stdin.readline()
message=message.rstrip('\r\n')
if not message: break
log_entry=scribe.LogEntry(dict(category=category, message=message))
result = client.Log(messages=[log_entry])
transport.close()
if result == scribe.ResultCode.OK:
sys.exit()
elif result == scribe.ResultCode.TRY_LATER:
print >> sys.stderr, "TRY_LATER"
sys.exit(84) # 'T'
else:
sys.exit("Unknown error code.")
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I am trying to setup apache2 so that the access log can be directly piped into Scribe. However, I have had no luck with the CustomLog directive.
CustomLog "| /home/james/sandbox/scribe-2.0/examples/scribe_cat test" combined
here is the what I have in my httpd.conf file. When I replace that line with a line that looks like the one below,
CustomLog "| cat > simpletest" combined
simpletest will have the running apache log. I am using this as a baseline to show that pipe does work in my current setup of apache.
Can you advise on how we can use scribe_cat to log our apache access logs?
I'm fairly certain that there is an issue with Apache writing to scribe_cat and that scribe_cat isn't flushing the buffers out to
scribe_cat is a tiny test program, so I wouldn't put too much faith in using it in production services.
I'm in the process of writing mod_log_scribe to log to scribe from apache2, but it'll be awhile before I finish that as I'm not a scribe developer and new to the facebook code base.
perhaps it would be nice to share how does facebook write to scribe in production services.
I would suggest using "logger" instead of scribe_test, assuming you already have syslog on the client machine configured to send all logs to the scribe server. Logger ships with most linux distros and was designed for this
sort of thing:
CustomLog "|/bin/logger -tapache" combined
Also realize this will NOT log apache errors to syslog, you need another apache directive for that:
ErrorLog syslog
At Facebook, we just log all Apache errors locally to a file. Each machine is also running Scribe locally with Scribe configured to forward messages to a central location. We then run a simple script on each machine that tails the Apache log and writes the data to Scribe.
Scribe_cat is a simple example of sending a test message to Scribe. It may not do exactly what you need, but you should just use it as an example of how to call Scribe. You’ll notice that scribe_cat calls Python’s sys.stdin.read(). This will read until an EOF is reached. I’m not an expert at httpd logging configuration, but I’m guessing changing the read() to a readline() might work better when using CustomLog.
Also, I assuming you’ve already tested Scribe by itself by running something like “echo ‘hello’ | scribe_cat test” or following the examples in the examples directory.
I wanted to know how do u keep tailing ? cause after a specific file-size or interval, the *_current sym link changes to a different file.
I've made a script which monitors the "scribe_stats" file, and keeps a track of all the 'rotated' files. All these rotated files then are processed into the DB.
I'd like to see what others are doing to keep a track of this.
The unix command "tail --follow=name" current should do what you want. It will follow the current symlink and reopen the new log file when it gets rotated.
We use Ruby's File::Tail (http://file-tail.rubyforge.org/doc/classes/File/Tail.html)
Perl also has a similar CPAN module-> http://search.cpan.org/dist/File-Tail/
There is also "logtail.c" which will tail the file and also keep a history of the last position that it read ->http://www.drxyzzy.org/ntlog/logtail.c
So, this works for me. It is a modified version of scribe_cat that handles talking to scribe and newlines correctly.
In your apache.conf, add:
CustomLog "|/your-scribe-cat-location/scribe_cat -h localhost:1463 apache" combined
And save this, as scribe_cat:
#!/usr/bin/python
'''scribe_cat: A simple script for sending messages to scribe.'''
import sys
from scribe import scribe
from thrift.transport import TTransport, TSocket
from thrift.protocol import TBinaryProtocol
if len(sys.argv) == 2:
category = sys.argv[1]
host = '127.0.0.1'
port = 1463
elif len(sys.argv) == 4 and sys.argv[1] == '-h':
category = sys.argv[3]
host_port = sys.argv[2].split(':')
host = host_port[0]
if len(host_port) > 1:
port = int(host_port[1])
else:
port = 1463
else:
sys.exit('usage (message is stdin): scribe_cat [-h host[:port]] category')
socket = TSocket.TSocket(host=host, port=port)
transport = TTransport.TFramedTransport(socket)
protocol = TBinaryProtocol.TBinaryProtocol(trans=transport, strictRead=False, strictWrite=False)
client = scribe.Client(iprot=protocol, oprot=protocol)
transport.open()
while 1:
message=sys.stdin.readline()
message=message.rstrip('\r\n')
if not message: break
log_entry=scribe.LogEntry(dict(category=category, message=message))
result = client.Log(messages=[log_entry])
transport.close()
if result == scribe.ResultCode.OK:
sys.exit()
elif result == scribe.ResultCode.TRY_LATER:
print >> sys.stderr, "TRY_LATER"
sys.exit(84) # 'T'
else:
sys.exit("Unknown error code.")
Simlarly with perl, using scribe_cat.pl from Log::Dispatch::Scribe ( http://search.cpan.org/perldoc?scribe_cat.pl )
CustomLog "|/usr/local/bin/scribe_cat.pl --category=apache" combined
I created a simple Python script to tail an Apache log and pipe the results into Scribe. I super Supervisor to ensure the pipe is always running.
http://www.silassewell.com/blog/2009/05/12/pipe-apache-or-any-logs-to-scribe/