Hi,
I found this bug with log4cpp lib in my software running on OL7 only (Oracle Linux 7) that the file appender fd is closed immediately after my service started. I created this patch that works on 1.0. I still didn't find anything addressing this issue in the latest RCs, so I would like to share this patch with you and hope to get it officially from the next release.
Thanks,
Leo
$diff FileAppender.cpp FileAppender.cpp.patch
80,83c80,92
< std::string message(_getLayout().format(event));
< if (!::write(_fd, message.data(), message.length())) {
< // XXX help! help!
< }
---
> std::string message(_getLayout().format(event));
>
> int sizeWrite = ::write(_fd, message.data(), message.length());
>
> if (sizeWrite < 0) {
> if (errno == EBADF) { // the fd is closed
> reopen(); // reopen the file
> sizeWrite = ::write(_fd, message.data(), message.length()); // retry the log, if failed again, ignore it
>
> if (sizeWrite < 0) {
> }
> }
> }
The reason why I used sizeWrite to store the write result simply because I was using SYSLOG for debugging:
Last edit: Leo Liu 2017-03-24
Hello Leo,
Can I ask you how did you determine that the file appender fd gets closed immediately after the service started?
Alex.
Last edit: Alexander Perepelkin 2017-04-08
Hi Alex,
The symptom of the problem started to show when we deployed the service on Oracle Linux 7. It appeared that when the service was restarted, the log file stopped updating. I checked the process's open file descriptor with "lsof" utility, it appeared that the file descriptor disappeared from the list.
With an attempt to check for errno == EBADF (bad file descriptor) when encounter writing issue, we can reopen the file for an extra attempt.
Just a quick fix that works for me.
Leo
Leo,
Would that be possible that the service after restart tried to make write into log file when the file had not been opened yet?
Could you make a check that the log file was opened at least single time before that write attempts?
Another possible reason: if for some reason the file gets closed, and we make this patch for reopening on each write attempt, it could seriousy impact system's performance.
May be the file gets closed just before write for some reason?
Alex.
Alex,
Yes, I can confirm that is opened because one line of log is written before it's closed. Maybe is due to the process is being forked.
Regarding the performance impact, if you take a closer look at the patch, the code before the patch has already done the check, but it comments that it doesn't know what to do with it.
< if (!::write(_fd, message.data(), message.length())) {
< // XXX help! help!
< }
We just added one more condition if it failed the check because of the file desriptor is being closed, then we reopen it. So there won't be any more process if we are not in the failure condition already.
We used the lib in a carrier grade service software, and the performance is proven not affected by this patch.
Thanks,
Leo
Last edit: Leo Liu 2017-06-12
Leo,
Can you possibly make a check and ensure which of two processes writes the particular line in the log? It might help you to analyze the reason of file being closed.
As a general consideration, it is preferred not to write information from two separate processes into single log file. The lines could get mixed.
The possible performance impact here is more than one added check. If the file can not be written for the reason, then the library will continuously try to reopen it and attempt to write again. This is a sequence of not so cheap system calls and they will occur each time when an application attempts to make a log record. That is what may degrade performance of the running application.
Alex.
Alex,
Sorry that I was occupied by a release effort for a while.
Basically, I think there are two things I wanted to point out.
Thanks,
Leo