Menu

#46 FileAppender handles file descriptor closed

open
nobody
None
5
2017-10-03
2017-03-24
Leo Liu
No

Hi,

I found this bug with log4cpp lib in my software running on OL7 only (Oracle Linux 7) that the file appender fd is closed immediately after my service started. I created this patch that works on 1.0. I still didn't find anything addressing this issue in the latest RCs, so I would like to share this patch with you and hope to get it officially from the next release.

Thanks,
Leo

$diff FileAppender.cpp FileAppender.cpp.patch
80,83c80,92
<         std::string message(_getLayout().format(event));
<         if (!::write(_fd, message.data(), message.length())) {
<             // XXX help! help!
<         }
---
>       std::string message(_getLayout().format(event));
>
>       int sizeWrite = ::write(_fd, message.data(), message.length());
>
>       if (sizeWrite < 0) {
>       if (errno == EBADF) { // the fd is closed
>         reopen(); // reopen the file
>         sizeWrite = ::write(_fd, message.data(), message.length()); // retry the log, if failed again, ignore it
>
>         if (sizeWrite < 0) {
>         }
>       }
>       }
1 Attachments

Discussion

  • Leo Liu

    Leo Liu - 2017-03-24

    The reason why I used sizeWrite to store the write result simply because I was using SYSLOG for debugging:

          int sizeWrite = ::write(_fd, message.data(), message.length());
    #ifdef SYSLOG_ENABLED
          setlogmask(LOG_UPTO(LOG_NOTICE));
          openlog("log4cpp", LOG_CONS|LOG_PID|LOG_NDELAY, LOG_LOCAL1);
          //syslog(LOG_NOTICE, "fd(%d) filename(%s) Write %d bytes!!!!\n", _fd, _fileName.c_str(), sizeWrite);
    #endif
          if (sizeWrite < 0) {
    #ifdef SYSLOG_ENABLED
        syslog(LOG_ERR, "write errno is %d(%s)\n", errno, strerror(errno));
    #endif
        if (errno == EBADF) { // the fd is closed
    #ifdef SYSLOG_ENABLED
          syslog(LOG_ERR, "reopen the fd and relog this message: %s", message.c_str());
    #endif
          reopen(); // reopen the file
          sizeWrite = ::write(_fd, message.data(), message.length()); // retry the log, if failed again, ignore it
    
          if (sizeWrite < 0) {
    #ifdef SYSLOG_ENABLED
            syslog(LOG_ERR, "failed to reopen the fd and relog this message: %s", message.c_str());
    #endif
          }
        }
          }
    #ifdef SYSLOG_ENABLED
          closelog();
    #endif
    
     

    Last edit: Leo Liu 2017-03-24
  • Alexander Perepelkin

    Hello Leo,

    Can I ask you how did you determine that the file appender fd gets closed immediately after the service started?

    Alex.

     

    Last edit: Alexander Perepelkin 2017-04-08
  • Leo Liu

    Leo Liu - 2017-04-27

    Hi Alex,

    The symptom of the problem started to show when we deployed the service on Oracle Linux 7. It appeared that when the service was restarted, the log file stopped updating. I checked the process's open file descriptor with "lsof" utility, it appeared that the file descriptor disappeared from the list.
    With an attempt to check for errno == EBADF (bad file descriptor) when encounter writing issue, we can reopen the file for an extra attempt.

    Just a quick fix that works for me.

    Leo

     
  • Alexander Perepelkin

    Leo,

    Would that be possible that the service after restart tried to make write into log file when the file had not been opened yet?
    Could you make a check that the log file was opened at least single time before that write attempts?

    Another possible reason: if for some reason the file gets closed, and we make this patch for reopening on each write attempt, it could seriousy impact system's performance.
    May be the file gets closed just before write for some reason?

    Alex.

     
  • Leo Liu

    Leo Liu - 2017-06-12

    Alex,

    Yes, I can confirm that is opened because one line of log is written before it's closed. Maybe is due to the process is being forked.
    Regarding the performance impact, if you take a closer look at the patch, the code before the patch has already done the check, but it comments that it doesn't know what to do with it.
    < if (!::write(_fd, message.data(), message.length())) {
    < // XXX help! help!
    < }
    We just added one more condition if it failed the check because of the file desriptor is being closed, then we reopen it. So there won't be any more process if we are not in the failure condition already.
    We used the lib in a carrier grade service software, and the performance is proven not affected by this patch.

    Thanks,
    Leo

     

    Last edit: Leo Liu 2017-06-12
  • Alexander Perepelkin

    Leo,

    Can you possibly make a check and ensure which of two processes writes the particular line in the log? It might help you to analyze the reason of file being closed.
    As a general consideration, it is preferred not to write information from two separate processes into single log file. The lines could get mixed.

    The possible performance impact here is more than one added check. If the file can not be written for the reason, then the library will continuously try to reopen it and attempt to write again. This is a sequence of not so cheap system calls and they will occur each time when an application attempts to make a log record. That is what may degrade performance of the running application.

    Alex.

     
  • Leo Liu

    Leo Liu - 2017-10-03

    Alex,

    Sorry that I was occupied by a release effort for a while.
    Basically, I think there are two things I wanted to point out.
    1. This patch works well for me, and my software is writing rotating log files of 100M files every 30 mins averagely on a busy deployment. There is no noticeable impact on performance and missing log issue.
    2. If you take a look at the section in question, it's obvious that the check is done no matter what in there anyways. (The original code is just surrender, do nothing when it fails) What the patch do is when the check fails, we will try an extra attempt. I believe the close file descriptor in the beginning is due to the system reason for OL7. This patch should add no more overhead to the existing process, regarding that section of codes.

    Thanks,
    Leo

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.