Menu

#188 Too many opened files

v1.4
open
nobody
None
1
2013-05-16
2013-05-16
OWurdak
No

Hi

I use OSS v1.4 - stable - rev 2274 - build 240.
I use the file crawler (smb/cifs) to index our file server but after a while the crawler hung up on some files.
Apparently the crawler retries to parse the file but do not closes it until the smb-Error STATUS_TOO_MANY_OPENED_FILES (0xC000011F) appears.

Here a snippet of the smbstatus output and the oss-log

....
31413 500 DENY_NONE 0x89 RDONLY NONE /smb_shares/04_Projekte long_file_name.doc Thu May 16 12:42:47 2013
31413 500 DENY_NONE 0x89 RDONLY NONE /smb_shares/04_Projekte long_file_name.doc Thu May 16 12:43:01 2013
31413 500 DENY_NONE 0x89 RDONLY NONE /smb_shares/04_Projekte long_file_name.doc Thu May 16 12:43:16 2013
31413 500 DENY_NONE 0x89 RDONLY NONE /smb_shares/04_Projekte long_file_name.doc Thu May 16 12:43:30 2013
31413 500 DENY_NONE 0x89 RDONLY NONE /smb_shares/04_Projekte long_file_name.doc Thu May 16 12:43:44 2013
31413 500 DENY_NONE 0x89 RDONLY NONE /smb_shares/04_Projekte long_file_name.doc Thu May 16 12:43:59 2013
31413 500 DENY_NONE 0x89 RDONLY NONE /smb_shares/04_Projekte long_file_name.doc Thu May 16 12:44:13 2013
....

14:05:23,630 root - jcifs.smb.SmbException: 0xC000011F
com.jaeksoft.searchlib.SearchLibException: jcifs.smb.SmbException: 0xC000011F
at com.jaeksoft.searchlib.crawler.file.process.fileInstances.SmbFileInstance.listFilesAndDirectories(SmbFileInstance.java:155)
at com.jaeksoft.searchlib.crawler.file.process.ItemDirectoryIterator.<init>(ItemDirectoryIterator.java:51)
at com.jaeksoft.searchlib.crawler.file.process.ItemIterator.create(ItemIterator.java:84)
at com.jaeksoft.searchlib.crawler.file.process.FilePathItemIterator.<init>(FilePathItemIterator.java:46)
at com.jaeksoft.searchlib.crawler.file.process.CrawlFileThread.runner(CrawlFileThread.java:87)
at com.jaeksoft.searchlib.process.ThreadAbstract.run(ThreadAbstract.java:261)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:679)
Caused by: jcifs.smb.SmbException: 0xC000011F
at jcifs.smb.SmbTransport.checkStatus(SmbTransport.java:563)
at jcifs.smb.SmbTransport.send(SmbTransport.java:640)
at jcifs.smb.SmbSession.send(SmbSession.java:238)
at jcifs.smb.SmbTree.send(SmbTree.java:119)
at jcifs.smb.SmbFile.send(SmbFile.java:775)
at jcifs.smb.SmbFile.doFindFirstNext(SmbFile.java:1986)
at jcifs.smb.SmbFile.doEnum(SmbFile.java:1738)
at jcifs.smb.SmbFile.listFiles(SmbFile.java:1715)
at jcifs.smb.SmbFile.listFiles(SmbFile.java:1704)
at com.jaeksoft.searchlib.crawler.file.process.fileInstances.SmbFileInstance.listFilesAndDirectories(SmbFileInstance.java:151)

Discussion

  • Emmanuel Keller

    Emmanuel Keller - 2013-05-16

    Hi,

    Can you check what is the current limit on your environment for the user which runs OpenSearchServer (Tomcat)? By default linux users are limited to 1024 files, which is often too low for a large index.

    On our Debian based SaaS platform we currently set this limit to 16384. Here is an extract of our /etc/security/limits.conf:

    *   hard    nofile  32768
    *   soft    nofile  16384
    

    It can be useful to check which files are open when the issue happen.
    Locate the PID of Tomcat, and use the lsof command.

    ps -ef | grep tomcat
    lsof -p [pid]
    

    Let me know.

     
  • OWurdak

    OWurdak - 2013-05-16

    Hi
    Thanks for the quick answer.
    here the extract of my limits.conf

      • nofile 16384

    In my smbstatus there are about 16000 files open. 3 different files each 5500 entries.
    The last entry was inserted today at 12:46.
    Till 12:46 the oss.log shows the STATUS_TOO_MANY_OPENED_FILES (0xC000011F).
    So this error message is correct.

    The lsof -p pid_of_tomcat shows only some jar und lib files.
    The files are opened from the smb daemon.
    so
    smbstatus -p | grep jcifs
    shows 2 processes
    lsof -p pid_of_smb
    shows me the open files

    I looked in the oss.log before 12:46 (scroll down to see)

    There is the real error of the file parser.
    The parser tries to open a word tilde doc file. ~*.doc
    The file length of this tilde files is really 162 Bytes.

    Is it possible to include the file name in the log file to find it easily?
    I think after the file parser fails it should close the file and continue with the next file. Perhaps the crawler should make an ignore-list with this failed files. This would solve my problem.

    I mentioned 3 open files in the smbstatus. The second is also a ~*.doc. The third one is a special pdf with an embedded 3d-CAD-object.

    regards
    Oliver

    12:46:08,289 root - Unable to read entire header; 162 bytes read; expected 512 bytes
    java.io.IOException: Unable to read entire header; 162 bytes read; expected 512 bytes
    at org.apache.poi.poifs.storage.HeaderBlock.alertShortRead(HeaderBlock.java:226)
    at org.apache.poi.poifs.storage.HeaderBlock.readFirst512(HeaderBlock.java:207)
    at org.apache.poi.poifs.storage.HeaderBlock.<init>(HeaderBlock.java:104)
    at org.apache.poi.poifs.filesystem.POIFSFileSystem.<init>(POIFSFileSystem.java:138)
    at org.apache.poi.hwpf.HWPFDocumentCore.verifyAndBuildPOIFS(HWPFDocumentCore.java:106)
    at org.apache.poi.hwpf.extractor.WordExtractor.<init>(WordExtractor.java:53)
    at com.jaeksoft.searchlib.parser.DocParser.currentWordExtraction(DocParser.java:61)
    at com.jaeksoft.searchlib.parser.DocParser.parseContent(DocParser.java:102)
    at com.jaeksoft.searchlib.parser.Parser.doParserContent(Parser.java:105)
    at com.jaeksoft.searchlib.parser.ParserSelector.parserLoop(ParserSelector.java:390)
    at com.jaeksoft.searchlib.parser.ParserSelector.parseFileInstance(ParserSelector.java:452)
    at com.jaeksoft.searchlib.crawler.file.spider.CrawlFile.download(CrawlFile.java:95)
    at com.jaeksoft.searchlib.crawler.file.process.CrawlFileThread.crawl(CrawlFileThread.java:135)
    at com.jaeksoft.searchlib.crawler.file.process.CrawlFileThread.runner(CrawlFileThread.java:110)
    at com.jaeksoft.searchlib.process.ThreadAbstract.run(ThreadAbstract.java:261)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:679)

     

Log in to post a comment.