indexing text files with no file extension

Bob
2012-09-01
2014-08-21
  • Bob
    Bob
    2012-09-01

    Hello All,

    Thanks for this very useful program. I am just getting started with 1.1.3 on
    win XP.

    How can I include files which have no file extension into the index as text
    files? Thunderbird stores email in ascii text files with no extension. I can
    read and search these files with notepad or word (or any text editor), but
    because they have no file extension, docfetcher ignores the contents. I can
    change the folder name (in T'bird) to folder.txt and docfetcher will index
    them but it seems there should be a more straightforward way to do this.

    Regards and Best Wishes, Bob

     
  • Nam-Quang Tran
    Nam-Quang Tran
    2012-09-01

    Bob,

    Sorry, at the moment DocFetcher doesn't support indexing files with no
    extension. You might have some luck though with writing a regex pattern that
    matches all those files and turns the "mime-type detection" on, which will
    cause DocFetcher to recognize them as plain text files. For more info about
    regexes, have a look at the manual subpage "Regular expressions".

    Best regards

    q:-) <= Quang

     
  • Bob
    Bob
    2012-09-02

    Quang,

    Not necessary to apologize that your very excellent program is not quite
    perfect. The real villian here is T'bird which creates files without an
    extension.

    I followed your suggestion without success. Namely, I added a line to the
    exclude files/detect mime type list, I put a single period . in the Pattern
    (regex) column, and changed the Action to "Detect mime type (slower)". But
    Docfetcher does not seem to have indexed the files.

    Regards and Best Wishes, Bob

    sorry if this message is a duplicate, the first time didn't seem to work

     
  • Nam-Quang Tran
    Nam-Quang Tran
    2012-09-02

    The single period won't work, because it only matches filenames that are one
    character long.

    Try this:

    .*

    or this:

    [^\.]*

    The first pattern matches any filename, and the second one any filename that
    doesn't contain a period (= filename without extension).

     
    Last edit: Nam-Quang Tran 2014-03-27
  • Bob
    Bob
    2012-09-02

    The second option seems to have worked.

    Thanks for the help.

    Bob

     
  • Mike
    Mike
    2014-03-27

    Hey bob,
    can you please exactly explain the steps?

     
    • Nam-Quang Tran
      Nam-Quang Tran
      2014-03-27

      On the indexing dialog, there's a file exclusion table. Add a new exclusion rule to that table with the following values:

      Pattern: [^\.]*
      Match Against: Filename
      Action: Detect mime-type

       
  • andrewg
    andrewg
    2014-04-23

    cool&thanks it works! :) however a feature request,
    if docfetcher index unix plain text inbox style emails
    http://en.wikipedia.org/wiki/Mbox
    that'd be great ! :)
    mozilla thunderbird saves them in those formats (without extensions on unix, not sure about windows though)

     
    • Nam-Quang Tran
      Nam-Quang Tran
      2014-04-23

      I'm aware of MBOX, but I don't have time to work on new features at the moment.

       
  • andrewg
    andrewg
    2014-04-25

    no problem, thanks much for writing the app first hand :)

     
  • ardentperf
    ardentperf
    2014-08-21

    I also vote for mbox support someday; I just bumped into the same issue trying to index my mail. Thanks for posting the workaround - it worked for me to get the files indexed.

    Thanks for a great app!