“It’s basically for the desktop what Google is for the Web.” That’s how Tran Nam Quang describes DocFetcher, an application that allows you to search the contents of documents on your computer.
DocFetcher has a few characteristics that make it unique among comparable alternatives, Tran says. Most desktop search apps are either Windows-only (Google Desktop Search, Copernic) or Linux-only (Beagle, Tracker). DocFetcher is cross-platform and portable. You can use the same DocFetcher instance on your Windows and Linux machine, and you can put it on a USB stick, hide it in an encrypted volume, or burn it onto a CD-ROM – along with your documents.
In addition, it’s fast at indexing. “People often think that Java, in which DocFetcher was written, automatically means slow, but in the case of indexing, it’s really not,” Tran says. “In my experience, the way to get the biggest performance boost is to give users full control over what is indexed. I believe we did a better job with that than, say, Google Desktop, where it is a lot more difficult to narrow down the search scope.”
Finally, DocFetcher yields excellent search results – “at least that’s what my users tell me. This is closely related to the fact that a narrow search scope means both fast indexing and better search results, since there’s less ‘noise’ in the results.”
Tran began working on DocFetcher in 2007 when he got fed up with the chaos on his computer. “I had hundreds of important files scattered everywhere, and after several fruitless attempts to organize them into a neat folder hierarchy, I gave up and started looking for alternatives. I looked at Google Desktop Search and several open source implementations, but wasn’t really satisfied with any of them. Google Desktop seemed to be way too complex and bloated for a simple task like this, and the open source alternatives were all more or less immature and bug-ridden. I decided to write my own desktop search application – a simple one that wasn’t buggy and that worked on both Windows and Linux. I also thought it would be a good thing to have some real-world programming experience, since I had never written anything of this magnitude before.
“When I had the first working prototype and found it to be indeed very useful, I thought, ‘Maybe there are other people out there who would find this useful, too,’ so the logical next step was to share it. Hosting it on SourceForge.net was logical, too, since it was and still is the biggest repository of open source software on the Web. It was important to me to make it open source, since I had already gotten so many useful open source tools for free and I wanted to give something back to the world.”
To publicize the project, Tran submitted it to several download sites, and wrote a Wikipedia article about it. “This really had a significant effect on the download counter,” he says.
For development, “we’re just using what everybody else uses – Java, Ant, Eclipse, Python – because we like to play it safe. The most unusual thing in our toolbox is AspectJ.”
The project typically offers new major releases every five to nine months, with critical bugfixes coming more often. the most recent version was released last week. DocFetcher already has quite a few advanced features, but Tran is planning on a full rewrite of the program. “The current version has become something of a mess over the years, making it difficult to change fundamental parts of the program.”