CLucene is a C++ port of Lucene: the high-performance, full-featured text search engine written in Java. CLucene is faster than lucene as it is written in C++.
An open source search engine with RESTFul API and crawlers
OpenSearchServer is a powerful, enterprise-class, search engine program. Using the web user interface, the crawlers (web, file, database, etc.) and the client libraries (REST/API , Ruby, Rails, Node.js, PHP, Perl) you will be able to integrate quickly and easily advanced full-text search capabilities in your application: Full-text with basic semantic, join queries, boolean queries, facet and filter, document (PDF, Office, etc.) indexation, web scrapping,etc. OpenSearchServer runs on Windows and Linux/Unix/BSD.
disk manager is a CD/DVD archiving tool. It storys the directory contents of any media so you can search it later. Its also designed as file explorer which makes it easy to find big files. Windows Version supports native file context menus.
Aperture is a Java framework for extracting and querying full-text content and metadata from various information systems (file systems, web sites, mail boxes, ...) and the file formats (documents, images, ...) occurring in these systems.
DuMP3 is a duplicate and similar file finder.
DuMP3 is a duplicate and similar file finder. It finds exact duplicate binaries by hash, similar text files by substring content, images (JPG, BMP, GIF, PNG, etc) by color and audio files (MP3, WAV, OGG, etc) by wave data. Future: fonts, video.
OpenEphyra is an open framework for question answering (QA). It retrieves answers to natural language questions from the Web and other sources. Visit http://www.ephyra.info/ for more details and information on joining this open research initiative.
XPath Shell (XPsh) is a shell extension for selecting files with an XPath-inspired syntax, depending on file attributes and the metadata and/or content of individual files. XPsh can be used as a standalone command, or it can be integrated into a shell.
Digital Library Search Engine
SeerSuite is an application toolkit for digital libraries and search engines; i.e., CiteSeerX. CiteSeerX has moved to GitHub, please get the latest code from: https://github.com/SeerLabs/CiteSeerX
Strigi is a desktop search engine.
Puggle is a graphical desktop search engine written exclusively in Java. It provides full text and metadata search over files, folders, music, photos, web pages and more that are stored locally on your computer.
a Solr Based Semantic Mediawiki Store
simple BNF parser makes xml markup of matches
bnf2xml a simple BNF parser that takes text as input, searches according to a BNF query file, and outputs text marked up by the xml labels that show context. bnf2xml is as simple to use as any text binary ie, awk(1) grep(1). bnf2xml does not require C API because it outputs simple xml labeling. README is visible on file dl page. EXAMPLE: $ echo "hi" | bnf2xml patternfile <word><alph>h</alph><alph>i</alph></word> or <gas>hydrogen iodide</gas> patternfile says how to find needle in haystack and what to show, ie: <alph> ::= a | b | c | d ... <word> ::= <alph>+ bnf2xml is a top down recursive parser. Unlike buttom up parsers like gcc(1) or some top downs, bnf2xml is completely unambiguous / resolves ALL conflicts. Slower on ave. for parsing C or than sed(1) for simple searches. Far easier than using flex/C to create a parser. caveate: I do not suggest it's worth while to make a new gcc(1) using bnf2xml. bnf2xml an nth BETA release, but no complains yet.
Search tool based on Tracker. Circare aims for a different way searching your Desktop, mixing together graph based data visualizations, RDF querys, metadata and descriptors gives you a complete interface to the (meta)Tracker indexer.
Document summarization system. By adding document content to system, user queries will generate a summary document containing the available information to the system.
PALOMA Suite allows the referencing of learning resources using metadata according to the international IEEE LOM standard, and SCORM, CANCORE and Normetic application profiles. PALOMA Suite contains: PALOMA (Standard), PALOMAWeb and PALOMARepository.
This software is designed to find files in stand alone PC or in other network PC It will copy files without make any folder and avoid duplicated. This software is Free of charge, pure java code and need java jre 1.6.
This package contains different tools to add NLP capabilities for Lucene 4.x (it has been tested using Lucene version from 4.6.x to 4.8.1). Although it was originally developed for German, it is, mostly, language independent. It allows the user to lemmatize words to be indexed, to weight termy ba their parts of speech (e.g. weighting nouns mor hevaily than pronouns), and to add synonyms taken from GermaNet or a list you provide to the search index and thereby increase recall of lucene.
CD Maze is an easy to use CD-ROM/DVD-ROM catalog system for the GNOME/Unix/Linux-Desktop.
Narrows search result produced by popular Internet search engines, allowing to put extra filtering conditions, as certain words presented, certain words excluded, and so on.
用c实现了常用的容器，如果rbtree,hashtable,list,vector,deque,heap,map,以及定时器，os api，应用开发框架。 实现了一个基于btree索引算法的文件数据库,提供了断电保护,以及事务提交与回滚等接口
gDiscoverer is a search tool, that is providing a inexact search, that means, that you could type in "lunex" and get "Linux" as result.
fast linux duplicate finder
fast linux duplicate finder
This small C# (mono or MS.NET 3.5 required) console program generates text or html output which lists directories and files. Copies of directory or file names will be marked in HTML output. I use it to find files in a messy company network.
Tool to find symbols in Library or Executable
This tool is handcrafted for developer to find symbols listed inside Library and Executables. Upgrading for demangling support.
Java Graphical app that easily locate and remove duplicates. StopDuplicates find real duplicates using MD5 anywhere in a directory and all its subdirectories.