This library implements several locality sensitive hashing(LSH) based algorithms, including indexing data structure for high dimensional spaces and metric spaces, sketch constructions and set embedding algorithms.
multi-encoding strings(1) replacement with language identification
Enhanced version of the standard Unix strings(1) program which uses language models for automatic language identification and character-set identification, supporting over 1400 languages, dozens of character encodings, and 4800+ language/encoding pairs.
Ldap Browser is an open source ldap browser. It has been tested and run on Windows, Solaris, Linux and OS390, and should run on any java supporting operating system.
It is a desktop search (aiming learning materials) tool that provides full text search with a friendly GUI
Lux is a simple, fast and extendible full-text search engine. It works as a library for now.
The MARKet for Open Source
MARKOS will realize the prototype of a service and an interactive application providing an integrated view on the Open Source projects available the on web, focusing on functional, structural and licenses aspects of software code.
Nucular Archiving System for creating full text indices for fielded data. Python API, web, and command line interfaces. Fast. Very light weight. Concurrent read/writes with no possible locking issues. No server process. Proximity. Facets. Funny name.
Development and support of OCFA have been discontinued. the code has moved to these github repositories: https://github.com/DNPA/OcfaLib https://github.com/DNPA/OcfaArch https://github.com/DNPA/OcfaJavaLib https://github.com/DNPA/OcfaModules https://github.com/DNPA/OcfaDoc If you are interested in contributing to ongoing work on the creation of a community maintained OCFA inspired computer forensic framework, please join the Mattock/MattockFS community page on G+: https://plus.google.com/communities/102487198908055860744
An open source search engine with RESTFul API and crawlers
OpenSearchServer is a powerful, enterprise-class, search engine program. Using the web user interface, the crawlers (web, file, database, etc.) and the client libraries (REST/API , Ruby, Rails, Node.js, PHP, Perl) you will be able to integrate quickly and easily advanced full-text search capabilities in your application: Full-text with basic semantic, join queries, boolean queries, facet and filter, document (PDF, Office, etc.) indexation, web scrapping,etc. OpenSearchServer runs on Windows and Linux/Unix/BSD.
Tool to find symbols in Library or Executable
This tool is handcrafted for developer to find symbols listed inside Library and Executables. Upgrading for demangling support.
This project to be release for parallel computing
Plait (pronounced "play") is a command-line jukebox and music player front-end. It understands brief queries that pick a single song, mix queries that combine works from multiple artists, and stream queries that find Shoutcast radio streams.
Pwim utilizes python to find through a given set of directories the string, or strings, that you meant to play. For example, 'pwim pick ice' will yeild you 'maison ikoku - The Pillows - ice pick.wmv' to be played with whatever media player of your choice.
Calculate primes by using extremely fast sorting
This project considers the problem of calculating primes as a sorting problem. It includes the most efficient tree-based sorting algorithm that is possible and shows that finding a new prime can be done by sorting the differences between the previous primes in the right way. Unfortunately it has turned out that going this way is even more slowly than trying to find primes by brute force. So it can only be used as a test with heavy load for the sorting algorithm, which can be used for sorting any kind of data. And as already mentioned, it's just the most efficient tree-based sorting algorithm that you can get. But furthermore this way of finding primes interestingly leaves a hard nut to crack for mathematicians: In very rare cases it finds numbers that are not primes. For all primes below one million this phenomenon arises in exactly two cases: 31213 which is 7 * 7 * 7 * 7 * 13 336141 which is 3 * 3 * 13 * 13 * 13 * 17 Who can explain, why?
Network grep, on steroids
Puggle is a graphical desktop search engine written exclusively in Java. It provides full text and metadata search over files, folders, music, photos, web pages and more that are stored locally on your computer.
A catalog application for various media types - CD, DVD, NetDrives, USB flash keys, etc. It can import data from famous WhereIsIt Windows applicaion. In a word this is a try to make a WhereIsIt-like application for Linux.
searchd is an indexing- and search-daemon available for both unix- and win32-platforms. it is a modular service, able to handle indexing- and search-requests using i/o modules. It's planned to be the basic indexing system for mozilla- and other plugins.
SF Ftp Search Engine ----High speed, open source based and no database required. Demo: http://sf.hit.edu.cn/
Sabuesonix is a desktop search engine. It can explore your PDF, TXT and HTML files (and more in the future) and create an index for quick documents search.
Search Help is a very helpful and easy-to-use program. You can search for web pages, photos, lyrics or you can translate words from german to english and otherwise. Search Help uses some of the well-known search engines to get results.
Fork of Search Monkey project
Power searching without the pain. Perform powerful desktop searches without having to index your system using regular expressions. Graphical equivalent to grep. Fork have new features like displaying context of files and "Open With" menu in Linux with KDE.
Graphical programming. Includes n-dimensional sorting.
Write programs as graphical dataflow charts instead of text. Compile them to any programming language you want. Besides this project includes the most efficient tree-based sorting algorithm that is possible. Originally developed on a CTOS Color NGEN, at first in Pascal, later ported to C, now - 20 years later - ported to Linux. Currently it's still not really system independent. But it's intended that further releases will cure this.
Narrows search result produced by popular Internet search engines, allowing to put extra filtering conditions, as certain words presented, certain words excluded, and so on.
Lucene/Solr based search engine and workflow system
Important: This project has been moved to https://github.com/statsbiblioteket/summa/ Lucens (and Solr) based search engine with very flexible setup and workflow system. It supports incremental updates, hierarchical faceting and index lookup with low memory overhead. Note: Although Summa is open source, the focus is on features used at Statsbiblioteket. No explicit resources has been allocated for support of external users.