Forked from https://sf.net/p/fmd/
The Free Manga Downloader (FMD) is an open source application written in Object-Pascal for managing and downloading manga from various websites. This is a mirror of main repository on GitHub. For feedback/bug report visit https://github.com/riderkick/FMD
Free Manga Downloader
The Free Manga Downloader (FMD) is an open source application written in Object-Pascal for managing and downloading manga from various websites such as AnimeA, Batoto, MangaFox, MangaStream, ...
CLucene is a C++ port of Lucene: the high-performance, full-featured text search engine written in Java. CLucene is faster than lucene as it is written in C++.
An open source search engine with RESTFul API and crawlers
OpenSearchServer is a powerful, enterprise-class, search engine program. Using the web user interface, the crawlers (web, file, database, etc.) and the client libraries (REST/API , Ruby, Rails, Node.js, PHP, Perl) you will be able to integrate quickly and easily advanced full-text search capabilities in your application: Full-text with basic semantic, join queries, boolean queries, facet and filter, document (PDF, Office, etc.) indexation, web scrapping,etc. OpenSearchServer runs on Windows and Linux/Unix/BSD.
PHPCrawl is a high configurable webcrawler/webspider-library written in PHP. It supports filters, limiters, cookie-handling, robots.txt-handling, multiprocessing and much more.
The stuff here has no documentation and some of it may never be completed. This is my playground, use at your own risk.
Aperture is a Java framework for extracting and querying full-text content and metadata from various information systems (file systems, web sites, mail boxes, ...) and the file formats (documents, images, ...) occurring in these systems.
Framework for text mining, data integration and data analysis. Keywords: ontology and graph alignment, relation mining, warehouse, semantic database integration, bioinformatics, systems biology, microarray, Java.
Forum Downloader is a program that allows you to download forums and saves them locally for offline viewing and searching. It can also save linked images, images linked using thumbnails, attachments, or other files linked in posts.
Quran Search Engine API
Alfanous (The Lantern - الفانوس ) is an Arabic search engine API provide the simple and advanced search in the Holy Quran , more features and many interfaces...
The ht://Dig system is a complete indexing and searching system for a domain or intranet. This system is not meant to replace the need for powerful internet-wide search systems like Lycos, Infoseek, Google and AltaVista.
Free Extracts Emails, Phones and custom text from Web using JAVA Regex
In Files there is WebCrawlerMySQL.jar which supports MySql Connection Please follow this link to get latest version https://sourceforge.net/projects/web-spider-web-crawler-extract/ Free Web Spider & Crawler. Extracts Information from Web by parsing millions of pages. Store data into Derby OR MySQL Database and data are not being lost after force closing the spider. - Free Web Spider , Parser, Extractor, Crawler - Extraction of Emails , Phones and Custom Text from Web - Export to Excel File - Data Saved into Derby Database - Written in Java Cross Platform See also Free Email Sender in this link: https://sourceforge.net/projects/gitst-free-email-ender/
So scr_ipfm is a script written in php, used to graphically show amount of data downloaded by users in local network. To do that, it uses logs generated by ipfm program (ipfm is available at the address: http://robert.cheramy.net/ipfm/).
Fusker is a tool to create entire image galleries from an single specially constructed URL.
Imgur Gallery Downloader
Users can now search Imgur for any phrase and ImgurDL/Loadur will automatically search for matching images. ImgurDL/Loadur will download the images while displaying the progress to the user.
A function-testing, performance-measuring, site-mirroring, web spider that is widely portable and capable of using scenarios to process a wide range of web transactions, including ssl and forms.
===NOTICE=== After releasing a few updates, but far less than we wanted, we’ve made the decision to stop the OptimizeGoogle Project. The reasons for the decision were that there were not enough people on the team to keep it going. Google is changing things every day and it has become more and more frustrating to look at all the functions go broke piece by piece. The code will remain GPL, perhaps another person or team is interested in picking this up. For now, thank you for all your patience, feedback and support. Description: OptimizeGoogle is a Firefox extension that enhances Google search results and other pages by adding extra information and removing unwanted information. Created to maintain and improve CustomizeGoogle which seems to have been abandoned.
pyTube is a python-based commandline YouTube search. One can search for videos and display them in their default web browser. Requires python 2.5 and gdata.
OpenOffice Search is a document indexer and search engine for OpenOffice documents. It is Java-based, so it will run on any J2SE enabled platform and uses an embedded Derby (Cloudscape) database.
OpenEphyra is an open framework for question answering (QA). It retrieves answers to natural language questions from the Web and other sources. Visit http://www.ephyra.info/ for more details and information on joining this open research initiative.
MultiSearch is a simple and fast search engine able to concatenate and organize multiple results. Easy to customize for the end users.
Group file share with advanced text parsing capability for easy search
Originally created as a church resource sharing system, phpShare&Search allows users to create accounts, share documents, search documents, and like or report documents. phpShare&Search's power comes from its advanced document parser which extracts text from .PDF, .TXT, .DOC, and .DOCX files and its community features of liking resources and reporting them as inappropriate or SPAM. Users also subscribe to weekly updates of new content. User's may choose to download and host/install/configure/modify/manage this code themselves, or contract the code writer to do these functions for them. Contact me for a reasonable quote. eedrew <at> users <dot> sourceforge <dot> net To support future revisions and/or contribute based on the value you found from this code, checkout the External Link drop-down in the menu. Also, if you do not wish to create and maintain your own installation, email firstname.lastname@example.org for a quote on a turn key solution.
Classifier4J is a java library that provides an API for automatic classification of text. The default (and only current) implementation of this API is a Bayesian classifier. This library can be used for multiple purposes - as a spam filter or a blog cl
Search the web for videos, audios, eBooks, torrents and much more
What is WebCrunch? WebCrunch is intended to provide a very powerful web server indexing and search service allowing you to find a file among millions of files located on public servers around the internet. The search engine is powered by a database that holds information about all the files web servers have. The information about the files is gathered by an intelligent web crawler that runs every 2 to 4 days. It keeps the database clean and up-to-date with the previous contents and new entries for each web server address submitted by members.
Oxyus is an open source search engine written in 100% Java, aimed to provide a search button to your website in an easy way. Oxyus uses Apache Lucene for indexing, Quartz for scheduling and other interesting software products.