Free Extracts Emails, Phones and custom text from Web using JAVA Regex
In Files there is WebCrawlerMySQL.jar which supports MySql Connection
Please follow this link to get latest version
https://sourceforge.net/projects/web-spider-web-crawler-extract/
Free Web Spider & Crawler. Extracts Information from Web by parsing millions of pages. Store data into Derby OR MySQL Database and data are not being lost after force closing the spider.
- Free Web Spider , Parser, Extractor, Crawler
- Extraction of Emails , Phones and Custom Text from Web
- Export to Excel File
- Data Saved into Derby Database
- Written in Java Cross Platform
See also Free Email Sender in this link:
https://sourceforge.net/projects/gitst-free-email-ender/
Please install Microsoft OpenJDK to start the application
https://www.microsoft.com/openjdk
Framework for search and display of heterogenous document collections.
NOTICE: This code repository is deprecated. Please visit https://github.com/cdlib/xtf for the latest updates.
Obsolete Description: The eXtensible Text Framework (XTF) is an architecture that supports searching across collections of heterogeneous textual data (XML, PDF, HTML, text, and more), and the presentation of results and documents in a highly configurable manner. Includes highly customized versions of the proven open-source components Lucene and Saxon.
SeerSuite is an application toolkit for digital libraries and search engines; i.e., CiteSeerX.
CiteSeerX has moved to GitHub, please get the latest code from: https://github.com/SeerLabs/CiteSeerX
Desk.Now is a cross-platform Java client for the WhereIsNow WebService which allows you to know where is the latest version of a document, with just two clicks.
Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.
Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
Fire.now is a Firefox plugin that automatically adds your documents to the WhereIsNow latest version discovery service. Everytime you upload a document somewhere, Fire.now integrates the WhereIsNow keys into the file and add it's url to WhereIsNow.
The Java-Sitemapper is a Java API for building sitemap files to improve search indexing on Google, Yahoo!, MSN, and Ask.com. This project strives to implement the latest in search technology for use on the Java platform.
A Java library which allows to parse the latest freely available RDF files available at DMOZ (Open Directory Project) and inserts them into any JDBC compliant relational database (i.e. MySQL, PostgreSQL and others to come like Oracle, MS Access, SQLite).