Home / Release R01
Name Modified Size InfoDownloads / Week
Parent folder
webStraktor-20140420-R01.zip 2014-04-20 4.3 MB
webStraktor Manual 20140420-R01.pdf 2014-04-20 2.1 MB
README-20140420-R01.txt 2014-04-20 2.6 kB
Totals: 3 Items   6.4 MB 0
RELEASE NOTE
Version : webStraktor Release 1.0
Date : 20-April-2014


Summary:
webStraktor is a programmable World Wide Web data extraction client. Its purpose is to scrape HTML based content via the HTTP protocol and extract relevant information.  webStraktor features a scripting language to facilitate the collection, the extraction and the storage of information available on the web, including images.  The scripting language uses elements of the Regular Expression and xPath syntax. The webStraktor scripting language has a small instruction set and its syntax that is easy to master. 
The standard webStraktor output format is XML based, either in ASCII, UTF-8 or ISO-8859-1 (Latin1) code pages. 
webStraktor relies on the Apache HttpClient for retrieving content via the HTTP protocol. It adheres to the Robots Exclusion Protocol and it can be configured to operate in an anonymous way by connecting to the predominant types of proxy servers.
webStraktor extends the functionality of web crawlers, web spiders or web bots by integrating scraping and crawling capabilities and it provides exhaustive logging and tracing information. 

Components:
The webStraktor crawler and script interpreter
The webStraktor GUI builder is a java Swing based IDE (Integrated Development Environment). 
The webstraktor monitor is a java Swing application for displaying in real-time webStrakor tracing information.

Release history:
20/04/2014 - First release

Distribution: 
The WSQLC distribution comprises all required software, apart from a Java SDK and Ant software. 
Installattion instructions and user manual: Download the latest version of the webStraktor User Manual (webStraktor Manual 20140420-R01.pdf)

Notices:
Copyright (c) 2014 - webStraktor
webStraktor  is free software
Permission is granted to copy, distribute and/or modify this software under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA to obtain the GNU General Public License 
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for more details.
GNU General Public License: www.gnu.org/copyleft/gpl.html
Contact details for copyright holder:  webstraktor@gmail.com

Source: README-20140420-R01.txt, updated 2014-04-20