Red Hat Ansible Automation Platform on Azure allows you to quickly deploy, automate, and manage resources securely and at scale.
Deploy Red Hat Ansible Automation Platform on Microsoft Azure for a strategic automation solution that allows you to orchestrate, govern and operationalize your Azure environment.
Ideal for conference and event planners, independent planners, associations, event management companies, non-profits, and more.
YesEvents offers a comprehensive suite of services that spans the entire conference lifecycle and ensures every detail is executed with precision. Our commitment to exceptional customer service extends beyond conventional boundaries, consistently exceeding expectations and enriching both organizer and attendee experiences.
HXPath is a commandline tool useful to extract data from HTML documents. HXPath can select sub trees, like the standard xpath tool, but is also able to read contents and attributes and output them in a bash friendly format. HTML Tidy and HTTP/HTTPS get are built in too.
The Wikipedia Miner toolkit provides simplified access to Wikipedia. This open encyclopedia represents a vast, constantly evolving multilingual database of concepts and semantic relations; a promising resource for nlp and related research.
The aw script is written so that you can browse web sites
through the commandline by specifying where to look at in a
concise manner. It can also be used to make an excerpt of
web sites.
Document summarization system. By adding document content to system, user queries will generate a summary document containing the available information to the system.
Lurker is a mailing list archiver designed for capacity, speed, simplicity, and configurability in that order.
Noteworthy features include: google-style searching on all fields, chronology preserving threads, multilingual, and attachment support.
=DOES NOT WORK ANYMORE AS DSA HAS PUT CAPTCHA= DSA Practical Driving Test Monitor helps you find any available practical driving test slot within specified date range. Runs on Linux/Mac/Windows and automates your manual task of finding the test slot.
Network monitoring and troubleshooting is hard. TotalView makes it easy.
This means every device on your network, and every interface on every device is automatically analyzed for performance, errors, QoS, and configuration.
A Web application to search for files on FTP servers. Users can query files by part of the file name, the entire file name, a regular expression, or a shell pattern. To store file indexes, PostgreSQL or MySQL is used.
The censorship tools are a collection of bash scripts for a) comparing DNS server answers to get the blacklist from the censored server, b) downloading censored URLs and c) other stuff like open all censored pages of a blacklist with a browser.
Ex-Crawler is divided into 3 subprojects (Crawler Daemon, distributed gui Client, (web) search engine) which together provide a flexible and powerful search engine supporting distributed computing. More informations: http://ex-crawler.sourceforge.net
Law Leecher is a multi-threaded web crawling tool which extracts laws from the EU law database PreLex (http://ec.europa.eu/prelex/). It's written in Ruby.
Spider that recollects data from MySpace Social Network.
At now, it is only designed to extract information from native american people because it is used for a social science study in the UNAM (Universidad Nacional Autónoma de México).
Desk.Now is a cross-platform Java client for the WhereIsNow WebService which allows you to know where is the latest version of a document, with just two clicks.
Other spiders has a limited link depth, follows links not randomized or are combined with heavy indexing machines. This spider will has not link depth limits, randomize next url, that will be checked for new urls.
Glue 2 is a Semantic Web Service discovery engine fully compatible with the WSMO meta-model and the WSML language that aims at solving polarization problems by using mediators.
a small collection of python 3000 scripts/modules used to automate searching craigslist.org cities and categories for interesting stuff; these scripts currently use html screen scraping, since craigslist currently has no api
Sgrep (sorted grep) is a much faster alternative to traditional Unix grep when searching large files, because sgrep searches sorted input files using a fast binary search to find matching lines.
bee-rain is a web crawler that harvest and index file over the network. You can see result by bee-rain website : http://bee-rain.internetcollaboratif.info/
A threaded C application that searches torrent trackers/indexers for .torrent files and sorts the results according to user defined criteria. Uses glib2.0 and libcurl4
Multicraigs is a tool that allows a user to search many craigslist cities for specific terms with a single script. Stores records to a MySQL database, and is configured to only insert new listings. Parses craigslist's publicly available RSS feeds. More
Nucular Archiving System for creating full text indices for fielded data. Python API, web, and commandline interfaces. Fast. Very light weight. Concurrent read/writes with no possible locking issues. No server process. Proximity. Facets. Funny name.