Deploy pre-built tools that crawl websites, extract structured data, and feed your applications. Reliable web data without maintaining scrapers.
Automate web data collection with cloud tools that handle anti-bot measures, browser rendering, and data transformation out of the box. Extract content from any website, push to vector databases for RAG workflows, or pipe directly into your apps via API. Schedule runs, set up webhooks, and connect to your existing stack. Free tier available, then scale as you need to.
Explore 10,000+ tools
Cloud-based help desk software with ServoDesk
Full access to Enterprise features. No credit card required.
What if You Could Automate 90% of Your Repetitive Tasks in Under 30 Days? At ServoDesk, we help businesses like yours automate operations with AI, allowing you to cut service times in half and increase productivity by 25% - without hiring more staff.
Ex-Crawler is divided into 3 subprojects (Crawler Daemon, distributed gui Client, (web) search engine) which together provide a flexible and powerful search engine supporting distributed computing. More informations: http://ex-crawler.sourceforge.net
Open data mining platform. Provides common architecture for algorithms of various types. Efficient processing of arbitrarily large volumes of data thanks to data streaming. Weka and Rseslib partially integrated. (www.debellor.org)
Cyberinfrastructure Shell (CIShell) is an open source, community-driven framework/application for the integration and utilization of datasets, algorithms, tools, and computing resources. Algorithms can be integrated using most programming languages.
With the "xix" library, GATE functionality is available in XQuery (via an MXQuery extension). OpenCalais invocation is supported, too. -- Source code at http://sgv-jenkins-01.ethz.ch/job/xixlib/ws/-- "Show project details" for instruction
Fully managed relational database service for MySQL, PostgreSQL, and SQL Server
Focus on your application, and leave the database to us
Cloud SQL manages your databases so you don't have to, so your business can run without disruption. It automates all your backups, replication, patches, encryption, and storage capacity increases to give your applications the reliability, scalability, and security they need.
ViSBARD (Visual System for Browsing, Analysis, and Retrieval of Data) is an interactive visualization and analysis tool for space physics data. It provides an integrated 3-D/2-D environment to analyze measurements across many spacecraft and MHD models.
easy fusion is a java-based framework that intends to automatically deploy and control information fusion systems (IFS) on distributed and dynamic resources.
Optex Analyzer is a software to analyze and compare algorithms to solve approximately optimization problems. It has a GUI that allows select a set of input files containing raw algorithm results. The analysis is shown with tables and charts.
Companies searching for an Employer of Record solution to mitigate risk and manage compliance, taxes, benefits, and payroll anywhere in the world
With G-P's industry-leading Employer of Record (EOR) and Contractor solutions, you can hire, onboard and manage teams in 180+ countries — quickly and compliantly — without setting up entities.
The PARSEC CEE is the primary achievement of several years of effort at NASA's Marshall Space Flight Center. The CEE was developed to allow engineers in the Advanced Concepts Department to rapidly prototype launch vehicle and spacecraft concepts.
D.U.C.K (Determine segmentation of Unknown words by using Context Knowledge)is an NLP tool, which aims to find the correct segmentation for unknown words in written Hebrew. Statistics from different scopes will be used to determine the segmentation.
GmanDA is a GPL software for performing qualitative data analysis on mailing-lists and mboxes. It was developed to ease the work with large scale mailing-list archives taken from Gmane.org
SPASE Model is a collection of tools for working with the structured data model information. Tools can convert the relational version of the data model into various expressions, including XSD, XMI and PDF documentation.
DimReduction project provide an open-source multiplatform (Java) graphical environment for bioinformatics problems that supports many feature selection algorithms, pattern recognition techniques, criterion functions and graphic visualization tools.
A java tool for anytime and interactive sequence mining. Aims at providing users with a way of analyzing her activity traces and extract activity schemes from them.
Fuzzy logic add-in for OpenOffice.org Calc. InrecoLAN FuzzyMath allows to perform ordinary arithmetic operations and use ordinary mathematical and financial functions with fuzzy numbers. Have any ideas how to improve this project - you are welcome!
MonteCarlo portfolio simulation - it can be used as stand-alone command line application - it takes simple XML file needed data as entry and creates simple XML file with output, also this stuff have JNI and ISAPI interface.
This project is a compilation of tools/libraries to help with tasks related to Text Analytics mainly in Java. These tools range from simple wrappers to sophisticated mining tasks that can improve the productivity of researchers and engineers.
The Minervan project aims at aiding intelligent software development. It integrates reporting, analysis and data mining to support better decision making.
Blogspread is an open platform for the developmend of applications that analyze data from websites like blogs and forums. Blogspread is jointly developed by the University of Mannheim, Germany, the UFPE, Recife, Brasil and the UFAL, Maceio, Brasil.
JUNG provides a common and extendible language for the modeling, analysis, and visualization of data that can be represented as a graph or network.
New version now available on GitHub: https://github.com/jrtom/jung/releases/tag/jung-2.1
Maui is a multi-purpose automatic topic indexing algorithm. Given a document, Maui automatically identifies its topics. Depending on the task topics are tags, keywords, keyphrases, vocabulary terms, descriptors or Wikipedia titles.
OpenDMAP (Open Source Direct Memory Access Parser) is a natural language processing (text mining) application: a semantic parser for information extraction.