86 projects for "data processing" with 2 filters applied:

  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    More flexibility. More control.

    Generate interest, access liquidity without selling, and execute trades seamlessly. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 1
    mzitu

    mzitu

    Python crawler that downloads image galleries and analyzes titles

    ...Using text segmentation and frequency analysis, the project can create a word cloud representing common keywords found in the dataset. This makes the repository both a scraping example and a small data analysis experiment built around the collected content. Overall, mzitu serves as a learning-oriented implementation of Python web scraping, data processing, and visualization techniques.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    Command-line/Ant-task/embeddable text file preprocessor. Macros, flow control, expressions. Recursive directory processing. Extensible in Java to display data from any data sources (as database). Can generate complete homepages (tree of HTML-s, images, etc.)
    Leader badge
    Downloads: 86 This Week
    Last Update:
    See Project
  • 3
    gain

    gain

    Asyncio-based Python framework for building fast web crawling spiders

    Gain is a Python web crawling framework designed to simplify the process of building efficient and scalable web scrapers. It is built on top of asynchronous technologies such as asyncio, aiohttp, and uvloop to support high-performance crawling with concurrent network requests. It provides a structured framework for creating spiders that can navigate websites, extract structured data, and process the collected results. Developers define crawlers using components such as spiders, parsers, and...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    Gecco

    Gecco

    Lightweight Java web crawler framework with jQuery-style extraction

    ...It is designed to make crawler development straightforward by allowing developers to extract page elements using jQuery-style selectors rather than complex parsing logic. It integrates several well-known Java libraries and frameworks, including tools for HTTP requests, HTML parsing, JSON processing, and application development. Through its annotation-based design, developers can define crawling rules and data extraction logic directly within Java classes, reducing boilerplate code and improving readability. Gecco also provides mechanisms for handling dynamic web content, including support for asynchronous requests and extraction of JavaScript variables from pages. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 5
    geog-server-embedded

    geog-server-embedded

    GeoG Embedded Server

    GeoG Embedded Server with GeoG's Own Database.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    SoyBeans

    SoyBeans

    A Task Processing Framework for Java

    Soy Beans is an HTTP request processing framework written in Java. Written as an alternative to frameworks like Struts and Stripes, it provides a robust and extremely flexible API enabling rapid deployment, and dynamic configuration.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Jericho HTML Parser is a java library allowing analysis and manipulation of parts of an HTML document, including server-side tags, while reproducing verbatim any unrecognised or invalid HTML.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    IMPORTANT NOTE: This project has moved to Github: https://github.com/pkozelka/libxml2-pas Pascal units accessing the popular XML API from Daniel Veillard ( http://www.xmlsoft.org ). This should be usable at least from Kylix and Delphi, but hopefully also from other Pascal compilers (like freepascal).
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    XML Path Event API is a C#/.NET focussed programming interface that allows XPath style processing of XML documents in a streaming (SAX like) fashion.
    Downloads: 0 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    JODReports is a solution for generating dynamic documents and reports in Java based on the OpenDocument format (ODF). Templates can be easily composed with a word processor such as OpenOffice.org Writer. Data sources include POJOs and XML.
    Downloads: 15 This Week
    Last Update:
    See Project
  • 11
    DocFrac is a document converter that can convert between RTF, HTML and ASCII text. This includes RTF to HTML and HTML to RTF. Supports text formatting (e.g. bold); tables; and most European languages. Available for Windows; Linux; ActiveX and DLL.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    edib allows execution of XQuery queries and XProc pipelines from webpage in an eXist XMLDB installed on client. edib also allows complex processing and storage of data before sending it back to server, thus balancing load between server and client.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Syoncloud

    Syoncloud

    Hadoop, Hbase, HBase Web Client, Flume based log analytics system

    Syoncloud Logs enables you to process log files from various applications using Hadoop, Flume and HBase. It has an easy installation and configurations interface. It has Syoncloud HBase web client. It displays tree of HBase tables and column families linked to paginated grid of data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    distributedPHP client

    distributedPHP client

    A simple script for distributed computing through PHP:

    distributedPHP client is a simple PHP script that can simultaneously activate/send data to as many web scripts as you want. You must open and configure the distributedPHP .php file prior to running it. ditributedPHP client supports activating scripts without data, sending the same data to all scripts, sending unique data to each script or sending user input to each script. Examples of use include: distributed math computation, encryption breaking, SETI@home/folding@home (well, if they made the projects in php..) distributed bruteforce attacks, ddos attacks, distributed processing, etc.. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    A stand-alone editor using Mediawiki markup language to generate HTML code. You can create and preview pages written using Mediawiki markup (i.e. Wikipedia pages) while off-line.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    creole/c is a Wiki Creole parser and a HTML converter. It implements Wiki Creole 1.0 and almost all of its additions. The parser is written in C++ and has a simple event driven plain C API. The converter is a stand-alone console application.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Java Valves
    This Project aims at developing generic Valves for Containers like Tomcat.Development will be aimed at providing detailed request tracing valves based on the native logger valves.This project is created and architected by Arunn John Moothedathu (www.arunjohn.com).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    nanoWIME is a simple, flexible, easy-to-use javascript based WikiMarkup editor.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    ServingXML is an open source, Apache 2.0 licensed, framework for flat/XML data transformations. It defines an extensible markup vocabulary for expressing flat-XML, XML-flat, flat-flat, and XML-XML processing in pipelines.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    This project concept is "Good bye login form". YggDore Sky Gate provides login to various service by using same the login ID and password. The authentic method is like POP before SMTP, very simple, your service is able to join easily.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Transform XML documents using XSL style-sheets. Process embedded blocks of XXSLT (xslt + include directive) commands in any document - download XML data and define XSL style-sheets. Insert downloaded content directly to the page source. Cache support.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    NetMate Meter
    NetMate Meter is a flexible and extensible tool for network measurement. It can be used for accounting, delay/loss measurement, packet capturing. It supports dynamic loadable packet processing and data export modules and a flexible packet classifier.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Pypes is a framework which allows users to break complex data processing logic down into a series of smaller less complex tasks. These tasks, referred to as components, can then be connected so that the output of one becomes the input to another.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    RTF2HTML is a name for a cross-platform C++ library (DLL, OCX) and command-line utility, which is intended to convert documents from Rich Text Format (e.g. Word, OO Writer) to HTML. Its features are tiny size, speed, low mem usage and compact output.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    A JavaScript library for parsing Creole 1.0 wiki markup.
    Downloads: 1 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB