Showing 258 open source projects for "apache"

View related business solutions
  • Find Hidden Risks in Windows Task Scheduler Icon
    Find Hidden Risks in Windows Task Scheduler

    Free diagnostic script reveals configuration issues, error patterns, and security risks. Instant HTML report.

    Windows Task Scheduler might be hiding critical failures. Download the free JAMS diagnostic tool to uncover problems before they impact production—get a color-coded risk report with clear remediation steps in minutes.
    Download Free Tool
  • Retool your internal operations Icon
    Retool your internal operations

    Generate secure, production-grade apps that connect to your business data. Not just prototypes, but tools your team can actually deploy.

    Build internal software that meets enterprise security standards without waiting on engineering resources. Retool connects to your databases, APIs, and data sources while maintaining the permissions and controls you need. Create custom dashboards, admin tools, and workflows from natural language prompts—all deployed in your cloud with security baked in. Stop duct-taping operations together, start building in Retool.
    Build an app in Retool
  • 1
    NBi

    NBi

    NBi is a testing framework (add-on to NUnit)

    NBi is a testing framework (add-on to NUnit) for Business Intelligence. It supports most of the relational databases (SQL server, MySQL, postgreSQL ...) and OLAP platforms (Analysis Services, Mondrian ...) but also ETL and reporting components (Microsoft technologies). The main goal of this framework is to let users create tests with a declarative approach based on an Xml syntax. By the means of NBi, you don't need to develop C# code to specify your tests! Either, you don't need Visual...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Tributary

    Tributary

    Streaming reactive and dataflow graphs in Python

    Tributary is a library for constructing dataflow graphs in Python. Unlike many other DAG libraries in Python (airflow, luigi, prefect, dagster, dask, kedro, etc), tributary is not designed with data/etl pipelines or scheduling in mind. Instead, tributary is more similar to libraries like mdf, loman, pyungo, streamz, or pyfunctional, in that it is designed to be used as the implementation for a data model. One such example is the greeks library, which leverages tributary to build data models...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Amazon Kinesis Flink Connectors

    Amazon Kinesis Flink Connectors

    Contains various Apache Flink connectors to connect to AWS data

    ...An Apache Flink application is a Java or Scala application that is created with the Apache Flink framework. You author and build your Apache Flink application locally. Applications primarily use either the DataStream API or the Table API. The other Apache Flink APIs are also available for you to use, but they are less commonly used in building streaming applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Aestel

    Aestel

    Applications for data management

    "Information is data in action", and, consequently, having good quality data is essential. The AESTEL package contains two highly configurable applications for data management: A data loader and a reporting application, i.e. DataLoader and AEREA, respectively. The data loader application applies user-defined instructions to validate, process and load data. The reporting application provides a query builder and spreadsheet template designer. Both applications work with any relational data...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Atera all-in-one platform IT management software with AI agents Icon
    Atera all-in-one platform IT management software with AI agents

    Ideal for internal IT departments or managed service providers (MSPs)

    Atera’s AI agents don’t just assist, they act. From detection to resolution, they handle incidents and requests instantly, taking your IT management from automated to autonomous.
    Learn More
  • 5
    Feathr

    Feathr

    A scalable, unified data and AI engineering platform for enterprise

    Feathr is a data and AI engineering platform that is widely used in production at LinkedIn for many years and was open sourced in 2022. It is currently a project under LF AI & Data Foundation. Define data and feature transformations based on raw data sources (batch and streaming) using Pythonic APIs. Register transformations by names and get transformed data(features) for various use cases including AI modeling, compliance, go-to-market and more. Share transformations and data(features)...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Gephi

    Gephi

    Gephi the open graph Viz platform

    ...Simple Easy to install and get started. An UI that is centered around the visualization. Like Photoshop™ for graphs. Modular Extend Gephi with plug-ins. The architecture is built on top of Apache Netbeans Platform and can be extended or reused easily through well-written APIs.
    Downloads: 32 This Week
    Last Update:
    See Project
  • 7
    BitSail

    BitSail

    BitSail is a distributed high-performance data integration engine

    BitSail is ByteDance's open source data integration engine which is based on distributed architecture and provides high performance. It supports data synchronization between multiple heterogeneous data sources, and provides global data integration solutions in batch, streaming, and incremental scenarios. At present, it serves almost all business lines in ByteDance, such as Douyin, Toutiao, etc., and synchronizes hundreds of trillions of data every day. BitSail has been widely used and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    SnappyData

    SnappyData

    Memory optimized analytics database, based on Apache Spark

    SnappyData (aka TIBCO ComputeDB) is a distributed, in-memory optimized analytics database. SnappyData delivers high throughput, low latency, and high concurrency for a unified analytics workload. By fusing an in-memory hybrid database inside Apache Spark, it provides analytic query processing, mutability/transactions, access to virtually all big data sources and stream processing all in one unified cluster. One common use case for SnappyData is to provide analytics at interactive speeds over large volumes of data with minimal or no pre-processing of the dataset. For instance, there is no need to often pre-aggregate/reduce or generate cubes over your large data sets for ad-hoc visual analytics. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    CliMA Land

    CliMA Land

    Everything within the Land model

    CliMA Land is a next generation Land Surface Model (LSM) designed to use the broadly available remote sensing data as well as ground-based flux measurements. CliMA Land is a highly modular platform to promote research at different scales from tissue to organ, whole plant, and ecosystem. Therefore, we deliver CliMA Land via a patch of packages (also referred to as sub-modules) to reduce the time used to initialize a research project. As a result, the repository is more a collection of...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Lightspeed golf course management software Icon
    Lightspeed golf course management software

    Lightspeed Golf is all-in-one golf course management software to help courses simplify operations, drive revenue and deliver amazing golf experiences.

    From tee sheet management, point of sale and payment processing to marketing, automation, reporting and more—Lightspeed is built for the pro shop, restaurant, back office, beverage cart and beyond.
    Learn More
  • 10
    Weave Scope

    Weave Scope

    Monitoring, visualization and management for Docker and Kubernetes

    Understand your application quickly by seeing it in a real-time interactive display. Pick open-source or cloud-hosted options. Weave Scope automatically detects processes, containers, hosts. No kernel modules, no agents, no special libraries, no coding. Seamless integration with Docker, Kubernetes, DCOS and AWS ECS. See your Docker hosts, containers and services in real-time. Easily identify and correct issues to ensure the stability and performance of your containerized applications. View...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Julog.jl

    Julog.jl

    A Julia package for Prolog-style logic programming

    A Julia package for Prolog-style logic programming.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Spark.jl

    Spark.jl

    Julia binding for Apache Spark

    A Julia interface to Apache Spark. Spark.jl provides an interface to Apache Spark™ platform, including SQL / DataFrame and Structured Streaming. It closely follows the PySpark API, making it easy to translate existing Python code to Julia. Spark.jl supports multiple cluster types (in client mode), and can be considered as an analog to PySpark or RSpark within the Julia ecosystem.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Bloxs

    Bloxs

    Build dashboards in Jupyter Notebook with numeric and chart boxes

    Bloxs is a simple Python package that helps you display information in an attractive way (formed in blocks). Perfect for building dashboards, reports and apps in the notebook.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Remotery

    Remotery

    Single C file, Realtime CPU/GPU Profiler with Remote Web Viewer

    Remotery is a real-time CPU/GPU profiler implemented as a single C file, providing developers with immediate insights into the performance of their applications. It features a remote web-based viewer that runs in browsers like Chrome, Firefox, and Safari, allowing for cross-platform performance analysis. Remotery supports profiling multiple threads and GPU contexts, offering a comprehensive view of an application's performance characteristics.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Log4jScanner

    Log4jScanner

    A log4j vulnerability filesystem scanner and Go package

    log4jscanner is a filesystem scanner and Go package that helps organizations quickly identify vulnerable Log4j components inside JARs and shaded dependencies. Instead of probing networks, it walks directories and archives, including nested JARs, to find version fingerprints and risky classes associated with the Log4Shell family of issues. The focus on static analysis makes it suitable for container images, build artifacts, and offline systems where active scanning isn’t feasible. Clear,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Snowplow Analytics

    Snowplow Analytics

    Enterprise-strength marketing and product analytics platform

    Snowplow is ideal for data teams who want to manage the collection and warehousing of data across all their platforms and products.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    AWS Step Functions Data Science SDK

    AWS Step Functions Data Science SDK

    For building machine learning (ML) workflows and pipelines on AWS

    The AWS Step Functions Data Science SDK is an open-source library that allows data scientists to easily create workflows that process and publish machine learning models using Amazon SageMaker and AWS Step Functions. You can create machine learning workflows in Python that orchestrate AWS infrastructure at scale, without having to provision and integrate the AWS services separately. The best way to quickly review how the AWS Step Functions Data Science SDK works is to review the related...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    StreamAlert

    StreamAlert

    StreamAlert is a serverless, realtime data analysis framework

    StreamAlert is a serverless, real-time data analysis framework that empowers you to ingest, analyze, and alert on data from any environment, using data sources and alerting logic you define. Computer security teams use StreamAlert to scan terabytes of log data every day for incident detection and response. Incoming log data will be classified and processed by the rules engine. Alerts are then sent to one or more outputs. Rules are written in Python; they can utilize any Python libraries or...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Hedgehog Lab

    Hedgehog Lab

    Run, compile and execute JavaScript for Scientific Computing

    Hedgehog Lab is an open-source scientific computation tool in the browser. Before the development, Pleases make sure you are already installed and enabled the yarn. Once cloned, switch to the dev branch and navigate to the folder by typing cd hedgehog-lab and then running the provided commands. On each run the program compiles and it takes time. Please wait until it shows "Compiled successfully!" and instructions about with which IP:PORT to connect.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    RDS - JS - Examples

    RDS - JS - Examples

    TypeScript/JavaScript example code using the RDS API

    Rich Data Services (or RDS) is a suite of REST APIs designed by Metadata Technology North America (MTNA) to meet various needs for data engineers, managers, custodians, and consumers. RDS provides a range of services including data profiling, mapping, transformation, validation, ingestion, and dissemination. For more information about each of these APIs and how you can incorporate or consume them as part of your work flow please visit the MTNA website. RDS-JS-Examples is...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    CueLake

    CueLake

    Use SQL to build ELT pipelines on a data lakehouse

    ...To extract and load incremental data, you write simple select statements. CueLake executes these statements against your databases and then merges incremental data into your data lakehouse (powered by Apache Iceberg). To transform data, you write SQL statements to create views and tables in your data lakehouse. CueLake uses Celery as the executor and celery-beat as the scheduler. Celery jobs trigger Zeppelin notebooks. Zeppelin auto-starts and stops the Spark cluster for every scheduled run of notebooks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    ML workspace

    ML workspace

    All-in-one web-based IDE specialized for machine learning

    All-in-one web-based development environment for machine learning. The ML workspace is an all-in-one web-based IDE specialized for machine learning and data science. It is simple to deploy and gets you started within minutes to productively built ML solutions on your own machines. This workspace is the ultimate tool for developers preloaded with a variety of popular data science libraries (e.g., Tensorflow, PyTorch, Keras, Sklearn) and dev tools (e.g., Jupyter, VS Code, Tensorboard)...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    TensorBase

    TensorBase

    TensorBase is a new big data warehousing with modern efforts

    TensorBase hopes the open source not become a copy game. TensorBase has a clear-cut opposition to fork communities, repeat wheels, or hack traffic for so-called reputations (like Github stars). After thoughts, we decided to temporarily leave the general data warehousing field. For people who want to learn how a database system can be built up, or how to apply modern Rust to the high-performance field, or embed a lightweight data analysis system into your own big one. You can still try, ask...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Amadeus

    Amadeus

    Harmonious distributed data analysis in Rust

    Amadeus is a high-performance, distributed data processing framework written in Rust, designed to offer an ergonomic and safe alternative to tools like Apache Spark. It provides both streaming and batch capabilities, allowing users to work with real-time and historical data at scale. Thanks to Rust’s memory safety and zero-cost abstractions, Amadeus delivers performance gains while reducing the complexity and bugs common in large-scale data pipelines. It emphasizes developer productivity through a fluent, expressive API and makes it easier to build composable and reliable data transformation pipelines without sacrificing speed or safety.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Kale

    Kale

    Kubeflow’s superfood for Data Scientists

    KALE (Kubeflow Automated pipeLines Engine) is a project that aims at simplifying the Data Science experience of deploying Kubeflow Pipelines workflows. Kubeflow is a great platform for orchestrating complex workflows on top Kubernetes and Kubeflow Pipeline provides the mean to create reusable components that can be executed as part of workflows. The self-service nature of Kubeflow make it extremely appealing for Data Science use, at it provides an easy access to advanced distributed jobs...
    Downloads: 0 This Week
    Last Update:
    See Project