Showing 39 open source projects for "java open source"

View related business solutions
  • Atera all-in-one platform IT management software with AI agents Icon
    Atera all-in-one platform IT management software with AI agents

    Ideal for internal IT departments or managed service providers (MSPs)

    Atera’s AI agents don’t just assist, they act. From detection to resolution, they handle incidents and requests instantly, taking your IT management from automated to autonomous.
    Learn More
  • Most modern and flexible cloud platform for MLM companies Icon
    Most modern and flexible cloud platform for MLM companies

    ERP-class software for multi-level marketing

    For direct selling (MLM) companies, from startup to well established enterprises with millions of distributors across the world
    Learn More
  • 1
    Datapipe

    Datapipe

    Real-time, incremental ETL library for ML with record-level depend

    Datapipe is a real-time, incremental ETL library for Python with record-level dependency tracking. Datapipe is designed to streamline the creation of data processing pipelines. It excels in scenarios where data is continuously changing, requiring pipelines to adapt and process only the modified data efficiently. This library tracks dependencies for each record in the pipeline, ensuring minimal and efficient data processing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    DataGym.ai

    DataGym.ai

    Open source annotation and labeling tool for image and video assets

    DATAGYM enables data scientists and machine learning experts to label images up to 10x faster. AI-assisted annotation tools reduce manual labeling effort, give you more time to finetune ML models and speed up your go to market of new products. Accelerate your computer vision projects by cutting down data preparation time up to 50%. A machine learning model is only as good as its training data. DATAGYM is an end-to-end workbench to create, annotate, manage, and export the right training data...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Tributary

    Tributary

    Streaming reactive and dataflow graphs in Python

    Tributary is a library for constructing dataflow graphs in Python. Unlike many other DAG libraries in Python (airflow, luigi, prefect, dagster, dask, kedro, etc), tributary is not designed with data/etl pipelines or scheduling in mind. Instead, tributary is more similar to libraries like mdf, loman, pyungo, streamz, or pyfunctional, in that it is designed to be used as the implementation for a data model. One such example is the greeks library, which leverages tributary to build data models...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Orchest

    Orchest

    Build data pipelines, the easy way

    Code, run and monitor your data pipelines all from your browser! From idea to scheduled pipeline in hours, not days. Interactively build your data science pipelines in our visual pipeline editor. Versioned as a JSON file. Run scripts or Jupyter notebooks as steps in a pipeline. Python, R, Julia, JavaScript, and Bash are supported. Parameterize your pipelines and run them periodically on a cron schedule. Easily install language or system packages. Built on top of regular Docker container...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Turn more customers into advocates. Icon
    Turn more customers into advocates.

    Fight skyrocketing paid media costs by turning your customers into a primary vehicle for acquisition, awareness, and activation with Extole.

    The platform's advanced capabilities ensure companies get the most out of their referral programs. Leverage custom events, profiles, and attributes to enable dynamic, audience-specific referral experiences. Use first-party data to tailor customer segment messaging, rewards, and engagement strategies. Use our flexible APIs to build management capabilities and consumer experiences–headlessly or hybrid. We have all the tools you need to build scalable, secure, and high-performing referral programs.
    Learn More
  • 5
    BitSail

    BitSail

    BitSail is a distributed high-performance data integration engine

    BitSail is ByteDance's open source data integration engine which is based on distributed architecture and provides high performance. It supports data synchronization between multiple heterogeneous data sources, and provides global data integration solutions in batch, streaming, and incremental scenarios. At present, it serves almost all business lines in ByteDance, such as Douyin, Toutiao, etc., and synchronizes hundreds of trillions of data every day.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    CueLake

    CueLake

    Use SQL to build ELT pipelines on a data lakehouse

    With CueLake, you can use SQL to build ELT (Extract, Load, Transform) pipelines on a data lakehouse. You write Spark SQL statements in Zeppelin notebooks. You then schedule these notebooks using workflows (DAGs). To extract and load incremental data, you write simple select statements. CueLake executes these statements against your databases and then merges incremental data into your data lakehouse (powered by Apache Iceberg). To transform data, you write SQL statements to create views and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Microsoft Integration

    Microsoft Integration

    Microsoft Integration, Azure, Power Platform, Office 365 and much more

    Microsoft Integration, Azure, BAPI, Office 365 and much more Stencils Pack it’s a Visio package that contains fully resizable Visio shapes (symbols/icons) that will help you to visually represent On-premise, Cloud or Hybrid Integration and Enterprise architectures scenarios (BizTalk Server, API Management, Logic Apps, Service Bus, Event Hub…), solutions diagrams and features or systems that use Microsoft Azure and related cloud and on-premises technologies in Visio 2016/2013.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 8
    nonechucks

    nonechucks

    Deal with bad samples in your dataset dynamically

    nonechucks is a library that provides wrappers for PyTorch's datasets, samplers and transforms to allow for dropping unwanted or invalid samples dynamically. What if you have a dataset of 1000s of images, out of which a few dozen images are unreadable because the image files are corrupted? Or what if your dataset is a folder full of scanned PDFs that you have to OCRize, and then run a language detector on the resulting text, because you want only the ones that are in English? Or maybe you...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    DataKit

    DataKit

    Connect processes into powerful data pipelines

    Connect processes into powerful data pipelines with a simple git-like filesystem interface. DataKit is a tool to orchestrate applications using a Git-like dataflow. It revisits the UNIX pipeline concept, with a modern twist: streams of tree-structured data instead of raw text. DataKit allows you to define complex build pipelines over version-controlled data. DataKit is currently used as the coordination layer for HyperKit, the hypervisor component of Docker for Mac and Windows, and for the...
    Downloads: 0 This Week
    Last Update:
    See Project
  • HOA Software Icon
    HOA Software

    Smarter Community Management Starts Here

    Simplify HOA management with software that handles everything from financials to communication.
    Learn More
  • 10
    CloverDX

    CloverDX

    Design, automate, operate and publish data pipelines at scale

    Please, visit www.cloverdx.com for latest product versions. Data integration platform; can be used to transform/map/manipulate data in batch and near-realtime modes. Suppors various input/output formats (CSV,FIXLEN,Excel,XML,JSON,Parquet, Avro,EDI/X12,HL7,COBOL,LOTUS, etc.). Connects to RDBMS/JMS/Kafka/SOAP/Rest/LDAP/S3/HTTP/FTP/ZIP/TAR. CloverDX offers 100+ specialized components which can be further extended by creation of "macros" - subgraphs - and libraries, shareable with 3rd...
    Downloads: 12 This Week
    Last Update:
    See Project
  • 11
    apache spark data pipeline osDQ

    apache spark data pipeline osDQ

    osDQ dedicated to create apache spark based data pipeline using JSON

    This is an offshoot project of open source data quality (osDQ) project https://sourceforge.net/projects/dataquality/ This sub project will create apache spark based data pipeline where JSON based metadata (file) will be used to run data processing , data pipeline , data quality and data preparation and data modeling features for big data. This uses java API of apache spark.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    Next-generation data pipeline to statistically call methylated and differentially methylated loci See the manual in doc/manual.pdf Please note: Calling of (differentially) methylated _positions_ will soon be uploaded.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Apache Kafka

    Apache Kafka

    Mirror of Apache Kafka

    Apache Kafka is a distributed streaming platform built around durable, partitioned logs called topics, enabling high-throughput, fault-tolerant event pipelines. Producers append records to partitions, brokers replicate them for durability, and consumer groups read them at their own pace while balancing work across instances. The commit/offset model and retention policies support patterns from real-time processing to event sourcing and audit trails. Exactly-once processing semantics,...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    A graphical data manipulation and processing system including data import, numerical analysis and visualisation. The software is written in Java and built upon the Netbeans platform to provide a modular desktop data manipulation application.
    Downloads: 0 This Week
    Last Update:
    See Project