Search Results for "pentaho data integration" - Page 2

Showing 1645 open source projects for "pentaho data integration"

View related business solutions
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • Application Monitoring That Won't Slow Your App Down Icon
    Application Monitoring That Won't Slow Your App Down

    AppSignal's Rust-based agent is lightweight and stable. Already running in thousands of production apps.

    Full APM with errors, performance, logs, and uptime monitoring. 99.999% uptime SLA on the platform itself.
    Start Free
  • 1
    reticulate

    reticulate

    R Interface to Python

    reticulate is an R package from Posit that creates seamless interoperability between R and Python. It lets you call Python modules, classes, and functions from within R, automatically translating between R and Python data structures. Useful for combining Python tooling with R projects, data analysis, and RMarkdown reports.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    nango

    nango

    A single API for all your integrations.

    Nango is a single API to interact with all other external APIs. It should be the only API you need to integrate to your app. Nango is an open-source solution for integrating third-party APIs with applications, simplifying API authentication, data syncing, and management.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    RStudio

    RStudio

    RStudio is an integrated development environment (IDE) for R

    RStudio is a powerful, full-featured integrated development environment (IDE) tailored primarily for the R programming language but increasingly supportive of other languages like Python and Julia. It brings together console, editor, plotting, workspace, history, and file-management panes into a unified interface, helping data scientists, statisticians, and analysts to work more productively. The IDE is cross-platform: there are desktop versions for Windows, macOS and Linux, as well as a server version for remote or multi-user deployment via a web browser. In addition to code editing and execution, RStudio offers extensive support for reproducible research via R Markdown, notebooks, and integration with version control systems like Git and SVN. ...
    Downloads: 80 This Week
    Last Update:
    See Project
  • 4
    CellTypist

    CellTypist

    A tool for semi-automatic cell type classification, harmonization

    CellTypist is an automated tool for cell type classification, harmonization, and integration. Classification, transfer cell type labels from the reference to query dataset. Harmonization, match and harmonize cell types defined by independent datasets. integration, integrate cell and cell types with supervision from harmonization. CellTypist recapitulates cell type structure and biology of independent datasets. Regularised linear models with Stochastic Gradient Descent provide a fast and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    SymbolicNumericIntegration.jl

    SymbolicNumericIntegration.jl

    SymbolicNumericIntegration.jl: Symbolic-Numerics for Solving Integrals

    SymbolicNumericIntegration.jl is a hybrid symbolic/numerical integration package that works on the Julia Symbolics expressions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Positron

    Positron

    Positron, a next-generation data science IDE

    ...The IDE supports notebook and script workflows, integration of data-app frameworks (such as Shiny, Streamlit, Dash), database and cloud connections, and built-in AI-assisted capabilities to help write code, explore data, and build models.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    SeedCrackerX

    SeedCrackerX

    Minecraft mod designed to reverse-engineer

    SeedcrackerX is a Minecraft mod designed to reverse-engineer and determine a world’s seed by analyzing in-game structures and environmental data. It operates by collecting information from structures such as shipwrecks, temples, and monuments, then using that data to progressively narrow down possible seeds until the correct one is identified. The mod automates much of this process, initiating cracking procedures once sufficient data has been gathered, often requiring only exploration of...
    Downloads: 201 This Week
    Last Update:
    See Project
  • 8
    Apache InLong

    Apache InLong

    Apache InLong - a one-stop integration framework for massive data

    Apache InLong is a one-stop integration framework for massive data that provides automatic, secure and reliable data transmission capabilities. InLong supports both batch and stream data processing at the same time, which offers great power to build data analysis, modeling and other real-time applications based on streaming data. InLong (应龙) is a divine beast in Chinese mythology who guides the river into the sea, and it is regarded as a metaphor of the InLong system for reporting data streams. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    QSV

    QSV

    Blazing-fast Data-Wrangling toolkit

    qsv is a fast, command-line CSV data toolkit written in Rust that extends the capabilities of xsv. It’s designed to make working with CSV files at scale easy and efficient, offering over 40 powerful subcommands for tasks like querying, sampling, splitting, deduplicating, and more. qsv is ideal for data engineers, analysts, and developers who need high-performance CSV manipulation on the command line.
    Downloads: 30 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 10
    Alova.js

    Alova.js

    Workflow-Streamlined next-generation request tools

    Extremely streamline API integration workflow. Quickly find APIs in the editor, and enjoy full type hints even in js projects with the API code automatically generated by Alova's extension. Request in various complex scenes by one line of code. Automatically manage paging data, and data preloading, reduce unnecessary data refresh, improve fluency by 300%, and reduce coding difficulty by 50%.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Canal

    Canal

    MySQL binlog

    Canal is an open-source project developed by Alibaba that simulates MySQL slave functionality to parse MySQL binlog files. It enables real-time data synchronization and change data capture (CDC) between MySQL and other systems such as Elasticsearch, Kafka, or HBase. Canal is widely used for data integration, replication, and monitoring across distributed systems, offering high performance and low-latency log parsing.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    harmonypy

    harmonypy

    Integrate multiple high-dimensional datasets with fuzzy k-means

    Harmony is an algorithm for integrating multiple high-dimensional datasets. harmonypy is a port of the harmony R package by Ilya Korsunsky. Harmony is a general-purpose R package with an efficient algorithm for integrating multiple data sets. It is especially useful for large single-cell datasets such as single-cell RNA-seq.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 13

    JSON for Modern C++

    JSON that's part of C++

    This is JSON for C++, a JSON library unlike any other that's packed with plenty of great features. While there may be dozens of JSON libraries out there, JSON for C++ stands out with a focus on three things: an intuitive syntax, trivial integration and serious testing. Using the operator magic of modern C++, this library makes JSON feel like a first class data type. With trivial integration, the entire code is made up of a single header file json.hpp, no dependencies, no complex build system required. It's been heavily unit-tested covering 100% of the code, and follows the Core Infrastructure Initiative (CII) best practices to ensure the highest quality at all times. ...
    Downloads: 30 This Week
    Last Update:
    See Project
  • 14
    Recordly

    Recordly

    Recordly is an open‑source screen recorder for MacOS/Windows/Linux

    Recordly is a lightweight recording and data capture tool designed to streamline the process of collecting, organizing, and replaying structured information, likely oriented toward developers or productivity workflows. The project focuses on simplicity and speed, allowing users to record interactions or data streams with minimal configuration while maintaining clarity and usability.
    Downloads: 31 This Week
    Last Update:
    See Project
  • 15
    Jan.ai

    Jan.ai

    Open source alternative to ChatGPT that runs 100% offline

    Jan.ai is an open-source, privacy-focused AI assistant that serves as an alternative to ChatGPT, running completely locally on your device. It allows you to download and run LLMs (local language models) offline while also offering optional integration with cloud-based model providers—giving you full control over your data and AI interactions. Download and run LLMs (Llama, Gemma, Qwen, GPT-oss etc.) from HuggingFace. Connect to GPT models via OpenAI, Claude models via Anthropic, Mistral, Groq, and others. Create specialized AI assistants for your tasks. MCP integration for agentic capabilities.
    Downloads: 58 This Week
    Last Update:
    See Project
  • 16
    Cassandra Spark Connector

    Cassandra Spark Connector

    Apache Spark to Apache Cassandra connector

    The Apache Cassandra Spark Connector allows Spark jobs (RDDs or DataFrames/Datasets) to read from and write to Cassandra tables. Compatible with Apache Cassandra (v2.1+), Spark 1.0–3.5, and Scala 2.11–2.13, it supports mapping Cassandra rows to Scala case classes, saving results back to Cassandra, and executing arbitrary CQL within Spark applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Coomer Downloader App

    Coomer Downloader App

    Coomer downloader

    ...The application typically supports features such as authentication, rate limiting, and retry mechanisms to ensure reliable downloads even when dealing with unstable connections or restricted endpoints. It is often used for personal archiving, data collection, or offline access to content that may otherwise be difficult to manage manually. The tool operates through a command-line interface, making it suitable for scripting and integration into automated workflows.
    Downloads: 354 This Week
    Last Update:
    See Project
  • 18
    LakeSoul

    LakeSoul

    An end-to-end, realtime and cloud native Lakehouse framework

    LakeSoul is a high-performance, unified table storage framework for big data lakes, supporting both streaming and batch data in a single format. Built on top of Apache Spark and leveraging Apache Arrow and Parquet, LakeSoul provides ACID transactions, schema evolution, and time travel. It is designed for large-scale data lake architectures that require consistency, efficiency, and easy integration with modern data stacks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    OpenDataMCP

    OpenDataMCP

    Connect any Open Data to any LLM with Model Context Protocol

    An initiative aimed at connecting open datasets to Large Language Models (LLMs) using the Model Context Protocol, facilitating seamless access and integration of public data into AI applications. ​
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Calculus.jl

    Calculus.jl

    Calculus functions in Julia

    The Calculus package provides tools for working with the basic calculus operations of differentiation and integration. You can use the Calculus package to produce approximate derivatives by several forms of finite differencing or to produce exact derivatives using symbolic differentiation. You can also compute definite integrals by different numerical methods.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    IronCalc

    IronCalc

    Main engine of the IronCalc ecosystem

    IronCalc is a new, modern, work-in-progress spreadsheet engine and set of tools to work with spreadsheets in diverse settings. IronCalc is a lightweight, open-source computational engine designed for performing mathematical operations, formula calculations, and data-driven tasks.
    Downloads: 52 This Week
    Last Update:
    See Project
  • 22
    Suna

    Suna

    Suna - Open Source Generalist AI Agent

    ...Designed to assist users in accomplishing real-world tasks through natural conversation, Suna combines powerful capabilities with an intuitive interface. It serves as a digital companion for research, data analysis, and everyday challenges, integrating tools like browser automation, file management, web crawling, command-line execution, website deployment, and API integration. Suna's architecture comprises a FastAPI-based backend, a Next.js/React frontend, an agent Docker environment, and a Supabase database for state management. This modular design allows for seamless interaction and task execution through simple conversations. ​
    Downloads: 4 This Week
    Last Update:
    See Project
  • 23
    Dagster

    Dagster

    An orchestration platform for the development, production

    Dagster is an orchestration platform for the development, production, and observation of data assets. Dagster as a productivity platform: With Dagster, you can focus on running tasks, or you can identify the key assets you need to create using a declarative approach. Embrace CI/CD best practices from the get-go: build reusable components, spot data quality issues, and flag bugs early. Dagster as a robust orchestration engine: Put your pipelines into production with a robust...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 24
    Jitsu

    Jitsu

    Jitsu is an open-source Segment alternative

    Jitsu is a fully-scriptable data ingestion engine for modern data teams. Set-up a real-time data pipeline in minutes, not days. Installing Jitsu is a matter of selecting your framework and adding few lines of code to your app. Jitsu is built to be framework agnostic, so regardless of your stack, we have a solution that'll work for your team. Connect data warehouse (Snowflake, Clickhouse, BigQuery, S3, Redshift ot Postgres) and query your data instantly. Jitsu can either stream data in...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    Nelmio Alice

    Nelmio Alice

    Expressive fixtures generator

    Nelmio Alice is a PHP library designed to generate complex data fixtures for testing and development environments. It uses YAML, XML, or PHP files to define fixture templates, making it easy to create realistic and varied data sets. Alice integrates well with Doctrine ORM, allowing developers to quickly populate databases with test data, making it especially useful for automated testing and staging environments.
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB