Showing 22 open source projects for "you-get"

View related business solutions
  • Atera all-in-one platform IT management software with AI agents Icon
    Atera all-in-one platform IT management software with AI agents

    Ideal for internal IT departments or managed service providers (MSPs)

    Atera’s AI agents don’t just assist, they act. From detection to resolution, they handle incidents and requests instantly, taking your IT management from automated to autonomous.
    Learn More
  • Desktop and Mobile Device Management Software Icon
    Desktop and Mobile Device Management Software

    It's a modern take on desktop management that can be scaled as per organizational needs.

    Desktop Central is a unified endpoint management (UEM) solution that helps in managing servers, laptops, desktops, smartphones, and tablets from a central location.
    Learn More
  • 1
    Luigi

    Luigi

    Python module that helps you build complex pipelines of batch jobs

    ...These tasks can be anything, but are typically long running things like Hadoop jobs, dumping data to/from databases, running machine learning algorithms, or anything else. You can build pretty much any task you want, but Luigi also comes with a toolbox of several common task templates that you use. It includes support for running Python mapreduce jobs in Hadoop, as well as Hive, and Pig, jobs. It also comes with file system abstractions for HDFS, and local files that ensures all file system operations are atomic.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    Automated Tool for Optimized Modelling

    Automated Tool for Optimized Modelling

    Automated Tool for Optimized Modelling

    ...On the other hand, using multiple notebooks makes it harder to compare the results and to keep an overview. On top of that, refactoring the code for every test can be quite time-consuming. How many times have you conducted the same action to pre-process a raw dataset? How many times have you copy-and-pasted code from an old repository to re-use it in a new use case? ATOM is here to help solve these common issues. The package acts as a wrapper of the whole machine learning pipeline, helping the data scientist to rapidly find a good model for his problem.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 3
    Conduit

    Conduit

    Conduit streams data between data stores. Kafka Connect replacement

    ...Sync data between your production systems using an extensible, event-first experience with minimal dependencies that fit within your existing workflow. Eliminate the multi-step process you go through today. Just download the binary and start building. Conduit connectors give you the ability to pull and push data to any production datastore you need. If a datastore is missing, the simple SDK allows you to extend Conduit where you need it. Conduit pipelines listen for changes to a database, data warehouse, etc., and allows your data applications to act upon those changes in real-time. ...
    Downloads: 22 This Week
    Last Update:
    See Project
  • 4
    StarRocks

    StarRocks

    StarRocks is a next-gen sub-second MPP database for full analytics

    ...Complex data pipelines and de-normalized tables have always been a necessary evil. Processing any updates or deletes once data arrives has not been possible- until now. StarRocks solves these challenges and makes real-time analytics easy. Get amazing query performance on Star or Snowflake Schemas directly. From canceled orders to updated items, your analytics applications can easily handle them with StarRocks. From streaming data to change data capture, StarRocks meets the data ingestion demands of real-time analytics. Scale storage and computing power horizontally and support tens of thousands of concurrent users. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Free and Open Source HR Software Icon
    Free and Open Source HR Software

    OrangeHRM provides a world-class HRIS experience and offers everything you and your team need to be that HR hero you know that you are.

    Give your HR team the tools they need to streamline administrative tasks, support employees, and make informed decisions with the OrangeHRM free and open source HR software.
    Learn More
  • 5
    AutoGluon

    AutoGluon

    AutoGluon: AutoML for Image, Text, and Tabular Data

    ...AutoGluon is modularized into sub-modules specialized for tabular, text, or image data. You can reduce the number of dependencies required by solely installing a specific sub-module via: python3 -m pip install <submodule>.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    gusty

    gusty

    Making DAG construction easier

    gusty allows you to control your Airflow DAGs, Task Groups, and Tasks with greater ease. gusty manages collections of tasks, represented as any number of YAML, Python, SQL, Jupyter Notebook, or R Markdown files. A directory of task files is instantly rendered into a DAG by passing a file path to gusty's create_dag function. gusty also manages dependencies (within one DAG) and external dependencies (dependencies on tasks in other DAGs) for each task file you define.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Best-of Python

    Best-of Python

    A ranked list of awesome Python open-source libraries

    ...All projects are ranked by a project-quality score, which is calculated based on various metrics automatically collected from GitHub and different package managers. If you like to add or update projects, feel free to open an issue, submit a pull request, or directly edit the projects.yaml. Contributions are very welcome! Ranked list of awesome python libraries for web development. Correctly generate plurals, ordinals, indefinite articles; convert numbers. Libraries for loading, collecting, and extracting data from a variety of data sources and formats. ...
    Downloads: 15 This Week
    Last Update:
    See Project
  • 8
    Kestra

    Kestra

    Kestra is an infinitely scalable orchestration and scheduling platform

    ...Kestra is an open-source, event-driven orchestrator that simplifies data operations and improves collaboration between engineers and business users. By bringing Infrastructure as Code best practices to data pipelines, Kestra allows you to build reliable workflows and manage them with confidence. Thanks to the declarative YAML interface for defining orchestration logic, everyone who benefits from analytics can participate in the data pipeline creation process. The UI automatically adjusts the YAML definition any time you make changes to a workflow from the UI or via an API call. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9
    Elementary

    Elementary

    Open-source data observability for analytics engineers

    ...Elementary data monitors are configured and executed like native tests in dbt your project. Uploading and modeling of dbt artifacts, run and test results to tables as part of your runs. Get informative notifications on data issues, schema changes, models and tests failures. Inspect upstream and downstream dependencies to understand impact and root cause of data issues.
    Downloads: 0 This Week
    Last Update:
    See Project
  • No-code automation to improve your process workflows Icon
    No-code automation to improve your process workflows

    Pipefy is a digital automation software that centralizes data and standardizes workflows for teams like Finance and HR

    Transform your financial and HR operations and improve efficiency even remotely with digital, customized workflows that your team can automate and integrate with other software without the need of IT development.
    Try For Free
  • 10
    Backstage

    Backstage

    Backstage is an open platform for building developer portals

    ...So you can return to building and scaling, quickly and safely. Every team can see all the services they own and related resources (deployments, data pipelines, pull request status, etc.)
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    lakeFS

    lakeFS

    lakeFS - Git-like capabilities for your object storage

    ...Data is dynamic, it changes over time. Dealing with that without a data version control system is error-prone and labor-intensive. With lakeFS, your data lake is version controlled and you can easily time-travel between consistent snapshots of the lake. Easier ETL testing - test your ETLs on top of production data, in isolation, without copying anything. Safely experiment and test on full production data. Easily Collaborate on production data with your team. Automate data quality checks within data pipelines.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 12
    The Tengo Language

    The Tengo Language

    A fast script language for Go

    ...Compiler/runtime written in native Go (no external deps or cgo). Executable as a standalone language / REPL. Use cases, rules engine, state machine, data pipeline, transpiler. If you need to evaluate a simple expression, you can use Eval function instead.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    rudderstack

    rudderstack

    Privacy and Security focused Segment-alternative, in Golang

    ...Our SDKs track anonymous and known users at the source and reconcile users in your warehouse and SaaS tools. Go beyond event streaming and control all of your customer data on your own terms. Learn how we can help you build a customer data platform. RudderStack treats your data warehouse as a first-class citizen among destinations, with advanced features and configurable, near real-time sync. RudderStack is built API-first. It integrates seamlessly with the tools that the developers already use and love.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    GenStage

    GenStage

    Producer and consumer actors with back-pressure for Elixir

    ...Developers implement callbacks like handle_demand and handle_events to control how items are emitted, transformed, and consumed across asynchronous boundaries. Because stages are OTP processes, you gain fault tolerance, supervised restarts, and concurrency tuned via configurable demand and partitioning. GenStage underpins higher-level libraries like Flow and Broadway, but it can also be used directly for custom pipelines where timing and throughput matter. Its clear separation of concerns encourages testable, composable stages that can be rearranged as requirements evolve. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Mage.ai

    Mage.ai

    Build, run, and manage data pipelines for integrating data

    ...Effortlessly integrate and synchronize data from 3rd party sources. Build real-time and batch pipelines to transform data using Python, SQL, and R. Run, monitor, and orchestrate thousands of pipelines without losing sleep. Have you met anyone who said they loved developing in Airflow? That’s why we designed an easy developer experience that you’ll enjoy. Each step in your pipeline is a standalone file containing modular code that’s reusable and testable with data validations. No more DAGs with spaghetti code. Start developing locally with a single command or launch a dev environment in your cloud using Terraform. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    PipeRider

    PipeRider

    Code review for data in dbt

    You can compare two previously generated reports or use a single command to compare the differences between the current branch and the main branch. The latter is designed specifically for code review scenarios. In our pull requests on GitHub, we not only want to know which files have been changed, but also the impact of these changes on the data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    DataGym.ai

    DataGym.ai

    Open source annotation and labeling tool for image and video assets

    ...Your image data can be imported into DATAGYM from your local machine, from any public image URL or directly from an AWS cloud S3 bucket. Machine learning teams spend up to 80% of their time on data preparation. DATAGYM provides AI-powered annotation functions to help you accelerate your labeling task. The Pre-Labeling feature enables turbo-labeling – it processes thousands of images in the background within a very short time.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Tributary

    Tributary

    Streaming reactive and dataflow graphs in Python

    Tributary is a library for constructing dataflow graphs in Python. Unlike many other DAG libraries in Python (airflow, luigi, prefect, dagster, dask, kedro, etc), tributary is not designed with data/etl pipelines or scheduling in mind. Instead, tributary is more similar to libraries like mdf, loman, pyungo, streamz, or pyfunctional, in that it is designed to be used as the implementation for a data model. One such example is the greeks library, which leverages tributary to build data models...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    CueLake

    CueLake

    Use SQL to build ELT pipelines on a data lakehouse

    With CueLake, you can use SQL to build ELT (Extract, Load, Transform) pipelines on a data lakehouse. You write Spark SQL statements in Zeppelin notebooks. You then schedule these notebooks using workflows (DAGs). To extract and load incremental data, you write simple select statements. CueLake executes these statements against your databases and then merges incremental data into your data lakehouse (powered by Apache Iceberg).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    nonechucks

    nonechucks

    Deal with bad samples in your dataset dynamically

    nonechucks is a library that provides wrappers for PyTorch's datasets, samplers and transforms to allow for dropping unwanted or invalid samples dynamically. What if you have a dataset of 1000s of images, out of which a few dozen images are unreadable because the image files are corrupted? Or what if your dataset is a folder full of scanned PDFs that you have to OCRize, and then run a language detector on the resulting text, because you want only the ones that are in English? Or maybe you have an AlternateIndexSampler, and you want to be able to move to dataset[6] after dataset[4] fails while attempting to load! ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    apache spark data pipeline osDQ

    apache spark data pipeline osDQ

    osDQ dedicated to create apache spark based data pipeline using JSON

    This is an offshoot project of open source data quality (osDQ) project https://sourceforge.net/projects/dataquality/ This sub project will create apache spark based data pipeline where JSON based metadata (file) will be used to run data processing , data pipeline , data quality and data preparation and data modeling features for big data. This uses java API of apache spark. It can run in local mode also. Get json example at https://github.com/arrahtech/osdq-spark How to run Unzip the zip file Windows : java -cp .\lib\*;osdq-spark-0.0.1.jar org.arrah.framework.spark.run.TransformRunner -c .\example\samplerun.json Mac UNIX java -cp ./lib/*:./osdq-spark-0.0.1.jar org.arrah.framework.spark.run.TransformRunner -c ./example/samplerun.json For those on windows, you need to have hadoop distribtion unzipped on local drive and HADOOP_HOME set. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22

    CCDLAB

    A FITS image data viewer & reducer, and UVIT Data Reduction Pipeline.

    ...The latest CCDLAB installer can be downloaded here: https://github.com/user29A/CCDLAB/releases The Visual Studio 2017 project files can be found here: https://github.com/user29A/CCDLAB/ Those may not be the latest code files as code is generally updated a few times a week. If you want the latest project files then let me know.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next