Showing 533 open source projects for "python-4suite-xml"

View related business solutions
  • Desktop and Mobile Device Management Software Icon
    Desktop and Mobile Device Management Software

    It's a modern take on desktop management that can be scaled as per organizational needs.

    Desktop Central is a unified endpoint management (UEM) solution that helps in managing servers, laptops, desktops, smartphones, and tablets from a central location.
    Learn More
  • Control D: Advanced DNS Filtering for Businesses and Consumers Icon
    Control D: Advanced DNS Filtering for Businesses and Consumers

    Secure, Filter, and Control Your Network

    Control D is a modern and customizable DNS service that blocks threats, unwanted content and ads - on all devices. Onboard in minutes, and forget about it.
    Learn More
  • 1
    Mara Pipelines

    Mara Pipelines

    A lightweight opinionated ETL framework, halfway between plain scripts

    This package contains a lightweight data transformation framework with a focus on transparency and complexity reduction. Data integration pipelines as code: pipelines, tasks and commands are created using declarative Python code. PostgreSQL as a data processing engine. Extensive web ui. The web browser as the main tool for inspecting, running and debugging pipelines. GNU make semantics. Nodes depend on the completion of upstream nodes. No data dependencies or data flows. No in-app data processing: command line tools as the main tool for interacting with databases and data. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Apache Airflow Provider

    Apache Airflow Provider

    Great Expectations Airflow operator

    Due to apply_default decorator removal, this version of the provider requires Airflow 2.1.0+. If your Airflow version is 2.1.0, and you want to install this provider version, first upgrade Airflow to at least version 2.1.0. Otherwise, your Airflow package version will be upgraded automatically, and you will have to manually run airflow upgrade db to complete the migration. This operator currently works with the Great Expectations V3 Batch Request API only. If you would like to use the...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    CleanVision

    CleanVision

    Automatically find issues in image datasets

    ...This data-centric AI package is a quick first step for any computer vision project to find problems in the dataset, which you want to address before applying machine learning. CleanVision is super simple -- run the same couple lines of Python code to audit any image dataset! The quality of machine learning models hinges on the quality of the data used to train them, but it is hard to manually identify all of the low-quality data in a big dataset. CleanVision helps you automatically identify common types of data issues lurking in image datasets. This package currently detects issues in the raw images themselves, making it a useful tool for any computer vision task such as: classification, segmentation, object detection, pose estimation, keypoint detection, generative modeling, etc.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    Great Expectations

    Great Expectations

    Always know what to expect from your data

    Great Expectations helps data teams eliminate pipeline debt, through data testing, documentation, and profiling. Software developers have long known that testing and documentation are essential for managing complex codebases. Great Expectations brings the same confidence, integrity, and acceleration to data science and data engineering teams. Expectations are assertions for data. They are the workhorse abstraction in Great Expectations, covering all kinds of common data issues. Expectations...
    Downloads: 3 This Week
    Last Update:
    See Project
  • Self-hosted n8n: No-code AI workflows Icon
    Self-hosted n8n: No-code AI workflows

    Connect workflows. Integrate data

    A free-to-use workflow automation tool, n8n lets you connect all your apps and data in one customizable, no-code platform. Design workflows and process data from a simple, unified dashboard.
    Learn More
  • 5
    Population Shift Monitoring

    Population Shift Monitoring

    Monitor the stability of a Pandas or Spark dataframe

    popmon is a package that allows one to check the stability of a dataset. popmon works with both pandas and spark datasets. popmon creates histograms of features binned in time-slices, and compares the stability of the profiles and distributions of those histograms using statistical tests, both over time and with respect to a reference. It works with numerical, ordinal, categorical features, and the histograms can be higher-dimensional, e.g. it can also track correlations between any two...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    BertViz

    BertViz

    BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)

    BertViz is an interactive tool for visualizing attention in Transformer language models such as BERT, GPT2, or T5. It can be run inside a Jupyter or Colab notebook through a simple Python API that supports most Huggingface models. BertViz extends the Tensor2Tensor visualization tool by Llion Jones, providing multiple views that each offer a unique lens into the attention mechanism. The head view visualizes attention for one or more attention heads in the same layer. It is based on the excellent Tensor2Tensor visualization tool. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    nb-clean

    nb-clean

    Clean Jupyter notebooks of outputs, metadata, and empty cells

    ...It provides both a Git filter and pre-commit hook to automatically clean notebooks before they're staged, and can also be used with other version control systems, as a command line tool, and as a Python library. It can determine if a notebook is clean or not, which can be used as a check in your continuous integration pipelines. nb-clean can also be used as a pre-commit hook. You may prefer this to the Git filter if your project already uses the pre-commit framework. Note that the Git filter and pre-commit hook work differently, with different effects on your working directory. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    PySR

    PySR

    High-Performance Symbolic Regression in Python and Julia

    PySR is an open-source tool for Symbolic Regression: a machine learning task where the goal is to find an interpretable symbolic expression that optimizes some objective. Over a period of several years, PySR has been engineered from the ground up to be (1) as high-performance as possible, (2) as configurable as possible, and (3) easy to use. PySR is developed alongside the Julia library SymbolicRegression.jl, which forms the powerful search engine of PySR. The details of these algorithms are...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    gusty

    gusty

    Making DAG construction easier

    gusty allows you to control your Airflow DAGs, Task Groups, and Tasks with greater ease. gusty manages collections of tasks, represented as any number of YAML, Python, SQL, Jupyter Notebook, or R Markdown files. A directory of task files is instantly rendered into a DAG by passing a file path to gusty's create_dag function. gusty also manages dependencies (within one DAG) and external dependencies (dependencies on tasks in other DAGs) for each task file you define. All you have to do is provide a list of dependencies or external_dependencies inside of a task file, and gusty will automatically set each task's dependencies and create external task sensors for any external dependencies listed. gusty works with both Airflow 1.x and Airflow 2.x, and has even more features, all of which aim to make the creation, management, and iteration of DAGs more fluid, so that you can intuitively design your DAG and build your tasks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • We Stop Hackers From Sending Emails From Your Domain with DMARC Icon
    We Stop Hackers From Sending Emails From Your Domain with DMARC

    For businesses of all sizes, government organizations, and Managed Service Providers (MSPs) seeking robust email security

    PowerDMARC is a comprehensive email security solution designed to protect your brand reputation and safeguard your email communications. By leveraging advanced technologies such as DMARC, SPF, DKIM, BIMI, MTA-STS, and TLS-RPT, PowerDMARC offers a robust defense against email threats like spoofing, phishing, and ransomware.
    Learn More
  • 10
    ClearML

    ClearML

    Streamline your ML workflow

    ...It is designed as an end-to-end MLOps suite allowing you to focus on developing your ML code & automation, while ClearML ensures your work is reproducible and scalable. The ClearML Python Package for integrating ClearML into your existing scripts by adding just two lines of code, and optionally extending your experiments and other workflows with ClearML powerful and versatile set of classes and methods. The ClearML Server storing experiment, model, and workflow data, and supports the Web UI experiment manager, and ML-Ops automation for reproducibility and tuning. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Panda-Helper

    Panda-Helper

    Panda-Helper: Data profiling utility for Pandas DataFrames and Series

    Panda-Helper is a simple data-profiling utility for Pandas DataFrames and Series. Assess data quality and usefulness with minimal effort. Quickly perform initial data exploration, so you can move on to more in-depth analysis.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    whylogs

    whylogs

    The open standard for data logging

    whylogs is an open-source library for logging any kind of data. With whylogs, users are able to generate summaries of their datasets (called whylogs profiles) which they can use to track changes in their dataset Create data constraints to know whether their data looks the way it should. Quickly visualize key summary statistics about their datasets. whylogs profiles are the core of the whylogs library. They capture key statistical properties of data, such as the distribution (far beyond...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    airda

    airda

    airda(Air Data Agent

    airda(Air Data Agent) is a multi-smart body for data analysis, capable of understanding data development and data analysis needs, understanding data, generating data-oriented queries, data visualization, machine learning and other tasks of SQL and Python codes.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    NannyML

    NannyML

    Detecting silent model failure. NannyML estimates performance

    NannyML is an open-source python library that allows you to estimate post-deployment model performance (without access to targets), detect data drift, and intelligently link data drift alerts back to changes in model performance. Built for data scientists, NannyML has an easy-to-use interface, and interactive visualizations, is completely model-agnostic, and currently supports all tabular classification use cases.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    PySyft

    PySyft

    Data science on data without acquiring a copy

    Most software libraries let you compute over the information you own and see inside of machines you control. However, this means that you cannot compute on information without first obtaining (at least partial) ownership of that information. It also means that you cannot compute using machines without first obtaining control over those machines. This is very limiting to human collaboration and systematically drives the centralization of data, because you cannot work with a bunch of data...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    classic.tplx

    classic.tplx

    A more accurate representation of jupyter notebooks

    A more accurate representation of Jupyter notebooks when converting to pdfs. This template was designed to make converted Jupyter notebooks look (almost) identical to the actual notebook. If something doesn't exist in the original notebook then it doesn't belong in the conversion. As of nbconvert 5.5.0, the majority of these improvements have been merged into nbconvert's default template. Version 3.x of this package will continue to support nbconvert 5.5.0 and lower, whereas in the future...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    CKAN

    CKAN

    CKAN is an open-source DMS for powering data hubs

    CKAN is the world’s leading open-source data portal platform. CKAN makes it easy to publish, share and work with data. It's a data management system that provides a powerful platform for cataloging, storing and accessing datasets with a rich front-end, full API (for both data and catalog), visualization tools and more.CKAN is used by national and regional government organizations throughout the European Union, the Americas, Asia, and Oceania to power a variety of official and community data...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 18
    NVIDIA Merlin

    NVIDIA Merlin

    Library providing end-to-end GPU-accelerated recommender systems

    NVIDIA Merlin is an open-source library that accelerates recommender systems on NVIDIA GPUs. The library enables data scientists, machine learning engineers, and researchers to build high-performing recommenders at scale. Merlin includes tools to address common feature engineering, training, and inference challenges. Each stage of the Merlin pipeline is optimized to support hundreds of terabytes of data, which is all accessible through easy-to-use APIs. For more information, see NVIDIA...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    Timesketch

    Timesketch

    Collaborative forensic timeline analysis

    Timesketch is a collaborative forensic timeline analysis platform used to investigate security incidents by turning diverse evidence into a single, searchable chronology. Analysts ingest logs and artifacts from many sources—endpoints, servers, cloud services—and Timesketch normalizes them into events on a unified timeline. Powerful search, aggregations, and saved views help you pivot quickly, highlight anomalies, and preserve investigative steps for later review. The system supports tagging,...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    TensorBoardX

    TensorBoardX

    tensorboard for pytorch (and chainer, mxnet, numpy, etc.)

    The SummaryWriter class provides a high-level API to create an event file in a given directory and add summaries and events to it. The class updates the file contents asynchronously. This allows a training program to call methods to add data to the file directly from the training loop, without slowing down training. TensorboardX now supports logging directly to Comet. Comet is a free cloud based solution that allows you to automatically track, compare and explain your experiments. It adds a...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    Bayesian Optimization

    Bayesian Optimization

    Python implementation of global optimization with gaussian processes

    This is a constrained global optimization package built upon bayesian inference and gaussian process, that attempts to find the maximum value of an unknown function in as few iterations as possible. This technique is particularly suited for optimization of high cost functions, situations where the balance between exploration and exploitation is important. More detailed information, other advanced features, and tips on usage/implementation can be found in the examples folder. Follow the basic...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Cleanlab

    Cleanlab

    The standard data-centric AI package for data quality and ML

    cleanlab helps you clean data and labels by automatically detecting issues in a ML dataset. To facilitate machine learning with messy, real-world data, this data-centric AI package uses your existing models to estimate dataset problems that can be fixed to train even better models. cleanlab cleans your data's labels via state-of-the-art confident learning algorithms, published in this paper and blog. See some of the datasets cleaned with cleanlab at labelerrors.com. This package helps you...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    Pandas Profiling

    Pandas Profiling

    Create HTML profiling reports from pandas DataFrame objects

    pandas-profiling generates profile reports from a pandas DataFrame. The pandas df.describe() function is handy yet a little basic for exploratory data analysis. pandas-profiling extends pandas DataFrame with df.profile_report(), which automatically generates a standardized univariate and multivariate report for data understanding. High correlation warnings, based on different correlation metrics (Spearman, Pearson, Kendall, Cramér’s V, Phik). Most common categories (uppercase, lowercase,...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    Fondant

    Fondant

    Production-ready data processing made easy and shareable

    Fondant is a modular, pipeline-based framework designed to simplify the preparation of large-scale datasets for training machine learning models, especially foundation models. It offers an end-to-end system for ingesting raw data, applying transformations, filtering, and formatting outputs—all while remaining scalable and traceable. Fondant is designed with reproducibility in mind and supports containerized steps using Docker, making it easy to share and reuse data processing components....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    dbt-re-data

    dbt-re-data

    re_data - fix data issues before your users & CEO would discover them

    re_data is an open-source data reliability framework for the modern data stack. Currently, re_data focuses on observing the dbt project (together with underlaying data warehouse - Postgres, BigQuery, Snowflake, Redshift). Data transformations in re_data are implemented and exposed as models & macros in this dbt package. Gather all relevant outputs about your data in one place using our cloud. Invite your team and debug it easily from there. Go back in time, and see your past metadata. Set up...
    Downloads: 0 This Week
    Last Update:
    See Project