Join/Login
Open Source Software
Business Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Open Source Software

Business Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Browse Open Source
Business
Data Management
Data Quality Tools
Search Results

Search Results for "python-4suite-xml"

x

Sort By:

Relevance

Clear All Filters

OS

Windows 24
Linux 22
Mac 22
More...
BSD 6
ChromeOS 6
Server Operating Systems 1

Category

Business 27
- Data Management 27
- Enterprise 3
Artificial Intelligence 6
Software Development 6
Database 5
System 4
Formats and Protocols 3
Scientific/Engineering 2
Text Editors 1

License

OSI-Approved Open Source 25
Other License 1

Translations

English 2

Programming Language

Python 18
Java 6
C# 3
COBOL 1

Status

Production/Stable 5
Alpha 2
Mature 2
Pre-Alpha 1
More...
Beta 1
Inactive 1

Showing 27 open source projects for "python-4suite-xml"

View related business solutions

Data Quality Clear Filters & Widen Search

Auth0 Free: 25K MAUs + 5-Min Setup
Enterprise Auth, Zero Friction: Any Framework • 30+ SDKs • Universal Login

Production-ready login in 10 lines of code. SSO, MFA & social auth included. Scale seamlessly beyond free tier with Okta’s enterprise security.

Get Your API Keys
Payroll Services for Small Businesses | QuickBooks
Save up to 50% on QuickBooks Online! Keep the Accounting and Book Keeping for your Small Business up to date!

Easily pay your team and access powerful tools, employee benefits, and supportive experts with the #1 online payroll service provider. Manage payroll and access HR and employee services in one place. Pay your team automatically once your payroll setup is complete. We'll calculate, file, and pay your payroll taxes automatically.

Learn More
1

CSV Lint

CSV Lint plug-in for Notepad++ for syntax highlighting

CSV Lint plug-in for Notepad++ for syntax highlighting, csv validation, automatic column and datatype detecting fixed width datasets, change datetime format, decimal separator, sort data, count unique values, convert to xml, json, sql etc. A plugin for data cleaning and working with messy data files. Use CSV Lint for metadata discovery, technical data validation, and reformatting on tabular data files. It is not meant to be a replacement for spreadsheet programs like Excel or SPSS, but rather...

Downloads: 29 This Week

Last Update: 2024-06-25
See Project
2

SDGym

Benchmarking synthetic data generation methods

The Synthetic Data Gym (SDGym) is a benchmarking framework for modeling and generating synthetic data. Measure performance and memory usage across different synthetic data modeling techniques – classical statistics, deep learning and more! The SDGym library integrates with the Synthetic Data Vault ecosystem. You can use any of its synthesizers, datasets or metrics for benchmarking. You also customize the process to include your own work. Select any of the publicly available datasets from the...

Downloads: 2 This Week

Last Update: 2024-08-29
See Project
3

NBi

NBi is a testing framework (add-on to NUnit)

NBi is a testing framework (add-on to NUnit) for Business Intelligence. It supports most of the relational databases (SQL server, MySQL, postgreSQL ...) and OLAP platforms (Analysis Services, Mondrian ...) but also ETL and reporting components (Microsoft technologies). The main goal of this framework is to let users create tests with a declarative approach based on an Xml syntax. By the means of NBi, you don't need to develop C# code to specify your tests! Either, you don't need Visual Studio...

Downloads: 0 This Week

Last Update: 2023-08-10
See Project
4

SQLBucket

Lightweight library to write, orchestrate and test your SQL ETL

... with the project_folder parameter. That folder will contain all your SQL ETL. The python file where you create your SQLBucket object is also a good place to instantiate your command line interface.

Downloads: 0 This Week

Last Update: 2023-06-12
See Project
Save hundreds of developer hours with components built for SaaS applications.
The #1 Embedded Analytics Solution for SaaS Teams.

Whether you want full self-service analytics or simpler multi-tenant security, Qrvey’s embeddable components and scalable data management remove the guess work.

Try Developer Playground
5

Apache Airflow Provider

Great Expectations Airflow operator

Due to apply_default decorator removal, this version of the provider requires Airflow 2.1.0+. If your Airflow version is 2.1.0, and you want to install this provider version, first upgrade Airflow to at least version 2.1.0. Otherwise, your Airflow package version will be upgraded automatically, and you will have to manually run airflow upgrade db to complete the migration. This operator currently works with the Great Expectations V3 Batch Request API only. If you would like to use the...

Downloads: 0 This Week

Last Update: 2024-09-02
See Project
6

CleanVision

Automatically find issues in image datasets

CleanVision automatically detects potential issues in image datasets like images that are: blurry, under/over-exposed, (near) duplicates, etc. This data-centric AI package is a quick first step for any computer vision project to find problems in the dataset, which you want to address before applying machine learning. CleanVision is super simple -- run the same couple lines of Python code to audit any image dataset! The quality of machine learning models hinges on the quality of the data used...

Downloads: 0 This Week

Last Update: 2024-02-13
See Project
7

ydata-profiling

Create HTML profiling reports from pandas DataFrame objects

ydata-profiling primary goal is to provide a one-line Exploratory Data Analysis (EDA) experience in a consistent and fast solution. Like pandas df.describe() function, that is so handy, ydata-profiling delivers an extended analysis of a DataFrame while allowing the data analysis to be exported in different formats such as html and json.

Downloads: 0 This Week

Last Update: 2024-10-29
See Project
8

dbt-re-data

re_data - fix data issues before your users & CEO would discover them

re_data is an open-source data reliability framework for the modern data stack. Currently, re_data focuses on observing the dbt project (together with underlaying data warehouse - Postgres, BigQuery, Snowflake, Redshift). Data transformations in re_data are implemented and exposed as models & macros in this dbt package. Gather all relevant outputs about your data in one place using our cloud. Invite your team and debug it easily from there. Go back in time, and see your past metadata. Set up...

Downloads: 0 This Week

Last Update: 2023-12-21
See Project
9

Encord Active

The toolkit to test, validate, and evaluate your models and surface

Encord Active is an open-source toolkit to test, validate, and evaluate your models and surface, curate, and prioritize the most valuable data for labeling to supercharge model performance. Encord Active has been designed as a all-in-one open source toolkit for improving your data quality and model performance. Use the intuitive UI to explore your data or access all the functionalities programmatically. Discover errors, outliers, and edge-cases within your data - all in one open source...

Downloads: 0 This Week

Last Update: 2024-04-19
See Project
Bright Data - All in One Platform for Proxies and Web Scraping
Say goodbye to blocks, restrictions, and CAPTCHAs

Bright Data offers the highest quality proxies with automated session management, IP rotation, and advanced web unlocking technology. Enjoy reliable, fast performance with easy integration, a user-friendly dashboard, and enterprise-grade scaling. Powered by ethically-sourced residential IPs for seamless web scraping.

Get Started
10

Dagster

An orchestration platform for the development, production

Dagster is an orchestration platform for the development, production, and observation of data assets. Dagster as a productivity platform: With Dagster, you can focus on running tasks, or you can identify the key assets you need to create using a declarative approach. Embrace CI/CD best practices from the get-go: build reusable components, spot data quality issues, and flag bugs early. Dagster as a robust orchestration engine: Put your pipelines into production with a robust...

Downloads: 0 This Week

Last Update: 5 days ago
See Project
11

Arize Phoenix

Uncover insights, surface problems, monitor, and fine tune your LLM

Phoenix provides ML insights at lightning speed with zero-config observability for model drift, performance, and data quality. Phoenix is an Open Source ML Observability library designed for the Notebook. The toolset is designed to ingest model inference data for LLMs, CV, NLP and tabular datasets. It allows Data Scientists to quickly visualize their model data, monitor performance, track down issues & insights, and easily export to improve. Deep Learning Models (CV, LLM, and Generative)...

Downloads: 0 This Week

Last Update: 14 hours ago
See Project
12

Cleanlab

The standard data-centric AI package for data quality and ML

cleanlab helps you clean data and labels by automatically detecting issues in a ML dataset. To facilitate machine learning with messy, real-world data, this data-centric AI package uses your existing models to estimate dataset problems that can be fixed to train even better models. cleanlab cleans your data's labels via state-of-the-art confident learning algorithms, published in this paper and blog. See some of the datasets cleaned with cleanlab at labelerrors.com. This package helps you...

Downloads: 0 This Week

Last Update: 2024-09-26
See Project
13

Gretel Synthetics

Synthetic data generators for structured and unstructured text

Unlock unlimited possibilities with synthetic data. Share, create, and augment data with cutting-edge generative AI. Generate unlimited data in minutes with synthetic data delivered as-a-service. Synthesize data that are as good or better than your original dataset, and maintain relationships and statistical insights. Customize privacy settings so that data is always safe while remaining useful for downstream workflows. Ensure data accuracy and privacy confidently with expert-grade reports....

Downloads: 0 This Week

Last Update: 2024-10-23
See Project
14

Diffgram

Training data (data labeling, annotation, workflow) for all data types

From ingesting data to exploring it, annotating it, and managing workflows. Diffgram is a single application that will improve your data labeling and bring all aspects of training data under a single roof. Diffgram is world’s first truly open source training data platform that focuses on giving its users an unlimited experience. This is aimed to reduce your data labeling bills and increase your Training Data Quality. Training Data is the art of supervising machines through data. This...

Downloads: 0 This Week

Last Update: 2024-10-14
See Project
15

FiftyOne

The open-source tool for building high-quality datasets

The open-source tool for building high-quality datasets and computer vision models. Nothing hinders the success of machine learning systems more than poor-quality data. And without the right tools, improving a model can be time-consuming and inefficient. FiftyOne supercharges your machine learning workflows by enabling you to visualize datasets and interpret models faster and more effectively. Improving data quality and understanding your model’s failure modes are the most impactful ways to...

Downloads: 0 This Week

Last Update: 5 days ago
See Project
16

Pandas Profiling

Create HTML profiling reports from pandas DataFrame objects

pandas-profiling generates profile reports from a pandas DataFrame. The pandas df.describe() function is handy yet a little basic for exploratory data analysis. pandas-profiling extends pandas DataFrame with df.profile_report(), which automatically generates a standardized univariate and multivariate report for data understanding. High correlation warnings, based on different correlation metrics (Spearman, Pearson, Kendall, Cramér’s V, Phik). Most common categories (uppercase, lowercase,...

Downloads: 0 This Week

Last Update: 2024-10-29
See Project
17

data-diff

Efficiently diff rows across two different databases

We're excited to announce the launch of a new open-source product, data-diff that makes comparing datasets across databases fast at any scale. data-diff automates data quality checks for data replication and migration. In modern data platforms, data is constantly moving between systems, and at the modern data volume and complexity, systems go out of sync all the time. Until now, there has not been any tooling to ensure that when the data is correctly copied. Replicating data at scale, across...

Downloads: 0 This Week

Last Update: 2024-02-20
See Project
18

Muse: Middleware Universal Scripting idE

DevOps Automate: WebSphere; WebLogic; JBoss; Glassfish; Tomcat; Linux.

Simplify... Aggregate... Automate... Simplify... *** OPEN SOURCE - GPL3/EPL. Use Python / Jython to automate WebSphere, WebLogic, JBoss, Glassfish and Tomcat Middleware Estates over JMX, both SSL and non-SSL + Linux SSH (agent-less). Target all 5 servers and Linux from the same workspace. Familiar Eclipse based Jython Development IDE, pre-configured and ready to go. 4-Click Installer. Win x64, Linux WINE x64. Built-In JVM. Java 8/9/10, Amazon Corretto, JETPack13/14/16, IBM SDK...

Downloads: 2 This Week

Last Update: 2024-03-29
See Project
19

Open Source Data Quality and Profiling

World's first open source data quality & data preparation project

This project is dedicated to open source data quality and data preparation solutions. Data Quality includes profiling, filtering, governance, similarity check, data enrichment alteration, real time alerting, basket analysis, bubble chart Warehouse validation, single customer view etc. defined by Strategy. This tool is developing high performance integrated data management platform which will seamlessly do Data Integration, Data Profiling, Data Quality, Data Preparation, Dummy Data...

8 Reviews

Downloads: 4 This Week

Last Update: 2021-01-20
See Project
20

CloverDX

Design, automate, operate and publish data pipelines at scale

Please, visit www.cloverdx.com for latest product versions. Data integration platform; can be used to transform/map/manipulate data in batch and near-realtime modes. Suppors various input/output formats (CSV,FIXLEN,Excel,XML,JSON,Parquet, Avro,EDI/X12,HL7,COBOL,LOTUS, etc.). Connects to RDBMS/JMS/Kafka/SOAP/Rest/LDAP/S3/HTTP/FTP/ZIP/TAR. CloverDX offers 100+ specialized components which can be further extended by creation of "macros" - subgraphs - and libraries, shareable with 3rd parties...

4 Reviews

Downloads: 4 This Week

Last Update: 2023-05-04
See Project
21

DataCleaner

Data quality analysis, profiling, cleansing, duplicate detection +more

DataCleaner is a data quality analysis application and a solution platform for DQ solutions. It's core is a strong data profiling engine, which is extensible and thereby adds data cleansing, transformations, enrichment, deduplication, matching and merging. Website: http://datacleaner.github.io

3 Reviews

Downloads: 89 This Week

Last Update: 2019-02-12
See Project
22

Toolsverse ETL Framework

Open source Extract Transform Load engine written in Java

ETL Framework is a standalone Extract Transform Load engine written in Java. It includes executables for all major platforms and can be easily integrated into other applications. Key Features: * embeddable, open source and free * fast and scalable * uses target database features to do transformations and loads * manual and automatic data mapping * data streaming * bulk data loads * data quality features using SQL, JavaScript? and regex * data transformations Requirements *...

Downloads: 0 This Week

Last Update: 2013-05-30
See Project
23

NGS data quality evaluation

Python tool to evaluate the quality of high-throughput sequencing data

ngsdataqeval is a Python tool to evaluate the quality of high-throughput sequencing data, used by Next Generation Sequencing. Unlike other tools that analyze raw data, this is designed to evaluate the quality of the processed reads after mapping to a reference genome. The evaluation is performed in a genomic region defined by the user, and it provides some statistics computed from the reads that map to that region (ie. a single gene). The program provides a graphical output embedded in an html...

Downloads: 0 This Week

Last Update: 2014-09-06
See Project
24

AMB Data Profiling Data Quality

AMB New Generation Data Empowerment - offers a comprehensive approach to data governance needs with ground breaking features to locate, identify, discover, manage and protect your overall data infrastructure. Repeatable Process/Exposed Repository.

2 Reviews

Downloads: 0 This Week

Last Update: 2015-04-27
See Project
25

COBOL Data Definitions

Parse, analyze and -- most importantly -- use COBOL data definitions. This gives you access to COBOL data from Python programs. Write data analyzers, one-time data conversion utilities and Python programs that are part of COBOL systems. Really.

Downloads: 0 This Week

Last Update: 2013-04-26
See Project

Previous
You're on page 1
2
Next

Related Searches

csv

gym management

gym software

data warehouse

phoenix

roof

data replication

oracle apex sample

talend

cloveretl

Related Categories

Business

Artificial Intelligence

Software Development

Database

System

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
225 Broadway Suite 1600
San Diego, CA 92101
+1 (858) 454-5900

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2024 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise

Thanks for helping keep SourceForge clean.

X

You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

Briefly describe the problem (required):

Upload screenshot of ad (required):

Select a file, or drag & drop file here.

✔

✘

Screenshot instructions:

Click URL instructions:
Right-click on the ad, choose "Copy Link", then paste here →
(This may not be possible with some types of ads)

More information about our ad policies

Ad destination/click URL: