Showing 934 open source projects for "data quality"

View related business solutions
  • Event Management Software Icon
    Event Management Software

    Ideal for conference and event planners, independent planners, associations, event management companies, non-profits, and more.

    YesEvents offers a comprehensive suite of services that spans the entire conference lifecycle and ensures every detail is executed with precision. Our commitment to exceptional customer service extends beyond conventional boundaries, consistently exceeding expectations and enriching both organizer and attendee experiences.
  • An All-in-One EMR Exclusively for Therapy and Rehab. Icon
    An All-in-One EMR Exclusively for Therapy and Rehab.

    Electronic Medical Records Software

    Managing your therapy and rehab practice is a time-consuming process. You spend hours on paperwork, billing, scheduling, and more. Raintree’s Therapy & Rehab EHR is here to help you manage your practice more efficiently. With our all-in-one solution, you’ll get the tools you need to streamline your therapy and rehab practice, improve patient care, and get back to doing what you love.
  • 1
    DQO Data Quality Operations Center

    DQO Data Quality Operations Center

    Data Quality Operations Center

    DQO is an DataOps friendly data quality monitoring tool with customizable data quality checks and data quality dashboards. DQO comes with around 100 predefined data quality checks which helps you monitor the quality of your data. Table and column-level checks which allows writing your own SQL queries. Daily and monthly date partition testing. Data segmentation by up to 9 different data streams. Build-in scheduling. Calculation of data quality KPIs which can be displayed on multiple built...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    data-diff

    data-diff

    Efficiently diff rows across two different databases

    We're excited to announce the launch of a new open-source product, data-diff that makes comparing datasets across databases fast at any scale. data-diff automates data quality checks for data replication and migration. In modern data platforms, data is constantly moving between systems, and at the modern data volume and complexity, systems go out of sync all the time. Until now, there has not been any tooling to ensure that when the data is correctly copied. Replicating data at scale, across...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Cookiecutter Data Science

    Cookiecutter Data Science

    Project structure for doing and sharing data science work

    A logical, reasonably standardized, but flexible project structure for doing and sharing data science work. When we think about data analysis, we often think just about the resulting reports, insights, or visualizations. While these end products are generally the main event, it's easy to focus on making the products look nice and ignore the quality of the code that generates them. Because these end products are created programmatically, code quality is still important! And we're not talking...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    dbt-re-data

    dbt-re-data

    re_data - fix data issues before your users & CEO would discover them

    re_data is an open-source data reliability framework for the modern data stack. Currently, re_data focuses on observing the dbt project (together with underlaying data warehouse - Postgres, BigQuery, Snowflake, Redshift). Data transformations in re_data are implemented and exposed as models & macros in this dbt package. Gather all relevant outputs about your data in one place using our cloud. Invite your team and debug it easily from there. Go back in time, and see your past metadata. Set up...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Cyber Risk Assessment and Management Platform Icon
    Cyber Risk Assessment and Management Platform

    ConnectWise Identify is a powerful cybersecurity risk assessment platform offering strategic cybersecurity assessments and recommendations.

    When it comes to cybersecurity, what your clients don’t know can really hurt them. And believe it or not, keep them safe starts with asking questions. With ConnectWise Identify Assessment, get access to risk assessment backed by the NIST Cybersecurity Framework to uncover risks across your client’s entire business, not just their networks. With a clearly defined, easy-to-read risk report in hand, you can start having meaningful security conversations that can get you on the path of keeping your clients protected from every angle. Choose from two assessment levels to cover every client’s need, from the Essentials to cover the basics to our Comprehensive Assessment to dive deeper to uncover additional risks. Our intuitive heat map shows you your client’s overall risk level and priority to address risks based on probability and financial impact. Each report includes remediation recommendations to help you create a revenue-generating action plan.
  • 5
    Synthetic Data Vault (SDV)

    Synthetic Data Vault (SDV)

    Synthetic Data Generation for tabular, relational and time series data

    The Synthetic Data Vault (SDV) is a Synthetic Data Generation ecosystem of libraries that allows users to easily learn single-table, multi-table and timeseries datasets to later on generate new Synthetic Data that has the same format and statistical properties as the original dataset. Synthetic data can then be used to supplement, augment and in some cases replace real data when training Machine Learning models. Additionally, it enables the testing of Machine Learning or other data dependent...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    LosslessCut

    LosslessCut

    The swiss army knife of lossless video/audio editing

    ... losing quality. Or you can add a music or subtitle track to your video without needing to encode. Everything is extremely fast because it does an almost direct data copy, fueled by the awesome FFmpeg which does all the grunt work. Lossless merge/concatenation of arbitrary files (with identical codecs parameters, e.g. from the same camera). Lossless stream editing: Combine arbitrary tracks from multiple files (ex. add music or subtitle track to a video file).
    Downloads: 351 This Week
    Last Update:
    See Project
  • 7
    FastQC

    FastQC

    A quality control analysis tool for high throughput sequencing data

    FastQC is a quality control analysis tool designed to spot potential problems in high throughput sequencing datasets. Its goal is to provide a simple way by which to check the quality of raw sequence data coming from high throughput sequencing pipelines. It does this by running a modular set of analyses on one or more raw sequence files in fastq or bam format. It then produces a report summarizing the results, and highlighting any areas where the library may appear unusual. This should...
    Downloads: 85 This Week
    Last Update:
    See Project
  • 8
    DB Browser for SQLite

    DB Browser for SQLite

    The DB Browser for SQLite

    DB Browser for SQLite (DB4S) is a high quality, visual, open source tool to create, design, and edit database files compatible with SQLite. DB4S is for users and developers who want to create, search, and edit databases. DB4S uses a familiar spreadsheet-like interface, and complicated SQL commands do not have to be learned. This program is not a visual shell for the sqlite command line tool, and does not require familiarity with SQL commands. It is a tool to be used by both developers and end...
    Downloads: 131 This Week
    Last Update:
    See Project
  • 9
    ZXing

    ZXing

    Barcode scanning library for Java, Android

    ZXing or “Zebra Crossing” is an open source multi-format 1D/2D barcode image processing library that’s been implemented in Java, and also comes with ports to other languages. It currently supports the following formats: UPC-A and UPC-E EAN-8 and EAN-13 Code 39 Code 93 Code 128 ITF Codabar RSS-14 (all variants) RSS Expanded (most variants) QR Code Data Matrix Aztec ('beta' quality) PDF 417 ('alpha' quality) MaxiCode ZXing is made up of several modules, including a core image...
    Downloads: 105 This Week
    Last Update:
    See Project
  • Eptura Workplace Software Icon
    Eptura Workplace Software

    From desk booking and visitor management, to space planning and office utilization data, Eptura Workplace helps your entire organization work smarter.

    With the world of work changed forever, it’s essential to manage your workplace and assets together to effectively create a high-performing environment. The Eptura experience combines the power of workplace management software with asset management, enabling you to effectively operate your building and facilitate hybrid work.
  • 10
    DevilutionX

    DevilutionX

    Diablo build for modern operating systems

    DevilutionX is a port of Diablo and Hellfire that strives to make it simple to run the game while providing engine improvements, bugfixes, and some optional quality-of-life features. Check out the manual for what features are available and how best to take advantage of them. You'll need access to the data from the original game. If you don't have an original CD then you can buy Diablo from GoG. Alternatively you can use spawn.mpq from the shareware version, in place of DIABDAT.MPQ, to play...
    Downloads: 53 This Week
    Last Update:
    See Project
  • 11
    Super-PDF-Editor

    Super-PDF-Editor

    World's most comprehensive, powerful, process-based PDF editor

    ... performs in pdf files, scanned pdf files and any pdf files. OCR performs in image files, and supports multiple image formats. Auto and manual image enhancement for better OCR accuracy and quality. Supports 165+ languages with three languages data set. Use Multiple Languages at once. International Languages: 127 Languages, High, Medium, and Fast Quality. Scanned Images (jpg, png, gif, tiff, bmp) Multi-Page and TIFF and GIF, Scanned PDFs.
    Downloads: 44 This Week
    Last Update:
    See Project
  • 12
    CSV Lint

    CSV Lint

    CSV Lint plug-in for Notepad++ for syntax highlighting

    CSV Lint plug-in for Notepad++ for syntax highlighting, csv validation, automatic column and datatype detecting fixed width datasets, change datetime format, decimal separator, sort data, count unique values, convert to xml, json, sql etc. A plugin for data cleaning and working with messy data files. Use CSV Lint for metadata discovery, technical data validation, and reformatting on tabular data files. It is not meant to be a replacement for spreadsheet programs like Excel or SPSS, but rather...
    Downloads: 26 This Week
    Last Update:
    See Project
  • 13
    sharp

    sharp

    High performance Node.js image processing module

    The typical use case for this high speed Node.js module is to convert large images in common formats to smaller, web-friendly JPEG, PNG, AVIF and WebP images of varying dimensions. Resizing an image is typically 4x-5x faster than using the quickest ImageMagick and GraphicsMagick settings due to its use of libvips. Colour spaces, embedded ICC profiles and alpha transparency channels are all handled correctly. Lanczos resampling ensures quality is not sacrificed for speed. As well as image...
    Downloads: 29 This Week
    Last Update:
    See Project
  • 14
    PaperQA2

    PaperQA2

    High accuracy RAG for answering questions from scientific documents

    PaperQA2 is a package for doing high-accuracy retrieval augmented generation (RAG) on PDFs or text files, with a focus on the scientific literature. See our recent 2024 paper to see examples of PaperQA2's superhuman performance in scientific tasks like question answering, summarization, and contradiction detection. In this example we take a folder of research paper PDFs, magically get their metadata - including citation counts and a retraction check, then parse and cache PDFs into a...
    Downloads: 30 This Week
    Last Update:
    See Project
  • 15
    Matplotlib

    Matplotlib

    matplotlib: plotting with Python

    Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python. Matplotlib makes easy things easy and hard things possible. Matplotlib ships with several add-on toolkits, including 3D plotting with mplot3d, axes helpers in axes_grid1 and axis helpers in axisartist. A large number of third party packages extend and build on Matplotlib functionality, including several higher-level plotting interfaces (seaborn, HoloViews, ggplot, ...), and a...
    Downloads: 19 This Week
    Last Update:
    See Project
  • 16
    syslog-ng

    syslog-ng

    Log management solution that improves the performance of SIEM

    syslog-ng is the log management solution that improves the performance of your SIEM solution by reducing the amount and improving the quality of data feeding your SIEM. With syslog-ng Store Box, you can find the answer. Search billions of logs in seconds using full text queries with Boolean operators to pinpoint critical logs. syslog-ng Store Box provides secure, tamper-proof storage and custom reporting to demonstrate compliance. syslog-ng can deliver data from a wide variety of sources...
    Downloads: 14 This Week
    Last Update:
    See Project
  • 17
    Animity

    Animity

    Use the Android app to watch anime on your phone without ads

    An Android app to watch anime on your phone without ads. Animity is an anime streaming app that provides high-quality streaming of your favorite anime shows. It's designed to be easy to use, so you can quickly find the anime you want to watch and start streaming. With Animity, you can also sync your account with Anilist, so you can keep track of your favorite shows and manage your anime watchlist. Whether you're a die-hard anime fan or a casual viewer, Animity has something for everyone...
    Downloads: 17 This Week
    Last Update:
    See Project
  • 18
    Prophet

    Prophet

    Tool for producing high quality forecasts for time series data

    Prophet is a procedure for forecasting time series data based on an additive model where non-linear trends are fit with yearly, weekly, and daily seasonality, plus holiday effects. It works best with time series that have strong seasonal effects and several seasons of historical data. Prophet is robust to missing data and shifts in the trend, and typically handles outliers well. Prophet is used in many applications across Facebook for producing reliable forecasts for planning and goal setting...
    Downloads: 15 This Week
    Last Update:
    See Project
  • 19
    Super-PDF-Editor-Lite

    Super-PDF-Editor-Lite

    World's most comprehensive, powerful, process-based PDF editor

    .... Easy pdf imposition, booklet, n ups pages, and more. OCR performs in pdf files, scanned pdf files and any pdf files. OCR performs in image files, and supports multiple image formats. Auto and manual image enhancement for better OCR accuracy and quality. Supports 165+ languages with three languages data set. Use Multiple Languages at once. International Languages: 127 Languages, High, Medium, and Fast Quality. Scanned Images (jpg, png, gif, tiff, bmp) Multi-Page and TIFF and GIF, Scanned PDFs.
    Downloads: 13 This Week
    Last Update:
    See Project
  • 20
    AG Grid

    AG Grid

    The best JavaScript Data Table for building enterprise applications

    The performance, feature set and quality of AG Grid have not been seen before in a JavaScript data grid. Many features in AG Grid are unique to AG Grid, and simply put AG Grid into a class of its own, without compromising on quality or performance. Over 600,000 monthly downloads of AG Grid Community and over 50% of the Fortune 500 using AG Grid Enterprise. AG Grid has become the JavaScript Datagrid of choice for Enterprise JavaScript developers. AG Grid Community is free and open-sourced under...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 21
    AdGuardHome

    AdGuardHome

    Network-wide ads and trackers blocking DNS server

    ... and are updated on a regular basis, guaranteeing the best filtering quality. Protecting your personal data is our top priority. With AdGuard, you and your sensitive data will be safe from any online tracker and analytics system that may attempt to steal your data while surfing the web. Use the Family protection mode to block access to all websites with adult content and enforce safe search in the browser, in addition to the regular perks of ad blocking and browsing security.
    Downloads: 14 This Week
    Last Update:
    See Project
  • 22
    lakeFS

    lakeFS

    lakeFS - Git-like capabilities for your object storage

    Increase data quality and reduce the painful cost of errors. Data engineering best practices using git-like operations on data. lakeFS is an open-source data version control for data lakes. It enables zero-copy Dev / Test isolated environments, continuous quality validation, atomic rollback on bad data, reproducibility, and more. Data is dynamic, it changes over time. Dealing with that without a data version control system is error-prone and labor-intensive. With lakeFS, your data lake...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 23
    VALL-E

    VALL-E

    PyTorch implementation of VALL-E (Zero-Shot Text-To-Speech)

    We introduce a language modeling approach for text to speech synthesis (TTS). Specifically, we train a neural codec language model (called VALL-E) using discrete codes derived from an off-the-shelf neural audio codec model, and regard TTS as a conditional language modeling task rather than continuous signal regression as in previous work. During the pre-training stage, we scale up the TTS training data to 60K hours of English speech which is hundreds of times larger than existing systems. VALL...
    Downloads: 15 This Week
    Last Update:
    See Project
  • 24
    Cleanlab

    Cleanlab

    The standard data-centric AI package for data quality and ML

    cleanlab helps you clean data and labels by automatically detecting issues in a ML dataset. To facilitate machine learning with messy, real-world data, this data-centric AI package uses your existing models to estimate dataset problems that can be fixed to train even better models. cleanlab cleans your data's labels via state-of-the-art confident learning algorithms, published in this paper and blog. See some of the datasets cleaned with cleanlab at labelerrors.com. This package helps you find...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 25
    Cesium

    Cesium

    An open-source JavaScript library for world-class 3D globes and maps

    CesiumJS is an open source JavaScript library for creating world-class 3D globes and maps with the best possible performance, precision, visual quality, and ease of use. Developers across industries, from aerospace to smart cities to drones, use CesiumJS to create interactive web apps for sharing dynamic geospatial data. Built on open formats, CesiumJS is designed for robust interoperability and scaling for massive datasets. CesiumJS is released under the Apache 2.0 license and is free for both...
    Downloads: 9 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next