Showing 148 open source projects for "data quality"

View related business solutions
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 1
    Pandas Profiling

    Pandas Profiling

    Create HTML profiling reports from pandas DataFrame objects

    ...Mostly global details about the dataset (number of records, number of variables, overall missigness and duplicates, memory footprint). Comprehensive and automatic list of potential data quality issues (high correlation, skewness, uniformity, zeros, missing values, constant values, between others).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    OpenDataLoader PDF

    OpenDataLoader PDF

    PDF Parser for AI-ready data. Automate PDF accessibility

    OpenDataLoader PDF is an open-source document processing system designed to convert complex PDF files into structured, AI-ready formats such as Markdown, JSON, and HTML while preserving layout, hierarchy, and semantic meaning. It focuses on enabling downstream use cases like retrieval-augmented generation (RAG), knowledge extraction, and document intelligence pipelines by maintaining accurate reading order and spatial metadata through bounding boxes. The tool combines deterministic parsing...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 3
    RenderCV

    RenderCV

    LaTeX CV generator from a YAML/JSON input file

    RenderCV is a LaTeX CV/resume framework. It allows you to create a high-quality CV as a PDF from a YAML file with full Markdown syntax support and complete control over the LaTeX code. RenderCV offers built-in LaTeX and Markdown templates ready to produce high-quality CVs. However, the templates are entirely arbitrary and can easily be updated to leverage RenderCV's capabilities with your custom CV themes.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4

    JSON for Modern C++

    JSON that's part of C++

    ...While there may be dozens of JSON libraries out there, JSON for C++ stands out with a focus on three things: an intuitive syntax, trivial integration and serious testing. Using the operator magic of modern C++, this library makes JSON feel like a first class data type. With trivial integration, the entire code is made up of a single header file json.hpp, no dependencies, no complex build system required. It's been heavily unit-tested covering 100% of the code, and follows the Core Infrastructure Initiative (CII) best practices to ensure the highest quality at all times. Among its many features are JSON pointers, JSON patches, Iterators, SAX parsing and various container operations.
    Downloads: 13 This Week
    Last Update:
    See Project
  • Secure File Transfer for Windows with Cerberus by Redwood Icon
    Secure File Transfer for Windows with Cerberus by Redwood

    Protect and share files over FTP/S, SFTP, HTTPS and SCP with the #1 rated Windows file transfer server.

    Cerberus supports unlimited users and connections on a single IP, with built-in encryption, 2FA, and a browser-based web client — all deployable in under 15 minutes with a 25-day free trial.
    Try for Free
  • 5
    PDFCraft

    PDFCraft

    PDFCraft is a free, privacy-focused PDF toolkit

    ...But beyond manual editing, it also offers a programmable layer so developers can write scripts to batch process documents, generate templated reports, or extract structured data from PDFs for integration in workflows. The design emphasizes quality and compatibility: output PDFs render accurately across readers, preserve metadata, and support interactive elements like hyperlinks and form fields.
    Downloads: 16 This Week
    Last Update:
    See Project
  • 6
    LaTeX2e Kernel Code Repository
    LaTeX is a high-quality typesetting system; it includes features designed for the production of technical and scientific documentation. LaTeX is the de facto standard for the communication and publication of scientific documents. LaTeX is available as free software.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    HTMLProofer

    HTMLProofer

    Test your rendered HTML files to make sure they're accurate.

    HTMLProofer is a set of tests to validate your HTML output. These tests check if your image references are legitimate, if they have alt tags, if your internal links are working, and so on. It's intended to be an all-in-one checker for your output. In scope for this project is any well-known and widely-used test for HTML document quality. A major use for this project is continuous integration -- so we must have reliable results. We usually balance correctness over performance. And, if...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Spectral

    Spectral

    A flexible JSON/YAML linter for creating automated style guides

    ...Spectral is open-source but is also baked into Stoplight, with extensions for VS Code and other integration options, giving you real-time feedback wherever you design APIs. Spectral can be used as a generic ruleset engine on any JSON or YAML data but was built with OpenAPI, AsyncAPI, and JSON Schema in mind. Use Spectral rules to target API descriptions for quality improvement or enforce API Style Guide rules, such as naming conventions for OpenAPI models or prohibiting integers in URLs.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9
    iLovePDF Api

    iLovePDF Api

    iLovePDF Rest Api - PHP Library

    Develop and automate PDF processing tasks like Compress PDF, merging PDF, Split PDF, converting Office to PDF, PDF to JPG, Images to PDF, adding Page Numbers, Rotate PDF, Unlocking PDF, stamping a Watermark, and Repair PDF. Each one with several settings to get your desired results. Strong infrastructure to offer the best-dedicated processing power. You might know us from ilovepdf.com where we process millions of PDFs daily. We offer a simple and concise API Reference and Guide as well as...
    Downloads: 10 This Week
    Last Update:
    See Project
  • Error to trace to log to deploy. One click. No SSH. Icon
    Error to trace to log to deploy. One click. No SSH.

    Catch the cause before the pager goes off.

    AppSignal links every error to the trace, the trace to the log, the log to the deploy that shipped it.
    Free 30 days.
  • 10
    mistletoe

    mistletoe

    A fast, extensible and spec-compliant Markdown parser in pure Python

    mistletoe is a Markdown parser in pure Python, designed to be fast, spec-compliant and fully customizable. Apart from being the fastest CommonMark-compliant Markdown parser implementation in pure Python, mistletoe also supports easy definitions of custom tokens. Parsing Markdown into an abstract syntax tree also allows us to swap out renderers for different output formats, without touching any of the core components.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    PGFPlots

    PGFPlots

    A TeX package to draw normal and/or logarithmic plots directly in TeX

    PGFPlots, a TeX package to draw normal and/or logarithmic plots directly in TeX in two and three dimensions with a user-friendly interface, and PGFPlotstable, a TeX package to round and format numerical tables. Examples in manuals and/or on the website. PGFPlots draws high-quality function plots in normal or logarithmic scaling with a user-friendly interface directly in TeX. The user supplies axis labels, legend entries and the plot coordinates for one or more plots and PGFPlots applies axis...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    DeckTape

    DeckTape

    PDF exporter for HTML presentations

    DeckTape is a high-quality PDF exporter for HTML presentation frameworks. DeckTape is built on top of Puppeteer which relies on Google Chrome for laying out and rendering Web pages and provides a headless Chrome instance scriptable with a JavaScript API. DeckTape currently supports the following presentation frameworks out of the box. DeckTape also provides a generic command that works by emulating the end-user interaction, allowing it to be used to convert presentations from virtually any...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    dvisvgm

    dvisvgm

    A fast DVI, EPS, and PDF to SVG converter

    The command-line utility dvisvgm is a tool for TEX/LATEX users. It converts DVI, EPS, and PDF files to the XML-based vector graphics format SVG. In contrast to bitmap graphics, vector graphics are arbitrarily scalable without loss of quality. All modern web browsers support a large amount of the current SVG standard 1.1. Furthermore, SVG files can also be displayed with the Java-based Squiggle SVG browser which is part of the Apache Batik project, and the free vector graphics editor Inkscape.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    Pinyin

    Pinyin

    A high-quality solution for converting Chinese to Pinyin

    The Chinese to Pinyin tool based on the CC-CEDICT dictionary, more accurately supports the solution of Chinese characters to Pinyin for polyphonic characters. Memory type, suitable for servers with more memory space, advantages, fast conversion. Small memory type (default), suitable for environments with tight memory, advantages, small memory footprint, conversion is not as fast as memory type. I/O type, suitable for virtual machines with strict memory restrictions. Advantages: very minimal...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Vexip UI

    Vexip UI

    Vue 3 UI library, highly customizability, full TypeScript, performance

    Highly customizability, full TypeScript, performance pretty good. This library is using base on vue 3.0 with using composition api, and design and code components in the traditional way by Vue possible, fully TypeScript. Almost all the default value of props for each component can be quickly modified by configuration, for easy customization. And, the writing of component codes pays great attention to lowering the threshold of source code reading, and the style of code is as close to the...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Vanilla.PDF

    Vanilla.PDF

    Cross-platform SDK for creating and modifying PDF documents

    Vanilla.PDF is a modern, high-performance, open-source C++17 SDK designed for creating, editing, signing, and analyzing PDF documents across multiple platforms. It requires no external runtime dependencies, making it lightweight and ideal for embedding into desktop applications, servers, or automation pipelines. The SDK offers full cross-platform support including Windows, Linux, macOS, and Android, with builds available for major compilers and architectures. Vanilla.PDF supports advanced...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 17
    Asymptote

    Asymptote

    2D & 3D TeX-Aware Vector Graphics Language

    Asymptote is a powerful descriptive vector graphics language for technical drawing, inspired by MetaPost but with an improved C++-like syntax. Asymptote provides for figures the same high-quality typesetting that LaTeX does for scientific text.
    Leader badge
    Downloads: 139 This Week
    Last Update:
    See Project
  • 18

    RecordEditor

    Editor for Fixed Width, Csv and Existing Xml files.

    The RecordEditor is a Data File editor for Flat Files (delimited and fixed field position). It supports Unix / PC / Legacy (e.g. Mainframe) file formats, both Text and binary files. The Editor uses a Record-Layout description to format the files. This is ideal for Fixed width (Text or Binary) files, Cobol Data Files, Mainframe files and complicated Csv files. Cobol Copybooks can be used to format Cobol Data files. As well as an editor, The following utilities are supplied * Formatted...
    Downloads: 40 This Week
    Last Update:
    See Project
  • 19
    Grassroots DICOM

    Grassroots DICOM

    Cross-platform DICOM implementation

    Grassroots DiCoM is a C++ library for DICOM medical files. It is accessible from Python, C#, Java and PHP. It supports RAW, JPEG, JPEG 2000, JPEG-LS, RLE and deflated transfer syntax. It comes with a super fast scanner implementation to quickly scan hundreds of DICOM files. It supports SCU network operations (C-ECHO, C-FIND, C-STORE, C-MOVE). PS 3.3 & 3.6 are distributed as XML files. It also provides PS 3.15 certificates and password based mecanism to anonymize and de-identify DICOM datasets.
    Leader badge
    Downloads: 162 This Week
    Last Update:
    See Project
  • 20
    Gerber2PDF

    Gerber2PDF

    Gerber to PDF converter

    Gerber2PDF is a command-line tool to convert Gerber files to PDF for proofing and hobbyist printing purposes. It converts multiple Gerber files at once, placing the resulting layers each on it's own page within the PDF. Each layer has a PDF bookmark for easy reference. Layers can optionally be combined onto a single page and rendered with custom colours and transparency. There is a Drill to Gerber converter available from the downloads page.
    Leader badge
    Downloads: 28 This Week
    Last Update:
    See Project
  • 21
    Sprint PDF Editor (Smarter PDF Solution)

    Sprint PDF Editor (Smarter PDF Solution)

    Edit, Convert, Extract , Export, Secure and PDF Imposition.

    Sprint PDF Editor® The Productive, Modern, Innovative, Clean & Colourful GUI. Faster, Smarter & Seamless workflows, with 50+ functions. Sprint PDF Editor & Reader, Complete PDF Solution, Supercharge Your Workflows With Imposition, Extract, Compress, Watermark, Protect & Secure, Split & Merge, Crop Pages, Printing, Stamp & more. Your Privacy, Our Priority Protect Your Data with Complete Confidence. Our software is designed to keep your information 100% secure. Unlike cloud-based...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 22
    x3d

    x3d

    X3D is the open-standard format for 3D graphics scenes on the Web.

    Extensible 3D (X3D) Graphics is a royalty-free International Standard for real-time interactive 3D graphics on the Web, providing unsurpassed interoperability for 3D communications on the Web. This project includes source for example X3D scene libraries and multiple X3D codebases produced by Web3D Consortium members. All open-source contributions are welcome.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 23
    MaTeX

    MaTeX

    LaTeX labels in Mathematica

    Create LaTeX labels in Mathematica. In Mathematica 11.3 or later, simply evaluate ResourceFunction to install or upgrade MaTeX. A newer version can be safely installed when an older version is already present. MaTeX will always load the latest installed MaTeX that is compatible with your version of Mathematica. Mathematica is an excellent and flexible visualization tool, and even supports displaying complex mathematical formulae. However, its typesetting quality is not on par with Latex.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Super PDF Editor (a Batch PDF Processor)

    Super PDF Editor (a Batch PDF Processor)

    Create, Edit, Delete, Organize , Convert, Export, Secure & Sign PDF.

    Super PDF Editor - Powerful, superfast, lightweight PDF processor. All-in-one PDF solution, PDF editing with 80+ tools and functions. The easy-to-use software is complete with editing tools for modifying PDF files your way. Most comprehensive, powerful, process-based and lightning-fast batch processor software. OCR PDF. PDF Imposition, Reverse Pages, Resize Page, Scale Page, Booklet, N-up Pages, Merge, Split by page, Extract Page, Rotate Page. Replace Page, Insert Page, Delete Page....
    Leader badge
    Downloads: 18 This Week
    Last Update:
    See Project
  • 25
    LaTeXML

    LaTeXML

    A TeX and LaTeX to XML/HTML/ePub/MathML translator

    LaTeXML is a tool that converts LaTeX documents into structured formats like HTML, MathML, and ePub. Unlike traditional TeX-to-PDF processors, LaTeXML preserves semantic content, making it suitable for web publishing, accessibility, and content reuse. It supports a wide range of LaTeX packages and is designed to enable high-quality rendering of mathematical and scientific documents.
    Downloads: 1 This Week
    Last Update:
    See Project
Auth0 Logo