Showing 766 open source projects for "data quality"

View related business solutions
  • Enterprise-grade ITSM, for every business Icon
    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

    Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
    Try it Free
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • 1
    hrbrthemes

    hrbrthemes

    Opinionated, typographic-centric ggplot2 themes and theme components

    hrbrthemes is a focused ggplot2 theme package with an emphasis on typography, layout precision, and visual polish. It includes themes like theme_ipsum and Font scales tailored for clean, high‑quality production graphics.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    NLP Best Practices

    NLP Best Practices

    Natural Language Processing Best Practices & Examples

    In recent years, natural language processing (NLP) has seen quick growth in quality and usability, and this has helped to drive business adoption of artificial intelligence (AI) solutions. In the last few years, researchers have been applying newer deep learning methods to NLP. Data scientists started moving from traditional methods to state-of-the-art (SOTA) deep neural network (DNN) algorithms which use language models pretrained on large text corpora.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    The Object-Role Modeling (ORM) standard version 2, associated schemas and generation tools, and a reference implementation in the form of the Natural Object-Role Modeling Architect for Visual Studio (NORMA) product.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    QtiPlot
    QtiPlot is a user-friendly, platform independent data analysis and visualization application similar to the non-free Windows program Origin.
    Downloads: 18 This Week
    Last Update:
    See Project
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • 5
    Image Quality Assessment

    Image Quality Assessment

    Convolutional Neural Networks to predict aesthetic quality of images

    Image Quality Assessment is an open-source deep learning project that implements neural models for predicting the aesthetic and technical quality of digital images. The repository provides an implementation inspired by the NIMA (Neural Image Assessment) research approach, which uses convolutional neural networks trained on human-annotated datasets to estimate image quality scores.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    textgenrnn

    textgenrnn

    Easily train your own text-generating neural network

    With textgenrnn you can easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code. A modern neural network architecture that utilizes new techniques as attention-weighting and skip-embedding to accelerate training and improve model quality. Train on and generate text at either the character-level or word-level. Configure RNN size, the number of RNN layers, and whether to use bidirectional RNNs. Train on any generic input text...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    NYCOpenData-Profiling-Analysis

    NYCOpenData-Profiling-Analysis

    Open Data Profiling, Quality and Analysis on NYC OpenData dataset

    Open data often comes with little or no metadata. You will profile a large collection of open data sets and derive metadata that can be used for data discovery, querying, and identification of data quality problems. For each column, identify and summarize the semantic types present in the column. These can be generic types (e.g., city, state) or collection-specific types (NYU school names, NYC agency).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    EverydayWechat

    EverydayWechat

    Python tool that automates WeChat messages, replies, & group utilities

    ...In addition to personal messaging automation, the project includes a group assistant that can respond to queries and provide useful information within chat groups. These group utilities can retrieve data such as weather conditions, calendar details, garbage classification information, movie box office statistics, delivery tracking updates, and air quality reports.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 9
    PMHTTP

    PMHTTP

    Swift/Obj-C HTTP framework with a focus on REST and JSON

    PMHTTP is an HTTP framework built around URLSession and designed for Swift while retaining Obj-C compatibility. We think URLSession is great. But it was designed for Obj-C and it doesn't handle anything beyond the networking aspect of a request. This means no handling of JSON, and it doesn't even provide multipart/form-data uploads. PMHTTP leaves the networking to URLSession and provides everything else.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Secure File Transfer for Windows with Cerberus by Redwood Icon
    Secure File Transfer for Windows with Cerberus by Redwood

    Protect and share files over FTP/S, SFTP, HTTPS and SCP with the #1 rated Windows file transfer server.

    Cerberus supports unlimited users and connections on a single IP, with built-in encryption, 2FA, and a browser-based web client — all deployable in under 15 minutes with a 25-day free trial.
    Try for Free
  • 10
    MatchZoo

    MatchZoo

    Facilitating the design, comparison and sharing of deep text models

    The goal of MatchZoo is to provide a high-quality codebase for deep text matching research, such as document retrieval, question answering, conversational response ranking, and paraphrase identification. With the unified data processing pipeline, simplified model configuration and automatic hyper-parameters tunning features equipped, MatchZoo is flexible and easy to use. Preprocess your input data in three lines of code, keep track parameters to be passed into the model. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Chatito

    Chatito

    Dataset generation for AI chatbots, NLP tasks

    Chatito is a tool that helps generate datasets for training and validating chatbot models using a simple domain-specific language (DSL).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    DefinitelyTyped

    DefinitelyTyped

    The repository for high quality TypeScript type definitions

    DefinitelyTyped is the repository for high quality TypeScript type definitions. TypeScript is an open source typed superset of JavaScript, with many different data types which allow JavaScript developers to use highly-productive development tools and practices. DefinitelyTyped packages are all type-checking/linting cleanly and published to npm in under an hour. Types packages have tags for versions of TypeScript that they explicitly support, and currently only versions 2.8 and above are tested.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    PyTorch-BigGraph

    PyTorch-BigGraph

    Generate embeddings from large-scale graph-structured data

    PyTorch-BigGraph (PBG) is a system for learning embeddings on massive graphs—think billions of nodes and edges—using partitioning and distributed training to keep memory and compute tractable. It shards entities into partitions and buckets edges so that each training pass only touches a small slice of parameters, which drastically reduces peak RAM and enables horizontal scaling across machines. PBG supports multi-relation graphs (knowledge graphs) with relation-specific scoring functions,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    PlusV

    PlusV

    An Open Source alternative to SBR (Spectral band replication)

    What is PlusV? With traditional MP3, a typical Near CD Quality audio file has been encoded with a data rate of 128 kbits/s. While this is ok for people with big hard disks and fast Internet connections, this data speed has clearly been a bottleneck for people using modems or storing their music into 32 or 64 MB portable player FLASH cards. PlusV is a brand new audio compression enhancement technology that allows audio files to be compressed in as little as 64 or even 48 kbits/s. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    two-step-reason

    two-step-reason

    Annotation quality control via two-step reason selection (EMNLP-IJCNLP

    The annotation tool that we used for testing the quality control method via two-step reason selection. The annotation guideline, acquired data, and code to reproduce the experimental results. The detailed instructions are in the readme file within the zip file. Github: https://github.com/nlpcl-lab/two-step-reason
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Coursebook

    Coursebook

    Introductory Systems Programming Textbook for University of Illinois

    Welcome to the systems programming coursebook! This repository houses a high-quality, open-source introductory systems programming textbook used by the CS 341: System Programming course at the University of Illinois at Urbana-Champaign The book assumes that you have taken a programming language course and are familiar with assembly instructions. All of the code and instruction will be in C, as it is the de-facto language of the Linux Kernel.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    cglib

    cglib

    High level API to generate and transform Java byte code

    Byte Code Generation Library is high-level API to generate and transform Java byte code. It is used by AOP, testing, data access frameworks to generate dynamic proxy objects and intercept field access. cglib is a powerful, high-performance and quality Code Generation Library. It is used to extend Java classes and implement interfaces at runtime. See samples and API documentation to learn more about features. This library is free software, freely reusable for personal or commercial purposes.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    Medrechaincode

    Medrechaincode

    Lifetime medical records in decentralized chain code-network

    Medrechaincode uses a blockchain technology to store patient life time medical records in decentralized node with single true version of patient medical records” Medrechaincode provides a consensus based access of patient medical record to different healthcare professionals like R&D labs, doctors, hospital & pharmacist. This platform will also help the healthcare professional to use integrated data quality tool to remove duplicity of patient records , profiling of medical records, metadata discovery , data cleansing , classification of medical records, bucketization, anomaly discovery to find out the anomalous trend of medical records. m This platform will help patient to reduced their consultation time, monetize their medical records, for labs to get the historical records & come up with new medicine and make a clinical trial.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    bento-starter

    bento-starter

    Full-Stack solution to quickly build PWA applications with Vue.js

    ...BentoStarter uses firestore which provides a cloud NoSQL Database so you can focus on writing your front-end code. Optional continuous integration/delivery configuration that helps you control your code quality before deployment. BentoStarter helps you getting started by proposing a default app structure based on best practices. As this project is a template and not a CLI, you can modify the whole project according to your needs. Prerender your different app pages and boost SEO with meta-data description per page.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20

    xMSanalyzer

    An R package for metabolomics data extraction and quality assessment

    xMSanalyzer comprises of utilities that can be classified into four main modules: 1) merging apLCMS or XCMS sample processing results from multiple sets of parameter settings, 2) evaluation of sample quality, feature consistency, and batch-effect, 3) feature matching, and 4) characterization of m/z using KEGG REST; 5) Batch-effect correction using ComBat
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    SQL Exporter

    SQL Exporter

    Database agnostic SQL exporter for Prometheus

    ...SQL queries are grouped into collectors -- logical groups of queries, e.g., query stats or I/O stats, mapped to the metrics they populate. Collectors may be DBMS-specific (e.g,. MySQL InnoDB stats) or custom, deployment-specific (e.g., pricing data freshness). This means you can quickly and easily set up custom collectors to measure data quality, whatever that might mean in your specific case.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 22
    apache spark data pipeline osDQ

    apache spark data pipeline osDQ

    osDQ dedicated to create apache spark based data pipeline using JSON

    This is an offshoot project of open source data quality (osDQ) project https://sourceforge.net/projects/dataquality/ This sub project will create apache spark based data pipeline where JSON based metadata (file) will be used to run data processing , data pipeline , data quality and data preparation and data modeling features for big data. This uses java API of apache spark.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    CloverDX

    CloverDX

    Design, automate, operate and publish data pipelines at scale

    Please, visit www.cloverdx.com for latest product versions. Data integration platform; can be used to transform/map/manipulate data in batch and near-realtime modes. Suppors various input/output formats (CSV,FIXLEN,Excel,XML,JSON,Parquet, Avro,EDI/X12,HL7,COBOL,LOTUS, etc.). Connects to RDBMS/JMS/Kafka/SOAP/Rest/LDAP/S3/HTTP/FTP/ZIP/TAR. CloverDX offers 100+ specialized components which can be further extended by creation of "macros" - subgraphs - and libraries, shareable with 3rd...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 24
    benerator is a framework for creating realistic and valid high-volume test data, used for load and performance testing and showcase setup. Data is generated from an easily configurable metadata model and exported to databases, XML, CSV or flat files.
    Leader badge
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    Radiance

    Radiance

    Highly accurate ray-tracing software system for UNIX computers

    Radiance is a free, highly accurate ray-tracing software system for UNIX computers. It is a suite of programs designed for the analysis and visualization of lighting in design. Radiance is superior to simpler lighting calculation and rendering tools in that there are no limitations on the geometry or the materials that may be simulated. Scene geometry, materials, luminaires, time, date and sky conditions (for daylight calculations) are specified; spectral radiance (ie. luminance +...
    Downloads: 57 This Week
    Last Update:
    See Project
Auth0 Logo