Showing 471 open source projects for "data modeling"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    Pattern

    Pattern

    Web mining module for Python, with tools for scraping

    Pattern is an open-source Python library that provides tools for web mining, natural language processing, machine learning, and network analysis. The project integrates multiple capabilities into a single framework that allows developers to collect, process, and analyze textual data from the web. It includes modules for web scraping and crawling that can retrieve information from sources such as social media platforms, search engines, and online knowledge bases. In addition to data mining features, the library offers natural language processing functionality including part-of-speech tagging, sentiment analysis, and n-gram extraction. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    geocompr

    geocompr

    Geocomputation with R: an open source book

    This repository hosts the source for Geocomputation with R, an open-source book covering spatial data analysis, visualization, and modeling using R. It teaches how to work with vector and raster data, coordinate systems, mapping, and geocomputation techniques using packages like sf, terra, tmap, and more. Actively maintained and updated for real-world geospatial workflows.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Data Analysis for the Life Sciences

    Data Analysis for the Life Sciences

    Rmd source files for the HarvardX series PH525x

    This repository holds the R Markdown (.Rmd) source files for the PH525x / HarvardX course series (Data Analysis for the Life Sciences / Genomics) managed by GenomicsClass. It functions as the canonical source for course lab exercises, lecture modules, and reading materials in reproducible format. Students and learners use these R Markdown files to follow along, knit notebooks, run code samples, and complete the lab-based assignments. The repo is licensed under MIT, allowing reuse and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    pyntcloud

    pyntcloud

    pyntcloud is a Python library for working with 3D point clouds

    This page will introduce the general concept of point clouds and illustrate the capabilities of pyntcloud as a point cloud processing tool. Point clouds are one of the most relevant entities for representing three dimensional data these days, along with polygonal meshes (which are just a special case of point clouds with connectivity graph attached). In its simplest form, a point cloud is a set of points in a cartesian coordinate system. Accurate 3D point clouds can nowadays be (easily and...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 5
    The Python Computer Graphics Kit is a collection of Python modules that contain the basic types and functions to be able to create 3D computer graphics images (focusing on Pixar's RenderMan interface).
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    Fairseq

    Fairseq

    Facebook AI Research Sequence-to-Sequence Toolkit written in Python

    Fairseq(-py) is a sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling and other text generation tasks. We provide reference implementations of various sequence modeling papers. Recent work by Microsoft and Google has shown that data parallel training can be made significantly more efficient by sharding the model parameters and optimizer state across data parallel workers. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7

    Solvespace

    SOLVESPACE is a free (GPLv3) parametric 3d CAD tool.

    SOLVESPACE is a free (GPLv3) parametric 3d CAD tool. Applications include: modeling 3d parts — draw with extrudes, revolves, helixes and Boolean (union / difference / intersection) operations modeling 2d parts — draw the part as a single section, and export DXF, PDF, SVG; use 3d assembly to verify fit 3d-printed parts — export the STL or other triangle mesh expected by most 3d printers preparing CAM data — export 2d vector art for a waterjet machine or laser cutter; or generate STEP or STL, for import into third-party CAM software for machining mechanism design — use the constraint solver to simulate planar or spatial linkages, with pin, ball, or slide joints plane and solid geometry — replace hand-solved trigonometry and spreadsheets with a live dimensioned drawing ORIGINAL SOURCE: https://solvespace.com/index.pl
    Downloads: 10 This Week
    Last Update:
    See Project
  • 8
    MAE (Masked Autoencoders)

    MAE (Masked Autoencoders)

    PyTorch implementation of MAE

    MAE (Masked Autoencoders) is a self-supervised learning framework for visual representation learning using masked image modeling. It trains a Vision Transformer (ViT) by randomly masking a high percentage of image patches (typically 75%) and reconstructing the missing content from the remaining visible patches. This forces the model to learn semantic structure and global context without supervision. The encoder processes only the visible patches, while a lightweight decoder reconstructs the...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    VANESA
    This project moved to GitHub in 2021 and is available at: https://cbrinkrolf.github.io/VANESA/ This tool is a platform-independent software to create individual pathways and to examine biological networks of distributed, heterogeneous data sources, e.g. KEGG, BRENDA. VANESA also offers Petri net modeling of extended hybrid Petri nets which can be also simulated using the OpenModelica framework.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Compliant and Reliable File Transfers Backed by Top Security Certifications Icon
    Compliant and Reliable File Transfers Backed by Top Security Certifications

    Cerberus FTP Server delivers SOC 2 Type II certified security and FIPS 140-2 validated encryption.

    Stop relying on non-certified, legacy file transfer tools that creak under the weight of modern security demands. Get full audit trails, advanced access controls and more supported by an award-winning team of experts. Start your free 25-day trial today.
    Start Free Trial
  • 10
    Statistical Rethinking 2022

    Statistical Rethinking 2022

    Statistical Rethinking course winter 2022

    This repository hosts the 2022 version of the Statistical Rethinking course. It contains course materials such as R scripts, notebooks, and worked examples aligned with McElreath’s textbook. The code emphasizes Bayesian data analysis using R, the rethinking package, and Stan models. It includes lecture code files, example datasets, and structured exercises that parallel the topics covered in the lectures (probability, regression, model comparison, Bayesian updating). The repo functions as a...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    Guia do Cientista de Dados das Galáxias

    Guia do Cientista de Dados das Galáxias

    Repository for gathering information on study materials

    Guia do Cientista de Dados das Galáxias is an open-source community repository that aggregates educational resources, tools, and references related to data science, machine learning, and analytics. The project was created by the Pizza de Dados community with the goal of organizing useful materials for people interested in learning or working in the data science ecosystem. The repository collects links to books, podcasts, tutorials, datasets, communities, and study groups that can help...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    IoT Technical Guide

    IoT Technical Guide

    IoT Technical Guide - Building a High-Performance IoT Platform

    ...It walks readers through the architecture, concepts, and engineering stack behind IoT systems instead of only listing isolated tools. The guide covers topics such as IoT market context, device models, ThingsBoard source-code learning, MQTT broker setup, CoAP services, message peak shaving, data modeling, and database selection. It is intended for developers who want to learn how real IoT platforms organize devices, messages, protocols, and backend services. The repository is especially useful as a structured study path for readers preparing to build or customize an IoT platform. It functions more as a technical tutorial and roadmap than as a single installable software package.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    GiantMIDI-Piano

    GiantMIDI-Piano

    Classical piano MIDI dataset

    ...The dataset contains thousands of piano works, spanning a large number of composers and styles, with each piece transcribed into high-precision MIDI files capturing note events, pedal usage, velocities, etc. It provides a resource for music information retrieval (MIR), symbolic music modeling, composer classification, music generation, analysis of classical piano repertoire, and data-driven research in musicology or AI-based composition. Because the dataset is machine-generated via an automated transcription pipeline, it offers consistency, scale, and accessibility that would be difficult to achieve manually — enabling researchers to work with large corpora of piano music without copyright restrictions on symbolic data.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    Machine Learning in Asset Management

    Machine Learning in Asset Management

    Machine Learning in Asset Management

    ...The project collects educational materials, code implementations, and experiments related to applying artificial intelligence methods in financial markets. It covers topics such as predictive modeling for asset prices, portfolio optimization strategies, and risk management using machine learning algorithms. The repository also includes references to academic research, tutorials, and datasets that help users understand how machine learning can enhance traditional investment strategies. Many of the experiments focus on applying supervised learning, reinforcement learning, and statistical modeling techniques to financial data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Zenoss Community Edition

    Zenoss Community Edition

    Zenoss - Intelligent IT Operations Management

    Zenoss provides software-defined IT operations for the world’s largest organizations. We deliver the ultimate level of IT service health with simplicity by providing the most granular and intelligent IT service modeling possible, at any scale, and sharing these unique insights with other IT operations management (ITOM) tools to make them more efficient. Zenoss Community Edition is not a “demo” or trial version of Zenoss Enterprise or Zenoss Cloud! Before You install Zenoss Community...
    Downloads: 16 This Week
    Last Update:
    See Project
  • 16
    Statistics for Data Scientists

    Statistics for Data Scientists

    "Statistics for Data Scientists: 50 Essential Concepts"

    The “statistics-for-data-scientists” repository is a pedagogical resource designed to bridge rigorous statistics theory and practical data science workflows. The code and materials are intended to help data scientists and analysts grasp statistical principles (e.g. inference, regressions, hypothesis testing, probability, confidence intervals) in contexts relevant to real data analysis tasks.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    scikit-learn tips

    scikit-learn tips

    50 scikit-learn tips

    scikit-learn-tips is an educational repository that collects practical advice and best practices for using the scikit-learn machine learning library effectively. The project consists of short explanations and examples that highlight common patterns, pitfalls, and techniques used when building machine learning workflows in Python. Each tip typically demonstrates how specific components of scikit-learn, such as pipelines, preprocessing utilities, or model evaluation tools, should be applied in...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    SparrowRecSys

    SparrowRecSys

    A Deep Learning Recommender System

    SparrowRecSys is an open-source deep learning recommendation system framework designed to demonstrate the architecture and implementation of modern industrial-scale recommender systems. The project integrates multiple machine learning models and data processing pipelines to simulate how real-world recommendation platforms operate. It includes components for offline data processing, feature engineering, model training, real-time data updates, and online recommendation services. SparrowRecSys supports a wide range of state-of-the-art recommendation algorithms, including models for click-through rate prediction and user behavior modeling that are widely used in advertising and content recommendation systems. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Cockpit Next

    Cockpit Next

    Add content management functionality to any site

    Cockpit is an open-source headless content management system designed to add flexible content management functionality to websites, applications, and digital platforms. The system provides a lightweight backend where developers can create custom content models and manage structured data without imposing a specific front-end framework. Because Cockpit follows an API-first architecture, content can be delivered through REST or GraphQL APIs to any application, including websites, mobile apps,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20

    dycomodel

    Dynamic Consumption Modeling

    Dynamic Consumption Modeling to create the model of the feature consumption’s flow taking in consideration: the data from prior period of consumption (one time produced) specific model suggested by system and customized by customer everyday/periodically inventory quantity and to predict the remaining quantity of items set the reordering points (first point, with fixed quantity, for fixed dates) GitHub: https://github.com/surban1974/dycomodel
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Quan is designed to model physical quantities in C++ programs. Advantages include automated dimensional analysis checking, automatic unit conversions, self documentation of code.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Texthero

    Texthero

    Text preprocessing, representation and visualization from zero to hero

    Texthero is a python package to work with text data efficiently. It empowers NLP developers with a tool to quickly understand any text-based dataset and it provides a solid pipeline to clean and represent text data, from zero to hero.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Transformer TTS

    Transformer TTS

    Implementation of a Transformer based neural network

    ...The repository ships with tooling to build datasets (especially LJSpeech) and create training data, plus scripts to train both the aligner and the TTS model, monitor training with TensorBoard, and resume or reset training runs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    deps.cloud

    deps.cloud

    Index and query dependencies across your company's private repository

    ...A dependency defines a relationship between two modules. Specifically a directed, versioned, and historical relationship. It’s this type of relationship that makes modeling data in traditional data stores difficult.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    JSONAPI

    JSONAPI

    jsonapi.org style payload serializer and deserializer

    jsonapi provides helpers and reference code for working with the JSON:API specification, focusing on predictable serialization, deserialization, and linkage of related resources. It enforces the spec’s conventions—data, attributes, relationships, included—so clients and servers exchange data in a consistent, cacheable way. By centralizing how resource identifiers, links, and pagination metadata are emitted, it reduces subtle incompatibilities between services. The library favors explicit...
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB