Showing 118 open source projects for "data modeling"

View related business solutions
  • Enterprise-grade ITSM, for every business Icon
    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

    Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
    Try it Free
  • Build Securely on Azure with Proven Frameworks Icon
    Build Securely on Azure with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 1
    GiantMIDI-Piano

    GiantMIDI-Piano

    Classical piano MIDI dataset

    ...The dataset contains thousands of piano works, spanning a large number of composers and styles, with each piece transcribed into high-precision MIDI files capturing note events, pedal usage, velocities, etc. It provides a resource for music information retrieval (MIR), symbolic music modeling, composer classification, music generation, analysis of classical piano repertoire, and data-driven research in musicology or AI-based composition. Because the dataset is machine-generated via an automated transcription pipeline, it offers consistency, scale, and accessibility that would be difficult to achieve manually — enabling researchers to work with large corpora of piano music without copyright restrictions on symbolic data.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 2
    Machine Learning in Asset Management

    Machine Learning in Asset Management

    Machine Learning in Asset Management

    ...The project collects educational materials, code implementations, and experiments related to applying artificial intelligence methods in financial markets. It covers topics such as predictive modeling for asset prices, portfolio optimization strategies, and risk management using machine learning algorithms. The repository also includes references to academic research, tutorials, and datasets that help users understand how machine learning can enhance traditional investment strategies. Many of the experiments focus on applying supervised learning, reinforcement learning, and statistical modeling techniques to financial data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    scikit-learn tips

    scikit-learn tips

    50 scikit-learn tips

    scikit-learn-tips is an educational repository that collects practical advice and best practices for using the scikit-learn machine learning library effectively. The project consists of short explanations and examples that highlight common patterns, pitfalls, and techniques used when building machine learning workflows in Python. Each tip typically demonstrates how specific components of scikit-learn, such as pipelines, preprocessing utilities, or model evaluation tools, should be applied in...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    SparrowRecSys

    SparrowRecSys

    A Deep Learning Recommender System

    SparrowRecSys is an open-source deep learning recommendation system framework designed to demonstrate the architecture and implementation of modern industrial-scale recommender systems. The project integrates multiple machine learning models and data processing pipelines to simulate how real-world recommendation platforms operate. It includes components for offline data processing, feature engineering, model training, real-time data updates, and online recommendation services. SparrowRecSys supports a wide range of state-of-the-art recommendation algorithms, including models for click-through rate prediction and user behavior modeling that are widely used in advertising and content recommendation systems. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • 5
    Texthero

    Texthero

    Text preprocessing, representation and visualization from zero to hero

    Texthero is a python package to work with text data efficiently. It empowers NLP developers with a tool to quickly understand any text-based dataset and it provides a solid pipeline to clean and represent text data, from zero to hero.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Transformer TTS

    Transformer TTS

    Implementation of a Transformer based neural network

    ...The repository ships with tooling to build datasets (especially LJSpeech) and create training data, plus scripts to train both the aligner and the TTS model, monitor training with TensorBoard, and resume or reset training runs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    MLOps Course

    MLOps Course

    Learn how to design, develop, deploy and iterate on ML apps

    The MLOps Course by Goku Mohandas is an open-source curriculum that teaches how to combine machine learning with solid software engineering to build production-grade ML applications. It is structured around the full lifecycle: data pipelines, modeling, experiment tracking, deployment, testing, monitoring, and iteration. The repository itself contains configuration, code examples, and links to accompanying lessons hosted on the Made With ML site, which provide detailed narrative explanations and diagrams. Instead of focusing only on model training, the course emphasizes best practices like modular code design, CI/CD, containerization, reproducibility, and responsible ML (including monitoring and feedback loops). ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    XLM (Cross-lingual Language Model)

    XLM (Cross-lingual Language Model)

    PyTorch original implementation of Cross-lingual Language Model

    XLM (Cross-lingual Language Model) is a family of multilingual pretraining methods that align representations across languages to enable strong zero-shot transfer. It popularized objectives like Masked Language Modeling (MLM) across many languages and Translation Language Modeling (TLM) that jointly trains on parallel sentence pairs to tighten cross-lingual alignment. Using a shared subword vocabulary, XLM learns language-agnostic features that work well for classification and sequence...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Objectron

    Objectron

    A dataset of short, object-centric video clips

    The Objectron dataset is a collection of short, object-centric video clips, which are accompanied by AR session metadata that includes camera poses, sparse point-clouds and characterization of the planar surfaces in the surrounding environment. In each video, the camera moves around the object, capturing it from different angles. The data also contain manually annotated 3D bounding boxes for each object, which describe the object’s position, orientation, and dimensions. The dataset consists...
    Downloads: 0 This Week
    Last Update:
    See Project
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • 10
    Pipeline for training Language Models

    Pipeline for training Language Models

    Pipeline for training Language Models using PyTorch.

    Pipeline for training Language Models using PyTorch. Inspired by Yandex Data School NLP Course (week 03: Language Modeling) Prepared text file with space-separated words on each line.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    OpenAI Glow

    OpenAI Glow

    Copy code in "Glow: Generative Flow with Invertible 1x1 Convolutions"

    Glow is an open source generative model released by OpenAI that demonstrates flow-based generative modeling techniques. Unlike models that rely on approximate inference, Glow uses invertible transformations to directly learn the data distribution, allowing for exact likelihood computation and efficient sampling. The model is capable of producing high-quality synthetic images while maintaining interpretable latent spaces that enable meaningful manipulation of generated outputs. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Image GPT

    Image GPT

    Large-scale autoregressive pixel model for image generation by OpenAI

    Image-GPT is the official research code and models from OpenAI’s paper Generative Pretraining from Pixels. The project adapts GPT-2 to the image domain, showing that the same transformer architecture can model sequences of pixels without altering its fundamental structure. It provides scripts to download pretrained checkpoints of different model sizes (small, medium, large) trained on large-scale datasets and includes utilities for handling color quantization with a 9-bit palette....
    Downloads: 6 This Week
    Last Update:
    See Project
  • 13
    RoboSat

    RoboSat

    Semantic segmentation on aerial and satellite imagery

    RoboSat is an end-to-end pipeline written in Python 3 for feature extraction from aerial and satellite imagery. Features can be anything visually distinguishable in the imagery for example: buildings, parking lots, roads, or cars.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Machine Learning Mindmap

    Machine Learning Mindmap

    A mindmap summarising Machine Learning concepts

    ...The project organizes a wide range of machine learning topics into an interconnected diagram that helps learners understand how concepts relate to one another across the broader field of artificial intelligence. The mind map covers fundamental areas such as data preprocessing, statistical analysis, supervised learning, unsupervised learning, reinforcement learning, and deep learning architectures. By arranging these concepts visually, the repository allows students and practitioners to quickly explore the relationships between algorithms, techniques, and modeling approaches used in modern machine learning workflows. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    TenorSpace.js

    TenorSpace.js

    Neural network 3D visualization framework

    TensorSpace is a neural network 3D visualization framework built using TensorFlow.js, Three.js and Tween.js. TensorSpace provides Keras-like APIs to build deep learning layers, load pre-trained models, and generate a 3D visualization in the browser. From TensorSpace, it is intuitive to learn what the model structure is, how the model is trained and how the model predicts the results based on the intermediate information. After preprocessing the model, TensorSpace supports the visualization...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    DS-Take-Home

    DS-Take-Home

    Solution to the book A Collection of Data Science Take-Home Challenge

    ...The problems cover a broad set of applied data science topics including conversion rate analysis, fraud detection, employee retention modeling, marketing campaign evaluation, and recommendation-style problems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Learn_Data_Science_in_3_Months

    Learn_Data_Science_in_3_Months

    This is the Curriculum for "Learn Data Science in 3 Months"

    This project lays out a 12-week plan to go from basics to a portfolio-ready understanding of data science. It breaks the journey into clear stages: Python fundamentals, data wrangling, visualization, statistics, machine learning, and end-to-end projects. The schedule mixes learning and doing, encouraging you to build small deliverables each week—like notebooks, dashboards, and model demos—to reinforce skills. It also includes suggestions for datasets and problem domains so you aren’t stuck...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    OpenSeq2Seq

    OpenSeq2Seq

    Toolkit for efficient experimentation with Speech Recognition

    ...Its core goal is to give researchers a flexible, modular framework for building and training encoder–decoder architectures while fully leveraging distributed and mixed-precision training. The toolkit includes ready-made models for neural machine translation, automatic speech recognition, speech synthesis, language modeling, and additional NLP tasks such as sentiment analysis. It supports multi-GPU and multi-node data-parallel training, and integrates with Horovod to scale out across large GPU clusters. Mixed-precision support (float16) is optimized for NVIDIA Volta and Turing GPUs, allowing significant speedups and memory savings without sacrificing model quality. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    MatlabFunc

    MatlabFunc

    Matlab codes for feature learning

    MatlabFunc is a collection of MATLAB functions developed by the ZJULearning group to support various tasks in computer vision, machine learning, and numerical computation. The repository brings together a wide range of utility scripts, algorithms, and implementations that serve as building blocks for research and development. These functions cover areas such as matrix operations, optimization, data processing, and visualization, making them broadly applicable across different research...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 20
    Convolutional Recurrent Neural Network

    Convolutional Recurrent Neural Network

    Convolutional Recurrent Neural Network (CRNN) for image-based sequence

    Convolutional Recurrent Neural Network provides an implementation of the Convolutional Recurrent Neural Network (CRNN) architecture, a deep learning model designed for image-based sequence recognition tasks such as optical character recognition and scene text recognition. The architecture combines convolutional neural networks for extracting visual features from images with recurrent neural networks that model sequential dependencies in the extracted features. This hybrid approach allows the...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Edward

    Edward

    A probabilistic programming language in TensorFlow

    A library for probabilistic modeling, inference, and criticism. Edward is a Python library for probabilistic modeling, inference, and criticism. It is a testbed for fast experimentation and research with probabilistic models, ranging from classical hierarchical models on small data sets to complex deep probabilistic models on large data sets. Edward fuses three fields, Bayesian statistics and machine learning, deep learning, and probabilistic programming. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Seq2seq Chatbot for Keras

    Seq2seq Chatbot for Keras

    This repository contains a new generative model of chatbot

    This repository contains a new generative model of chatbot based on seq2seq modeling. The trained model available here used a small dataset composed of ~8K pairs of context (the last two utterances of the dialogue up to the current point) and respective response. The data were collected from dialogues of English courses online. This trained model can be fine-tuned using a closed-domain dataset to real-world applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Retrieval-Based Conversational Model

    Retrieval-Based Conversational Model

    Dual LSTM Encoder for Dialog Response Generation

    Retrieval-Based Conversational Model in Tensorflow is a project implementing a retrieval-based conversational model using a dual LSTM encoder architecture in TensorFlow, illustrating how neural networks can be trained to select appropriate responses from a fixed set of candidate replies rather than generate them from scratch. The core idea is to embed both the conversation context and potential replies into vector representations, then score how well each candidate fits the current dialogue,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Twitter Research Data Collector
    It gives facility of collecting tweets through Twitter Streaming API w.r.t different search criteria and to save tweets in CSV and ARFF (WEKA) file formats.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Apache PredictionIO

    Apache PredictionIO

    Machine learning server for developers and ML engineers

    Apache PredictionIO® is an open source Machine Learning Server built on top of a state-of-the-art open source stack for developers and data scientists to create predictive engines for any machine learning task. Quickly build and deploy an engine as a web service on production with customizable templates; respond to dynamic queries in real-time once deployed as a web service; evaluate and tune multiple engine variants systematically; unify data from multiple platforms in batch or in real-time for comprehensive predictive analytics; speed up machine learning modeling with systematic processes and pre-built evaluation measures; support machine learning and data processing libraries such as Spark MLLib and OpenNLP; implement your own machine learning models and seamlessly incorporate them into your engine; simplify data infrastructure management.
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB