Showing 412 open source projects for "python data analysis"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 1
    MLPACK is a C++ machine learning library with emphasis on scalability, speed, and ease-of-use. Its aim is to make machine learning possible for novice users by means of a simple, consistent API, while simultaneously exploiting C++ language features to provide maximum performance and flexibility for expert users. * More info + downloads: https://mlpack.org * Git repo: https://github.com/mlpack/mlpack
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    OGB

    OGB

    Benchmark datasets, data loaders, and evaluators for graph machine

    The Open Graph Benchmark (OGB) is a collection of realistic, large-scale, and diverse benchmark datasets for machine learning on graphs. OGB datasets are automatically downloaded, processed, and split using the OGB Data Loader. The model performance can be evaluated using the OGB Evaluator in a unified manner. OGB is a community-driven initiative in active development. We expect the benchmark datasets to evolve. OGB provides a diverse set of challenging and realistic benchmark datasets that...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    DIG

    DIG

    A library for graph deep learning research

    The key difference with current graph deep learning libraries, such as PyTorch Geometric (PyG) and Deep Graph Library (DGL), is that, while PyG and DGL support basic graph deep learning operations, DIG provides a unified testbed for higher level, research-oriented graph deep learning tasks, such as graph generation, self-supervised learning, explainability, 3D graphs, and graph out-of-distribution. If you are working or plan to work on research in graph deep learning, DIG enables you to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    FFCV

    FFCV

    Fast Forward Computer Vision (and other ML workloads!)

    ffcv is a drop-in data loading system that dramatically increases data throughput in model training. From gridding to benchmarking to fast research iteration, there are many reasons to want faster model training. Below we present premade codebases for training on ImageNet and CIFAR, including both (a) extensible codebases and (b) numerous premade training configurations.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    Let your crypto work for you

    Put idle assets to work with competitive interest rates, borrow without selling, and trade with precision. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 5
    T81 558

    T81 558

    Applications of Deep Neural Networks

    Deep learning is a group of exciting new technologies for neural networks. Through a combination of advanced training techniques and neural network architectural components, it is now possible to create neural networks that can handle tabular data, images, text, and audio as both input and output. Deep learning allows a neural network to learn hierarchies of information in a way that is like the function of the human brain. This course will introduce the student to classic neural network...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Merlion

    Merlion

    A Machine Learning Framework for Time Series Intelligence

    Merlion is a Python library for time series intelligence. It provides an end-to-end machine learning framework that includes loading and transforming data, building and training models, post-processing model outputs, and evaluating model performance. It supports various time series learning tasks, including forecasting, anomaly detection, and change point detection for both univariate and multivariate time series.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 7
    Sockeye

    Sockeye

    Sequence-to-sequence framework, focused on Neural Machine Translation

    Sockeye is an open-source sequence-to-sequence framework for Neural Machine Translation built on PyTorch. It implements distributed training and optimized inference for state-of-the-art models, powering Amazon Translate and other MT applications. For a quickstart guide to training a standard NMT model on any size of data, see the WMT 2014 English-German tutorial. If you are interested in collaborating or have any questions, please submit a pull request or issue. You can also send questions...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Mars Framework

    Mars Framework

    Mars is a tensor-based unified framework for large-scale data

    Mars is a distributed computing framework designed to scale scientific computing and data science workloads across large clusters while preserving the familiar programming interfaces of common Python libraries. The project provides a tensor-based execution model that extends the capabilities of tools such as NumPy, pandas, and scikit-learn so that large datasets can be processed in parallel without rewriting code for distributed environments.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 9

    audioFlux

    A library for audio and music analysis, feature extraction.

    audioflux is a deep learning tool library for audio and music analysis, feature extraction. It supports dozens of time-frequency analysis transformation methods and hundreds of corresponding time-domain and frequency-domain feature combinations. It can be provided to deep learning networks for training, and is used to study various tasks in the audio field such as Classification, Separation, Music Information Retrieval(MIR) and ASR etc.
    Downloads: 4 This Week
    Last Update:
    See Project
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 10
    UnionML

    UnionML

    Build and deploy machine learning microservices

    Creating ML apps should be simple and frictionless. UnionML is an open-source Python framework built on top of Flyte™, unifying the complex ecosystem of ML tools into a single interface. Combine the tools that you love using a simple, standardized API so you can stop writing so much boilerplate and focus on what matters: the data and the models that learn from them. Fit the rich ecosystem of tools and frameworks into a common protocol for machine learning.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    d2l-zh

    d2l-zh

    Chinese-language edition of Dive into Deep Learning

    d2l‑zh is the Chinese-language edition of Dive into Deep Learning, an interactive, open‑source deep learning textbook that combines code, math, and explanatory text. It features runnable Jupyter notebooks compatible with multiple frameworks (e.g., PyTorch, MXNet, TensorFlow), comprehensive theoretical analysis, and exercises. Widely adopted in over 70 countries and used by more than 500 universities for teaching deep learning.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Karate Club

    Karate Club

    An API Oriented Open-source Python Framework for Unsupervised Learning

    Karate Club is an unsupervised machine learning extension library for NetworkX. Karate Club consists of state-of-the-art methods to do unsupervised learning on graph-structured data. To put it simply it is a Swiss Army knife for small-scale graph mining research. First, it provides network embedding techniques at the node and graph level. Second, it includes a variety of overlapping and non-overlapping community detection methods. Implemented methods cover a wide range of network science...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    LSTMs for Human Activity Recognition

    LSTMs for Human Activity Recognition

    Human Activity Recognition example using TensorFlow on smartphone

    LSTM-Human-Activity-Recognition is a machine learning project that demonstrates how recurrent neural networks can be used to recognize human activities from sensor data. The repository implements a deep learning model based on Long Short-Term Memory (LSTM) networks to classify physical activities using time-series data collected from wearable sensors. The project uses the well-known Human Activity Recognition dataset derived from smartphone accelerometer and gyroscope signals. Through the...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    DeepCTR

    DeepCTR

    Package of deep-learning based CTR models

    DeepCTR is a Easy-to-use,Modular and Extendible package of deep-learning based CTR models along with lots of core components layers which can be used to easily build custom models. You can use any complex model with model.fit(), and model.predict(). Provide tf.keras.Model like interface for quick experiment. Provide tensorflow estimator interface for large scale data and distributed training. It is compatible with both tf 1.x and tf 2.x. With the great success of deep learning,DNN-based...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Python ML Jupyter Notebooks

    Python ML Jupyter Notebooks

    Practice and tutorial-style notebooks

    Python ML Jupyter Notebooks is an educational repository that demonstrates how to implement machine learning algorithms and data science workflows using Python. The project provides numerous examples and tutorials covering classical machine learning techniques such as regression, classification, clustering, and dimensionality reduction.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Pattern

    Pattern

    Web mining module for Python, with tools for scraping

    Pattern is an open-source Python library that provides tools for web mining, natural language processing, machine learning, and network analysis. The project integrates multiple capabilities into a single framework that allows developers to collect, process, and analyze textual data from the web. It includes modules for web scraping and crawling that can retrieve information from sources such as social media platforms, search engines, and online knowledge bases. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Data Science Collected Resources

    Data Science Collected Resources

    Carefully curated resource links for data science in one place

    ...Its goal is to provide learners and practitioners with easy access to high-quality resources related to data science tools, programming languages, cloud platforms, and machine learning techniques. The repository includes links to materials discussing topics such as artificial intelligence research, AWS infrastructure, machine learning algorithms, and data analysis tools. It also contains supplementary documents like cheat sheets and machine learning notes that help readers review important concepts quickly.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Machine Learning Git Codebook

    Machine Learning Git Codebook

    For extensive instructor led learning

    Machine Learning Git Codebook is an educational repository that provides a structured introduction to data science and machine learning concepts through a series of interactive notebooks and practical examples. The project is designed as a self-paced learning resource that walks learners through the full data science workflow, including data preprocessing, exploratory analysis, feature engineering, and model development. It covers a wide range of machine learning techniques such as decision trees, clustering methods, nearest neighbor algorithms, anomaly detection, and probabilistic classifiers. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    DeepCTR-Torch

    DeepCTR-Torch

    Easy-to-use,Modular and Extendible package of deep-learning models

    DeepCTR-Torch is an easy-to-use, Modular and Extendible package of deep-learning-based CTR models along with lots of core components layers that can be used to build your own custom model easily.It is compatible with PyTorch.You can use any complex model with model.fit() and model.predict(). With the great success of deep learning, DNN-based techniques have been widely used in CTR estimation tasks. The data in the CTR estimation task usually includes high sparse,high cardinality categorical...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    DialoGPT

    DialoGPT

    Large-scale pretraining for dialogue

    DialoGPT is an open-source conversational language model developed by Microsoft Research for generating natural dialogue responses using large-scale transformer architectures. The system is built on the GPT-2 architecture and is designed specifically for multi-turn conversation tasks, enabling machines to produce coherent responses during interactive dialogue. The model was trained on a massive dataset of approximately 147 million conversational exchanges extracted from Reddit discussion...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    SGX-Full-OrderBook-Tick-Data-Trading

    SGX-Full-OrderBook-Tick-Data-Trading

    Providing the solutions for high-frequency trading (HFT) strategies

    SGX-Full-OrderBook-Tick-Data-Trading-Strategy is an open-source research project focused on modeling high-frequency financial market behavior using machine learning techniques. The repository analyzes tick-level order book data from the Singapore Exchange and attempts to capture the dynamics of limit order book movements. By extracting features such as order depth ratios and price movement indicators, the system trains machine learning models to predict short-term market changes. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    OmicSelector

    OmicSelector

    Feature selection and deep learning modeling for omic biomarker study

    OmicSelector is an environment, Docker-based web application, and R package for biomarker signature selection (feature selection) from high-throughput experiments and others. It was initially developed for miRNA-seq (small RNA, smRNA-seq; hence the name was miRNAselector), RNA-seq and qPCR, but can be applied for every problem where numeric features should be selected to counteract overfitting of the models. Using our tool, you can choose features, like miRNAs, with the most significant...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Python Machine Learning 3rd Ed.

    Python Machine Learning 3rd Ed.

    The "Python Machine Learning (3rd edition)" book code repository

    Python Machine Learning 3rd Ed. repository contains the complete source code that accompanies the book Python Machine Learning by Sebastian Raschka and collaborators. The project provides implementations of machine learning algorithms and data science workflows described in the book, enabling readers to experiment with real code while studying theoretical concepts.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Elephas

    Elephas

    Distributed Deep learning with Keras & Spark

    Elephas is an extension of Keras, which allows you to run distributed deep learning models at scale with Spark. Elephas currently supports a number of applications. Elephas brings deep learning with Keras to Spark. Elephas intends to keep the simplicity and high usability of Keras, thereby allowing for fast prototyping of distributed models, which can be run on massive data sets. Elephas implements a class of data-parallel algorithms on top of Keras, using Spark's RDDs and data frames. Keras...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    mlr

    mlr

    Machine Learning in R

    R does not define a standardized interface for its machine-learning algorithms. Therefore, for any non-trivial experiments, you need to write lengthy, tedious, and error-prone wrappers to call the different algorithms and unify their respective output. {mlr} provides this infrastructure so that you can focus on your experiments! The framework provides supervised methods like classification, regression, and survival analysis along with their corresponding evaluation and optimization methods,...
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB