Showing 330 open source projects for "python data analysis"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • MongoDB 8.0 on Atlas | Run anywhere Icon
    MongoDB 8.0 on Atlas | Run anywhere

    Now available in even more cloud regions across AWS, Azure, and Google Cloud.

    MongoDB 8.0 brings enhanced performance and flexibility to Atlas—with expanded availability across 125+ regions globally. Build modern apps anywhere your users are, with the power of a modern database behind you.
    Learn More
  • 1
    openTSNE

    openTSNE

    Extensible, parallel implementations of t-SNE

    openTSNE is a modular Python implementation of t-Distributed Stochasitc Neighbor Embedding (t-SNE) [1], a popular dimensionality-reduction algorithm for visualizing high-dimensional data sets. openTSNE incorporates the latest improvements to the t-SNE algorithm, including the ability to add new data points to existing embeddings [2], massive speed improvements [3] [4] [5], enabling t-SNE to scale to millions of data points, and various tricks to improve the global alignment of the resulting...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Hamilton DAGWorks

    Hamilton DAGWorks

    Helps scientists define testable, modular, self-documenting dataflow

    Hamilton is a lightweight Python library for directed acyclic graphs (DAGs) of data transformations. Your DAG is portable; it runs anywhere Python runs, whether it's a script, notebook, Airflow pipeline, FastAPI server, etc. Your DAG is expressive; Hamilton has extensive features to define and modify the execution of a DAG (e.g., data validation, experiment tracking, remote execution). To create a DAG, write regular Python functions that specify their dependencies with their parameters...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Linfa

    Linfa

    A Rust machine learning framework

    linfa aims to provide a comprehensive toolkit to build Machine Learning applications with Rust. Kin in spirit to Python's scikit-learn, it focuses on common preprocessing tasks and classical ML algorithms for your everyday ML tasks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    fklearn

    fklearn

    Functional Machine Learning

    fklearn uses functional programming principles to make it easier to solve real problems with Machine Learning.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Sales CRM and Pipeline Management Software | Pipedrive Icon
    Sales CRM and Pipeline Management Software | Pipedrive

    The easy and effective CRM for closing deals

    Pipedrive’s simple interface empowers salespeople to streamline workflows and unite sales tasks in one workspace. Unlock instant sales insights with Pipedrive’s visual sales pipeline and fine-tune your strategy with robust reporting features and a personalized AI Sales Assistant.
    Try it for free
  • 5
    pmdarima

    pmdarima

    Statistical library designed to fill the void in Python's time series

    A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    pycm

    pycm

    Multi-class confusion matrix library in Python

    PyCM is a multi-class confusion matrix library written in Python that supports both input data vectors and direct matrix, and a proper tool for post-classification model evaluation that supports most classes and overall statistics parameters. PyCM is the swiss-army knife of confusion matrices, targeted mainly at data scientists that need a broad array of metrics for predictive models and an accurate evaluation of large variety of classifiers.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    POT

    POT

    Python Optimal Transport

    This open source Python library provides several solvers for optimization problems related to Optimal Transport for signal, image processing and machine learning.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    TextAttack

    TextAttack

    Python framework for adversarial attacks, and data augmentation

    Generating adversarial examples for NLP models. TextAttack is a Python framework for adversarial attacks, data augmentation, and model training in NLP.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    River ML

    River ML

    Online machine learning in Python

    River is a Python library for online machine learning. It aims to be the most user-friendly library for doing machine learning on streaming data. River is the result of a merger between creme and scikit-multiflow.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Crowdtesting That Delivers | Testeum Icon
    Crowdtesting That Delivers | Testeum

    Unfixed bugs delaying your launch? Test with real users globally – check it out for free, results in days.

    Testeum connects your software, app, or website to a worldwide network of testers, delivering detailed feedback in under 48 hours. Ensure functionality and refine UX on real devices, all at a fraction of traditional costs. Trusted by startups and enterprises alike, our platform streamlines quality assurance with actionable insights. Click to perfect your product now.
    Click to perfect your product now.
  • 10
    AutoKeras

    AutoKeras

    AutoML library for deep learning

    AutoKeras: An AutoML system based on Keras. It is developed by DATA Lab at Texas A&M University. The goal of AutoKeras is to make machine learning accessible to everyone. AutoKeras only support Python 3. If you followed previous steps to use virtualenv to install tensorflow, you can just activate the virtualenv. Currently, AutoKeras is only compatible with Python >= 3.7 and TensorFlow >= 2.8.0. AutoKeras supports several tasks with extremely simple interface. AutoKeras would search for the best...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    gensim

    gensim

    Topic Modelling for Humans

    Gensim is a Python library for topic modeling, document indexing, and similarity retrieval with large corpora. The target audience is the natural language processing (NLP) and information retrieval (IR) community.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Featuretools

    Featuretools

    An open source python library for automated feature engineering

    An open source Python framework for automated feature engineering. Featuretools automatically creates features from temporal and relational datasets. Featuretools uses DFS for automated feature engineering. You can combine your raw data with what you know about your data to build meaningful features for machine learning and predictive modeling. Featuretools provides APIs to ensure only valid data is used for calculations, keeping your feature vectors safe from common label leakage problems. You...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    fugue

    fugue

    A unified interface for distributed computing

    Fugue is a unified interface for distributed computing that lets users execute Python, Pandas, and SQL code on Spark, Dask, and Ray with minimal rewrites.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Alibi Detect

    Alibi Detect

    Algorithms for outlier, adversarial and drift detection

    Alibi Detect is an open source Python library focused on outlier, adversarial and drift detection. The package aims to cover both online and offline detectors for tabular data, text, images and time series. Both TensorFlow and PyTorch backends are supported for drift detection.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Arize Phoenix

    Arize Phoenix

    Uncover insights, surface problems, monitor, and fine tune your LLM

    Phoenix provides ML insights at lightning speed with zero-config observability for model drift, performance, and data quality. Phoenix is an Open Source ML Observability library designed for the Notebook. The toolset is designed to ingest model inference data for LLMs, CV, NLP and tabular datasets. It allows Data Scientists to quickly visualize their model data, monitor performance, track down issues & insights, and easily export to improve. Deep Learning Models (CV, LLM, and Generative...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    flair

    flair

    A very simple framework for state-of-the-art NLP

    A very simple framework for state-of-the-art NLP. Developed by Humboldt University of Berlin and friends. A powerful NLP library. Flair allows you to apply our state-of-the-art natural language processing (NLP) models to your text, such as named entity recognition (NER), sentiment analysis, part-of-speech tagging (PoS), special support for biomedical texts, sense disambiguation and classification, with support for a rapidly growing number of languages. A text embedding library. Flair has simple...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    fastai

    fastai

    Deep learning library

    ... of many deep learning and data processing techniques in terms of decoupled abstractions. These abstractions can be expressed concisely and clearly by leveraging the dynamism of the underlying Python language and the flexibility of the PyTorch library. fastai is organized around two main design goals: to be approachable and rapidly productive, while also being deeply hackable and configurable. It is built on top of a hierarchy of lower-level APIs which provide composable building blocks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    snorkel

    snorkel

    A system for quickly generating training data with weak supervision

    The Snorkel team is now focusing their efforts on Snorkel Flow, an end-to-end AI application development platform based on the core ideas behind Snorkel. The Snorkel project started at Stanford in 2016 with a simple technical bet: that it would increasingly be the training data, not the models, algorithms, or infrastructure, that decided whether a machine learning project succeeded or failed. Given this premise, we set out to explore the radical idea that you could bring mathematical...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    DALI

    DALI

    A GPU-accelerated library containing highly optimized building blocks

    The NVIDIA Data Loading Library (DALI) is a library for data loading and pre-processing to accelerate deep learning applications. It provides a collection of highly optimized building blocks for loading and processing image, video and audio data. It can be used as a portable drop-in replacement for built-in data loaders and data iterators in popular deep learning frameworks. Deep learning applications require complex, multi-stage data processing pipelines that include loading, decoding...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    SageMaker Training Toolkit

    SageMaker Training Toolkit

    Train machine learning models within Docker containers

    Train machine learning models within a Docker container using Amazon SageMaker. Amazon SageMaker is a fully managed service for data science and machine learning (ML) workflows. You can use Amazon SageMaker to simplify the process of building, training, and deploying ML models. To train a model, you can include your training script and dependencies in a Docker container that runs your training code. A container provides an effectively isolated environment, ensuring a consistent runtime...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Chronos Forecasting

    Chronos Forecasting

    Pretrained (Language) Models for Probabilistic Time Series Forecasting

    Chronos is a family of pretrained time series forecasting models based on language model architectures. A time series is transformed into a sequence of tokens via scaling and quantization, and a language model is trained on these tokens using the cross-entropy loss. Once trained, probabilistic forecasts are obtained by sampling multiple future trajectories given the historical context. Chronos models have been trained on a large corpus of publicly available time series data, as well...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    TorchAudio

    TorchAudio

    Data manipulation and transformation for audio signal processing

    The aim of torchaudio is to apply PyTorch to the audio domain. By supporting PyTorch, torchaudio follows the same philosophy of providing strong GPU acceleration, having a focus on trainable features through the autograd system, and having consistent style (tensor names and dimension names). Therefore, it is primarily a machine learning library and not a general signal processing library. The benefits of PyTorch can be seen in torchaudio through having all the computations be through PyTorch...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    High-Level Training Utilities Pytorch

    High-Level Training Utilities Pytorch

    High-level training, data augmentation, and utilities for Pytorch

    Contains significant improvements, bug fixes, and additional support. Get it from the releases, or pull the master branch. This package provides a few things. A high-level module for Keras-like training with callbacks, constraints, and regularizers. Comprehensive data augmentation, transforms, sampling, and loading. Utility tensor and variable functions so you don't need numpy as often. Have any feature requests? Submit an issue! I'll make it happen. Specifically, any data augmentation, data...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    TorchRec

    TorchRec

    Pytorch domain library for recommendation systems

    TorchRec is a PyTorch domain library built to provide common sparsity & parallelism primitives needed for large-scale recommender systems (RecSys). It allows authors to train models with large embedding tables sharded across many GPUs. Parallelism primitives that enable easy authoring of large, performant multi-device/multi-node models using hybrid data-parallelism/model-parallelism. The TorchRec sharder can shard embedding tables with different sharding strategies including data-parallel...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Raster Vision

    Raster Vision

    Open source framework for deep learning satellite and aerial imagery

    Raster Vision is an open source framework for Python developers building computer vision models on satellite, aerial, and other large imagery sets (including oblique drone imagery). There is built-in support for chip classification, object detection, and semantic segmentation using PyTorch. Raster Vision allows engineers to quickly and repeatably configure pipelines that go through core components of a machine learning workflow: analyzing training data, creating training chips, training models...
    Downloads: 0 This Week
    Last Update:
    See Project
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.