Showing 136 open source projects for "spreadsheet machine learning"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 1
    .NET for Apache Spark

    .NET for Apache Spark

    A free, open-source, and cross-platform big data analytics framework

    .NET for Apache Spark provides high-performance APIs for using Apache Spark from C# and F#. With these .NET APIs, you can access the most popular Dataframe and SparkSQL aspects of Apache Spark, for working with structured data, and Spark Structured Streaming, for working with streaming data. .NET for Apache Spark is compliant with .NET Standard - a formal specification of .NET APIs that are common across .NET implementations. This means you can use .NET for Apache Spark anywhere you write...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    GraphNeuralNetworks.jl

    GraphNeuralNetworks.jl

    Graph Neural Networks in Julia

    GraphNeuralNetworks.jl is a graph neural network library written in Julia and based on the deep learning framework Flux.jl.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    FinMind

    FinMind

    Open Data, more than 50 financial data

    ...Regardless of the program, you can download data through the api provided by FinMind, or you can download data directly from the website. After data is available, statistical analysis, regression analysis, time series analysis, machine learning, and deep learning can be performed. For individual stocks, provide visual analysis of technical, fundamental, and chip levels. According to different strategies, back-test analysis is performed to provide performance, profit and loss, and stock selection targets of different strategy investment portfolios.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    marimo

    marimo

    A reactive notebook for Python

    marimo is an open-source reactive notebook for Python, reproducible, git-friendly, executable as a script, and shareable as an app. marimo notebooks are reproducible, extremely interactive, designed for collaboration (git-friendly!), deployable as scripts or apps, and fit for modern Pythonista. Run one cell and marimo reacts by automatically running affected cells, eliminating the error-prone chore of managing the notebook state. marimo's reactive UI elements, like data frame GUIs and plots,...
    Downloads: 3 This Week
    Last Update:
    See Project
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • 5
    JDF.jl

    JDF.jl

    Julia DataFrames serialization format

    JDF is a DataFrames serialization format with the following goals, fast save and load times, compressed storage on disk, enabled disk-based data manipulation (not yet achieved), and support for machine learning workloads, e.g. mini-batch, sampling (not yet achieved). JDF stores a DataFrame in a folder with each column stored as a separate file. There is also a metadata.jls file that stores metadata about the original DataFrame. Collectively, the column files, the metadata file, and the folder is called a JDF "file". JDF.jl is a pure-Julia solution and there are a lot of ways to do nifty things like compression and encapsulating the underlying struture of the arrays that's hard to do in R and Python. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    LossFunctions.jl

    LossFunctions.jl

    Julia package of loss functions for machine learning

    ...As such, it is a part of the JuliaML ecosystem. The sole purpose of this package is to provide an efficient and extensible implementation of various loss functions used throughout Machine Learning (ML). It is thus intended to serve as a special purpose back-end for other ML libraries that require losses to accomplish their tasks. To that end we provide a considerable amount of carefully implemented loss functions, as well as an API to query their properties (e.g. convexity). Furthermore, we expose methods to compute their values, derivatives, and second derivatives for single observations as well as arbitrarily sized arrays of observations. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    Zingg

    Zingg

    Scalable master data management and identity resolution

    Zingg is an open-source entity resolution and master data management platform for finding duplicate, related, or matching records across large datasets. It uses machine learning to learn how records should be compared, reducing the need for brittle hand-written matching rules. The project is designed for data engineering and analytics teams working on customer 360, supplier 360, deduplication, fuzzy matching, data quality, and golden record workflows. Zingg runs on Apache Spark and can scale to large data lake, warehouse, and cloud platform environments. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    AI Data Science Team

    AI Data Science Team

    An AI-powered data science team of agents

    AI Data Science Team is a Python library and agent ecosystem designed to accelerate and automate common data science workflows by modeling them as specialized AI “agents” that can be orchestrated to perform tasks like data cleaning, transformation, analysis, visualization, and machine learning. It provides a modular agent framework where each agent focuses on a step in the typical data science pipeline — for example, loading data from CSV/Excel files, cleaning and wrangling messy datasets, engineering predictive features, building models with AutoML, connecting to SQL databases, and producing visual outputs — all driven by natural language or programmatic instructions. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    PySR

    PySR

    High-Performance Symbolic Regression in Python and Julia

    PySR is an open-source tool for Symbolic Regression: a machine learning task where the goal is to find an interpretable symbolic expression that optimizes some objective. Over a period of several years, PySR has been engineered from the ground up to be (1) as high-performance as possible, (2) as configurable as possible, and (3) easy to use. PySR is developed alongside the Julia library SymbolicRegression.jl, which forms the powerful search engine of PySR.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 10
    Automated Tool for Optimized Modelling

    Automated Tool for Optimized Modelling

    Automated Tool for Optimized Modelling

    ...How many times have you copy-and-pasted code from an old repository to re-use it in a new use case? ATOM is here to help solve these common issues. The package acts as a wrapper of the whole machine learning pipeline, helping the data scientist to rapidly find a good model for his problem.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    ReservoirComputing.jl

    ReservoirComputing.jl

    Reservoir computing utilities for scientific machine learning (SciML)

    ReservoirComputing.jl provides an efficient, modular and easy-to-use implementation of Reservoir Computing models such as Echo State Networks (ESNs). For information on using this package please refer to the stable documentation. Use the in-development documentation to take a look at not-yet-released features.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    PySyft

    PySyft

    Data science on data without acquiring a copy

    Most software libraries let you compute over the information you own and see inside of machines you control. However, this means that you cannot compute on information without first obtaining (at least partial) ownership of that information. It also means that you cannot compute using machines without first obtaining control over those machines. This is very limiting to human collaboration and systematically drives the centralization of data, because you cannot work with a bunch of data...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Pandas Profiling

    Pandas Profiling

    Create HTML profiling reports from pandas DataFrame objects

    pandas-profiling generates profile reports from a pandas DataFrame. The pandas df.describe() function is handy yet a little basic for exploratory data analysis. pandas-profiling extends pandas DataFrame with df.profile_report(), which automatically generates a standardized univariate and multivariate report for data understanding. High correlation warnings, based on different correlation metrics (Spearman, Pearson, Kendall, Cramér’s V, Phik). Most common categories (uppercase, lowercase,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    DiffEqBayes.jl

    DiffEqBayes.jl

    Extension functionality which uses Stan.jl, DynamicHMC.jl

    This repository is a set of extension functionality for estimating the parameters of differential equations using Bayesian methods. It allows the choice of using CmdStan.jl, Turing.jl, DynamicHMC.jl and ApproxBayes.jl to perform a Bayesian estimation of a differential equation problem specified via the DifferentialEquations.jl interface.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    CleanVision

    CleanVision

    Automatically find issues in image datasets

    CleanVision automatically detects potential issues in image datasets like images that are: blurry, under/over-exposed, (near) duplicates, etc. This data-centric AI package is a quick first step for any computer vision project to find problems in the dataset, which you want to address before applying machine learning. CleanVision is super simple -- run the same couple lines of Python code to audit any image dataset! The quality of machine learning models hinges on the quality of the data used to train them, but it is hard to manually identify all of the low-quality data in a big dataset. CleanVision helps you automatically identify common types of data issues lurking in image datasets. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Orange Data Mining

    Orange Data Mining

    Orange: Interactive data analysis

    Open source machine learning and data visualization. Build data analysis workflows visually, with a large, diverse toolbox. Perform simple data analysis with clever data visualization. Explore statistical distributions, box plots and scatter plots, or dive deeper with decision trees, hierarchical clustering, heatmaps, MDS and linear projections. Even your multidimensional data can become sensible in 2D, especially with clever attribute ranking and selections.
    Downloads: 77 This Week
    Last Update:
    See Project
  • 17
    Kibana

    Kibana

    Your window into the Elastic Stack

    Kibana is a analytics and search dashboard for Elasticsearch that allows you to visualize Elasticsearch data and efficiently navigate the Elastic Stack. With Kibana you can visualize and shape your data simply and intuitively, share visualizations for greater collaboration, organize dashboards and visualizations, and so much more.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 18
    Cleanlab

    Cleanlab

    The standard data-centric AI package for data quality and ML

    cleanlab helps you clean data and labels by automatically detecting issues in a ML dataset. To facilitate machine learning with messy, real-world data, this data-centric AI package uses your existing models to estimate dataset problems that can be fixed to train even better models. cleanlab cleans your data's labels via state-of-the-art confident learning algorithms, published in this paper and blog. See some of the datasets cleaned with cleanlab at labelerrors.com. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Dask

    Dask

    Parallel computing with task scheduling

    ...It integrates with familiar tools like NumPy, Pandas, and scikit-learn while enabling execution across cores or nodes with minimal code changes. Dask excels at handling large datasets that don’t fit into memory and is widely used in data science, machine learning, and big data pipelines.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    PDMats.jl

    PDMats.jl

    Uniform Interface for positive definite matrices of various structures

    Uniform interface for positive definite matrices of various structures. Positive definite matrices are widely used in machine learning and probabilistic modeling, especially in applications related to graph analysis and Gaussian models. It is not uncommon that positive definite matrices used in practice have special structures (e.g. diagonal), which can be exploited to accelerate computation. PDMats.jl supports efficient computation on positive definite matrices of various structures. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 21
    Best-of Python

    Best-of Python

    A ranked list of awesome Python open-source libraries

    This curated list contains 390 awesome open-source projects with a total of 1.4M stars grouped into 28 categories. All projects are ranked by a project-quality score, which is calculated based on various metrics automatically collected from GitHub and different package managers. If you like to add or update projects, feel free to open an issue, submit a pull request, or directly edit the projects.yaml. Contributions are very welcome! Ranked list of awesome python libraries for web...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 22
    sparklyr

    sparklyr

    R interface for Apache Spark

    sparklyr is an R package that provides seamless interfacing with Apache Spark clusters—either local or remote—while letting users write code in familiar R paradigms. It supplies a dplyr-compatible backend, Spark machine learning pipelines, SQL integration, and I/O utilities to manipulate and analyze large datasets distributed across cluster environments.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    SciMLBase.jl

    SciMLBase.jl

    The Base interface of the SciML ecosystem

    ...The SciML common interface ties together the numerical solvers of the Julia package ecosystem into a single unified interface. It is designed for maximal efficiency and parallelism, while incorporating essential features for large-scale scientific machine learning such as differentiability, composability, and sparsity.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Surrogates.jl

    Surrogates.jl

    Surrogate modeling and optimization for scientific machine learning

    A surrogate model is an approximation method that mimics the behavior of a computationally expensive simulation. In more mathematical terms: suppose we are attempting to optimize a function f(p), but each calculation of f is very expensive. It may be the case we need to solve a PDE for each point or use advanced numerical linear algebra machinery, which is usually costly. The idea is then to develop a surrogate model g which approximates f by training on previous data collected from...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Elasticsearch

    Elasticsearch

    A Distributed RESTful Search Engine

    Elasticsearch is a distributed, RESTful search and analytics engine that lets you store, search and analyze with ease at scale. It lets you perform and combine many types of searches; it scales seamlessly, and offers answers incredibly fast with search results you can rank based on a variety of factors. Elasticsearch can be used for a wide variety of use cases, from maps and metrics to site search and workplace search, and with all data types.
    Downloads: 5 This Week
    Last Update:
    See Project
Auth0 Logo