Showing 3715 open source projects for "data science"

View related business solutions
  • Red Hat Ansible Automation Platform on Microsoft Azure Icon
    Red Hat Ansible Automation Platform on Microsoft Azure

    Red Hat Ansible Automation Platform on Azure allows you to quickly deploy, automate, and manage resources securely and at scale.

    Deploy Red Hat Ansible Automation Platform on Microsoft Azure for a strategic automation solution that allows you to orchestrate, govern and operationalize your Azure environment.
  • Achieve perfect load balancing with a flexible Open Source Load Balancer Icon
    Achieve perfect load balancing with a flexible Open Source Load Balancer

    Take advantage of Open Source Load Balancer to elevate your business security and IT infrastructure with a custom ADC Solution.

    Boost application security and continuity with SKUDONET ADC, our Open Source Load Balancer, that maximizes IT infrastructure flexibility. Additionally, save up to $470 K per incident with AI and SKUDONET solutions, further enhancing your organization’s risk management and cost-efficiency strategies.
  • 1
    AWESOME DATA SCIENCE

    AWESOME DATA SCIENCE

    Awesome Data Science repository to learn and apply for real world

    An open source Data Science repository to learn and apply towards solving real world problems. This is a shortcut path to start studying Data Science. Just follow the steps to answer the questions, "What is Data Science and what should I study to learn Data Science?" Data Science is one of the hottest topics on the Computer and Internet farmland nowadays. People have gathered data from applications and systems until today and now is the time to analyze them. The next steps are producing...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Cookiecutter Data Science

    Cookiecutter Data Science

    Project structure for doing and sharing data science work

    A logical, reasonably standardized, but flexible project structure for doing and sharing data science work. When we think about data analysis, we often think just about the resulting reports, insights, or visualizations. While these end products are generally the main event, it's easy to focus on making the products look nice and ignore the quality of the code that generates them. Because these end products are created programmatically, code quality is still important! And we're not talking...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Julia Data Science

    Julia Data Science

    Book on Julia for Data Science

    This is an open source and open access book on how to do Data Science using Julia. Our target audience are researchers from all fields of applied sciences. Of course, we hope to be useful for industry too. You can navigate through the pages of the ebook by using the arrow keys (left/right) on your keyboard.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    ggplot2

    ggplot2

    An implementation of the Grammar of Graphics in R

    ... for plotting. In most cases using ggplot2 starts with supplying a dataset and aesthetic mapping (with aes()); adding on layers (like geom_point() or geom_histogram()), scales (like scale_colour_brewer()), and faceting specifications (like facet_wrap()); and finally, coordinating systems. ggplot2 has a rich ecosystem of community-maintained extensions for those looking for more innovation. ggplot2 is a part of the tidyverse, an ecosystem of R packages designed for data science.
    Downloads: 16 This Week
    Last Update:
    See Project
  • Free and Open Source HR Software Icon
    Free and Open Source HR Software

    OrangeHRM provides a world-class HRIS experience and offers everything you and your team need to be that HR hero you know that you are.

    Give your HR team the tools they need to streamline administrative tasks, support employees, and make informed decisions with the OrangeHRM free and open source HR software.
  • 5
    Lua

    Lua

    The Lua development repository, as seen by the Lua team

    Lua is a powerful, efficient, lightweight, embeddable scripting language. It supports procedural programming, object-oriented programming, functional programming, data-driven programming, and data description. Lua combines simple procedural syntax with powerful data description constructs based on associative arrays and extensible semantics. Lua is dynamically typed, runs by interpreting bytecode with a register-based virtual machine, and has automatic memory management with incremental garbage...
    Downloads: 18 This Week
    Last Update:
    See Project
  • 6
    AWS Step Functions Data Science SDK

    AWS Step Functions Data Science SDK

    For building machine learning (ML) workflows and pipelines on AWS

    The AWS Step Functions Data Science SDK is an open-source library that allows data scientists to easily create workflows that process and publish machine learning models using Amazon SageMaker and AWS Step Functions. You can create machine learning workflows in Python that orchestrate AWS infrastructure at scale, without having to provision and integrate the AWS services separately. The best way to quickly review how the AWS Step Functions Data Science SDK works is to review the related example...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    ClearML

    ClearML

    Streamline your ML workflow

    ClearML is an open source platform that automates and simplifies developing and managing machine learning solutions for thousands of data science teams all over the world. It is designed as an end-to-end MLOps suite allowing you to focus on developing your ML code & automation, while ClearML ensures your work is reproducible and scalable. The ClearML Python Package for integrating ClearML into your existing scripts by adding just two lines of code, and optionally extending your experiments...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 8
    Gradio

    Gradio

    Create UIs for your machine learning model in Python in 3 minutes

    ... them interact with the model on your computer remotely from their own devices. Once you've created an interface, you can permanently host it on Hugging Face. Hugging Face Spaces will host the interface on its servers and provide you with a link you can share. One of the best ways to share your machine learning model, API, or data science workflow with others is to create an interactive demo that allows your users or colleagues to try out the demo in their browsers.
    Downloads: 14 This Week
    Last Update:
    See Project
  • 9
    Nuclio

    Nuclio

    High-Performance Serverless event and data processing platform

    Nuclio is an open source and managed serverless platform used to minimize development and maintenance overhead and automate the deployment of data-science-based applications. Real-time performance running up to 400,000 function invocations per second. Portable across low laptops, edge, on-prem and multi-cloud deployments. The first serverless platform supporting GPUs for optimized utilization and sharing. Automated deployment to production in a few clicks from Jupyter notebook. Deploy one...
    Downloads: 7 This Week
    Last Update:
    See Project
  • Engage for Amazon Connect, the Pre-built Contact Center Platform Icon
    Engage for Amazon Connect, the Pre-built Contact Center Platform

    Utilizing the power of AWS and Generative AI, Engage provides your customers with highly personalized, exceptional experiences.

    Engage is a pre-built, intelligent contact center platform that transforms customer service.
  • 10
    OpenRefine

    OpenRefine

    A free, open source, powerful tool for working with messy data

    ..., then that is the only time the data will be shared outside of your computer. OpenRefine is available in over 15 languages, is cross-platform and part of the Code for Science & Society.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 11
    AllenNLP

    AllenNLP

    An open-source NLP research library, built on PyTorch

    AllenNLP makes it easy to design and evaluate new deep learning models for nearly any NLP problem, along with the infrastructure to easily run them in the cloud or on your laptop. AllenNLP includes reference implementations of high quality models for both core NLP problems (e.g. semantic role labeling) and NLP applications (e.g. textual entailment). AllenNLP supports loading "plugins" dynamically. A plugin is just a Python package that provides custom registered classes or additional...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 12
    Milvus

    Milvus

    Vector database for scalable similarity search and AI applications

    ... vector datasets. Rich APIs designed for data science workflows. Consistent user experience across laptop, local cluster, and cloud. Embed real-time search and analytics into virtually any application. Milvus’ built-in replication and failover/failback features ensure data and applications can maintain business continuity in the event of a disruption. Component-level scalability makes it possible to scale up and down on demand.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 13
    XGBoost

    XGBoost

    Scalable and Flexible Gradient Boosting

    XGBoost is an optimized distributed gradient boosting library, designed to be scalable, flexible, portable and highly efficient. It supports regression, classification, ranking and user defined objectives, and runs on all major operating systems and cloud platforms. XGBoost works by implementing machine learning algorithms under the Gradient Boosting framework. It also offers parallel tree boosting (GBDT, GBRT or GBM) that can quickly and accurately solve many data science problems. XGBoost...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    PSLab Android App

    PSLab Android App

    PSLab Android App

    Repository for the PSLab Android App for performing experiments with the Pocket Science Lab open-hardware platform. This repository holds the Android App for performing experiments with PSLab. PSLab is a tiny pocket science lab that provides an array of equipment for doing science and engineering experiments. It can function like an oscilloscope, waveform generator, frequency counter, programmable voltage and current source and also as a data logger. PSLab is a tiny pocket science lab...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    Metaflow

    Metaflow

    A framework for real-life data science

    Metaflow is a human-friendly Python library that helps scientists and engineers build and manage real-life data science projects. Metaflow was originally developed at Netflix to boost productivity of data scientists who work on a wide variety of projects from classical statistics to state-of-the-art deep learning.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    Great Expectations

    Great Expectations

    Always know what to expect from your data

    Great Expectations helps data teams eliminate pipeline debt, through data testing, documentation, and profiling. Software developers have long known that testing and documentation are essential for managing complex codebases. Great Expectations brings the same confidence, integrity, and acceleration to data science and data engineering teams. Expectations are assertions for data. They are the workhorse abstraction in Great Expectations, covering all kinds of common data issues. Expectations...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    cuDF

    cuDF

    GPU DataFrame Library

    ... with conda (miniconda, or the full Anaconda distribution) from the rapidsai channel. cuDF is supported only on Linux, and with Python versions 3.7 and later. The RAPIDS suite of open-source software libraries aims to enable the execution of end-to-end data science and analytics pipelines entirely on GPUs. It relies on NVIDIA® CUDA® primitives for low-level compute optimization but exposing that GPU parallelism and high-bandwidth memory speed through user-friendly Python interfaces.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    Synapse Machine Learning

    Synapse Machine Learning

    Simple and distributed Machine Learning

    SynapseML (previously MMLSpark) is an open source library to simplify the creation of scalable machine learning pipelines. SynapseML builds on Apache Spark and SparkML to enable new kinds of machine learning, analytics, and model deployment workflows. SynapseML adds many deep learning and data science tools to the Spark ecosystem, including seamless integration of Spark Machine Learning pipelines with the Open Neural Network Exchange (ONNX), LightGBM, The Cognitive Services, Vowpal Wabbit...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    gdbgui

    gdbgui

    Browser-based frontend to gdb (gnu debugger)

    Browser-based frontend to gdb (gnu debugger). Add breakpoints, view the stack, visualize data structures, and more in C, C++, Go, Rust, and Fortran. Run gdbgui from the terminal and a new tab will open in your browser. gdbgui is a browser-based frontend to gdb, the gnu debugger. You can add breakpoints, view stack traces, and more in C, C++, Go, and Rust! It's perfect for beginners and experts. Simply run gdbgui from the terminal to start the gdbgui server, and a new tab will open in your...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 20
    DataSophon

    DataSophon

    The next generation of cloud-native big data management expert

    Aiming at quickly deploying, managing, monitoring and automating the operation and maintenance of Big Data service components and nodes, helping you quickly build stable, efficient Big Data cluster services. The Three-Body Problem, a Hugo Award-winning work of the world's highest science fiction literature, is known for its stunning "hard science fiction" style, and its author Liu Cixin is credited with "single-handedly raising Chinese science fiction to a world-class level". As a very...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    TPOT

    TPOT

    A Python Automated Machine Learning tool that optimizes ML

    Consider TPOT your Data Science Assistant. TPOT is a Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming. TPOT stands for Tree-based Pipeline Optimization Tool. Consider TPOT your Data Science Assistant. TPOT is a Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    compromise

    compromise

    Modest natural-language processing

    Language is complicated and there's a gazillion words. Compromise is a javascript library that interprets and pre-parses text and makes some reasonable decisions so things are way easier. Compromise tries its best to parse text. it is small, quick, and often good-enough. It is not as smart as you'd think. Conjugate and negate verbs in any tense. Play between plural, singular and possessive forms. Interpret plain-text numbers. Handle implicit terms. Use it on the client-side or as an...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    Quadratic

    Quadratic

    Data science spreadsheet with Python & SQL

    Quadratic enables your team to work together on data analysis to deliver better results, faster. You already know how to use a spreadsheet, but you’ve never had this much power before. Quadratic is a Web-based spreadsheet application that runs in the browser and as a native app (via Electron). Our goal is to build a spreadsheet that enables you to pull your data from its source (SaaS, Database, CSV, API, etc) and then work with that data using the most popular data science tools today (Python...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Kedro

    Kedro

    A Python framework for creating reproducible, maintainable code

    Kedro is an open sourced Python framework for creating maintainable and modular data science code. Provides the scaffolding to build more complex data and machine-learning pipelines. In addition, there's a focus on spending less time on the tedious "plumbing" required to maintain data science code; this means that you have more time to solve new problems. Standardises team workflows; the modular structure of Kedro facilitates a higher level of collaboration when teams solve problems together...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Orchest

    Orchest

    Build data pipelines, the easy way

    Code, run and monitor your data pipelines all from your browser! From idea to scheduled pipeline in hours, not days. Interactively build your data science pipelines in our visual pipeline editor. Versioned as a JSON file. Run scripts or Jupyter notebooks as steps in a pipeline. Python, R, Julia, JavaScript, and Bash are supported. Parameterize your pipelines and run them periodically on a cron schedule. Easily install language or system packages. Built on top of regular Docker container images...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next