Search Results for "python data analysis" - Page 5

Showing 3311 open source projects for "python data analysis"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • MongoDB Atlas | Run databases anywhere Icon
    MongoDB Atlas | Run databases anywhere

    Ensure the availability of your data with coverage across AWS, Azure, and GCP on MongoDB Atlas—the multi-cloud database for every enterprise.

    MongoDB Atlas allows you to build and run modern applications across 125+ cloud regions, spanning AWS, Azure, and Google Cloud. Its multi-cloud clusters enable seamless data distribution and automated failover between cloud providers, ensuring high availability and flexibility without added complexity.
    Learn More
  • 1
    hosts

    hosts

    Consolidate and extend hosts files from several well-curated sources

    .... Data for extensions are stored in the extensions folder. You manage extensions by curating this folder tree, where you will find the data for fakenews, social, gambling, and porn extension data that we maintain and provide for you. Create an optional blacklist file. The contents of this file (containing a listing of additional domains in hosts file format) are appended to the unified hosts file during the update process. A sample blacklist is included, and may be modified as you need.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 2
    leafmap

    leafmap

    A Python package for interactive mapping and geospatial analysis

    A Python package for geospatial analysis and interactive mapping in a Jupyter environment. Leafmap is a Python package for interactive mapping and geospatial analysis with minimal coding in a Jupyter environment. It is a spin-off project of the geemap Python package, which was designed specifically to work with Google Earth Engine (GEE). However, not everyone in the geospatial community has access to the GEE cloud computing platform. Leafmap is designed to fill this gap for non-GEE users...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    BambooAI

    BambooAI

    A Python library powered by Language Models (LLMs)

    BambooAI is a Python library powered by large language models (LLMs) for conversational data discovery and analysis, allowing users to interact with data through natural language.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    Volatility

    Volatility

    An advanced memory forensics framework

    Volatility is a widely used open-source framework for analyzing memory captures (RAM dumps) from Windows, Linux, and macOS systems. It enables investigators and malware analysts to extract process lists, network connections, DLLs, strings, artifacts, and more. Volatility supports many plugins for detecting hidden processes, malware, rootkits, and event tracing. It’s essential in digital forensics and incident response workflows.
    Downloads: 6 This Week
    Last Update:
    See Project
  • Crowdtesting That Delivers | Testeum Icon
    Crowdtesting That Delivers | Testeum

    Unfixed bugs delaying your launch? Test with real users globally – check it out for free, results in days.

    Testeum connects your software, app, or website to a worldwide network of testers, delivering detailed feedback in under 48 hours. Ensure functionality and refine UX on real devices, all at a fraction of traditional costs. Trusted by startups and enterprises alike, our platform streamlines quality assurance with actionable insights. Click to perfect your product now.
    Click to perfect your product now.
  • 5
    truffleHog

    truffleHog

    Searches through git repositories for high entropy strings and secrets

    truffleHog searches through git repositories for high entropy strings and secrets, digging deep into commit history. TruffleHog runs behind the scenes to scan your environment for secrets like private keys and credentials, so you can protect your data before a breach occurs. Secrets can be found anywhere, so TruffleHog scans more than just code repositories, including SaaS and internally hosted software. With support for custom integrations and new integrations added all the time, you can...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 6
    FiftyOne

    FiftyOne

    The open-source tool for building high-quality datasets

    ... to boost the performance of your model. FiftyOne provides the building blocks for optimizing your dataset analysis pipeline. Use it to get hands-on with your data, including visualizing complex labels, evaluating your models, exploring scenarios of interest, identifying failure modes, finding annotation mistakes, and much more! Surveys show that machine learning engineers spend over half of their time wrangling data, but it doesn't have to be that way.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 7
    spaCy models

    spaCy models

    Models for the spaCy Natural Language Processing (NLP) library

    spaCy is designed to help you do real work, to build real products, or gather real insights. The library respects your time, and tries to avoid wasting it. It's easy to install, and its API is simple and productive. spaCy excels at large-scale information extraction tasks. It's written from the ground up in carefully memory-managed Cython. If your application needs to process entire web dumps, spaCy is the library you want to be using. Since its release in 2015, spaCy has become an industry...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 8
    MarkItDown

    MarkItDown

    Python tool for converting files and office documents to Markdown

    MarkItDown is a lightweight Python utility developed by Microsoft for converting various files and office documents to Markdown format. It is particularly useful for preparing documents for use with large language models and related text analysis pipelines. ​
    Downloads: 4 This Week
    Last Update:
    See Project
  • 9
    Prophet

    Prophet

    Tool for producing high quality forecasts for time series data

    Prophet is a procedure for forecasting time series data based on an additive model where non-linear trends are fit with yearly, weekly, and daily seasonality, plus holiday effects. It works best with time series that have strong seasonal effects and several seasons of historical data. Prophet is robust to missing data and shifts in the trend, and typically handles outliers well. Prophet is used in many applications across Facebook for producing reliable forecasts for planning and goal setting...
    Downloads: 6 This Week
    Last Update:
    See Project
  • Sales CRM and Pipeline Management Software | Pipedrive Icon
    Sales CRM and Pipeline Management Software | Pipedrive

    The easy and effective CRM for closing deals

    Pipedrive’s simple interface empowers salespeople to streamline workflows and unite sales tasks in one workspace. Unlock instant sales insights with Pipedrive’s visual sales pipeline and fine-tune your strategy with robust reporting features and a personalized AI Sales Assistant.
    Try it for free
  • 10
    Trame

    Trame

    Weave various components and technologies into a Web App

    ... under Apache License Version 2.0 which allows users to create open source or commercial applications without any licensing worries. By relying simply on Python and HTML, trame focuses on one's data and associated analysis and visualizations while hiding the complications of web development.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 11
    CellTypist

    CellTypist

    A tool for semi-automatic cell type classification, harmonization

    ... and accurate prediction. Scalable and flexible. Python-based implementation is easy to integrate into existing pipelines. A community-driven encyclopedia for cell types.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 12
    SageMaker Spark Container

    SageMaker Spark Container

    Docker image used to run data processing workloads

    Apache Spark™ is a unified analytics engine for large-scale data processing. It provides high-level APIs in Scala, Java, Python, and R, and an optimized engine that supports general computation graphs for data analysis. It also supports a rich set of higher-level tools including Spark SQL for SQL and DataFrames, MLlib for machine learning, GraphX for graph processing, and Structured Streaming for stream processing. The SageMaker Spark Container is a Docker image used to run batch data...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    Pyodide

    Pyodide

    Pyodide is a Python distribution for the browser and Node.js

    Pyodide brings the Python runtime to the browser by compiling Python and its scientific libraries to WebAssembly. It allows developers to run Python code directly in web browsers without a server, supporting packages like NumPy, Pandas, and Matplotlib. Pyodide opens up new possibilities for interactive data analysis, scientific computing, and educational tools in web environments, all while integrating seamlessly with JavaScript.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    Dagster

    Dagster

    An orchestration platform for the development, production

    Dagster is an orchestration platform for the development, production, and observation of data assets. Dagster as a productivity platform: With Dagster, you can focus on running tasks, or you can identify the key assets you need to create using a declarative approach. Embrace CI/CD best practices from the get-go: build reusable components, spot data quality issues, and flag bugs early. Dagster as a robust orchestration engine: Put your pipelines into production with a robust multi-tenant...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 15
    TensorFlow

    TensorFlow

    TensorFlow is an open source library for machine learning

    Originally developed by Google for internal use, TensorFlow is an open source platform for machine learning. Available across all common operating systems (desktop, server and mobile), TensorFlow provides stable APIs for Python and C as well as APIs that are not guaranteed to be backwards compatible or are 3rd party for a variety of other languages. The platform can be easily deployed on multiple CPUs, GPUs and Google's proprietary chip, the tensor processing unit (TPU). TensorFlow...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 16
    tslearn

    tslearn

    The machine learning toolkit for time series analysis in Python

    The machine learning toolkit for time series analysis in Python. tslearn expects a time series dataset to be formatted as a 3D numpy array. The three dimensions correspond to the number of time series, the number of measurements per time series and the number of dimensions respectively (n_ts, max_sz, d). In order to get the data in the right format.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    PandasAI

    PandasAI

    PandasAI is a Python library that integrates generative AI

    PandasAI is a Python library that adds Generative AI capabilities to pandas, the popular data analysis and manipulation tool. It is designed to be used in conjunction with pandas, and is not a replacement for it. PandasAI makes pandas (and all the most used data analyst libraries) conversational, allowing you to ask questions to your data in natural language. For example, you can ask PandasAI to find all the rows in a DataFrame where the value of a column is greater than 5, and it will return...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 18
    DVC

    DVC

    Data Version Control | Git for Data & Models

    DVC is built to make ML models shareable and reproducible. It is designed to handle large files, data sets, machine learning models, and metrics as well as code. Version control machine learning models, data sets and intermediate files. DVC connects them with code and uses Amazon S3, Microsoft Azure Blob Storage, Google Drive, Google Cloud Storage, Aliyun OSS, SSH/SFTP, HDFS, HTTP, network-attached storage, or disc to store file contents. Version control machine learning models, data sets...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 19
    Airborne Data Processing and Analysis

    Airborne Data Processing and Analysis

    Software to processing and analyze of airborne measurements.

    The Airborne Data Processing and Analysis (ADPAA) package is an open-source software package containing a collection of programs and scripts to process and analyze data from in-situ instruments deployed on airborne platforms. The ADPAA package was started to process data on the North Dakota Citation Research Aircraft but has been used to process data on many airborne platforms. The software methodology used in ADPAA is provided in the peer-review publication: Delene, D. J., Airborne Data...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 20
    LlamaIndex

    LlamaIndex

    Central interface to connect your LLM's with external data

    LlamaIndex (GPT Index) is a project that provides a central interface to connect your LLM's with external data. LlamaIndex is a simple, flexible interface between your external data and LLMs. It provides the following tools in an easy-to-use fashion.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 21
    JumpServer

    JumpServer

    Manage assets on different clouds at the same time

    ... and reuse. Prevent internal misuse and permission abuse. Management of people and assets. Retrospective safeguards and basis for accident analysis.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 22
    Logfire MCP

    Logfire MCP

    The Logfire MCP Server is here

    The Logfire MCP Server is a Model Context Protocol server that allows AI applications to access OpenTelemetry traces and metrics sent to Logfire. It enables retrieval and analysis of telemetry data, enhancing debugging and observability workflows. ​
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    TorchRL

    TorchRL

    A modular, primitive-first, python-first PyTorch library

    TorchRL is an open-source Reinforcement Learning (RL) library for PyTorch. TorchRL provides PyTorch and python-first, low and high-level abstractions for RL that are intended to be efficient, modular, documented, and properly tested. The code is aimed at supporting research in RL. Most of it is written in Python in a highly modular way, such that researchers can easily swap components, transform them, or write new ones with little effort.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 24
    Streamlink

    Streamlink

    Streamlink is a CLI utility which pipes video streams

    Streamlink is a command-line utility that pipes video streams from various services into a video player, such as VLC. The main purpose of Streamlink is to avoid resource-heavy and unoptimized websites, while still allowing the user to enjoy various streamed content. There is also an API available for developers who want access to the stream data. Streamlink is built upon a plugin system that allows support for new services to be easily added. Most of the big streaming services are supported...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 25
    Pacu

    Pacu

    The AWS exploitation framework, designed for testing security

    Pacu (named after a type of Piranha in the Amazon) is a comprehensive AWS security-testing toolkit designed for offensive security practitioners. While several AWS security scanners currently serve as the proverbial “Nessus” of the cloud, Pacu is designed to be the Metasploit equivalent. Written in Python 3 with a modular architecture, Pacu has tools for every step of the pen testing process, covering the full cyber kill chain. Pacu is the aggregation of all of the exploitation experience...
    Downloads: 6 This Week
    Last Update:
    See Project
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.