Showing 66 open source projects for "data processing"

View related business solutions
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    Access competitive interest rates on your digital assets.

    Generate interest, borrow against your crypto, and trade a range of cryptocurrencies — all in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 1
    WinPython

    WinPython

    Portable Scientific Python 2/3 32/64bit Distribution for Windows

    WinPython is a free open-source portable distribution of the Python programming language for Windows XP/7/8, designed for scientists, supporting both 32bit and 64bit versions of Python 2 and Python 3. Since September 2014, Developpement has moved to https://winpython.github.io/
    Leader badge
    Downloads: 4,118 This Week
    Last Update:
    See Project
  • 2

    Pytente

    Uma Ferramenta Computacional para Análise e Recuperação de Patentes

    O Pytente é uma solução avançada para automatizar o processo de coleta, armazenamento e tratamento de dados bibliográficos de patentes. A ferramenta foi projetada para simplificar a coleta de grandes volumes de dados em repositórios de acesso aberto. O Pytente garante o armazenamento estruturado das informações, além da validação e eliminação de registros duplicados. Dentre as diversas funcionalidades disponibilizadas pela ferramenta, destacam-se a extração personalizada de subconjuntos de...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    PI-Based Image Encoder / Converter

    PI-Based Image Encoder / Converter

    Python code able to convert / compress image to PI (3.14, π) Indexes

    Image processing tool that encodes pixel data as indices within the first 16.7 million digits of PI (π). Features high-performance Numba-accelerated search and a signature 'film-grain' aesthetic upon reconstruction. ZIP also include 16 MB file with 16,7 mil numbers of PI Benchmark(Single-Thread): Hardware & Environment Apple Silicon: Apple M2 (Mac mini/MacBook) x86_64 Platform: Intel Core Ultra 5 225F (Arrow Lake, 10 Cores) OS 1: Fedora 43 (GNOME) OS 2: Windows 11 Pro (23H2/24H2) Software: Python 3.14.3 + Numba JIT (latest) Results (Lower is better) Platform / OS CPU Time (Seconds) macOS (Native) Apple M2 52.151311 s (in default setup) Fedora Linux Intel Core Ultra 5 225F 58.536457 s (in default Power Management: Balanced) Windows 11 Intel Core Ultra 5 225F 59.681427 s (important! ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    PoJamas aims to provide a Python and tools for loading, processing, and producing .cr2, pz3 (crz, pzz) files compatible with the SmithMicro (e-frontier) Poser character animation application. PoJamas is composed of: - Python library - Python Wavefront (.obj) 3D viewer based on GLFW - LibreOffice/Python Application (to ease the library and the viewer usage) As of 2020, the project is ported in Python3 As of 2021 this project proposes a 3D viewer for Wavefront files...
    Downloads: 1 This Week
    Last Update:
    See Project
  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • 5
    towhee

    towhee

    Framework that is dedicated to making neural data processing

    ...Towhee provides out-of-the-box integration with your favorite libraries, tools, and frameworks, making development quick and easy. Towhee includes a pythonic method-chaining API for describing custom data processing pipelines. We also support schemas, making processing unstructured data as easy as handling tabular data.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    CloudI: A Cloud at the lowest level
    CloudI is an open-source private cloud computing framework for efficient, secure, and internal data processing. CloudI provides scaling for previously unscalable source code with efficient fault-tolerant execution of ATS, C/C++, Erlang/Elixir, Go, Haskell, Java, JavaScript/node.js, OCaml, Perl, PHP, Python, Ruby, or Rust services. The bare essentials for efficient fault-tolerant processing on a cloud!
    Downloads: 7 This Week
    Last Update:
    See Project
  • 7
    Horovod

    Horovod

    Distributed training framework for TensorFlow, Keras, PyTorch, etc.

    ...Horovod can be installed on-premise or run out-of-the-box in cloud platforms, including AWS, Azure, and Databricks. Horovod can additionally run on top of Apache Spark, making it possible to unify data processing and model training into a single pipeline. Once Horovod has been configured, the same infrastructure can be used to train models with any framework, making it easy to switch between TensorFlow, PyTorch, MXNet, and future frameworks as machine learning tech stacks continue to evolve. Start scaling your model training with just a few lines of Python code. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 8
    The Related Values Processing Framework helps the integration of Process Control Data Historian Systems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    whiteboxgui

    whiteboxgui

    An interactive GUI for WhiteboxTools in a Jupyter-based environment

    The whiteboxgui Python package is a Jupyter frontend for WhiteboxTools, an advanced geospatial data analysis platform developed by Prof. John Lindsay (webpage; jblindsay) at the University of Guelph's Geomorphometry and Hydrogeomatics Research Group. WhiteboxTools can be used to perform common geographical information systems (GIS) analysis operations, such as cost-distance analysis, distance buffering, and raster reclassification.
    Downloads: 8 This Week
    Last Update:
    See Project
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 10
    Neural Network Visualization

    Neural Network Visualization

    Project for processing neural networks and rendering to gain insights

    nn_vis is a minimalist visualization tool for neural networks written in Python using OpenGL and Pygame. It provides an interactive, graphical representation of how data flows through neural network layers, offering a unique educational experience for those new to deep learning or looking to explain it visually. By animating input, weights, activations, and outputs, the tool demystifies neural network operations and helps users intuitively grasp complex concepts. Its lightweight codebase is...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Wooey

    Wooey

    A Django app that creates automatic web UIs for Python scripts

    Wooey is a simple web interface to run command line Python scripts. Think of it as an easy way to get your scripts up on the web for routine data analysis, file processing, or anything else. The project was inspired by how simply and powerfully sandman could expose users to a database and by how Gooey turns ArgumentParser-based command-line scripts into WxWidgets GUIs. Originally two separate projects (Django-based djangui by Chris Mitchell and Flask-based Wooey by Martin Fitzpatrick) it has been merged to combine our efforts. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Whisper Library

    Whisper Library

    Whisper is a file-based time-series database format for Graphite

    Whisper is one of three components within the Graphite project. Whisper is a fixed-size database, similar in design and purpose to RRD (round-robin-database). It provides fast, reliable storage of numeric data over time. Whisper allows for higher resolution (seconds per point) of recent data to degrade into lower resolutions for long-term retention of historical data. Copies data from src in dst, if missing. Unlike whisper-merge, don't overwrite data that's already present in the target...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 13
    Big List of Naughty Strings

    Big List of Naughty Strings

    List of strings which have a high probability of causing issues

    ...Because it’s crowdsourced, it reflects real issues practitioners have faced in production, not just theoretical cases. Using the list regularly helps harden applications against the fragile edges of text processing and user input.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    earthengine-py-notebooks

    earthengine-py-notebooks

    A collection of 360+ Jupyter Python notebook examples

    earthengine-py-notebooks is a comprehensive collection of hundreds of Jupyter Python notebooks that serve as examples and tutorials for using the Google Earth Engine Python API. These notebooks are organized into thematic areas such as image processing, machine learning, visualization, filtering, and asset management, exposing users to real geospatial analysis tasks. The repository makes it easier to explore Earth Engine’s large geospatial data catalog, interactively display map layers, and generate visual insights without the need for external GIS software by leveraging interactive widgets and mapping libraries. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    NLP Architect

    NLP Architect

    A model library for exploring state-of-the-art deep learning

    ...The library contains NLP/NLU-related models per task, different neural network topologies (which are used in models), procedures for simplifying workflows in the library, pre-defined data processors and dataset loaders and misc utilities. The library is designed to be a tool for model development: data pre-processing, build model, train, validate, infer, save or load a model.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    fastNLP

    fastNLP

    fastNLP: A Modularized and Extensible NLP Framework

    fastNLP is a lightweight framework for natural language processing (NLP), the goal is to quickly implement NLP tasks and build complex models. A unified Tabular data container simplifies the data preprocessing process. Built-in Loader and Pipe for multiple datasets, eliminating the need for preprocessing code. Various convenient NLP tools, such as Embedding loading (including ELMo and BERT), intermediate data cache, etc..
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Makani

    Makani

    Makani was developed a commercial-scale airborne wind turbine

    Makani was an ambitious Google X project that sought to harness wind energy using airborne wind turbines — autonomous kites capable of generating power while flying in crosswind patterns. This open-source repository contains the complete software stack that powered Makani’s research and flight systems, including the flight simulator, autopilot controller, avionics firmware, visualization tools, and ground control software. The software enables simulation, control, and analysis of the Makani...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Forecasting Best Practices

    Forecasting Best Practices

    Time Series Forecasting Best Practices & Examples

    ...Rather than creating implementations from scratch, we draw from existing state-of-the-art libraries and build additional utilities around processing and featuring the data, optimizing and evaluating models, and scaling up to the cloud. The examples and best practices are provided as Python Jupyter notebooks and R markdown files and a library of utility functions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Albedo

    Albedo

    A recommender system for discovering GitHub repos

    ...It treats repositories and developers as a graph of interactions and applies large-scale matrix factorization to model affinities, with Apache Spark providing the distributed data processing. The project focuses on implicit feedback—stars, watches, and other engagement metrics—so it can build useful recommendations without explicit ratings. A reproducible setup and Makefile-driven workflow streamline tasks like spinning up services, loading datasets, training models, and generating candidate lists. Because it’s built around Spark’s scalable primitives, Albedo can experiment on substantial snapshots of GitHub metadata rather than toy corpora. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Twint

    Twint

    An advanced Twitter scraping & OSINT tool written in Python

    Twint is an advanced open-source Twitter scraping and OSINT tool written in Python that extracts tweets, user data, followers, likes, and more—without relying on Twitter’s API—making it highly useful for researchers, analysts, and hobbyists who want to bypass rate limits and access public Twitter data.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    Django Celery

    Django Celery

    Old Celery integration project for Django

    Celery is a simple, flexible, and reliable distributed system to process vast amounts of messages, while providing operations with the tools required to maintain such a system. It’s a task queue with focus on real-time processing, while also supporting task scheduling. Celery has a large and diverse community of users and contributors, you should come join us on IRC or our mailing-list. Celery is Open Source and licensed under the BSD License. A task queue’s input is a unit of work called a...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 22
    Pipelines

    Pipelines

    An experimental programming language for data flow

    Pipelines is a language and runtime for crafting massively parallel pipelines. Unlike other languages for defining data flow, the Pipeline language requires the implementation of components to be defined separately in the Python scripting language. This allows the details of implementations to be separated from the structure of the pipeline while providing access to thousands of active libraries for machine learning, data analysis, and processing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Wally

    Wally

    Distributed Stream Processing

    ...Provide high-performance & low-latency data processing. Be portable and deploy easily (i.e., run on-prem or any cloud). Manage in-memory state for the application. Allow applications to scale as needed, even when they are live and up-and-running. The primary API for Wally is written in Pony. Wally applications are written using this Pony API.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 24
    cnn-text-classification-tf

    cnn-text-classification-tf

    Convolutional Neural Network for Text Classification in Tensorflow

    The cnn-text-classification-tf repository by Denny Britz is a well-known educational implementation of convolutional neural networks for text classification using TensorFlow, aimed at helping developers and researchers understand how CNNs can be applied to natural language processing tasks. Based loosely on Kim’s influential paper on CNNs for sentence classification, this codebase demonstrates how to preprocess text data, convert words into learned embeddings, and apply multiple convolution filters to extract n-gram features that are then pooled and fed into a classifier. The project includes scripts for training, evaluation, and data handling, making it easy to run experiments on datasets such as movie reviews or other labeled text collections. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Zhao

    Zhao

    A compilation of "The Princely Party Relationship Network"

    zhao is a repository that consolidates research, data, and insights related to Zhao, which is likely an individual’s research collection, notes, or curated resources on deep learning, AI, or computational topics (name and content context suggest specialized study). The project may include code examples, experiment results, references to academic papers, mathematical notes, and supporting scripts to explore specific ML methods, benchmarks, or theoretical findings.
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB