Search Results for "data processing" - Page 3

Showing 400 open source projects for "data processing"

View related business solutions
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    Access competitive interest rates on your digital assets.

    Generate interest, borrow against your crypto, and trade a range of cryptocurrencies — all in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • 1
    Video-subtitle-remover (VSR)

    Video-subtitle-remover (VSR)

    AI tool that removes hardcoded subtitles and text from videos locally

    ...It allows users to define a specific subtitle region so that only text in that area is removed rather than modifying the entire frame. It can also automatically remove text throughout the whole video when a position is not specified. In addition to video processing, the project supports removing text-like watermarks from images through similar techniques. The processing runs locally without requiring any external API services, enabling offline use and greater control over the data being processed.
    Downloads: 131 This Week
    Last Update:
    See Project
  • 2
    Quantitative Trading System

    Quantitative Trading System

    A comprehensive quantitative trading system with AI-powered analysis

    Quantitative Trading System is a comprehensive quantitative trading platform that integrates artificial intelligence, financial data analysis, and automated strategy execution within a unified software system. The project is designed to provide an end-to-end infrastructure for building and operating algorithmic trading strategies in financial markets. It includes tools for collecting and processing market data from multiple sources, performing statistical and machine learning analysis, and generating trading signals based on quantitative models. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    Astropy

    Astropy

    Repository for the Astropy core package

    ...It is at the core of the Astropy Project, which aims to enable the community to develop a robust ecosystem of affiliated packages covering a broad range of needs for astronomical research, data processing, and data analysis.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 4
    Taipy

    Taipy

    Turns Data and AI algorithms into production-ready web applications

    ...Taipy enhances performance with caching control of graphical events, optimizing rendering by selectively updating graphical components only upon interaction. Effortlessly manage massive datasets with Taipy's built-in decimator for charts, intelligently reducing the number of data points to save time and memory without losing the essence of your data's shape. Struggle with sluggish performance and excessive memory usage, as every data point demands processing. Large datasets become cumbersome, complicating the user experience and data analysis. Scenarios are made easy with Taipy Studio. A powerful VS Code extension that unlocks a convenient graphical editor. ...
    Downloads: 11 This Week
    Last Update:
    See Project
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • 5
    POT

    POT

    Python Optimal Transport

    This open source Python library provides several solvers for optimization problems related to Optimal Transport for signal, image processing and machine learning.
    Downloads: 31 This Week
    Last Update:
    See Project
  • 6
    CogDB

    CogDB

    Micro Graph Database for Python Applications

    Cog is a lightweight, embedded graph database for Go that provides a simple interface for storing and querying graph-based data structures, making it useful for knowledge representation and graph analytics.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 7
    txtai

    txtai

    Build AI-powered semantic search applications

    txtai executes machine-learning workflows to transform data and build AI-powered semantic search applications. Traditional search systems use keywords to find data. Semantic search applications have an understanding of natural language and identify results that have the same meaning, not necessarily the same keywords. Backed by state-of-the-art machine learning models, data is transformed into vector representations for search (also known as embeddings). Innovation is happening at a rapid...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 8
    Robin

    Robin

    AI-powered tool for dark web OSINT search and investigation

    ...The project provides a modular architecture separating search, scraping, and AI processing components so it can be extended with new data sources.
    Downloads: 24 This Week
    Last Update:
    See Project
  • 9
    MegaParse

    MegaParse

    File Parser optimised for LLM Ingestion with no loss

    MegaParse is a file parser optimized for Large Language Model (LLM) ingestion, ensuring no loss of information. It efficiently parses various document formats, such as PDFs, DOCX, and PPTX, converting them into formats ideal for processing by LLMs. This tool is essential for applications that require accurate and comprehensive data extraction from diverse document types.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 10
    Neuroglancer

    Neuroglancer

    WebGL-based viewer for volumetric data

    ...The viewer is built with a multi-threaded architecture, separating rendering and data processing to ensure smooth performance even with massive datasets. Extensively used in neuroscience research, Neuroglancer supports integration with tools.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 11
    BettaFish

    BettaFish

    Public opinion analysis system

    ...Unlike simpler analytics tools, BettaFish employs agent collaboration and a “forum” style internal mechanism to combine diverse model outputs, making the analysis richer and more robust. It also integrates multimodal processing, enabling it to parse images and video alongside text.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    spider_collection

    spider_collection

    Collection of Python web scraping scripts for data extraction tasks

    ...In addition to raw data collection, some spiders include basic data processing and analysis using tools such as pandas and simple visualization with matplotlib. It also contains examples of proxy pool integration and encapsulation to support more reliable crawling when working with sites that enforce request limits.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    Python API for JMComic

    Python API for JMComic

    Python crawler and API for downloading JMComic albums and images

    ...It provides a structured API that allows developers to retrieve albums, chapters, and images using simple Python code while handling the necessary network requests and data processing behind the scenes. It supports both web-based and mobile API interfaces, enabling flexible interaction with the platform depending on the available endpoints. Its architecture includes components for configuration management, download orchestration, and client communication, allowing users to automate the retrieval of manga chapters or entire albums. ...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 14
    Bayesian Optimization

    Bayesian Optimization

    Python implementation of global optimization with gaussian processes

    This is a constrained global optimization package built upon bayesian inference and gaussian process, that attempts to find the maximum value of an unknown function in as few iterations as possible. This technique is particularly suited for optimization of high cost functions, situations where the balance between exploration and exploitation is important. More detailed information, other advanced features, and tips on usage/implementation can be found in the examples folder. Follow the basic...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 15
    WeClone

    WeClone

    One-stop solution for creating your digital avatar from chat history

    WeClone is an open source AI project designed to replicate a person’s conversational style and personality by training models on chat history data. The system analyzes message patterns, linguistic style, and contextual behavior in order to generate responses that resemble the original user’s communication style. It is intended primarily as an experimental exploration of digital personality modeling and conversational AI personalization. By processing large volumes of conversation data, WeClone can build a profile of an individual’s writing tone, vocabulary preferences, and conversational tendencies. ...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 16
    Bespoke Curator

    Bespoke Curator

    Synthetic data curation for post-training and data extraction

    Curator is an open-source Python library designed to build synthetic data pipelines for training and evaluating machine learning models, particularly large language models. The system helps developers generate, transform, and curate high-quality datasets by combining automated generation with structured validation and filtering. It supports workflows where models are used to produce synthetic examples that can later be refined into reliable training datasets for reasoning, question...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 17
    Youtu-Agent

    Youtu-Agent

    A simple yet powerful agent framework that delivers with models

    ...The system focuses on reducing the complexity traditionally involved in configuring large language model agents by providing a modular architecture that separates execution environments, tools, and context management. This structure allows developers to rapidly assemble agent systems capable of performing tasks such as research, file processing, and data analysis. The framework supports automated generation of agent components, enabling the system to synthesize prompts, tool interfaces, and workflow configurations automatically. Youtu-Agent also incorporates hybrid learning strategies that combine experience accumulation with reinforcement learning to improve agent performance over time. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 18
    AIOHTTP

    AIOHTTP

    Asynchronous HTTP client/server framework for asyncio and Python

    ...A long awaited new feature is tracing client request life cycle to figure out when and why client request spends a time waiting for connection establishment, getting server response headers etc. Now it is possible by registering special signal handlers on every request processing stage. The main change is dropping yield from support and using async/await everywhere. Farewell, Python 3.4. You often want to send some sort of data in the URL’s query string. If you were constructing the URL by hand, this data would be given as key/value pairs in the URL after a question mark, e.g. httpbin.org/get?key=val. Requests allows you to provide these arguments as a dict, using the params keyword argument. aiohttp internally performs URL canonicalization before sending request.
    Downloads: 171 This Week
    Last Update:
    See Project
  • 19
    julep

    julep

    A new DSL and server for AI agents and multi-step tasks

    Julep is a platform for creating AI agents that remember past interactions and can perform complex tasks. It offers long-term memory and manages multi-step processes. Julep enables the creation of multi-step tasks incorporating decision-making, loops, parallel processing, and integration with numerous external tools and APIs. While many AI applications are limited to simple, linear chains of prompts and API calls with minimal branching, Julep is built to handle more complex scenarios.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Haystack

    Haystack

    Haystack is an open source NLP framework to interact with your data

    Apply the latest NLP technology to your own data with the use of Haystack's pipeline architecture. Implement production-ready semantic search, question answering, summarization and document ranking for a wide range of NLP applications. Evaluate components and fine-tune models. Ask questions in natural language and find granular answers in your documents using the latest QA models with the help of Haystack pipelines. Perform semantic search and retrieve ranked documents according to meaning,...
    Downloads: 14 This Week
    Last Update:
    See Project
  • 21
    airda

    airda

    airda(Air Data Agent

    airda(Air Data Agent) is a multi-smart body for data analysis, capable of understanding data development and data analysis needs, understanding data, generating data-oriented queries, data visualization, machine learning and other tasks of SQL and Python codes.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 22
    Smallpond

    Smallpond

    A lightweight data processing framework built on DuckDB and 3FS

    smallpond is a lightweight distributed data processing framework built by DeepSeek, designed to scale DuckDB workloads over clusters using their 3FS (Fire-Flyer File System) backend. The idea is to preserve DuckDB’s fast analytics engine but lift it from single-node to multi-node settings, giving you the ability to operate on large datasets (e.g. petabyte scale) without moving to a heavyweight system like Spark.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Browser Use

    Browser Use

    Make websites accessible for AI agents

    Browser-Use is a framework that makes websites accessible for AI agents, enabling automated interactions and data extraction from web pages.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 24
    chatd

    chatd

    Chat with your documents using local AI

    chatd is an open-source desktop application that allows users to interact with their documents through a locally running large language model. The software focuses on privacy and security by ensuring that all document processing and inference occur entirely on the user’s computer without sending data to external cloud services. It includes a built-in integration with the Ollama runtime, which provides a cross-platform environment for running large language models locally. The application typically runs models such as Mistral-7B and allows users to load and analyze documents while asking questions in natural language. ...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 25
    Matrix

    Matrix

    Multi-Agent daTa geneRation Infra and eXperimentation framework

    Matrix is a distributed, large-scale engine for multi-agent synthetic data generation and experiments: it provides the infrastructure to run thousands of “agentic” workflows concurrently (e.g. multiple LLMs interacting, reasoning, generating content, data-processing pipelines) by leveraging distributed computing (like Ray + cluster management). The idea is to treat data generation as a “data-to-data” transformation: each input item defines a task, and the runtime orchestrates asynchronous, peer-to-peer agent workflows, avoiding global synchronization bottlenecks. ...
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB