Search Results for "data processing" - Page 9

Showing 400 open source projects for "data processing"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 1
    File Sorter for Photographers

    File Sorter for Photographers

    Organize files/images from a csv or xlsx file.

    A user-friendly application to efficiently sort all types of files from a source folder into a destination folder based on a list of filenames provided in an Excel or CSV file.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2

    openSkyMatch

    Matches OpenScience Observatories images with astronomical catalogs

    openSkyMatch is a collection of Linux shell and Python scripts designed for the OpenScience Observatories program. It automates the identification and matching of detected celestial objects in locally captured FITS images with entries in large-scale sky catalogs, notably Pan-STARRS1 DR2 (II/389/ps1_dr2). The toolkit supports data preprocessing, coordinate correlation, and catalog-based validation of astronomical detections. All tools are open-source and optimized for reproducibility and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Chinese-LLaMA-Alpaca 2

    Chinese-LLaMA-Alpaca 2

    Chinese LLaMA-2 & Alpaca-2 Large Model Phase II Project

    This project is developed based on the commercially available large model Llama-2 released by Meta. It is the second phase of the Chinese LLaMA&Alpaca large model project. The Chinese LLaMA-2 base model and the Alpaca-2 instruction fine-tuning large model are open-sourced. These models expand and optimize the Chinese vocabulary on the basis of the original Llama-2, use large-scale Chinese data for incremental pre-training, and further improve the basic semantics and command understanding of...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Mara Pipelines

    Mara Pipelines

    A lightweight opinionated ETL framework, halfway between plain scripts

    This package contains a lightweight data transformation framework with a focus on transparency and complexity reduction. Data integration pipelines as code: pipelines, tasks and commands are created using declarative Python code. PostgreSQL as a data processing engine. Extensive web ui. The web browser as the main tool for inspecting, running and debugging pipelines. GNU make semantics.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    More flexibility. More control.

    Generate interest, access liquidity without selling, and execute trades seamlessly. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 5
    Advanced Trigonometry Calculator

    Advanced Trigonometry Calculator

    Precision Trigonometry: Advanced Calculator for Complex Math

    Advanced Trigonometry Calculator is equipped with a user-friendly interface that allows for easy input of problems and instant computation. Professionals such as engineers who need to perform advanced trigonometric calculations in their work will find this tool extremely useful. ATC Online Alpha: https://advantrigoncalc.sourceforge.io/atc/ More info by clicking below: https://advantrigoncalc.sourceforge.io/ Advanced Trigonometry Calculator was only and always only developed by...
    Leader badge
    Downloads: 11 This Week
    Last Update:
    See Project
  • 6
    PoJamas aims to provide a Python and tools for loading, processing, and producing .cr2, pz3 (crz, pzz) files compatible with the SmithMicro (e-frontier) Poser character animation application. PoJamas is composed of: - Python library - Python Wavefront (.obj) 3D viewer based on GLFW - LibreOffice/Python Application (to ease the library and the viewer usage) As of 2020, the project is ported in Python3 As of 2021 this project proposes a 3D viewer for Wavefront files...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    towhee

    towhee

    Framework that is dedicated to making neural data processing

    ...Towhee provides out-of-the-box integration with your favorite libraries, tools, and frameworks, making development quick and easy. Towhee includes a pythonic method-chaining API for describing custom data processing pipelines. We also support schemas, making processing unstructured data as easy as handling tabular data.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    ddgr

    ddgr

    DuckDuckGo from the terminal

    ...The tool also supports options like opening a selected result in a web browser, piping results into other tools, and restricting searches to specific formats such as text-only or JSON for further processing. Because it avoids third-party tracking and ads built into many browser search experiences, ddgr appeals to users seeking greater control over data and a faster, distraction-free search flow.
    Downloads: 16 This Week
    Last Update:
    See Project
  • 9
    pdf combiner merger converter splitter

    pdf combiner merger converter splitter

    PDF Combiner is a user-friendly, GUI-based tool built in

    PDF Combiner is a user-friendly open source free to use, GUI-based tool for combining, pdf to excel, pdf to word, image to pdf, zip, unzip annotate and splitting PDF files. It is easy to use, supports multiple file insert and delete and process, and allows you to adjust the order of files before combining.
    Downloads: 2 This Week
    Last Update:
    See Project
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 10
    solo-learn

    solo-learn

    Library of self-supervised methods for visual representation

    A library of self-supervised methods for visual representation learning powered by Pytorch Lightning. A library of self-supervised methods for unsupervised visual representation learning powered by PyTorch Lightning. We aim at providing SOTA self-supervised methods in a comparable environment while, at the same time, implementing training tricks. The library is self-contained, but it is possible to use the models outside of solo-learn.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 11
    RAGs

    RAGs

    Build ChatGPT over your data, all with natural language

    RAGs is an open-source application designed to simplify the creation of retrieval-augmented generation pipelines through an interactive interface. Built with Streamlit and powered by the LlamaIndex ecosystem, the tool allows users to construct AI assistants that answer questions using their own data sources. Instead of requiring extensive programming knowledge, the application allows users to configure and build a RAG system using natural language instructions. The system automatically...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    text-dedup

    text-dedup

    All-in-one text de-duplication

    text-dedup is a Python library that enables efficient deduplication of large text corpora by using MinHash and other probabilistic techniques to detect near-duplicate content. This is especially useful for NLP tasks where duplicated training data can skew model performance. text-dedup scales to billions of documents and offers tools for chunking, hashing, and comparing text efficiently with low memory usage. It supports Jaccard similarity thresholding, parallel execution, and flexible...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    CloudI: A Cloud at the lowest level
    CloudI is an open-source private cloud computing framework for efficient, secure, and internal data processing. CloudI provides scaling for previously unscalable source code with efficient fault-tolerant execution of ATS, C/C++, Erlang/Elixir, Go, Haskell, Java, JavaScript/node.js, OCaml, Perl, PHP, Python, Ruby, or Rust services. The bare essentials for efficient fault-tolerant processing on a cloud!
    Downloads: 7 This Week
    Last Update:
    See Project
  • 14
    funNLP

    funNLP

    Resources, corpora, and tools for Chinese natural language processing

    ...The project is highly community-oriented, frequently updated with contributions and new resources, and it’s widely used in both academic and applied NLP research. Its value lies in providing not just tools but also curated, domain-specific data, which can be hard to find elsewhere.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Lightning-Hydra-Template

    Lightning-Hydra-Template

    PyTorch Lightning + Hydra. A very user-friendly template

    ...A collection of best practices for efficient workflow and reproducibility. Thoroughly commented - you can use this repo as a reference and educational resource. Not fitted for data engineering - the template configuration setup is not designed for building data processing pipelines that depend on each other. PyTorch Lightning, a lightweight PyTorch wrapper for high-performance AI research. Think of it as a framework for organizing your PyTorch code. Hydra, a framework for elegantly configuring complex applications. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    find-similar

    find-similar

    User-friendly library to find similar objects

    The mission of the FindSimilar project is to provide a powerful and versatile open source library that empowers developers to efficiently find similar objects and perform comparisons across a variety of data types. Whether dealing with texts, images, audio, or more, our project aims to simplify the process of identifying similarities and enhancing decision-making. https://github.com/findsimilar/find-similar - GitHub repo http://demo.findsimilar.org/ - Demo project and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Originally a reimplementation of OpenGroupware's ZideStore. While compatible with legacy ZideStore Coils provides a sophisticated workflow system with ETL and integration capabilities and superior WebDAV/CalDAV features and compatibility. The workflow engine suppports processes described in BPML and provides integration with a variety of services include SSH, LPD, LDAP, and relation databases.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Prime QA

    Prime QA

    State-of-the-art Multilingual Question Answering research

    PrimeQA is a public open source repository that enables researchers and developers to train state-of-the-art models for question answering (QA). By using PrimeQA, a researcher can replicate the experiments outlined in a paper published in the latest NLP conference while also enjoying the capability to download pre-trained models (from an online repository) and run them on their own custom data. PrimeQA is built on top of the Transformers toolkit and uses datasets and models that are directly...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    AI-powered enterprise search engine

    AI-powered enterprise search engine

    AI-powered enterprise search engine

    ...It enables users to search across sources such as Slack, Confluence, Jira, Google Drive, and other enterprise systems, consolidating fragmented knowledge into a single, unified search experience. By leveraging natural language processing, Gerev allows users to query information in plain English, making it easier to find answers without needing exact keywords or knowing where the data is stored. The platform indexes content from connected systems rather than relying on their native search capabilities, resulting in faster and more relevant results across large datasets. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    ED Software project contains several programs used (mostly) for processing gas-phase electron diffraction (GED) experimental data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Horovod

    Horovod

    Distributed training framework for TensorFlow, Keras, PyTorch, etc.

    ...Horovod can be installed on-premise or run out-of-the-box in cloud platforms, including AWS, Azure, and Databricks. Horovod can additionally run on top of Apache Spark, making it possible to unify data processing and model training into a single pipeline. Once Horovod has been configured, the same infrastructure can be used to train models with any framework, making it easy to switch between TensorFlow, PyTorch, MXNet, and future frameworks as machine learning tech stacks continue to evolve. Start scaling your model training with just a few lines of Python code. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 22
    The Related Values Processing Framework helps the integration of Process Control Data Historian Systems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    AB3DMOT

    AB3DMOT

    Official Python Implementation for "3D Multi-Object Tracking

    ...The system processes detection results from 3D object detectors that analyze LiDAR point clouds and uses them to track multiple objects across consecutive frames. Its tracking pipeline relies on a combination of classical algorithms, including a Kalman filter for state estimation and the Hungarian algorithm for data association between detected objects and existing tracks. This relatively simple design allows the tracker to achieve very high processing speeds while maintaining competitive tracking accuracy. The project also introduces new evaluation metrics specifically designed for assessing performance in 3D tracking benchmarks. The framework has been evaluated on widely used datasets such as KITTI and nuScenes and demonstrates strong performance compared with more complex tracking systems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    SageMaker Experiments Python SDK

    SageMaker Experiments Python SDK

    Experiment tracking and metric logging for Amazon SageMaker notebooks

    ...Each step in the workflow is described by a Trial Component. There is no relationship between Trial Components such as ordering. Trial Component: A description of a single step in a machine learning workflow. For example data cleaning, feature extraction, model training, model evaluation, etc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    OGB

    OGB

    Benchmark datasets, data loaders, and evaluators for graph machine

    ...OGB fully automates dataset processing. The OGB data loaders automatically download and process graphs, provide graph objects that are fully compatible with Pytorch Geometric and DGL. OGB provides standardized dataset splits and evaluators that allow for easy and reliable comparison of different models in a unified manner.
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB