146 projects for "data processing" with 2 filters applied:

  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    More flexibility. More control.

    Generate interest, access liquidity without selling, and execute trades seamlessly. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • 1
    crème de la crème of AI courses

    crème de la crème of AI courses

    This repository is a curated collection of links to various courses

    ...Topics covered include deep learning, natural language processing, computer vision, large language models, linear algebra, reinforcement learning, and machine learning engineering. Because the repository links to well-known educational content such as university lecture series and professional training materials, it functions as a structured roadmap for individuals who want to develop expertise in artificial intelligence.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    xLSTM

    xLSTM

    Neural Network architecture based on ideas of the original LSTM

    xLSTM is an open-source machine learning architecture that reimagines the classic Long Short-Term Memory (LSTM) network for modern large-scale language modeling and sequence processing tasks. The project introduces a new recurrent neural network design that incorporates exponential gating mechanisms and enhanced memory structures to overcome limitations of traditional LSTM models. By introducing innovations such as matrix-based memory and improved normalization techniques, xLSTM improves the ability of recurrent networks to capture long-range dependencies in sequential data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    kg-gen

    kg-gen

    Knowledge Graph Generation from Any Text

    kg-gen is an open-source framework developed by the STAIR Lab that automatically generates knowledge graphs from unstructured text using large language models. The system is designed to transform plain text sources such as documents, articles, or conversation transcripts into structured graphs composed of entities and relationships. Instead of relying on traditional rule-based extraction techniques, KG-Gen uses language models to identify entities and their relationships, producing...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    AI_Tutorial

    AI_Tutorial

    A selection of learning materials, search, recommendation, advertising

    AI_Tutorial is a large curated repository that aggregates high-quality learning resources related to artificial intelligence, machine learning, deep learning, natural language processing, and data engineering. The project functions as a centralized knowledge base designed to help engineers and researchers discover tutorials, technical articles, algorithm explanations, and architecture discussions from across the AI ecosystem. Rather than focusing on a single framework or course, the repository collects materials from many sources such as open-source projects, technical blogs, research papers, and industry engineering posts. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Go from Code to Production URL in Seconds Icon
    Go from Code to Production URL in Seconds

    Cloud Run deploys apps in any language instantly. Scales to zero. Pay only when code runs.

    Skip the Kubernetes configs. Cloud Run handles HTTPS, scaling, and infrastructure automatically. Two million requests free per month.
    Try it free
  • 5
    Live API Web Console

    Live API Web Console

    A react-based starter app for using the Live API over websockets

    Live API Web Console is a React starter that demonstrates how to use Gemini’s Live API over WebSockets to build real-time, multimodal experiences. The app includes modules for streaming audio playback, recording user media from the microphone, webcam, or even screen capture, and it surfaces a unified event log so you can debug the session as it flows. Configuration lives in a simple .env file and the project boots with standard web tooling, letting you experiment quickly with models, system...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    SMILI

    SMILI

    Scientific Visualisation Made Easy

    The Simple Medical Imaging Library Interface (SMILI), pronounced 'smilie', is an open-source, light-weight and easy-to-use medical imaging viewer and library for all major operating systems. The main sMILX application features for viewing n-D images, vector images, DICOMs, anonymizing, shape analysis and models/surfaces with easy drag and drop functions. It also features a number of standard processing algorithms for smoothing, thresholding, masking etc. images and models, both with...
    Leader badge
    Downloads: 70 This Week
    Last Update:
    See Project
  • 7

    libsombrero

    Astronomical object/structure detection from 1D and 2D data sets.

    Sombrero is a fast wavelet image processing and object detection C library for astronomical images. Sombrero is named after the "Mexican Hat" shape of the wavelet masks used in image convolution and is released under the GNU LGPL library.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Armadillo

    Armadillo

    fast C++ library for linear algebra & scientific computing

    * Fast C++ library for linear algebra (matrix maths) and scientific computing * Easy to use functions and syntax, deliberately similar to Matlab / Octave * Uses template meta-programming techniques to increase efficiency * Provides user-friendly wrappers for OpenBLAS, Intel MKL, LAPACK, ATLAS, ARPACK, SuperLU and FFTW libraries * Useful for machine learning, pattern recognition, signal processing, bioinformatics, statistics, finance, etc. * Downloads:...
    Leader badge
    Downloads: 2,862 This Week
    Last Update:
    See Project
  • 9
    ADAMS

    ADAMS

    ADAMS is a workflow engine for building complex knowledge workflows.

    ADAMS is a flexible workflow engine aimed at quickly building and maintaining data-driven, reactive workflows, easily integrated into business processes. Instead of placing operators on a canvas and manually connecting them, a tree structure and flow control operators determine how data is processed (sequentially/parallel). This allows rapid development and easy maintenance of large workflows, with hundreds or thousands of operators.
    Leader badge
    Downloads: 4 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    Chinese-LLaMA-Alpaca 2

    Chinese-LLaMA-Alpaca 2

    Chinese LLaMA-2 & Alpaca-2 Large Model Phase II Project

    This project is developed based on the commercially available large model Llama-2 released by Meta. It is the second phase of the Chinese LLaMA&Alpaca large model project. The Chinese LLaMA-2 base model and the Alpaca-2 instruction fine-tuning large model are open-sourced. These models expand and optimize the Chinese vocabulary on the basis of the original Llama-2, use large-scale Chinese data for incremental pre-training, and further improve the basic semantics and command understanding of...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Universal Sentence Encoder

    Universal Sentence Encoder

    Encoder of greater-than-word length text trained on a variety of data

    The Universal Sentence Encoder (USE) is a pre-trained deep learning model designed to encode sentences into fixed-length embeddings for use in various natural language processing (NLP) tasks. It leverages Transformer and Deep Averaging Network (DAN) architectures to generate embeddings that capture the semantic meaning of sentences. The model is designed for tasks like sentiment analysis, semantic textual similarity, and clustering, and provides high-quality sentence representations in a...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    towhee

    towhee

    Framework that is dedicated to making neural data processing

    ...Towhee provides out-of-the-box integration with your favorite libraries, tools, and frameworks, making development quick and easy. Towhee includes a pythonic method-chaining API for describing custom data processing pipelines. We also support schemas, making processing unstructured data as easy as handling tabular data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    RAGs

    RAGs

    Build ChatGPT over your data, all with natural language

    RAGs is an open-source application designed to simplify the creation of retrieval-augmented generation pipelines through an interactive interface. Built with Streamlit and powered by the LlamaIndex ecosystem, the tool allows users to construct AI assistants that answer questions using their own data sources. Instead of requiring extensive programming knowledge, the application allows users to configure and build a RAG system using natural language instructions. The system automatically...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Complete Machine Learning Package

    Complete Machine Learning Package

    A comprehensive machine learning repository containing 30+ notebooks

    ...The repository also includes examples related to natural language processing, computer vision, and data visualization, giving learners exposure to several subfields of machine learning. By organizing the content into modular notebooks, the project allows users to explore topics independently and experiment with the code directly.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    The Algorithms - C #

    The Algorithms - C #

    Collection of various algorithms in mathematics, machine learning

    TheAlgorithms/C is an open-source repository that provides implementations of classic algorithms and data structures written in the C programming language. The project is part of the larger “The Algorithms” initiative, which aims to create educational resources by implementing algorithms in multiple programming languages. Within the C repository, contributors implement algorithms from many areas of computer science including sorting, searching, graph processing, mathematics, machine learning, and numerical methods. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    funNLP

    funNLP

    Resources, corpora, and tools for Chinese natural language processing

    ...The project is highly community-oriented, frequently updated with contributions and new resources, and it’s widely used in both academic and applied NLP research. Its value lies in providing not just tools but also curated, domain-specific data, which can be hard to find elsewhere.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    AI-powered enterprise search engine

    AI-powered enterprise search engine

    AI-powered enterprise search engine

    ...It enables users to search across sources such as Slack, Confluence, Jira, Google Drive, and other enterprise systems, consolidating fragmented knowledge into a single, unified search experience. By leveraging natural language processing, Gerev allows users to query information in plain English, making it easier to find answers without needing exact keywords or knowing where the data is stored. The platform indexes content from connected systems rather than relying on their native search capabilities, resulting in faster and more relevant results across large datasets. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    AB3DMOT

    AB3DMOT

    Official Python Implementation for "3D Multi-Object Tracking

    ...The system processes detection results from 3D object detectors that analyze LiDAR point clouds and uses them to track multiple objects across consecutive frames. Its tracking pipeline relies on a combination of classical algorithms, including a Kalman filter for state estimation and the Hungarian algorithm for data association between detected objects and existing tracks. This relatively simple design allows the tracker to achieve very high processing speeds while maintaining competitive tracking accuracy. The project also introduces new evaluation metrics specifically designed for assessing performance in 3D tracking benchmarks. The framework has been evaluated on widely used datasets such as KITTI and nuScenes and demonstrates strong performance compared with more complex tracking systems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Common Resource Grep - crgrep

    Common Resource Grep - crgrep

    Common Resource Grep

    CRGREP searches for matching text in databases, various document formats, archives and other difficult to access resources. A command line tool for name and content text matching in database tables, plain files, MS Office documents, PDF, archives, MP3 audio, image meta-data, scanned documents, maven dependencies and web resources. CRGREP will search resources within resources of any arbitrary combination or depth, so text within a document within a zip archive, and so on. Here you...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 20
    Mars Framework

    Mars Framework

    Mars is a tensor-based unified framework for large-scale data

    ...Its architecture automatically divides large computational tasks into smaller chunks that can be executed across multiple nodes in a cluster, allowing complex analytics, machine learning workflows, and data transformations to run efficiently at scale. Mars is particularly useful for workloads that exceed the memory capacity of a single machine or require high levels of parallel processing.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 21
    3D-Machine-Learning

    3D-Machine-Learning

    A resource repository for 3D machine learning

    3D-Machine-Learning is an open-source repository that compiles resources related to machine learning techniques applied to three-dimensional data. The project acts as a curated research directory that includes papers, datasets, tutorials, and software tools relevant to the emerging field of 3D machine learning. This interdisciplinary domain combines ideas from computer vision, computer graphics, and deep learning to analyze and generate three-dimensional structures. The repository includes references to important research papers covering topics such as point cloud processing, 3D reconstruction, shape analysis, and scene understanding. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    godlp

    godlp

    Sensitive information protection toolkit

    godlp appears to be another software project from ByteDance — however, as of the most recent checks, there’s very little publicly available information about it: the repository exists under ByteDance’s GitHub, but its documentation, README, and metadata are minimal (or not human-readable), and the project seems to have limited community visibility compared to their other major tools. Because of that opacity, one must infer that godlp is likely a specialized internal or early-stage tool, possibly related to internal optimization, data processing, or platform-specific functionality (given ByteDance’s historical patterns). The minimal public footprint suggests it may be experimental, unmaintained, or only partially open-sourced, which reduces its immediate practicality for external developers.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    MeshCNN in PyTorch

    MeshCNN in PyTorch

    Convolutional Neural Network for 3D meshes in PyTorch

    MeshCNN is a deep learning framework designed specifically for processing 3D triangular mesh data using convolutional neural networks. Unlike traditional CNNs that operate on images or voxel grids, MeshCNN performs convolution operations directly on the edges of mesh structures. This design allows the model to capture geometric relationships between mesh elements while preserving the underlying topology of 3D shapes.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    AI Platform Training and Prediction
    AI Platform Training and Prediction is a collection of machine learning example projects that demonstrate how to train, deploy, and serve models using Google Cloud AI Platform and related services. It includes a wide variety of implementations across frameworks such as TensorFlow, PyTorch, scikit-learn, and XGBoost, allowing developers to explore different approaches to building ML solutions. The repository covers the full machine learning lifecycle, including data preprocessing, model...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    hora

    hora

    Efficient approximate nearest neighbor search algorithm collections

    hora is an open-source high-performance vector similarity search library designed for large-scale machine learning and information retrieval systems. The project focuses on approximate nearest neighbor search, a fundamental technique used in modern AI applications such as recommendation systems, image search, and semantic search engines. Hora implements multiple efficient indexing algorithms that allow systems to rapidly search through high-dimensional vectors produced by machine learning...
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB