Showing 774 open source projects for "extraction"

View related business solutions
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 1

    cantools

    Access and convert ASC, BLF, DBC, and MDF files

    cantools is a set of libraries and command line tools for handling ASC, BLF, CLG, VSB, MDF, and DBC files. The tools can be used to analyze and convert the data to other formats. Shared libraries for parsing and accessing these files are also provided.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 2
    MORPHEUS for Fiji

    MORPHEUS for Fiji

    A tool for unbiased and reproducible cell morphometry in Fiji/ImageJ2

    ...Specifically, MORPHEUS works with sampling distributions to learn—in an unsupervised manner and by a non-parametric approach—how to recognize the cells suitable for subsequent analysis. Afterwards, the algorithm performs the evaluation of the most relevant cell-shape descriptors over the full set of detected cells. Optionally, also the extraction of nucleus features and a double-scale analysis of orientation can be performed. The whole algorithm is implemented as a one-click procedure, thus minimizing the user intervention and the ensuing biases and errors of human origin. By this way, MORPHEUS is intended to be a useful tool to face the issue of reproducibility in bioimage analysis.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    DIFRATE

    DIFRATE

    DIstortion Free Relaxation Analysis TEchnique (DIFRATE) software

    ..."Reducing bias in the analysis of solution-state NMR data with dynamics detectors." (2018) Currently available: http://doi.org/10.1002/anie.201901929 Contact: alsi-nmr@users.sourceforge.net Please contact me if you have any questions. Also check out INFOS for data extraction from NMR spectra (infos.sourceforge.net).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    HelloNzb

    HelloNzb

    The Binary Usenet Tool

    With HelloNzb you can download (binary) files from Usenet servers via NZB index files. The software is based on Java and can thus run on many platforms (tested on Windows and Linux). Automatic archive verification via PAR2, automatic RAR archive extraction, built-in yEnc- and UU-decoding. Portable, no installation required.
    Downloads: 9 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    3DFOREST

    3DFOREST

    tool for manage and process TLS lidar data from forest environment

    Software tool for tree atributes extraction from point cloud data acquired by terrestrial laser scanner in forest enviroment. 3DForest is not only a visualizer af data but brings new and complex data management for foresters and researchers
    Downloads: 4 This Week
    Last Update:
    See Project
  • 6
    NeuroNER

    NeuroNER

    Named-entity recognition using neural networks

    Named-entity recognition (NER) aims at identifying entities of interest in the text, such as location, organization and temporal expression. Identified entities can be used in various downstream applications such as patient note de-identification and information extraction systems. They can also be used as features for machine learning systems for other natural language processing tasks. Leverages the state-of-the-art prediction capabilities of neural networks (a.k.a. "deep learning") Is cross-platform, open source, freely available, and straightforward to use. Enables the users to create or modify annotations for a new or existing corpus. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    MITIE

    MITIE

    MITIE: library and tools for information extraction

    This project provides free (even for commercial use) state-of-the-art information extraction tools. The current release includes tools for performing named entity extraction and binary relation detection as well as tools for training custom extractors and relation detectors. MITIE is built on top of dlib, a high-performance machine-learning library[1], MITIE makes use of several state-of-the-art techniques including the use of distributional word embeddings[2] and Structural Support Vector Machines[3]. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    @Note2

    @Note2

    @Note2 - A workbench for Biomedical Text Mining

    Biomedical Text Mining (BioTM) is providing valuable approaches to the automated curation of scientific literature.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 9
    TextRank

    TextRank

    TextRank implementation for Python 3

    TextRank is an implementation of the TextRank algorithm for extractive text summarization and keyword extraction, inspired by Google’s PageRank.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Go from Code to Production URL in Seconds Icon
    Go from Code to Production URL in Seconds

    Cloud Run deploys apps in any language instantly. Scales to zero. Pay only when code runs.

    Skip the Kubernetes configs. Cloud Run handles HTTPS, scaling, and infrastructure automatically. Two million requests free per month.
    Try it free
  • 10

    TBXTools

    A Python class for Terminology Extraction and Management

    TBXTools allows easy and rapid Terminology Extraction and Management. This tool implements both statistical and linguistic methods, along with several utilities to create and manage terminological databases. It is written in Python and uses NLTK (Natural Language Toolkit) The project has moved to Github: https://github.com/aoliverg/TBXTools
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11

    StepPy

    Method of fast data extraction from structured documents

    Fast data extraction from structured documents like HTML and XML by using a phrase sequence search technique. The required data is found by searching for one or more signature phrases prior to the required data text followed by a terminal phrase after the data. No parsing is required which results in very high speed data extraction.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Simd

    Simd

    High performance image processing library in C++

    The Simd Library is a free open source image processing library, designed for C and C++ programmers. It provides many useful high performance algorithms for image processing such as: pixel format conversion, image scaling and filtration, extraction of statistic information from images, motion detection, object detection (HAAR and LBP classifier cascades) and classification, neural network. The algorithms are optimized with using of different SIMD CPU extensions. In particular the library supports following CPU extensions: SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2 and AVX-512 for x86/x64, VMX(Altivec) and VSX(Power7) for PowerPC, NEON for ARM. ...
    Leader badge
    Downloads: 32 This Week
    Last Update:
    See Project
  • 13
    Spatial Media

    Spatial Media

    Specifications and tools for 360º video and spatial audio

    spatial-media provides tools for working with spherical video and spatial audio metadata so players and platforms can correctly render immersive media. The utilities inject, inspect, and extract metadata in common container formats (MP4/WebM) to signal 360° projection type, stereoscopy mode, and spatial audio layout. Creators use it to prepare 360/VR180 assets for upload so services know whether a video is monoscopic, top-bottom stereo, or side-by-side, and whether ambisonic audio is...
    Downloads: 26 This Week
    Last Update:
    See Project
  • 14
    iText®, a JAVA PDF library

    iText®, a JAVA PDF library

    PDF Library for Developers

    iText is an open-source PDF library available for Java and .NET (C#). iText allows you to effortlessly generate and manipulate standards-compliant PDF documents with a powerful and feature-rich SDK. With iText, you can create archivable and accessible PDFs, split and merge documents, fill and flatten forms, digitally sign documents, and more. iText add-ons enable additional functionality, such as PDF creation from HTML templates, secure redaction, OCR, and much more. The latest...
    Leader badge
    Downloads: 156 This Week
    Last Update:
    See Project
  • 15
    PySptools

    PySptools

    Hyperspectral algorithms for Python

    ...The functions and classes are organized by topics: * abundance maps: FCLS, NNLS, UCLS * classification: AbundanceClassification, NormXCorr, KMeans SAM, SID, SVC * detection: ACE, CEM, GLRT, MatchedFilter, OSP * distance: chebychev, NormXCorr, SAM, SID * endmembers extraction: ATGP, FIPPI, NFINDR, PPI * material count: HfcVd, HySime * noise: Savitzky Golay, MNF, whiten * sigproc: bilateral * sklearn: HyperEstimatorCrossVal, HyperSVC and others * spectro: convex hull quotient, features extraction (tetracorder style), USGS06 lib interface * util: load_ENVI_file, load_ENVI_spec_lib, corr, cov and others The library do an extensive use of the numpy numeric library and can achieve good speed. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    Easy File Compression

    Easy File Compression

    Client side solution for PC & Browser Firefox to create zip archive

    All client-side solution for PC & Browser Firefox to create zip archive and extract zip and rar archive files with API HTML5 file system. == Browser Extensions == Add-on Firefox: http://mzl.la/1Kd7fiD OS requirements - Windows 7 and later are supported, older operating systems are not supported (and do not work). Both x86 and amd64 (x64) binaries are provided for - - Windows. Please note, the ARM version of Windows is not supported for now. == Installation and Activation == 1....
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    Convolutional Recurrent Neural Network

    Convolutional Recurrent Neural Network

    Convolutional Recurrent Neural Network (CRNN) for image-based sequence

    Convolutional Recurrent Neural Network provides an implementation of the Convolutional Recurrent Neural Network (CRNN) architecture, a deep learning model designed for image-based sequence recognition tasks such as optical character recognition and scene text recognition. The architecture combines convolutional neural networks for extracting visual features from images with recurrent neural networks that model sequential dependencies in the extracted features. This hybrid approach allows the...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Toolbox

    Toolbox

    Piotr's Image & Video Matlab Toolbox

    Piotr’s Image & Video MATLAB Toolbox is a general-purpose MATLAB toolbox for image and video processing and vision tasks, offering utilities, filters, detection, feature extraction, and algorithm building blocks. Example and demo scripts for usage (e.g. acfReadme, detector readmes). It augments MATLAB’s native capabilities (not replacing the Image Processing Toolbox) by providing efficient, reusable wrappers and optimized routines. Example and demo scripts for usage (e.g. acfReadme, detector readmes). Support for compilation / mex (for speed) and cross-platform compatibility.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    pyhanlp

    pyhanlp

    Chinese participle

    ...The project focuses on making HanLP’s capabilities accessible through a Python-friendly API surface, so you can integrate NLP steps into data pipelines, notebooks, and downstream ML or information-extraction code. In practice, it serves as a bridge layer: Python calls are translated into the corresponding HanLP operations, so you can keep your application logic in Python while relying on HanLP’s implementations. It is especially useful when you need a pragmatic “get results quickly” NLP layer for segmentation, tagging, entity extraction, parsing, or keyword-style tasks rather than experimenting with model training from scratch.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    gain

    gain

    Asyncio-based Python framework for building fast web crawling spiders

    ...It provides a structured framework for creating spiders that can navigate websites, extract structured data, and process the collected results. Developers define crawlers using components such as spiders, parsers, and items, allowing them to organize crawling logic and data extraction rules clearly. Gain supports CSS selectors and XPath expressions for parsing page content and extracting specific elements. Gain also allows developers to configure headers, concurrency levels, and proxy settings to control how crawlers interact with target websites. Because it uses asynchronous programming, Gain can handle multiple requests efficiently while minimizing blocking operations.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 21
    DigiExtractor
    DigiExtractor is a tool to allow extraction of video recordings from the DigiCorder series of DVB receivers manufactured by TechniSat.
    Leader badge
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22

    Information Extraction from Arabic Text

    Java based framework for extraction information from Arabic text

    This project presents a model a for extracting information from Arabic text. The project executables include three Java based modules that can be used to implement a rule-based information extraction process from Arabic text. These modules are: 1-A module for annotating a selected Arabic text file using a custom morpho-syntactic Part-of-Speech tagging scheme. 2-A module that can be used along with Protégé for establishing Ontology Web Language (OWL) based ontologies based on the concepts and relations in a text file. 3-A module for automatically extracting information from annotated Arabic text files. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23

    pdi-jira

    JIRA plugin for Pentaho Data Integration

    Using this PDI plugin you can connect any JIRA service even using SSL connection and perform JSON data extraction over the results. JQL is used to obtain data from the JIRA remote service.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Deeplearning-papernotes

    Deeplearning-papernotes

    Summaries and notes on Deep Learning research papers

    Deeplearning-papernotes is an implementation of Convolutional Neural Networks for sentence and text classification in TensorFlow, based on a well-known research paper that applies CNN architectures to natural language processing tasks with strong performance in sentiment analysis and similar classification problems. The repository provides the complete network definition, including an embedding layer to convert words into dense representations, convolution and max-pooling layers to extract...
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB