Showing 774 open source projects for "extraction"

View related business solutions
  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 1
    jieba

    jieba

    Stuttering Chinese word segmentation

    "Jaba" Chinese word segmentation, do the best Python Chinese word segmentation component. Four word segmentation modes are supported. Precise mode, which tries to cut the sentence most precisely, suitable for text analysis. Full mode, scans all the words that can be formed into words in the sentence, the speed is very fast, but the ambiguity cannot be resolved. The search engine mode, on the basis of the precise mode, divides the long words again to improve the recall rate, which is suitable...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 2
    cocoNLP

    cocoNLP

    A Chinese information extraction tool

    cocoNLP is a lightweight natural-language processing toolkit geared toward practical information extraction from raw text, especially for Chinese and mixed Chinese–English content. Instead of requiring a heavy pipeline, it focuses on quick wins such as extracting names, places, organizations, emails, phone numbers, and dates directly from unstructured sentences. The project blends pattern-based methods with NLP heuristics, giving developers dependable results for real-world texts like chats, comments, and user-generated content. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    AhoCorasickDoubleArrayTrie

    AhoCorasickDoubleArrayTrie

    An extremely fast implementation of Aho Corasick algorithm

    AhoCorasickDoubleArrayTrie is a Java implementation of the Aho–Corasick multi-pattern matching algorithm that is optimized using a Double-Array Trie data structure. It is designed for fast keyword scanning across large texts, where you want to search for many patterns simultaneously and efficiently. The core idea is to build an automaton from a dictionary of patterns, then stream through input text to emit matches with minimal overhead. By using a double-array trie representation, the...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    PyTracking

    PyTracking

    Visual tracking library based on PyTorch

    A general python framework for visual object tracking and video object segmentation, based on PyTorch. Official implementation of the RTS (ECCV 2022), ToMP (CVPR 2022), KeepTrack (ICCV 2021), LWL (ECCV 2020), KYS (ECCV 2020), PrDiMP (CVPR 2020), DiMP (ICCV 2019), and ATOM (CVPR 2019) trackers, including complete training code and trained models.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    Access competitive interest rates on your digital assets.

    Generate interest, borrow against your crypto, and trade a range of cryptocurrencies — all in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 5
    Snips NLU

    Snips NLU

    Snips Python library to extract meaning from text

    Snips NLU is a Natural Language Understanding python library that allows to parse sentences written in natural language, and extract structured information. It’s the library that powers the NLU engine used in the Snips Console that you can use to create awesome and private-by-design voice assistants. The exact output is a bit richer, the point here is to give a glimpse on what kind of information can be extracted. Behind every chatbot and voice assistant lies a common piece of technology:...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Image Super-Resolution (ISR)

    Image Super-Resolution (ISR)

    Super-scale your images and run experiments with Residual Dense

    The goal of this project is to upscale and improve the quality of low-resolution images. This project contains Keras implementations of different Residual Dense Networks for Single Image Super-Resolution (ISR) as well as scripts to train these networks using content and adversarial loss components. Docker scripts and Google Colab notebooks are available to carry training and prediction. Also, we provide scripts to facilitate training on the cloud with AWS and Nvidia-docker with only a few...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    Orange

    Orange

    OpenResty/Nginx Gateway for API monitoring and management

    A Gateway based on OpenResty(Nginx + Lua) for API Monitoring and Management. We recommend that you use luarocks to install Orange to reduce problems caused by dependency extensions in different operating system releases. System dependencies (openresty, resty-CLI, luarocks, etc.) are necessary to install Orange on different operating systems. By default, a Dashboard is provided to manage all Orange plugin data. All Orange's plugins have open APIs that can be used to achieve more personalized...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Phenalysis

    Phenalysis

    Analyze agronomic plant research plots in aerial orthomosaic images.

    A graphical user interface to import, analyze and export plots from orthomosaic images of agronomic trials. Please cite the following reference in your work if you use Phenalysis: Khan Z and Miklavcic SJ (2019) An Automatic Field Plot Extraction Method From Aerial Orthomosaic Images. Front. Plant Sci. 10:683. doi: https://doi.org/10.3389/fpls.2019.00683 This tool is being developed through the sponsorship of the Australian Research Council's Industrial Transformation Research Hub on Wheat in a Hot and Dry Climate. https://www.wheathub.com.au/
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    bert-event-extraction

    bert-event-extraction

    Pytorch Solution of Event Extraction Task using BERT on ACE 2005

    The detailed instructions are in the readme file within the zip file. Github: https://github.com/nlpcl-lab/bert-event-extraction
    Downloads: 0 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 10
    MCNPydE

    MCNPydE

    MCNP data extraction and display software library

    MCNPydE is a Python library for extracting data from MCNP output file. It requires Python, Matplotlib and Numpy. It is a data reduction tool for MCNP output for ease of results analysis and viewing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    JavoueJapasse_gl02

    JavoueJapasse_gl02

    LOGICIEL D'ANALYSE D'EMAILS

    Vous souhaitez analyser un programme permettant d'analyser simplement et efficacement tous vos emails ? Bienvenue chez JavoueJapasse ! Le cabinet de conseil UIConsult souhaite se doter d'un outil d'aide à l'analyse des communications et des expertises à l'intérieur de ses équipes de collaborateurs. L'enjeu est pour les responsables de secteur de pouvoir produire un rapport d'analyse des échanges par emails des collaborateurs, appelé "RCom". Ces rapport sont réalisés à l'échelle...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12

    AutoBench

    This program is a benchmark site data extraction util program

    This program is a program that extracts the latest CPU, GPU, Drive and RAM performance scores and rankings from benchmark sites. The Output Data is saved as a csv, xlsx and xls file. CPU information is written by model name and score. GPU information is written by model name and score. Drive information is written by model name and score. RAM information is written by model name and score.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    ace2005-preprocessing

    ace2005-preprocessing

    ACE 2005 corpus preprocessing for Event Extraction task

    This is a simple code for preprocessing ACE 2005 corpus for Event Extraction task. Using the existing methods were complicated for me, so I made this project. Github: https://github.com/nlpcl-lab/ace2005-preprocessing
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Activity Recognition

    Activity Recognition

    Resources about activity recognition

    This repository is a curated collection of resources, papers, code, and summaries relating to human activity recognition/behavior recognition. It is not a single integrated software package but rather a knowledge base organizing feature extraction methods, deep learning approaches, transfer learning strategies, datasets, and representative research in behavior recognition. The repository includes links to code in MATLAB, Python, summaries of algorithms, datasets, and relevant research papers. Feature extraction method summaries (e.g. motion, sensor, vision). Deep learning for activity recognition references.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    RoboSat

    RoboSat

    Semantic segmentation on aerial and satellite imagery

    RoboSat is an end-to-end pipeline written in Python 3 for feature extraction from aerial and satellite imagery. Features can be anything visually distinguishable in the imagery for example: buildings, parking lots, roads, or cars.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16

    chords-malware-analyzer

    Chords is binary file strings extractor, with many formats supported.

    Chords is strings on steroids. Is able to extract strings from files just like strings, but it also supports windows wide string, base64 and hexadecimal strings (with decoding support) and automatic recognition of Indicators of Compromise (IOCs). It has been developed to support the malware analysis process, but is a general purpose tool.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    ECommerceCrawlers

    ECommerceCrawlers

    Collection of Python ecommerce and website crawler examples projects

    ...These examples demonstrate how to build and operate web scrapers capable of collecting structured information such as product listings, news content, job postings, social media data, and other publicly available web data. It aims to help developers understand the full workflow of web scraping, including request simulation, data extraction, storage, and handling anti-scraping techniques. It includes crawlers for platforms such as ecommerce marketplaces, blogging platforms, recruitment sites, and social networks, providing real-world practice scenarios. Developers can study the individual project documentation to understand the analysis process.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 18
    TIES

    TIES

    A smart search engine for medical documents

    TIES (Text Information Extraction System) is a clinical text search engine that uses Natural Language Processing techniques to extract medical concepts from free text clinical reports. It provides secure de-identified access to this information and has in built collaboration tools and honest broker functionality. It is licensed for academic use under the BSD license.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Enlive

    Enlive

    Selector-based templating and transformation system for Clojure

    Enlive is a Clojure library for HTML templating, transformation, and scraping, supporting composable manipulation of HTML/XML in a functional style. It allows selecting, transforming, and generating HTML fragments using CSS selectors, and supports server-side template composition, dynamic pages, and content rewriting. By default selector-transformation pairs are run sequentially. When you know that several transformations are independent, you can now specify (as an optimization) to process...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Duckling (Old)

    Duckling (Old)

    Clojure library that parses text into structured data

    Duckling (the “old” archived version) is a natural language processing library (in Clojure) for parsing text to structured data — specifically, recognizing quantities such as dates, times, durations, measurements, currencies, etc., from free-form text. To use Duckling in your project, you just need two functions: load! to load the default configuration, and parse to parse a string. Duckling is a Clojure library that parses text into structured data. See our blog post announcement for more...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21

    cdrecord

    Highly portable CD/DVD/BluRay command line recording software

    cdrecord: A CD/DVD/BD recording program readcd: A program to read CD/DVD/BD media with CD-clone features cdda2wav: The most evolved CD-audio extraction program with paranoia support mkisofs: A program to create hybrid ISO9660/Joliet/HFS filesystems with optional Rock Ridge attributes isodebug: A program to print mkisofs debug information from media isodump: A program to dump ISO-9660 media isoinfo: A program to analyse/verify ISO/9660/Joliet/Rock-Ridge Filesystems isovfy: A program to verify the ISO-9660 structures rscsi: A Remote SCSI enabling daemon
    Leader badge
    Downloads: 8,872 This Week
    Last Update:
    See Project
  • 22

    xMSanalyzer

    An R package for metabolomics data extraction and quality assessment

    xMSanalyzer comprises of utilities that can be classified into four main modules: 1) merging apLCMS or XCMS sample processing results from multiple sets of parameter settings, 2) evaluation of sample quality, feature consistency, and batch-effect, 3) feature matching, and 4) characterization of m/z using KEGG REST; 5) Batch-effect correction using ComBat
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Photon

    Photon

    Incredibly fast crawler designed for OSINT

    ...Its Python implementation makes it accessible for customization and integration into larger automation frameworks. Despite its speed focus, the tool still provides useful filtering and extraction capabilities for analysts who need structured results. Overall, Photon functions as a lightweight yet powerful reconnaissance spider for web intelligence gathering.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 24
    qlImageSize

    qlImageSize

    QuickLook and Spotlight plugins to display the dimensions of images

    qlImageSize is a QuickLook plugin for macOS that displays image dimensions and file size in the QuickLook preview panel. It provides an efficient way to inspect image metadata without opening additional applications.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 25
    MultiPathNet

    MultiPathNet

    A Torch implementation of the object detection network

    MultiPathNet is a Torch-7 implementation of the “A MultiPath Network for Object Detection” paper (BMVC 2016), developed by Facebook AI Research. It extends the Fast R-CNN framework by introducing multiple network “paths” to enhance feature extraction and object recognition robustness. The MultiPath architecture incorporates skip connections and multi-scale processing to capture both fine-grained details and high-level context within a single detection pipeline. This results in improved detection accuracy across various object sizes and categories compared to standard single-path architectures. ...
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB