Showing 779 open source projects for "extraction"

View related business solutions
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    More flexibility. More control.

    Generate interest, access liquidity without selling, and execute trades seamlessly. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 1
    nligaStruct

    nligaStruct

    Isogeometric structural analysis with Bézier extraction

    ...Citation: Xiaoxiao Du, Gang Zhao, Ran Zhang, Wei Wang, Jiaming Yang. Numerical implementation for isogeometric analysis of thin-walled structures based on a Bézier extraction framework: nligaStruct. Thin-Walled Structures. 2022.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Tranalyzer

    Tranalyzer

    Tranalyzer flow generator packet analyzer moved to: tranalyzer.com

    ...A packet based "tshark mode" for detailed header and content inspection is improved for troubleshooting and security purposes. Flow based and packet based content inspection and extraction, better reporting, geo and organisation labeling, forensics support and encapsulation support such as ethip, teredo, anything in anything, SCTP, etc are new features of the 0.8.14. Checkout the tutorials: https://www.tranalyzer.com/tutorials
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    7-Zip-JBinding

    7-Zip-JBinding

    Java wrapper for 7z archiver engine

    Native (JNI) cross-platform library to extract (password protected, multi-part) 7z Zip Rar Tar Split Lzma Iso HFS GZip Cpio BZip2 Z Arj Chm Lhz Cab Nsis Deb Rpm Wim Udf archives and create 7z, Zip, Tar, GZip & BZip2 from Java.
    Leader badge
    Downloads: 49 This Week
    Last Update:
    See Project
  • 4
    Tensorflow Transformers

    Tensorflow Transformers

    State of the art faster Transformer with Tensorflow 2.0

    Imagine auto-regressive generation to be 90x faster. tf-transformers (Tensorflow Transformers) is designed to harness the full power of Tensorflow 2, designed specifically for Transformer based architecture. These models can be applied on text, for tasks like text classification, information extraction, question answering, summarization, translation, text generation, in over 100 languages. Images, for tasks like image classification, object detection, and segmentation. Audio, for tasks like speech recognition and audio classification. Faster AutoReggressive Decoding, TFlite support, creating TFRecords is simple. Auto-Batching tf.data.dataset or tf.ragged tensors. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 5
    Scylla

    Scylla

    Intelligent proxy pool for collecting and managing public proxies

    Scylla is an open source proxy pool system designed to collect, validate, and manage large numbers of public proxy servers for use in web scraping and data extraction workflows. It automatically crawls the internet to discover proxy IP addresses and evaluates their availability and reliability before adding them to a usable pool. It includes a JSON API that allows developers and applications to retrieve proxy information programmatically, making it easier to integrate proxy rotation into scraping tools or automation scripts. ...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 6
    Specter

    Specter

    Clojure(Script)'s missing piece

    Specter is a powerful Clojure (and ClojureScript) library that revolutionizes navigation and manipulation of deeply nested and recursive data structures through a flexible, high-performance API beyond what vanilla Clojure offers. Specter has an extremely simple core, just a single abstraction called "navigator". Queries and transforms are done by composing navigators into a "path" precisely targeting what you want to retrieve or change. Navigators can be composed with any other navigators,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7

    pdf-to-text-fragments

    PDF text extractor for Firefox extensions

    ...The list is then converted into strings; 'X value, Y value, text\n' which are concatenated and stored in a text file, 'Page n', where n is the page number. Data extraction becomes much easier because parsing can be based on both text value and text position. This is very useful for data sources which are available only in PDF form and are updated regularly. For example, insider trading data from SEDI.ca.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8

    Aseryla2

    Aseryla2 code repositories

    This project describes a model of how the semantic human memory represents the information relevant to the objects of the world in text format. It provides a system and a GUI application capable of extracting and managing concepts and relations from English texts. https://aseryla2.sourceforge.io/
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    VSGAN

    VSGAN

    VapourSynth Single Image Super-Resolution Generative Adversarial

    ...The Network will be applied in quadrants of the image to reduce up-front VRAM usage. You can use any RGB video input, including float32 (e.g., RGBS) inputs. Using VapourSynth you can pass a Video directly to VSGAN, without any frame extraction needed. Any edit you make in the VapourSynth script with or without VSGAN can be re-used for any other video. VSGAN is released under the MIT License, ensuring it will stay free, with the ability to be used commercially.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • 10
    Hugging Face Transformer

    Hugging Face Transformer

    CPU/GPU inference server for Hugging Face transformer models

    Optimize and deploy in production Hugging Face Transformer models in a single command line. At Lefebvre Dalloz we run in-production semantic search engines in the legal domain, in the non-marketing language it's a re-ranker, and we based ours on Transformer. In that setup, latency is key to providing a good user experience, and relevancy inference is done online for hundreds of snippets per user query. Most tutorials on Transformer deployment in production are built over Pytorch and FastAPI....
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11

    Linguistic Analyzer

    The Linguistic Analyzer is a tool for corpus analysis and comparison

    The Linguistic Analyzer (Almuhalil Alloghawy) is a free tool designed by a team from Al-Imam Muhammad bin Saud islamic university that can be used for corpus analysis and comparison in terms of the several linguistic characteristics, such as frequency lists generation, concordances, collocation extraction, the difference between two words, and keyword identification.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 12
    KoNLPy

    KoNLPy

    Python package for Korean natural language processing

    KoNLPy is a natural language processing (NLP) library for the Korean language, offering tokenization, morphological analysis, and named entity recognition.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Bandwidth

    Bandwidth

    Monitor monthly internet Transmit and Receive bandwidth usage - Linux

    Keep track of bandwidth usage Allows Linux users to monitor their Transmit and Receive bandwidth usage with a simple text based menu, via your browser or from the command line. Some of us are unable to get "unlimited", "all that you can eat", internet packages and are left trying to stay within our Download/Upload limits, whilst paying dearly for the "privilege". Equally, we didn't have the foresight or the money to purchase an snmp managed router, so we are unable to strip the traffic...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    printpdf

    printpdf

    Rust / WASM library for reading, writing and rendering PDF

    printpdf is a Rust library for creating, reading, writing, and rendering PDF documents, providing developers with fine-grained control over document generation and layout. It supports a wide range of PDF features, including pages, layers, annotations, vector graphics, images, and embedded fonts, allowing the creation of complex and professional documents. The library emphasizes manual positioning of elements, giving developers precise control over layout and rendering rather than relying on...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Ferret

    Ferret

    Ferret is a web scraping system

    Ferret is a web scraping system. It aims to simplify data extraction from the web for UI testing, machine learning, analytics and more. ferret allows users to focus on the data. It abstracts away the technical details and complexity of underlying technologies using its own declarative language. It is extremely portable, extensible, and fast.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16

    pak

    PAK file editor for Quake engine games

    A utility for manipulating .PAK files used by Quake and Quake 2 engine games. Allows for creation of .PAK data files from directories, extraction, individual file/directory insertion and extraction, and file/directory deletion.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 17
    PDFLayoutTextStripper

    PDFLayoutTextStripper

    Converts a pdf file into a text file while keeping the layout

    Converts a PDF file into a text file while keeping the layout of the original PDF. Useful to extract the content from a table or a form in a PDF file. PDFLayoutTextStripper is a subclass of PDFTextStripper class (from the Apache PDFBox library).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    VRN

    VRN

    Code for "Large Pose 3D Face Reconstruction

    The VRN (Volumetric Regression Network) repository implements the “Large Pose 3D Face Reconstruction from a Single Image via Direct Volumetric CNN Regression” method. Instead of explicitly fitting a 3D model via landmark estimation and deformation, VRN treats the reconstruction task as volumetric segmentation: it learns a CNN to regress a 3D volume aligned to the input image, and then extracts a mesh via isosurface from that volume. The network is unguided (no 2D landmarks as intermediate)....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Abot

    Abot

    Fast and flexible C# framework for building customizable web crawlers

    Abot is an open source C# web crawler framework designed to help developers efficiently crawl and process web content. It focuses on speed, flexibility, and extensibility while handling the complex low-level tasks involved in web crawling. It manages essential components such as multithreading, HTTP requests, scheduling, and link parsing so developers can focus on processing the collected data. Abot follows a modular architecture that allows developers to customize nearly every stage of the...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    aseryla

    aseryla

    Aseryla code repositories

    This project describes a model of how the semantic human memory represents the information relevant to the objects of the world in text format. It provides a system and a GUI application capable of extracting and managing concepts and relations from English texts. https://aseryla2.sourceforge.io/
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Exifr

    Exifr

    The fastest and most versatile JS EXIF reading library

    Exifr is a fast and very versatile JavaScript EXIF reading library that works everywhere, parses everything and handles just about anything you throw at it. It can handle any input: buffers, url, <img> tag and more; .jpg, .tif, and .heic files; and TIFF (EXIF, GPS, etc.), XMP, ICC, IPTC, JFIF segments. It skips parsing tags you don’t need, and reads only the first few bytes. There’s no need to read the whole file to see if there’s an EXIF file in it, or extract all the data when you just...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 22

    Honkai Impact 3rd Subtitle Extract Tool

    Subtitle extraction tool for Honkai Impact 3rd PC Client

    ...Click Repair Once the process is complete, it will display a dialog with information on how many SLTs were extracted. The log window will show the results of the extraction process. Disclaimer: This is an application that I threw together rather quickly. I posted it here for anyone who wants to use it
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    VAD

    VAD

    Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM

    This repository is a voice activity detection (VAD) toolkit that implements multiple models (DNN, bDNN, LSTM, ACAM) for detecting speech versus non-speech in audio. It also provides a recorded dataset in varied real-world settings (e.g. bus stop, construction site, park, room) with ground truth labeling. Acoustic feature extraction (multi-resolution cochleagram, MRCG). Post-processing modules (e.g. smoothing, thresholds). The toolkit supports both MATLAB and Python/TensorFlow components (for feature extraction, classification, postprocessing). Acoustic feature extraction (multi-resolution cochleagram, MRCG). Provided real-world dataset with manual annotations.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    Texthero

    Texthero

    Text preprocessing, representation and visualization from zero to hero

    Texthero is a python package to work with text data efficiently. It empowers NLP developers with a tool to quickly understand any text-based dataset and it provides a solid pipeline to clean and represent text data, from zero to hero.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Chatette

    Chatette

    A powerful dataset generator for Rasa NLU, inspired by Chatito

    Chatette is a Python-based tool for generating training datasets for Natural Language Understanding (NLU) models, particularly those used with Rasa NLU. It employs a domain-specific language to define templates, enabling the creation of diverse and extensive training examples for intent classification and entity recognition.​
    Downloads: 0 This Week
    Last Update:
    See Project