Showing 20 open source projects for "extraction"

View related business solutions
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • 1
    Zotero

    Zotero

    Tool to help you collect, organize, annotate, cite, and share research

    Zotero is a powerful, free, open-source research management application designed to help students, academics, and professionals collect, organize, annotate, cite, and share research sources and materials for papers, projects, or books. It can save web pages, PDFs, books, articles, and more with metadata, automatically extract bibliographic information, and organize items into collections and tag systems, while supporting notes and annotations directly alongside references. Zotero’s interface...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 2
    ktrain

    ktrain

    ktrain is a Python library that makes deep learning AI more accessible

    ktrain is a Python library that makes deep learning and AI more accessible and easier to apply. ktrain is a lightweight wrapper for the deep learning library TensorFlow Keras (and other libraries) to help build, train, and deploy neural networks and other machine learning models. Inspired by ML framework extensions like fastai and ludwig, ktrain is designed to make deep learning and AI more accessible and easier to apply for both newcomers and experienced practitioners. With only a few lines...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 3
    CERCA

    CERCA

    CERCA – Citation Extraction & Reference Checking Assistant

    CERCA is an open-source research tool that supports the verification of bibliographic references in scientific manuscripts. It extracts references from PDF files and checks their existence and consistency against authoritative metadata sources, producing explainable diagnostics, audit logs, and reproducible reports. It is intended for: - Researchers performing final manuscript checks - Reviewers assessing reference consistency - Editors supporting editorial quality control -...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 4
    Digital Forensics Guide

    Digital Forensics Guide

    Learn all about Digital Forensics and Computer Forensics

    The Digital Forensics Guide repository is a comprehensive, structured reference for investigators, analysts, students, and cybersecurity professionals interested in digital forensics principles, tools, methodologies, and workflows. It organizes foundational topics such as evidence acquisition, disk and memory analysis, file system structures, network forensics, artifact extraction, timeline generation, and reporting into digestible modules that help build core competency. Alongside conceptual explanations, the guide includes practical examples with widely used tools (like Autopsy, Volatility, Sleuth Kit, and network analysis suites), illustrating how investigations proceed from initial data capture to final analysis. ...
    Downloads: 7 This Week
    Last Update:
    See Project
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 5
    PyTorch GAN Zoo

    PyTorch GAN Zoo

    A mix of GAN implementations including progressive growing

    PyTorch GAN Zoo is a comprehensive open research toolbox designed for experimenting with and developing Generative Adversarial Networks (GANs) using PyTorch. The project provides modular implementations of popular GAN architectures, including Progressive Growing of GANs (PGAN), DCGAN, and an experimental StyleGAN version. It is built to support both researchers and developers who want to train, evaluate, and extend GANs efficiently across diverse datasets such as CelebA-HQ, FashionGen, DTD,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    jieba

    jieba

    Stuttering Chinese word segmentation

    "Jaba" Chinese word segmentation, do the best Python Chinese word segmentation component. Four word segmentation modes are supported. Precise mode, which tries to cut the sentence most precisely, suitable for text analysis. Full mode, scans all the words that can be formed into words in the sentence, the speed is very fast, but the ambiguity cannot be resolved. The search engine mode, on the basis of the precise mode, divides the long words again to improve the recall rate, which is suitable...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 7
    GUAJE FUZZY

    GUAJE FUZZY

    Free software for generating understandable and accurate fuzzy systems

    ...Thus, it is a free software tool (licensed under GPL-v3) with the aim of supporting the design of interpretable and accurate fuzzy systems by means of combining several preexisting open source tools, taking profit from the main advantages of all of them. It is a user-friendly portable tool designed and developed in order to make easier knowledge extraction and representation for fuzzy systems, paying special attention to interpretability issues. GUAJE lets the user define expert variables and rules, but also provide supervised and fully automatic learning capabilities. Both types of knowledge, expert and induced, are integrated under the expert supervision, ensuring interpretability, simplicity and consistency of the knowledge base along the whole process. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8

    SyntheticWSI

    Tools to generate and visualize artificial whole slide images

    ...Collection of tools to help generate artificial Whole Slide Images (WSIs). A WSI is stored as a ZIP archive of JPG tiles, and this software contains a tool to visualize this format. SVS files can be used directly for texture extraction (thanks to the included Bio-Formats library). Main source files in package fr.unistra.wsi.synthetic.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    musicinformationretrieval.com

    musicinformationretrieval.com

    Instructional notebooks on music information retrieval

    musicinformationretrieval.com is a collection of instructional materials for music information retrieval (MIR). These materials contain a mix of casual conversation, technical discussion, and Python code. These pages, including the one you're reading, are authored using Colab notebooks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    More flexibility. More control.

    Generate interest, access liquidity without selling, and execute trades seamlessly. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 10
    cde4php - Cross Database Engine for PHP

    cde4php - Cross Database Engine for PHP

    Uniform Database Abstraction for PHP Development

    Debby has replaced CDE in the Tina4Stack, you may want to check it out at http://tina4.com CDE is a PHP class which implements the general database functions in PHP and provides a common SQL platform for php development where developers change their databases but not their code. Supports Firebird, MySQL,Oracle,SQLite, MSSQL(both drivers),CUBRID,ODBC. CDE now supports date uniformity, param passing & BLOB handling across all the databases supported. CDE is not a replacement for PDO,...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 11
    FALCON - Text Search Java Project

    FALCON - Text Search Java Project

    JSON based text search Java Project

    ...It also takes care of jumbling of words within query and spelling mistakes. Commonly used techniques in this project are Natural Language Processing, Information Extraction and Question-Answering Architecture. ---------------------- - Latest Version - ---------------------- Details of latest version can be found on project website - http://geekdadaji.com --------------------------- - CONTACT DETAILS - --------------------------- CREATOR : SWAPNIL A JADHAV (saj1919) EMAIL ID : dadajibudhau@gmail.com WEBSITE : http://geekdadaji.com LICENSE : CC BY-NC 4.0
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12

    TML - Text Mining Library for LSA & CMM

    TML is a Java Library for LSA and extracting Concept Maps from text

    TML has moved to http://www.villalon.cl/tml.html and the code to https://github.com/villalon/tml
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Java examples for information retrievals covering themes like indexing, search, ranking, information extraction, regular expressions or crawling based on libraries such as Lucene. It provide support for learning information retrieval.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14

    Large Text File converter

    Java Based Heavy-duty utilitity to process large delimited text files

    ...Another strength of this tool is in its configurability, it's design allows to generate as many output files as required from one input file, and at every row of input file validation, extraction, conversion can be applied. Use case Example: legacy system is to be replaced with new advanced system with different DB schema, and the data provided as 100GB size of delimited text data which is to be inserted in 10 different tables of new system DB after validation,date format conversion, rearrangements, and MD5 hashing implementation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Utility for extraction of data from files in UNIMARC format (ISO-2709)
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    DASSGUI
    DASS-GUI is a graphical user interface for pattern search in non-sequential data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    C4 is a C++ class library for analyzing sound files, particularly spoken and sung phonations. C4 provides features such as frequency analysis, pitch extraction, or calculation of voice quality parameters (e.g. alpha ratio, HNR, jitter, etc.).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Cairo (Complex Archive Ingest for Repository Objects) is a tool for processing digital archives prior to submitting them to archival storage for long-term preservation; among other features, this includes format identification and metadata extraction.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    The Cornell Web Lab Collaboration Server is a suite of tools and services for GUI-based extraction, analysis and sharing of archived web data. See http://weblab.infosci.cornell.edu/ and http://www.cs.cornell.edu/~weigel for details about the project.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Web Textual eXtraction Tools C++ Parallel web crawler, noun phrase idenification, Multi-lingual Part of Speech Tagging, Tarjan's Algorithm, Co-RelationShip Mappings...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB