Showing 17 open source projects for "b-tree"

View related business solutions
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • Train ML Models With SQL You Already Know Icon
    Train ML Models With SQL You Already Know

    BigQuery automates data prep, analysis, and predictions with built-in AI assistance.

    Build and deploy ML models using familiar SQL. Automate data prep with built-in Gemini. Query 1 TB and store 10 GB free monthly.
    Try Free
  • 1
    CycleGAN and pix2pix in PyTorch

    CycleGAN and pix2pix in PyTorch

    Image-to-Image Translation in PyTorch

    CycleGAN and pix2pix in PyTorch repository is a PyTorch implementation of two influential image-to-image translation frameworks: CycleGAN (for unpaired translation) and pix2pix (for paired translation). This repo gives developers and researchers a convenient, modern (PyTorch-based) platform to train and test these methods — supporting both paired datasets (input to output) and unpaired datasets (domain-to-domain) with minimal changes. The code supports standard training and inference...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2

    Tokenized Text Aligner

    Aligns tokens in two versions of a text with differing tokenization.

    ...It is intended for use in the preparation of annotated linguistic corpora, where differences in tokenization may arise (i) following corrections or modifications to the source text or (ii) through the creation of different layers of annotation (part-of-speech, treebank) requiring different tokenization. In its default implementation, it produces a human-readable CSV table associating tokens in text A with tokens in text B, and can also inject token-level annotation from text B to text A. The Aligner class on which the default implementation is based can be incorporated into more complex workflows.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3

    Ghawwas_V4

    An open source system for Arabic corpora processing

    Ghawwas (previously known as Khawas) is an open source system for Arabic corpora processing. Ghawwas V4.0 provides the following main functions: a. Frequency list for single word and N-Grams b. Concordance c. Collocation (MI, CHI Squared, LL, T-Score, Z Score, Dice, Log Dice, Weirdness Coefficient) d. Lexical patterns search e. Two corpora frequency profile comparison based on MI, CHI, LL, T-Score, Z Score, Dice, Log Dice, Weirdness Coefficient f. Accept Windows and UTF-8 character encoding g. Accept TXT, DOC, DOCX, RTF and HTML formats h. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 4
    Welsh Natural Language Toolkit
    The project supports the Welsh Language Technology domain with a set of NLP tools that drive innovation and advance the development of sophisticated textual analysis solutions. The WNLT project delivers four core NLP modules; a) Word Segmentation for separating text into words b) Sentence Boundary Disambiguation for finding sentence boundaries c) Part of Speech Tagger for determining the part of speech of each word d) Morphological Analyser for identifying the root form (lemma) of words. The modules are written in JAVA and ‘wrapped’ for execution under the General Architecture for Text Engineering (GATE) framework. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Fully Managed MySQL, PostgreSQL, and SQL Server Icon
    Fully Managed MySQL, PostgreSQL, and SQL Server

    Automatic backups, patching, replication, and failover. Focus on your app, not your database.

    Cloud SQL handles your database ops end to end, so you can focus on your app.
    Try Free
  • 5

    bnf2xml

    simple BNF parser makes xml markup of matches

    ...EXAMPLE: $ echo "hi" | bnf2xml patternfile <word><alph>h</alph><alph>i</alph></word> or <gas>hydrogen iodide</gas> patternfile says how to find needle in haystack and what to show, ie: <alph> ::= a | b | c | d ... <word> ::= <alph>+ bnf2xml is a top down recursive parser. Unlike buttom up parsers like gcc(1) or some top downs, bnf2xml is completely unambiguous / resolves ALL conflicts. Slower on ave. for parsing C or than sed(1) for simple searches. Far easier than using flex/C to create a parser. caveate: I do not suggest it's worth while to make a new gcc(1) using bnf2xml. bnf2xml an nth BETA release, but no complains yet.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    Welsh Natural Language Toolkit

    Welsh Natural Language Toolkit

    WNLT is a suite of open source natural language modules for the Welsh

    The project supports the Welsh Language Technology domain with a set of NLP tools that drive innovation and advance the development of sophisticated textual analysis solutions. The WNLT project delivers four core NLP modules; a) Word Segmentation for separating text into words b) Sentence Boundary Disambiguation for finding sentence boundaries c) Part of Speech Tagger for determining the part of speech of each word d) Morphological Analyser for identifying the root form (lemma) of words. The modules are written in JAVA and ‘wrapped’ for execution under the General Architecture for Text Engineering (GATE) framework. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Virastyar

    Virastyar

    Virastyar is an spell checker for low-resource languages

    ...It stands upon the shoulders of many free/libre/open-source (FLOSS) libraries developed for processing low-resource languages, especially Persian and RTL languages Publications: Kashefi, O., Nasri, M., & Kanani, K. (2010). Towards Automatic Persian Spell Checking. SCICT. Kashefi, O., Sharifi, M., & Minaie, B. (2013). A novel string distance metric for ranking Persian respelling suggestions. Natural Language Engineering, 19(2), 259-284. Rasooli, M. S., Kahefi, O., & Minaei-Bidgoli, B. (2011). Effect of adaptive spell checking in Persian. In NLP-KE Contributors: Omid Kashefi Azadeh Zamanifar Masoumeh Mashaiekhi Meisam Pourafzal Reza Refaei Mohammad Hedayati Kamiar Kanani Mehrdad Senobari Sina Iravanin Mohammad Sadegh Rasooli Mohsen Hoseinalizadeh Mitra Nasri Alireza Dehlaghi Fatemeh Ahmadi Neda PourMorteza
    Downloads: 16 This Week
    Last Update:
    See Project
  • 8

    Automatic Compound Processing (AuCoPro)

    Automatic compound splitting and semantic analysis of compounds

    ...Specifically, we will explore the possibility to create new knowledge about closely-related languages, and efficiently develop additional, more advanced resources for (a) compound segmentation; and (b) the semantic analysis of compounds; as such, the project will be divided into two interrelated subprojects, to be executed simultaneously. The focus in this project will be on Afrikaans (with Dutch as the closely-related, well-sourced language), which will lay grounds for future work on other closely-related language pairs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9

    t2t-pipe

    automatic alignment pipeline for parallel treebanks

    The *Tree-to-Tree (t2t) Alignment Pipe* is a collection of python scripts, co-ordinating the process of automatic alignment of parallel treebanks from plain text files with a single call from a unix command line. Supported Languages: DE, FR, EN
    Downloads: 0 This Week
    Last Update:
    See Project
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 10
    Unsupervised TXT classifier

    Unsupervised TXT classifier

    Classify any two TXT documents, no training required - JAVA

    ...In a way, this is similar to clustering but not really a clustering algorithm since there is some training involved. The summarizer from Classifier4J has been adjusted to accept two inputs (lets call them A and B). Then, the summarizer gets trained with A to summarize a document B, and vice versa. This extracts a relevant structure for both documents (and thus avoids the over-training) which are then compared using the Vector-Space analysis to give a range of belonging of one document to another (and thus avoids the shortage of information). ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    TF-IDF.jar is a Java Archive file to measure TF-IDF of each document in a document collection (corpus). The jar can be used to (a) get all the terms in the corpus (b) get the document frequency (DF) and inverse document frequency (IDF) of all the terms in the corpus (c) get the TF-IDF of each document in the corpus (d) get each term with their frequency (no. of presence), term frequency (TF) and TF-IDF in every document
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Redundancy due to cut-paste operations in text creates bias in machine learning for NLP. This module takes a directory and produces a subset of the files in that directory (in a list) with an upper bound on similarity between two files.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    SweetOnionCCG2PTBConverter

    SweetOnionCCG2PTBConverter

    A tool that converts CCGBank to PTB

    Conversion between different grammar frameworks is of great importance to comparative performance analysis of the parsers developed on them. This tool can convert CCG derivations to PTB trees by using Max Entropy models as well as visualizing the tree graphs. The main technical innovation presented here is the effective conversion method which achieves a F score over 95%.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Vtgrep stands for Visual Tree Grep and is a GUI to tgrep. It allows the user to build graphical representations of tree structures and then translates them into the tgrep syntax. provides search functionality, as well as search and result logging.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Sanchay
    Sanchay is a collection of tools and APIs for language researchers. It has some implementations of NLP algorithms, some flexible APIs, several user friendly annotation interfaces and Sanchay Query Language for language resources.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    A collection of Metasyntaxes like EBNF for .Net including a definition file parser and an expression tree.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17

    Supertagger

    Software for assigning supertags.

    Supertagging is a process of statistical lexical disambiguation, preprocessing step to parsing, which assigns LTAG tree categories to the lexical items present in the input sentence. Thus, if the input sentence is in the form of a dependency tree, the task of the supertagger is to assign the most probable TAG family to each node and edge in the dependency tree.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB