Showing 83 open source projects for "tagging"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Enterprise-grade ITSM, for every business Icon
    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

    Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
    Try it Free
  • 1
    TFKit

    TFKit

    Handling multiple nlp task in one pipeline

    TFKit is a tool kit mainly for language generation. It leverages the use of transformers on many tasks with different models in this all-in-one framework. All you need is a little change of config. You can use tfkit for model training and evaluation with tfkit-train and tfkit-eval. The key to combine different task together is to make different task with same data format. All data will be in csv format - tfkit will use csv for all task, normally it will have two columns, first columns is the...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    NLP-progress

    NLP-progress

    Repository to track the progress in Natural Language Processing (NLP)

    ...This document aims to track the progress in Natural Language Processing (NLP) and give an overview of the state-of-the-art (SOTA) across the most common NLP tasks and their corresponding datasets. It aims to cover both traditional and core NLP tasks such as dependency parsing and part-of-speech tagging as well as more recent ones such as reading comprehension and natural language inference. The main objective is to provide the reader with a quick overview of benchmark datasets and the state-of-the-art for their task of interest, which serves as a stepping stone for further research. To this end, if there is a place where results for a task are already published and regularly maintained, such as a public leaderboard, the reader will be pointed there.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    jieba

    jieba

    Stuttering Chinese word segmentation

    ...The paddle mode uses the PaddlePaddle deep learning framework to train the sequence labeling (bidirectional GRU) network model to achieve word segmentation. Also supports part-of-speech tagging. To use paddle mode, you need to install paddlepaddle-tiny, pip install paddlepaddle-tiny==1.6.1. Currently paddle mode supports jieba v0.40 and above. For versions below jieba v0.40, please upgrade jieba, pip install jieba --upgrade.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4

    KSUCCA Corpus

    A 50 million tokens corpus of Classical Arabic.

    King Saud University Corpus of Classical Arabic (KSUCCA) is a pioneering 50 million tokens annotated corpus of Classical Arabic texts from the period of pre-Islamic era until the fourth Hijri century (equivalent to the period from the seventh until early eleventh century CE), which is the period of pure classical Arabic. The main aim of this corpus is to be used for studying the distributional lexical semantics of The Quran words. However, it can be used for other research purposes, such...
    Downloads: 7 This Week
    Last Update:
    See Project
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 5
    anaGo

    anaGo

    Bidirectional LSTM-CRF and ELMo for Named-Entity Recognition

    anaGo is a Python library for sequence labeling(NER, PoS Tagging,...), implemented in Keras. anaGo can solve sequence labeling tasks such as named entity recognition (NER), part-of-speech tagging (POS tagging), semantic role labeling (SRL) and so on. Unlike traditional sequence labeling solver, anaGo doesn't need to define any language-dependent features. Thus, we can easily use anaGo for any language.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    Image classification models for Keras

    Image classification models for Keras

    Keras code and weights files for popular deep learning models

    ...Pre-trained weights can be automatically loaded upon instantiation (weights='imagenet' argument in model constructor for all image models, weights='msd' for the music tagging model). Weights are automatically downloaded if necessary, and cached locally in ~/.keras/models/. This repository contains code for the following Keras models, VGG16, VGG19, ResNet50, Inception v3, and CRNN for music tagging.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    kcws

    kcws

    Deep Learning Chinese Word Segment

    Deep learning chinese word segment. Install the bazel code construction tool and install tensorflow (currently this project requires tf 1.0.0alpha version or above) Switch to the code directory of this project and run ./configure. Compile background service. Pay attention to the public account of waiting for words and reply to kcws to get the corpus download address. Extract the corpus to a directory. Change to the code directory.After installing tensorflow, switch to the kcws code...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8

    RDRPOSTagger

    A Rule-based Part-of-Speech and Morphological Tagging Toolkit

    RDRPOSTagger is a robust, easy-to-use and language-independent rule-based toolkit for Part-of-Speech (POS) and morphological tagging. RDRPOSTagger obtains fast performance in both learning and tagging process. RDRPOSTagger also achieves a very competitive accuracy in comparison to the state-of-the-art results. RDRPOSTagger now supports pre-trained POS and morphological tagging models for Bulgarian, Czech, Dutch, English, French, German, Hindi, Italian, Portuguese, Spanish, Swedish, Thai and Vietnamese. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    ...Multiple narratives can be listed in the text file, where narratives are separated using a # symbol. The text upload process entitles the initial (POS) tagging of uploaded text using Stanford (POS) tagger. The user can later modify and extend the initial tagging. The resultant annotations are stored in the supporting database. These results can be exported to excel or text files for further processing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Train ML Models With SQL You Already Know Icon
    Train ML Models With SQL You Already Know

    BigQuery automates data prep, analysis, and predictions with built-in AI assistance.

    Build and deploy ML models using familiar SQL. Automate data prep with built-in Gemini. Query 1 TB and store 10 GB free monthly.
    Try Free
  • 10
    ...Universal language support (depending on the availability of training data), with language-specific features for Chinese and English. Currently support word segmentation, POS tagging, dependency and phrase-structure parsing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    GT NLP Class

    GT NLP Class

    Course materials for Georgia Tech CS 4650 and 7650

    This repository contains lecture notes, slides, assignments, and code for a university-level Natural Language Processing course. It spans core NLP topics such as language modeling, sequence tagging, parsing, semantics, and discourse, alongside modern machine learning methods used to solve them. Students work through programming exercises and problem sets that build intuition for both classical algorithms (like HMMs and CRFs) and neural approaches (like word embeddings and sequence models). The materials emphasize theory grounded in practical experimentation, often via Python notebooks or scripts that visualize results and encourage ablation studies. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Ansj Chinese word segmentation

    Ansj Chinese word segmentation

    Ansj word segmentation

    The real java implementation of ict. The word segmentation effect is faster than the open source version of ict. Chinese word segmentation, name recognition, part-of-speech tagging, user-defined dictionary. This is a java implementation of Chinese word segmentation based on n-Gram+CRF+HMM. The word segmentation speed reaches about 2 million words per second (tested under mac air), and the accuracy rate can reach more than 96%. At present, it has realized the functions of Chinese word segmentation, Chinese name recognition, user-defined dictionary, keyword extraction, automatic summarization, and keyword tagging. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Phrasal

    Phrasal

    Statistical phrase-based machine translation system

    ...Our work ranges from basic research in computational linguistics to key applications in human language technology, and covers areas such as sentence understanding, automatic question answering, machine translation, syntactic parsing and tagging, sentiment analysis.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14

    VnDP

    A Vietnamese dependency parsing toolkit

    VnDP is a Vietnamese dependency parsing toolkit which integrates a pre-trained parsing model and a pre-trained POS tagging model. The parsing model was trained on our VnDT Vietnamese dependency Treebank which was automatically converted from the Vietnamese constituent Treebank. See more details in VnDP's website at http://vndp.sourceforge.net/
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Part-of-speech tagging is the task of assigning symbols from a particular set to words in a natural language text. ACOPOST implements and extends well-known machine learning techniques and provides a uniform environment for testing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16

    Drug Extraction

    Drug name extraction

    ...accuracy: 95.25%; precision: 85.70%; recall: 76.20%; FB1: 80.67 Using GATE Corpus Benchmark: Strict: P: 0.65 R: 0.73 F1: 0.69 Lenient: P: 0.74 R: 0.84 F1: 0.78 The details of how to reproduce evaluation, see README. To use standalone version for tagging download DrugExtractionStandalone.tar.gz from Files.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Cotovía

    Cotovía

    Text-to-Speech System for Galician and Spanish

    Cotovía is a unit-selection text-to-speech system for Galician and Spanish. Cotovía is distributed under the GPL3.0+ license, but each of the avaliable speaker voices has its own license. The speakers available at sourceforge are free for commercial and non-commercial uses. Another speaker, free for non-commercial uses, is avaliable through external links (see the Blog section). Cotovia has been developed by the University de Vigo and the center 'Ramón Piñeiro' for Research in Humanities,...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 18

    Bermuda Text-to-Speech

    This project includes basic NLP and DSP techniques for Text-to-Speech

    See TTS demo at: http://rslp.racai.ro/index.php?page=tts This is an entirely written in JAVA project which includes a set of tools and methods designed to enable Multilingual Text-to-Speech (TTS) synthesis. We currently support English and Romanian but we will soon train more models and make them available for download. If you want to read more about our other NLP and TTS tools check out http://nlptools.racai.ro.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    TextBlob

    TextBlob

    TextBlob is a Python library for processing textual data

    Simple, Pythonic, text processing, Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more. It provides a simple API for diving into common natural language processing (NLP) tasks such as part-of-speech tagging, noun phrase extraction, sentiment analysis, classification, translation, and more. TextBlob stands on the giant shoulders of NLTK and pattern, and plays nicely with both.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    DBpedia Spotlight
    DBpedia Spotlight is a tool for annotating mentions of DBpedia resources in natural language text. The source code is now hosted on GitHub: https://github.com/dbpedia-spotlight
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    The Rudify tools are a collection of tools for ontology tagging.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Maximum entropy is a powerful method for constructing statistical models of classification tasks, such as part of speech tagging in Natural Language Processing. Several example applications using maxent can be found in the OpenNLP Tools Library.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    SemNotes

    SemNotes

    Semantic Note-taking tool for KDE

    SemNotes is a semantic note taking tool for KDE4, built on top of Nepomuk-KDE. The tool is still under development, but it is already usable, provided that KDE4 is installed and the Nepomuk running.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    This Java project creates a testing environment application to analyze an image at its low level features and suggest tags to clasify it using an ontology search based on the tags of similar images.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    JTextPro: A Java-based Text Processing tool that includes sentence boundary detection (using maximum entropy classifier), word tokenization (following Penn conventions), part-of-speech tagging (using CRFTagger), and phrase chunking (using CRFChunker).
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB