Showing 85 open source projects for "tagging"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • 1
    Claude Cognitive

    Claude Cognitive

    Persistent context and multi-instance coordination

    Claude Cognitive is an advanced memory and context-management extension designed to address the stateless limitations of Claude Code by giving the model a form of persistent “working memory” and multi-instance coordination. It introduces an attention-based context router that prioritizes files and content relevant to the current development discussion — tagging them as HOT, WARM, or COLD based on recency and keyword activation — so Claude Code doesn’t waste token budget rereading irrelevant code. This context routing dramatically reduces redundant token usage and accelerates large codebase interactions by focusing only on what truly matters to the current task. Additionally, Claude-Cognitive includes a pool coordinator to share state across multiple Claude Code instances, preserving what’s been learned or completed and preventing repetitive debugging or redundant exploration.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    VoxCPM2

    VoxCPM2

    Tokenizer-Free TTS for Multilingual Speech Generation

    VoxCPM2 is an advanced open-source text-to-speech system that redefines speech synthesis by eliminating traditional tokenization and instead generating continuous speech representations through a diffusion-based autoregressive architecture. Built on top of the MiniCPM model family, it enables highly natural, expressive, and context-aware speech generation that adapts tone, emotion, and pacing directly from input text. The system is trained on massive multilingual datasets, enabling support...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    torchtext

    torchtext

    Data loaders and abstractions for text and NLP

    We recommend Anaconda as a Python package management system. Please refer to pytorch.org for the details of PyTorch installation. LTS versions are distributed through a different channel than the other versioned releases. Alternatively, you might want to use the Moses tokenizer port in SacreMoses (split from NLTK). You have to install SacreMoses. To build torchtext from source, you need git, CMake and C++11 compiler such as g++. When building from source, make sure that you have the same C++...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Exile is a Python based image collection manager application. Easily add metadata to photos, inluding Caption, People, Event, Location and Tags. No external database: stores metadata in Exif/IPTC/Xmp tags. Three level categorization for easy photo sorting/management Clone GPS data between files.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 5
    Nostalgic Photo DataBase (platform)

    Nostalgic Photo DataBase (platform)

    Active repository of jpeg-photos with tags suitable for personal needs

    ...This versatile system allows users to organize and search through their collection using customizable tags, catering to images of any vintage. One of NPDB's key features is its flexible tagging system, which allows users to categorize their images using an arbitrary set of tags tailored to their preferences. This intuitive approach streamlines the organization process, making it easier than ever to locate specific images amidst a vast collection. Powered by an embedded SQL database, NPDB delivers lightning-fast search results, ensuring that users can access the images they need almost instantaneously. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    OpenNMT-tf

    OpenNMT-tf

    Neural machine translation and sequence learning using TensorFlow

    ...OpenNMT-tf is a general-purpose sequence learning toolkit using TensorFlow 2. While neural machine translation is the main target task, it has been designed to more generally support sequence-to-sequence mapping, sequence tagging, sequence classification, language modeling. Models are described with code to allow training custom architectures and overriding default behavior. For example, the following instance defines a sequence-to-sequence model with 2 concatenated input features, a self-attentional encoder, and an attentional RNN decoder sharing its input and output embeddings. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    textacy

    textacy

    NLP, before and after spaCy

    textacy is a Python library for performing a variety of natural language processing (NLP) tasks, built on the high-performance spaCy library. With the fundamentals, tokenization, part-of-speech tagging, dependency parsing, etc., delegated to another library, textacy focuses primarily on the tasks that come before and follow after.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Pattern

    Pattern

    Web mining module for Python, with tools for scraping

    ...It includes modules for web scraping and crawling that can retrieve information from sources such as social media platforms, search engines, and online knowledge bases. In addition to data mining features, the library offers natural language processing functionality including part-of-speech tagging, sentiment analysis, and n-gram extraction. The framework also includes machine learning algorithms that support classification, clustering, and vector space modeling for text analysis tasks. Another component of the library provides tools for analyzing and visualizing networks, making it useful for studying relationships between entities in large datasets.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 9
    DeepDanbooru

    DeepDanbooru

    AI based multi-label girl image classification system

    ...Because the Danbooru dataset contains millions of images with extensive annotations, it provides a valuable training resource for machine learning models specializing in illustration analysis. Such datasets have been widely used for tasks including automatic image tagging, anime face detection, and generative modeling research.
    Downloads: 11 This Week
    Last Update:
    See Project
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 10
    e-Dokyumento

    e-Dokyumento

    e-Dokyumento is web-based Document Management System (DMS)

    e-Dokyumento is opensource web-based Document Management System (DMS) A Document Management which automates the basic office document workflow such as receiving, filing, routing, and approving through capturing (scanning), digitizing (OCR Reading), storing, tagging, and electronically routing and approving (e-signature) of electronic documents. # Demo : https://e-dokyumento.herokuapp.com/ https://edokyu.seillig.com/ (refer to Readme.md for the accounts) #Dockerhub: https://hub.docker.com/r/nelsonmaligro/edokyumento # Install using the ISO: 1. Download: https://sourceforge.net/projects/e-dokyumento/files/Releases/e-DokyuV3.iso/download 2. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 11
    KoNLPy

    KoNLPy

    Python package for Korean natural language processing

    KoNLPy is a natural language processing (NLP) library for the Korean language, offering tokenization, morphological analysis, and named entity recognition.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 12

    MITRE Annotation Toolkit

    A toolkit for managing and manipulating text annotations

    The MITRE Annotation Toolkit (MAT) is a suite of tools which can be used for automated and human tagging of annotations. Annotation is a process, used mostly by researchers in natural language processing, of enhancing documents with information about the various phrase types the documents contain. MAT supports both UI interaction and command-line interaction, and provides various levels of control over the overall annotation process.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 13
    Reminiscence

    Reminiscence

    Self-Hosted Bookmark And Archive Manager

    ...Bookmarking links to pdf, jpg etc.. via the web interface will automatically save those files on the server. Supports archival of media elements of a web page using third-party download managers. Directory-based categorization of bookmarks. Automatic tagging of HTML links. Automatic summarization of HTML content. Special readability mode. Search bookmarks according to url, title, tags or summary. Supports multiple user accounts. Supports public and group directories for every user. Upload any file from web interface for archiving. Easy to use admin interface for managing multiple users. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Kashgari

    Kashgari

    Kashgari is a production-level NLP Transfer learning framework

    Kashgari is a simple and powerful NLP Transfer learning framework, build a state-of-art model in 5 minutes for named entity recognition (NER), part-of-speech tagging (PoS), and text classification tasks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    TFKit

    TFKit

    Handling multiple nlp task in one pipeline

    TFKit is a tool kit mainly for language generation. It leverages the use of transformers on many tasks with different models in this all-in-one framework. All you need is a little change of config. You can use tfkit for model training and evaluation with tfkit-train and tfkit-eval. The key to combine different task together is to make different task with same data format. All data will be in csv format - tfkit will use csv for all task, normally it will have two columns, first columns is the...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    SageMaker Chainer Containers

    SageMaker Chainer Containers

    Docker container for running Chainer scripts to train and host Chainer

    ...The "base" Dockerfile encompasses the installation of the framework and all of the dependencies needed. All "final" Dockerfiles build images using base images that use the tagging scheme.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    NLP-progress

    NLP-progress

    Repository to track the progress in Natural Language Processing (NLP)

    ...This document aims to track the progress in Natural Language Processing (NLP) and give an overview of the state-of-the-art (SOTA) across the most common NLP tasks and their corresponding datasets. It aims to cover both traditional and core NLP tasks such as dependency parsing and part-of-speech tagging as well as more recent ones such as reading comprehension and natural language inference. The main objective is to provide the reader with a quick overview of benchmark datasets and the state-of-the-art for their task of interest, which serves as a stepping stone for further research. To this end, if there is a place where results for a task are already published and regularly maintained, such as a public leaderboard, the reader will be pointed there.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    jieba

    jieba

    Stuttering Chinese word segmentation

    ...The paddle mode uses the PaddlePaddle deep learning framework to train the sequence labeling (bidirectional GRU) network model to achieve word segmentation. Also supports part-of-speech tagging. To use paddle mode, you need to install paddlepaddle-tiny, pip install paddlepaddle-tiny==1.6.1. Currently paddle mode supports jieba v0.40 and above. For versions below jieba v0.40, please upgrade jieba, pip install jieba --upgrade.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 19
    anaGo

    anaGo

    Bidirectional LSTM-CRF and ELMo for Named-Entity Recognition

    anaGo is a Python library for sequence labeling(NER, PoS Tagging,...), implemented in Keras. anaGo can solve sequence labeling tasks such as named entity recognition (NER), part-of-speech tagging (POS tagging), semantic role labeling (SRL) and so on. Unlike traditional sequence labeling solver, anaGo doesn't need to define any language-dependent features. Thus, we can easily use anaGo for any language.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Tasks

    Tasks

    Program to manage tasks

    Tasks aims management of tasks easy. It allows for more advanced features such as tagging tasks with custom categories, importance, subtasks and dates while also making it easy to create them. All a tasks need to be present is a title!
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    pyhanlp

    pyhanlp

    Chinese participle

    ...In practice, it serves as a bridge layer: Python calls are translated into the corresponding HanLP operations, so you can keep your application logic in Python while relying on HanLP’s implementations. It is especially useful when you need a pragmatic “get results quickly” NLP layer for segmentation, tagging, entity extraction, parsing, or keyword-style tasks rather than experimenting with model training from scratch.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Image classification models for Keras

    Image classification models for Keras

    Keras code and weights files for popular deep learning models

    ...Pre-trained weights can be automatically loaded upon instantiation (weights='imagenet' argument in model constructor for all image models, weights='msd' for the music tagging model). Weights are automatically downloaded if necessary, and cached locally in ~/.keras/models/. This repository contains code for the following Keras models, VGG16, VGG19, ResNet50, Inception v3, and CRNN for music tagging.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 23

    RDRPOSTagger

    A Rule-based Part-of-Speech and Morphological Tagging Toolkit

    RDRPOSTagger is a robust, easy-to-use and language-independent rule-based toolkit for Part-of-Speech (POS) and morphological tagging. RDRPOSTagger obtains fast performance in both learning and tagging process. RDRPOSTagger also achieves a very competitive accuracy in comparison to the state-of-the-art results. RDRPOSTagger now supports pre-trained POS and morphological tagging models for Bulgarian, Czech, Dutch, English, French, German, Hindi, Italian, Portuguese, Spanish, Swedish, Thai and Vietnamese. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Inferact

    Inferact

    File-tagging tool coded in Python.

    Tag, search (e.g., by similarity), manage data collections of your choosing. Currently optimized for music and image data. Uses sha1 hashes for identification. Please use additional programs such as: IrfanView (http://www.irfanview.com) and MPC (https://mpc-hc.org) for media handling and a console with unicode support, like ConEmu (https://conemu.github.io).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    GT NLP Class

    GT NLP Class

    Course materials for Georgia Tech CS 4650 and 7650

    This repository contains lecture notes, slides, assignments, and code for a university-level Natural Language Processing course. It spans core NLP topics such as language modeling, sequence tagging, parsing, semantics, and discourse, alongside modern machine learning methods used to solve them. Students work through programming exercises and problem sets that build intuition for both classical algorithms (like HMMs and CRFs) and neural approaches (like word embeddings and sequence models). The materials emphasize theory grounded in practical experimentation, often via Python notebooks or scripts that visualize results and encourage ablation studies. ...
    Downloads: 2 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB