Search Results for "text processing" - Page 7

718 projects for "text processing" with 1 filter applied:

  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.
    Start Free
  • 1
    NLP-Models-Tensorflow

    NLP-Models-Tensorflow

    Gathers machine learning and Tensorflow deep learning models for NLP

    NLP-Models-Tensorflow is a collection of natural language processing model implementations built using the TensorFlow deep learning framework. The repository provides numerous examples of neural network architectures used in modern NLP research and applications, including text classification, language modeling, machine translation, and sentiment analysis. Each model implementation is designed to illustrate how common NLP architectures operate, such as recurrent neural networks, convolutional models for text processing, and transformer-style attention mechanisms. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Leseratte is a Java parser for German written language. Currently, it contains a German lexicon (based on the Wiktionary), inflexion rules, a grammar and a parser. (Semantics component planned.) Usable as a Java library, also provides a graphical UI.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    PyCBC

    PyCBC

    Learn how to use PyCBC to analyze gravitational-wave data

    PyCBC is a software developed by a collaboration of LIGO, Virgo, and independent scientists. It is open source and freely available. We use PyCBC in the detection of gravitational waves from binary mergers such as GW150914. These examples explore how to analyze gravitational wave data, how we find potential signals and learn about them. Many of these tutorials will require you to make edits to config files as part of their exercises. At the moment this isn't easy to do on services like...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Forms

    Forms

    An easy way to create, parse and validate forms in node.js

    ...It allows developers to define form schemas using a declarative API, specifying fields, validation rules, and data transformations in a consistent format. The library supports a wide range of input types, including text, numbers, dates, and custom fields, making it adaptable to different application needs. It includes built-in validation mechanisms that ensure user input meets specified constraints before being processed. Forms also provides utilities for rendering HTML form elements, reducing the need for manual template coding. It integrates with popular Node.js web frameworks, enabling seamless handling of form submissions and data processing. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    Access competitive interest rates on your digital assets.

    Generate interest, borrow against your crypto, and trade a range of cryptocurrencies — all in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 5
    BRIC

    BRIC

    BRIC is a powerful tool for batch image processing.

    Bric is a cross-platform batch image processor. You can convert, resize, rotate and add watermark to your images. Multiple file types are supported for input and output. The project started back in 2011 and was maintained for a couple of years. In 2020 BRIC is again in active development, so some of the features written below might be outdated. Please be patient, until everything is reviewed and rewritten.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Programa para generar documentos HTML con expresiones matemáticas incrustadas, procesadas con Maxima (maxima.sourceforge.net).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    AhoCorasickDoubleArrayTrie

    AhoCorasickDoubleArrayTrie

    An extremely fast implementation of Aho Corasick algorithm

    ...It is designed for fast keyword scanning across large texts, where you want to search for many patterns simultaneously and efficiently. The core idea is to build an automaton from a dictionary of patterns, then stream through input text to emit matches with minimal overhead. By using a double-array trie representation, the project emphasizes performance and memory efficiency compared to simpler pointer-heavy trie structures, which can matter a lot for large dictionaries or latency-sensitive services. This makes it a strong fit for tasks like content filtering, entity/term spotting, dictionary-based annotation, or high-throughput log/text processing. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    cocoNLP

    cocoNLP

    A Chinese information extraction tool

    cocoNLP is a lightweight natural-language processing toolkit geared toward practical information extraction from raw text, especially for Chinese and mixed Chinese–English content. Instead of requiring a heavy pipeline, it focuses on quick wins such as extracting names, places, organizations, emails, phone numbers, and dates directly from unstructured sentences. The project blends pattern-based methods with NLP heuristics, giving developers dependable results for real-world texts like chats, comments, and user-generated content. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Anaphraseus
    Anaphraseus is a CAT (Computer Aided Translation) tool, OpenOffice.org 2-3 macro set for OpenOffice/LibreOffice Writer similar to famous Wordfast. It works with the Wordfast Translation Memory format (*.TXT), and supports text segmentation.
    Downloads: 21 This Week
    Last Update:
    See Project
  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • 10
    XSLT syntax highlighting

    XSLT syntax highlighting

    Java based XSLT Processor extension for syntax highlighting

    Please note that project moved to GitHub: https://github.com/xmlark/xslthl This is an implementation of syntax highlighting as an extension module for XSLT processors (Xalan, Saxon), so if you have e.g. article about programming written in DocBook, code examples can be automatically syntax highlighted during the XSLT processing phase.
    Leader badge
    Downloads: 72 This Week
    Last Update:
    See Project
  • 11
    Jed Modes Repository
    A collection of S-Lang extension scripts (modes) for the Jed text editor, contributed by Jed users. Browse the repository at http://jedmodes.sf.net/
    Leader badge
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12

    AhoTTS Multilingual, a Multilingual TTS

    Text-to-Speech TTS for Basque, Spanish, Catalan, Galician and English

    Text-to-Speech conversor for Basque, Spanish, Catalan, Galician and English. It includes linguistic processing and built voices for all the languages aforementioned. Its acoustic engine is based on hts_engine and it uses a high quality vocoder called AhoCoder. Developed by Aholab Signal Processing Laboratory: https://aholab.ehu.es/aholab/ http://aholab.ehu.es/ahocoder/
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    YouTubeCrawler

    YouTubeCrawler

    Go-based automation utility that downloads YouTube videos

    This tool is a Go-based automation utility that downloads YouTube videos and permanently embeds or “hard-codes” their subtitles (typically English) into MP4 output files. The workflow involves specifying one or more URLs (via a simple “url” text file in each folder) and the program uses youtube-dl to fetch video and subtitle, then ffmpeg to overlay the subtitles onto the video track. The architecture follows a command-pattern setup: tasks implement a common interface and are scheduled and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Pgn2ltx is a simple converter that reads chess games in PGN format and outputs them to a LaTeX input file.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    ChessTask is a frontend for easily creating chess tasks with LaTeX. The tasks are stored in a list which can be exported, either to a LaTeX input file, or to HTML files.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Fen2eps is a small console program that converts FEN (Forsyth Edwards Notation) strings to EPS (Encapsulated Postscript) files containing the chess board diagram.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Duckling (Old)

    Duckling (Old)

    Clojure library that parses text into structured data

    Duckling (the “old” archived version) is a natural language processing library (in Clojure) for parsing text to structured data — specifically, recognizing quantities such as dates, times, durations, measurements, currencies, etc., from free-form text. To use Duckling in your project, you just need two functions: load! to load the default configuration, and parse to parse a string. Duckling is a Clojure library that parses text into structured data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Diffuse
    Diffuse is a graphical tool for comparing and merging text files. It can retrieve files for comparison from Bazaar, CVS, Darcs, Git, Mercurial, Monotone, RCS, Subversion, and SVK repositories.
    Leader badge
    Downloads: 133 This Week
    Last Update:
    See Project
  • 19
    Vrapper

    Vrapper

    Vim-like editing in Eclipse

    Vrapper is an eclipse plugin which acts as a wrapper for existing eclipse text editors to provide a Vim-like input scheme for moving around and editing text. Eclipse Update Site: http://vrapper.sourceforge.net/update-site/stable
    Downloads: 11 This Week
    Last Update:
    See Project
  • 20
    Mavscript

    Mavscript

    Calculations in a text document

    Mavscript allows the user to do calculations in a text document. Plain text, LaTeX and OpenOffice Writer files (.odt) are supported. The calculation is done by the algebra system Yacas (default), Jasymca or by the Java interpreter BeanShell.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 21
    cnn-text-classification-tf

    cnn-text-classification-tf

    Convolutional Neural Network for Text Classification in Tensorflow

    The cnn-text-classification-tf repository by Denny Britz is a well-known educational implementation of convolutional neural networks for text classification using TensorFlow, aimed at helping developers and researchers understand how CNNs can be applied to natural language processing tasks. Based loosely on Kim’s influential paper on CNNs for sentence classification, this codebase demonstrates how to preprocess text data, convert words into learned embeddings, and apply multiple convolution filters to extract n-gram features that are then pooled and fed into a classifier. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    iText®, a JAVA PDF library

    iText®, a JAVA PDF library

    PDF Library for Developers

    iText is an open-source PDF library available for Java and .NET (C#). iText allows you to effortlessly generate and manipulate standards-compliant PDF documents with a powerful and feature-rich SDK. With iText, you can create archivable and accessible PDFs, split and merge documents, fill and flatten forms, digitally sign documents, and more. iText add-ons enable additional functionality, such as PDF creation from HTML templates, secure redaction, OCR, and much more. The latest...
    Leader badge
    Downloads: 200 This Week
    Last Update:
    See Project
  • 23
    TeXML is an XML vocabulary for TeX. The processor transforms the TeXML markup into the TeX markup, escaping special and out-of-encoding characters. The intended audience is developers who automatically generate [La]TeX or ConTeXt files.
    Leader badge
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    Command-line/Ant-task/embeddable text file preprocessor. Macros, flow control, expressions. Recursive directory processing. Extensible in Java to display data from any data sources (as database). Can generate complete homepages (tree of HTML-s, images, etc.)
    Leader badge
    Downloads: 108 This Week
    Last Update:
    See Project
  • 25
    unfluff

    unfluff

    Automatically extract body content (and other cool stuff) from HTML

    unfluff is a Node.js library designed to automatically extract the main content from an HTML document — stripping away navigation bars, ads, footers and other boilerplate to leave you with the “body content”, metadata (title, author, date) and other useful fields. It’s a tool very much aimed at content-analysis, web scraping, building datasets, or repurposing article text for downstream processing (like machine-learning or summarization). The API is simple: you feed in raw HTML and it returns a structured object with the extracted text and other fields. It supports caching internal representations to speed up repeated extractions. While its language support is best for English, it is still widely used in web-content-processing pipelines. ...
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB