Search Results for "text processing" - Page 5

Showing 1744 open source projects for "text processing"

View related business solutions
  • Enterprise-grade ITSM, for every business Icon
    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

    Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
    Try it Free
  • Secure File Transfer for Windows with Cerberus by Redwood Icon
    Secure File Transfer for Windows with Cerberus by Redwood

    Protect and share files over FTP/S, SFTP, HTTPS and SCP with the #1 rated Windows file transfer server.

    Cerberus supports unlimited users and connections on a single IP, with built-in encryption, 2FA, and a browser-based web client — all deployable in under 15 minutes with a 25-day free trial.
    Try for Free
  • 1
    Sygil WebUI

    Sygil WebUI

    Stable Diffusion web UI

    Sygil WebUI is a browser-based interface for running Stable Diffusion image generation locally or on a server, wrapping common text-to-image and image-to-image workflows into a practical UI. It provides multiple UI modes (including a legacy Gradio interface) and focuses on making iterative prompting, parameter tuning, and post-processing accessible without writing code. The UI exposes core generation controls like resolution, CFG guidance, sampling steps, samplers, seeds, and batch generation so users can reproduce results and refine outputs systematically. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    pangu.js

    pangu.js

    Opinionated paranoid text spacing in JavaScript

    pangu.js is a lightweight JavaScript library that automatically inserts proper spacing between Chinese, Japanese, or Korean (CJK) characters and Latin letters, numbers, or symbols. The goal is typographic polish: mixed-script text often becomes cramped or ambiguous without the correct spacing rules, and pangu.js fixes that reliably. It uses rule-based detection to walk text and add spaces where needed without altering the characters themselves. The library runs in browsers or Node.js, and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    DocETL

    DocETL

    A system for agentic LLM-powered data processing and ETL

    DocETL is an open-source system designed to build and execute data processing pipelines powered by large language models, particularly for analyzing complex collections of documents and unstructured datasets. The platform allows developers and researchers to construct structured workflows that extract, transform, and organize information from sources such as reports, transcripts, legal documents, and other text-heavy data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    NLP

    NLP

    Open source NLP guide with models, methods, and real use cases

    NLP is an open source introductory resource for natural language processing, presented as a continuously updated book hosted on GitHub. It explains how machines process and understand human language, combining theory with practical examples. Its covers core NLP concepts such as text representation, feature extraction, and model evaluation, alongside hands-on implementations using tools like Word2Vec, TF-IDF, and FastText.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.
    Start Free
  • 5
    KrillinAI

    KrillinAI

    Video translation and dubbing tool powered by LLMs

    ...It integrates several stages of the pipeline: video acquisition (either from local files or remote via download tools), speech recognition (ASR), subtitle segmentation and alignment, machine translation (with context-aware translation to preserve semantics), and voice cloning + text-to-speech (TTS) to produce dubbed audio tracks. KrillinAI supports both landscape and portrait videos, which makes it suitable for a wide range of platforms — from YouTube to TikTok or other vertical-video sites — and ensures correct formatting and layout for the final video. The tool offers “one-click” workflows and desktop versions, lowering the barrier for users who may not be familiar with video editing or audio processing pipelines.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 6
    SetFit

    SetFit

    Efficient few-shot learning with Sentence Transformers

    SetFit is an efficient and prompt-free framework for few-shot fine-tuning of Sentence Transformers. It achieves high accuracy with little labeled data - for instance, with only 8 labeled examples per class on the Customer Reviews sentiment dataset, SetFit is competitive with fine-tuning RoBERTa Large on the full training set of 3k examples.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    SALMONN family

    SALMONN family

    A suite of advanced multi-modal LLMs

    SALMONN is a family of advanced multi-modal large language models (LLMs) developed by ByteDance — designed to handle and integrate multiple data modalities (e.g. text, audio, video) rather than just plain text. The repository bundles different branches targeting specialized tasks (e.g. video-SALMONN, speech-quality assessment, general multimodal tasks), suggesting that the project is modular and extensible across domains. SALMONN aims to push the frontier of multi-modal AI by allowing models...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    PostgresML

    PostgresML

    The GPU-powered AI application database

    PostgresML is a complete platform in a PostgreSQL extension. Build simpler, faster, and more scalable models right inside your database. Explore the SDK and test open source models in our hosted database. Combine and automate the entire workflow from embedding generation to indexing and querying for the simplest (and fastest) knowledge-based chatbot implementation. Leverage multiple types of natural language processing and machine learning models such as vector search and personalization...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Stanza

    Stanza

    Stanford NLP Python library for many human languages

    Stanza is a collection of accurate and efficient tools for the linguistic analysis of many human languages. Starting from raw text to syntactic analysis and entity recognition, Stanza brings state-of-the-art NLP models to languages of your choosing. Stanza is a Python natural language analysis package. It contains tools, which can be used in a pipeline, to convert a string containing human language text into lists of sentences and words, to generate base forms of those words, their parts of...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Streamline Azure Security with Palo Alto Networks VM-Series Icon
    Streamline Azure Security with Palo Alto Networks VM-Series

    Centrally manage physical and virtualized firewalls with Panorama

    Improve your security posture and reduce incident response time. Use the VM-Series to natively analyze Azure traffic and dynamically drive policy updates based on workload changes.
    Learn more
  • 10
    Xiyan MCP Server

    Xiyan MCP Server

    A Model Context Protocol (MCP) server

    The XiYan MCP Server is a Model Context Protocol (MCP) server that enables natural language queries to databases, powered by XiYan-SQL, a state-of-the-art text-to-SQL model. It allows users to interact with databases using conversational language, simplifying data retrieval processes. ​
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    TextWorld

    TextWorld

    ​TextWorld is a sandbox learning environment for the training

    TextWorld is a learning environment designed to train reinforcement learning agents to play text-based games, where actions and observations are entirely in natural language. Developed by Microsoft Research, TextWorld focuses on language understanding, planning, and interaction in complex, narrative-driven environments. It generates games procedurally, enabling scalable testing of agents’ natural language processing and decision-making abilities.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Dream Textures

    Dream Textures

    Stable Diffusion built-in to Blender

    Create textures, concept art, background assets, and more with a simple text prompt. Use the 'Seamless' option to create textures that tile perfectly with no visible seam. Texture entire scenes with 'Project Dream Texture' and depth to image. Re-style animations with the Cycles render pass. Run the models on your machine to iterate without slowdowns from a service. Create textures, concept art, and more with text prompts.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 13

    joy of text

    Editor with scripting language, security features & system interfaces.

    Jot was developed general purpose editor for large CAD files. It's command-driven UI requires no mode switching and hence requires fewer keystrokes to get a typical job done. It is particularly useful for checking and cross-referencing between several source, intermediate and output files - a common requirement for CAD work. But jot's usefulness doesn't stop there. It's sophisticated search features can, for example, be used for interactive data mining or automating the extraction of...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    OpenRecall

    OpenRecall

    OpenRecall is a fully open-source, privacy-first alternative

    OpenRecall is an open-source, privacy-first system designed to capture, index, and make searchable a user’s entire digital activity history, effectively acting as a personal memory layer for computing environments. It works by taking periodic screenshots of a user’s screen and applying local AI processing, including OCR and semantic analysis, to extract and structure information from both text and images. This data is then indexed into a searchable database, allowing users to retrieve past information quickly using natural language queries. Unlike proprietary alternatives, OpenRecall operates entirely locally, ensuring that all captured data remains on the user’s device and is never transmitted to external servers. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    Perl 5

    Perl 5

    The Perl programming language

    This repository contains the reference implementation of the Perl 5 programming language, including the interpreter, core modules, build system, and an extensive test suite. Perl 5 is a multi-paradigm language renowned for powerful text processing, rich regular expressions, and pragmatic glue code across systems. The core distribution is highly portable, building on Unix, Linux, Windows, and many other platforms, with stable release cycles and careful back-compatibility. A C API (XS) and embedding APIs allow tight integration with native libraries and host applications, while the CPAN ecosystem supplies hundreds of thousands of reusable modules. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 16
    Goverlay

    Goverlay

    Goverlay is an easy graphical interface to configure MangoHud

    Goverlay is a graphical configuration tool designed to simplify and centralize the management of performance and visual enhancement utilities for Linux gaming environments. It provides an intuitive user interface that allows users to configure tools such as MangoHud for real-time performance monitoring, vkBasalt for post-processing effects, and OptiScaler for advanced upscaling techniques. By abstracting complex configuration files into a visual interface, it makes advanced system tuning accessible even to users who are not comfortable editing text-based configs. The software is particularly useful for gamers seeking to optimize frame rates, visualize system metrics, or enhance graphical fidelity without manually managing multiple tools. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 17
    markdown-it

    markdown-it

    Markdown parser, done right. 100% CommonMark support, extensions

    markdown-it is a fast and extensible JavaScript-based Markdown parser designed to convert Markdown text into HTML while maintaining strict compliance with the CommonMark specification and offering additional syntax enhancements. It is widely used in web applications, documentation tools, and content platforms due to its high performance and flexibility. The library is built with a rule-based parsing system that allows developers to customize or replace syntax rules, making it adaptable to a...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    TADA

    TADA

    Open Source Speech Language Model

    TADA is an open-source speech-language modeling framework designed to unify spoken audio and text representations within a single generative architecture. The system focuses on aligning speech and text streams using a dual-alignment mechanism that synchronizes the acoustic signal with its textual representation. By modeling both modalities together, the framework allows developers to build systems capable of generating, understanding, and transforming speech and language simultaneously. This...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    AI App Lab

    AI App Lab

    Implementing large models into scenario-based applications

    ...The project focuses on helping developers bridge the gap between AI models and practical business use cases by offering a structured environment for creating production-ready AI systems. It includes a high-level SDK called Arkitect, which provides workflows and tools for integrating models, plugins, and multimodal capabilities such as text, image, and voice processing. The repository also contains a large collection of prototype applications that demonstrate how AI can be applied to scenarios such as customer service, education, content generation, and mobile automation. These examples allow developers to quickly replicate and customize solutions for their own business needs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    SAG

    SAG

    SQL-Driven RAG Engine

    SAG is an open-source SQL-driven retrieval-augmented generation engine that dynamically constructs knowledge graphs during query processing. Instead of relying on a static knowledge graph prepared in advance, the system automatically builds relational structures between entities while processing user queries. Documents are first decomposed into atomic semantic events, which are then represented using multidimensional natural language vectors. These vectors allow the system to identify relationships between concepts and construct a graph representation of knowledge at runtime. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    OCRBase

    OCRBase

    MD/.JSON Document OCR and structured data extraction API

    ...It includes real-time job progress updates via WebSockets, which makes it easier to integrate into UIs, dashboards, or ingestion systems where users need feedback on long-running document processing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Botonic

    Botonic

    Build chatbots and conversational experiences using React

    Botonic is a full-stack Javascript framework to create chatbots and modern conversational apps that work on multiple platforms, web, mobile and messaging apps (Messenger, Whatsapp, Telegram, etc). Building modern applications on top of messaging apps like Whatsapp or Messenger is much more than creating simple text-based chatbots. Botonic is a full-stack serverless framework that combines the power of React and Tensorflow.js to create amazing experiences at the intersection of text and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    StoryGen Atelier

    StoryGen Atelier

    AI-assisted storyboard and video generation tool

    StoryGen Atelier is an advanced creative tool that blends AI with visual storytelling, making it possible to generate fully structured storyboards and stitched videos from text prompts without requiring manual art or animation skills. Users begin with natural language descriptions of their story or scene, and the system uses state-of-the-art large models to generate both the script and corresponding frames. Once individual frames are created, a second AI model generates transition clips that smoothly link the frames into a coherent short video sequence, and the tool then assembles everything into a finished video using standard video processing tools. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 24
    Pipecat

    Pipecat

    Framework for building real-time voice and multimodal AI agents

    ...Developers can create a wide range of interactive systems including voice assistants, customer service agents, interactive storytelling applications, and multimodal interfaces that combine voice, video, images, and text. Its modular architecture allows components to be composed into pipelines that process audio, text, and video streams in real time.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 25
    kg-gen

    kg-gen

    Knowledge Graph Generation from Any Text

    kg-gen is an open-source framework developed by the STAIR Lab that automatically generates knowledge graphs from unstructured text using large language models. The system is designed to transform plain text sources such as documents, articles, or conversation transcripts into structured graphs composed of entities and relationships. Instead of relying on traditional rule-based extraction techniques, KG-Gen uses language models to identify entities and their relationships, producing...
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB