Search Results for "text processing" - Page 8

Showing 1569 open source projects for "text processing"

View related business solutions
  • Go from Code to Production URL in Seconds Icon
    Go from Code to Production URL in Seconds

    Cloud Run deploys apps in any language instantly. Scales to zero. Pay only when code runs.

    Skip the Kubernetes configs. Cloud Run handles HTTPS, scaling, and infrastructure automatically. Two million requests free per month.
    Try it free
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 1
    LangExtract

    LangExtract

    A Python library for extracting structured information

    LangExtract is a Python library developed by Google that leverages large language models (LLMs) to extract structured information from unstructured text—such as clinical notes, research papers, or literary works—based on user-defined instructions. It is designed to transform free-form text into reliable, schema-constrained data while maintaining traceability back to the source material. Each extracted entity is precisely grounded in its original context, allowing visual inspection and validation via automatically generated interactive HTML visualizations. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Kimi-Audio

    Kimi-Audio

    Audio foundation model excelling in audio understanding

    Kimi-Audio is an ambitious open-source audio foundation model designed to unify a wide array of audio processing tasks — from speech recognition and audio understanding to generative conversation and sound event classification — within a single cohesive architecture. Instead of fragmenting work across specialized models, Kimi-Audio handles automatic speech recognition (ASR), audio question answering, automatic audio captioning, speech emotion recognition, and audio-to-text chat in one system, enabling developers to build rich, multimodal audio applications without stitching together disparate components. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    ChatGPT Exporter

    ChatGPT Exporter

    Export and Share your ChatGPT conversation history

    ChatGPT Exporter is a browser-based userscript tool designed to export ChatGPT conversations into multiple structured and shareable formats, enabling users to preserve, analyze, and reuse AI-generated content outside the ChatGPT interface. It integrates directly into the ChatGPT web environment, typically via tools like Tampermonkey, and adds export functionality without requiring backend services or complex setup. The tool supports a wide range of output formats including plain text, HTML,...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4
    Python Progressbar

    Python Progressbar

    Progressbar 2 - A progress bar for Python 2 and Python 3

    A text progress bar is typically used to display the progress of a long-running operation, providing a visual cue that processing is underway. The progressbar is based on the old Python progressbar package that was published on the now-defunct Google Code. Since that project was completely abandoned by its developer and the developer did not respond to my email, I decided to fork the package.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.
    Start Free
  • 5
    Scriberr

    Scriberr

    Self-hosted AI audio transcription

    Scriberr is a self-hosted AI-powered transcription platform designed to convert audio and video into highly accurate text while prioritizing privacy and local processing. Unlike cloud-based transcription services, Scriberr runs entirely on the user’s machine, ensuring that sensitive recordings are never sent to third-party servers and remain fully under user control. It leverages modern speech recognition models such as Whisper and other advanced architectures to deliver precise transcripts with word-level timing and speaker identification. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6
    Matter AI

    Matter AI

    Matter AI is open-source AI Code Reviewer Agent

    Matter AI is an AI-powered platform designed to enhance productivity through automated content generation, data analysis, and decision support. It leverages machine learning models to process text, analyze patterns, and generate insights, making it suitable for businesses looking to optimize data-driven decision-making. Matter AI integrates with various data sources and provides customizable AI workflows tailored to different industries.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    SuperSocket

    SuperSocket

    Extensible socket server application framework for .NET

    ...The framework supports multiple protocols, including TCP, UDP, and WebSocket, and gives developers a modular architecture for protocol parsing, session handling, command processing, and server hosting. It is suitable for chat servers, game servers, IoT gateways, telemetry services, industrial systems, and other real-time network applications. SuperSocket is designed to be flexible enough for custom binary or text protocols while still offering reusable abstractions for common server patterns. It is most useful for .NET teams that need robust networking infrastructure with room for domain-specific protocol logic.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    Windrecorder

    Windrecorder

    Windrecorder is a memory search app by records everything

    Windrecorder is an open-source personal memory search engine that continuously records on-screen activity in a highly optimized and storage-efficient format. It captures screen content locally and builds a searchable database using OCR and image understanding, allowing users to rewind and rediscover anything they have previously seen. The system indexes only meaningful visual changes, extracting text, browser data, and contextual information to improve search accuracy and reduce storage...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    Self-Operating Computer

    Self-Operating Computer

    A framework to enable multimodal models to operate a computer

    The Self-Operating Computer Framework is an innovative system that enables multimodal models to autonomously operate a computer by interpreting the screen and executing mouse and keyboard actions to achieve specified objectives. This framework is compatible with various multimodal models and currently integrates with GPT-4o, o1, Gemini Pro Vision, Claude 3, and LLaVa. Notably, it was the first known project to implement a multimodal model capable of viewing and controlling a computer screen....
    Downloads: 8 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    Raycast Ollama

    Raycast Ollama

    Raycast extention for Ollama

    Raycast Ollama is an extension for Raycast that integrates Ollama-based large language models directly into the macOS productivity launcher environment. It allows users to interact with local AI models through Raycast commands, enabling quick access to chat, text generation, and other AI-powered tasks without leaving their workflow. The extension is designed to be lightweight and fast, aligning with Raycast’s philosophy of keyboard-driven productivity. It provides a seamless interface for...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    RuoYi AI

    RuoYi AI

    Enterprise AI platform for building, deploying, and managing apps

    RuoYi AI is a full-stack enterprise-oriented AI development platform designed to help developers rapidly build, deploy, and manage intelligent applications using modern large language models and AI ecosystems. It provides a unified framework for integrating multiple AI models from different providers, allowing teams to switch or combine models through a consistent interface without vendor lock-in. RuoYi AI includes built-in support for retrieval-augmented generation, enabling organizations...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 12
    LibPDF

    LibPDF

    A modern PDF library for TypeScript

    ...The library offers full read and write manipulation, including support for encryption with RC4 and modern AES cipher suites, form filling and flattening, digital signature creation and verification, page merging/splitting, rich text extraction with layout information, and font embedding with subsetting.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Model Zoo

    Model Zoo

    Please do not feed the models

    FluxML Model Zoo is a collection of demonstration models built with the Flux machine learning library in Julia. The repository provides ready-to-run implementations across multiple domains, including computer vision, natural language processing, and reinforcement learning. Each model is organized into its own project folder with pinned package versions, ensuring reproducibility and stability. The examples serve both as educational tools for learning Flux and as practical starting points for...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Biome

    Biome

    A toolchain for web projects, aimed to provide functionalities

    Biome formats and lints your code in a fraction of a second. Biome supports JavaScript, TypeScript, JSON, and CSS. It aims to support all main languages of modern web development. Biome has sane defaults and requires minimal configuration. Biome helps you as much as possible by displaying detailed and contextualized diagnostics. Biome unifies functionality that has previously been separate tools. Building upon a shared base allows us to provide a cohesive experience for processing code,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    openwechat

    openwechat

    golang WeChat SDK

    ...Breakthrough login restrictions. No need to scan the QR code repeatedly to log in. Support multiple WeChat accounts to log in at the same time. Message reply, send text, pictures, files, emoji and other messages to designated objects (friends, groups) Hot login (no need to repeatedly scan the code to log in), custom message processing, file download, message anti-withdrawal. Obtain object information, set friend notes, pull friends into groups, etc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    XML Copy Editor
    XML Copy Editor is a fast, free, validating XML editor.
    Leader badge
    Downloads: 712 This Week
    Last Update:
    See Project
  • 17
    syslog-ng

    syslog-ng

    Log management solution that improves the performance of SIEM

    syslog-ng is the log management solution that improves the performance of your SIEM solution by reducing the amount and improving the quality of data feeding your SIEM. With syslog-ng Store Box, you can find the answer. Search billions of logs in seconds using full text queries with Boolean operators to pinpoint critical logs. syslog-ng Store Box provides secure, tamper-proof storage and custom reporting to demonstrate compliance. syslog-ng can deliver data from a wide variety of sources to...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    Step-Audio 2

    Step-Audio 2

    Multi-modal large language model designed for audio understanding

    ...Moreover, Step-Audio2 supports tool-calling and retrieval-augmented generation (RAG), allowing it to access external knowledge sources or audio/text databases, thus reducing hallucinations and improving coherence in complex dialogues.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Vidi2

    Vidi2

    Large Multimodal Models for Video Understanding and Editing

    Vidi is a family of large multimodal models developed for deep video understanding and editing tasks, integrating vision, audio, and language to allow sophisticated querying and manipulation of video content. It’s designed to process long-form, real-world videos and answer complex queries such as “when in this clip does X happen?” or “where in the frame is object Y during that moment?” — offering temporal retrieval, spatio-temporal grounding (i.e. locating objects over time + space), and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    llms-from-scratch-cn

    llms-from-scratch-cn

    Build a large language model from 0 only with Python foundation

    llms-from-scratch-cn is an educational open-source project designed to teach developers how to build large language models step by step using practical code and conceptual explanations. The repository provides a hands-on learning path that begins with the fundamentals of natural language processing and gradually progresses toward implementing full GPT-style architectures from the ground up. Rather than focusing on using pre-trained models through APIs, the project emphasizes understanding...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    Data Formulator

    Data Formulator

    Create rich visualizations with AI

    ...However, these systems do not work well for iterative visualization authoring, because they often require analysts to provide, in a single turn, a text-only prompt that fully describes the complex visualization task to be performed, which is unrealistic to both users and models in many cases. In this paper, we present Data Formulator 2, an LLM-powered visualization system to address these challenges.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    jsonrepair

    jsonrepair

    Repair invalid JSON documents

    ...It is especially useful in workflows involving AI-generated content, manually edited configuration files, or unreliable external APIs where malformed JSON frequently occurs. jsonrepair supports both browser and Node.js environments, making it suitable for client-side validation tools and backend processing pipelines alike. The project focuses on automation and fault tolerance, reducing the need for manual cleanup of corrupted JSON data. Its lightweight architecture and practical functionality have made it valuable for modern applications that process unpredictable structured text.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    RAG Web UI

    RAG Web UI

    RAG Web UI is an intelligent dialogue system based on RAG

    ...It combines document retrieval with large language models to provide accurate, context-aware responses based on indexed data rather than generic model knowledge. The platform supports ingestion of multiple document formats, including PDFs, Word files, Markdown, and plain text, automatically processing and vectorizing them for efficient retrieval. It features a multi-turn conversational interface that maintains context across interactions, allowing users to engage in more natural and continuous dialogues with their data. The system is designed with a scalable architecture that separates frontend and backend components, enabling distributed deployment and efficient handling of large datasets. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Scribus

    Scribus

    Powerful desktop publishing software

    Scribus is an Open Source program that brings professional page layout to Linux, BSD UNIX, Solaris, OpenIndiana, GNU/Hurd, Mac OS X, OS/2 Warp 4, eComStation, and Windows desktops with a combination of press-ready output and new approaches to page design. Underneath a modern and user-friendly interface, Scribus supports professional publishing features, such as color separations, CMYK and spot colors, ICC color management, and versatile PDF creation.
    Leader badge
    Downloads: 20,583 This Week
    Last Update:
    See Project
  • 25
    Bowtie, an ultrafast, memory-efficient short read aligner for short DNA sequences (reads) from next-gen sequencers. Please cite: Langmead B, et al. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10:R25.
    Leader badge
    Downloads: 382 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB