Search Results for "text processing" - Page 8

Sort By:

Showing 1569 open source projects for "text processing"

View related business solutions

Linux Clear Filters & Widen Search

Go from Code to Production URL in Seconds
Cloud Run deploys apps in any language instantly. Scales to zero. Pay only when code runs.

Skip the Kubernetes configs. Cloud Run handles HTTPS, scaling, and infrastructure automatically. Two million requests free per month.

Try it free
Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure
Native application identity and user-based security for your Azure cloud

Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.

Get a free trial
1

LangExtract

A Python library for extracting structured information

LangExtract is a Python library developed by Google that leverages large language models (LLMs) to extract structured information from unstructured text—such as clinical notes, research papers, or literary works—based on user-defined instructions. It is designed to transform free-form text into reliable, schema-constrained data while maintaining traceability back to the source material. Each extracted entity is precisely grounded in its original context, allowing visual inspection and validation via automatically generated interactive HTML visualizations. ...

Downloads: 0 This Week

Last Update: 1 day ago
See Project
2

Kimi-Audio

Audio foundation model excelling in audio understanding

Kimi-Audio is an ambitious open-source audio foundation model designed to unify a wide array of audio processing tasks — from speech recognition and audio understanding to generative conversation and sound event classification — within a single cohesive architecture. Instead of fragmenting work across specialized models, Kimi-Audio handles automatic speech recognition (ASR), audio question answering, automatic audio captioning, speech emotion recognition, and audio-to-text chat in one system, enabling developers to build rich, multimodal audio applications without stitching together disparate components. ...

Downloads: 1 This Week

Last Update: 2026-01-27
See Project
3

ChatGPT Exporter

Export and Share your ChatGPT conversation history

ChatGPT Exporter is a browser-based userscript tool designed to export ChatGPT conversations into multiple structured and shareable formats, enabling users to preserve, analyze, and reuse AI-generated content outside the ChatGPT interface. It integrates directly into the ChatGPT web environment, typically via tools like Tampermonkey, and adds export functionality without requiring backend services or complex setup. The tool supports a wide range of output formats including plain text, HTML,...

Downloads: 3 This Week

Last Update: 2026-05-12
See Project
4

Python Progressbar

Progressbar 2 - A progress bar for Python 2 and Python 3

A text progress bar is typically used to display the progress of a long-running operation, providing a visual cue that processing is underway. The progressbar is based on the old Python progressbar package that was published on the now-defunct Google Code. Since that project was completely abandoned by its developer and the developer did not respond to my email, I decided to fork the package.

Downloads: 1 This Week

Last Update: 2024-08-28
See Project
Gemini 3 and 200+ AI Models on One Platform
Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.

Start Free
5

Scriberr

Self-hosted AI audio transcription

Scriberr is a self-hosted AI-powered transcription platform designed to convert audio and video into highly accurate text while prioritizing privacy and local processing. Unlike cloud-based transcription services, Scriberr runs entirely on the user’s machine, ensuring that sensitive recordings are never sent to third-party servers and remain fully under user control. It leverages modern speech recognition models such as Whisper and other advanced architectures to deliver precise transcripts with word-level timing and speaker identification. ...

Downloads: 2 This Week

Last Update: 2026-03-19
See Project
6

Matter AI

Matter AI is open-source AI Code Reviewer Agent

Matter AI is an AI-powered platform designed to enhance productivity through automated content generation, data analysis, and decision support. It leverages machine learning models to process text, analyze patterns, and generate insights, making it suitable for businesses looking to optimize data-driven decision-making. Matter AI integrates with various data sources and provides customizable AI workflows tailored to different industries.

Downloads: 0 This Week

Last Update: 2025-06-29
See Project
7

SuperSocket

Extensible socket server application framework for .NET

...The framework supports multiple protocols, including TCP, UDP, and WebSocket, and gives developers a modular architecture for protocol parsing, session handling, command processing, and server hosting. It is suitable for chat servers, game servers, IoT gateways, telemetry services, industrial systems, and other real-time network applications. SuperSocket is designed to be flexible enough for custom binary or text protocols while still offering reusable abstractions for common server patterns. It is most useful for .NET teams that need robust networking infrastructure with room for domain-specific protocol logic.

Downloads: 1 This Week

Last Update: 3 days ago
See Project
8

Windrecorder

Windrecorder is a memory search app by records everything

Windrecorder is an open-source personal memory search engine that continuously records on-screen activity in a highly optimized and storage-efficient format. It captures screen content locally and builds a searchable database using OCR and image understanding, allowing users to rewind and rediscover anything they have previously seen. The system indexes only meaningful visual changes, extracting text, browser data, and contextual information to improve search accuracy and reduce storage...

Downloads: 1 This Week

Last Update: 2026-04-24
See Project
9

Self-Operating Computer

A framework to enable multimodal models to operate a computer

The Self-Operating Computer Framework is an innovative system that enables multimodal models to autonomously operate a computer by interpreting the screen and executing mouse and keyboard actions to achieve specified objectives. This framework is compatible with various multimodal models and currently integrates with GPT-4o, o1, Gemini Pro Vision, Claude 3, and LLaVa. Notably, it was the first known project to implement a multimodal model capable of viewing and controlling a computer screen....

1 Review

Downloads: 8 This Week

Last Update: 2025-02-28
See Project
MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
10

Raycast Ollama

Raycast extention for Ollama

Raycast Ollama is an extension for Raycast that integrates Ollama-based large language models directly into the macOS productivity launcher environment. It allows users to interact with local AI models through Raycast commands, enabling quick access to chat, text generation, and other AI-powered tasks without leaving their workflow. The extension is designed to be lightweight and fast, aligning with Raycast’s philosophy of keyboard-driven productivity. It provides a seamless interface for...

Downloads: 0 This Week

Last Update: 3 days ago
See Project
11

RuoYi AI

Enterprise AI platform for building, deploying, and managing apps

RuoYi AI is a full-stack enterprise-oriented AI development platform designed to help developers rapidly build, deploy, and manage intelligent applications using modern large language models and AI ecosystems. It provides a unified framework for integrating multiple AI models from different providers, allowing teams to switch or combine models through a consistent interface without vendor lock-in. RuoYi AI includes built-in support for retrieval-augmented generation, enabling organizations...

Downloads: 3 This Week

Last Update: 2026-04-13
See Project
12

LibPDF

A modern PDF library for TypeScript

...The library offers full read and write manipulation, including support for encryption with RC4 and modern AES cipher suites, form filling and flattening, digital signature creation and verification, page merging/splitting, rich text extraction with layout information, and font embedding with subsetting.

Downloads: 0 This Week

Last Update: 3 days ago
See Project
13

Model Zoo

Please do not feed the models

FluxML Model Zoo is a collection of demonstration models built with the Flux machine learning library in Julia. The repository provides ready-to-run implementations across multiple domains, including computer vision, natural language processing, and reinforcement learning. Each model is organized into its own project folder with pinned package versions, ensuring reproducibility and stability. The examples serve both as educational tools for learning Flux and as practical starting points for...

Downloads: 0 This Week

Last Update: 3 days ago
See Project
14

Biome

A toolchain for web projects, aimed to provide functionalities

Biome formats and lints your code in a fraction of a second. Biome supports JavaScript, TypeScript, JSON, and CSS. It aims to support all main languages of modern web development. Biome has sane defaults and requires minimal configuration. Biome helps you as much as possible by displaying detailed and contextualized diagnostics. Biome unifies functionality that has previously been separate tools. Building upon a shared base allows us to provide a cohesive experience for processing code,...

Downloads: 0 This Week

Last Update: 2026-05-09
See Project
15

openwechat

golang WeChat SDK

...Breakthrough login restrictions. No need to scan the QR code repeatedly to log in. Support multiple WeChat accounts to log in at the same time. Message reply, send text, pictures, files, emoji and other messages to designated objects (friends, groups) Hot login (no need to repeatedly scan the code to log in), custom message processing, file download, message anti-withdrawal. Obtain object information, set friend notes, pull friends into groups, etc.

Downloads: 0 This Week

Last Update: 2024-12-21
See Project
16

XML Copy Editor

XML editor

XML Copy Editor is a fast, free, validating XML editor.

13 Reviews

Downloads: 712 This Week

Last Update: 2025-10-05
See Project
17

syslog-ng

Log management solution that improves the performance of SIEM

syslog-ng is the log management solution that improves the performance of your SIEM solution by reducing the amount and improving the quality of data feeding your SIEM. With syslog-ng Store Box, you can find the answer. Search billions of logs in seconds using full text queries with Boolean operators to pinpoint critical logs. syslog-ng Store Box provides secure, tamper-proof storage and custom reporting to demonstrate compliance. syslog-ng can deliver data from a wide variety of sources to...

Downloads: 1 This Week

Last Update: 2026-02-24
See Project
18

Step-Audio 2

Multi-modal large language model designed for audio understanding

...Moreover, Step-Audio2 supports tool-calling and retrieval-augmented generation (RAG), allowing it to access external knowledge sources or audio/text databases, thus reducing hallucinations and improving coherence in complex dialogues.

Downloads: 0 This Week

Last Update: 2026-03-16
See Project
19

Vidi2

Large Multimodal Models for Video Understanding and Editing

Vidi is a family of large multimodal models developed for deep video understanding and editing tasks, integrating vision, audio, and language to allow sophisticated querying and manipulation of video content. It’s designed to process long-form, real-world videos and answer complex queries such as “when in this clip does X happen?” or “where in the frame is object Y during that moment?” — offering temporal retrieval, spatio-temporal grounding (i.e. locating objects over time + space), and...

Downloads: 0 This Week

Last Update: 2026-03-04
See Project
20

llms-from-scratch-cn

Build a large language model from 0 only with Python foundation

llms-from-scratch-cn is an educational open-source project designed to teach developers how to build large language models step by step using practical code and conceptual explanations. The repository provides a hands-on learning path that begins with the fundamentals of natural language processing and gradually progresses toward implementing full GPT-style architectures from the ground up. Rather than focusing on using pre-trained models through APIs, the project emphasizes understanding...

Downloads: 1 This Week

Last Update: 2026-03-26
See Project
21

Data Formulator

Create rich visualizations with AI

...However, these systems do not work well for iterative visualization authoring, because they often require analysts to provide, in a single turn, a text-only prompt that fully describes the complex visualization task to be performed, which is unrealistic to both users and models in many cases. In this paper, we present Data Formulator 2, an LLM-powered visualization system to address these challenges.

Downloads: 1 This Week

Last Update: 2026-05-12
See Project
22

jsonrepair

Repair invalid JSON documents

...It is especially useful in workflows involving AI-generated content, manually edited configuration files, or unreliable external APIs where malformed JSON frequently occurs. jsonrepair supports both browser and Node.js environments, making it suitable for client-side validation tools and backend processing pipelines alike. The project focuses on automation and fault tolerance, reducing the need for manual cleanup of corrupted JSON data. Its lightweight architecture and practical functionality have made it valuable for modern applications that process unpredictable structured text.

Downloads: 0 This Week

Last Update: 2026-05-10
See Project
23

RAG Web UI

RAG Web UI is an intelligent dialogue system based on RAG

...It combines document retrieval with large language models to provide accurate, context-aware responses based on indexed data rather than generic model knowledge. The platform supports ingestion of multiple document formats, including PDFs, Word files, Markdown, and plain text, automatically processing and vectorizing them for efficient retrieval. It features a multi-turn conversational interface that maintains context across interactions, allowing users to engage in more natural and continuous dialogues with their data. The system is designed with a scalable architecture that separates frontend and backend components, enabling distributed deployment and efficient handling of large datasets. ...

Downloads: 0 This Week

Last Update: 2026-04-06
See Project
24

Scribus

Powerful desktop publishing software

Scribus is an Open Source program that brings professional page layout to Linux, BSD UNIX, Solaris, OpenIndiana, GNU/Hurd, Mac OS X, OS/2 Warp 4, eComStation, and Windows desktops with a combination of press-ready output and new approaches to page design. Underneath a modern and user-friendly interface, Scribus supports professional publishing features, such as color separations, CMYK and spot colors, ICC color management, and versatile PDF creation.

144 Reviews

Downloads: 20,583 This Week

Last Update: 6 days ago
See Project
25

Bowtie

Bowtie, an ultrafast, memory-efficient short read aligner for short DNA sequences (reads) from next-gen sequencers. Please cite: Langmead B, et al. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10:R25.

27 Reviews

Downloads: 382 This Week

Last Update: 2026-03-07
See Project