Search Results for "structured text" - Page 4

Showing 332 open source projects for "structured text"

View related business solutions
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • 1
    Chroma

    Chroma

    A general purpose syntax highlighter in pure Go

    As Chroma has just been released, its API is still in flux. That said, the high-level interface should not change significantly. Chroma takes source code and other structured text and converts it into syntax-highlighted HTML, ANSI-coloured text, etc. Chroma is based heavily on Pygments and includes translators for Pygments lexers and styles. ABAP, ABNF, ActionScript, ActionScript 3, Ada, Angular2, ANTLR, ApacheConf, APL, AppleScript, Arduino, Awk. PacmanConf, Perl, PHP, PHTML, Pig, PkgConfig, PL/pgSQL, plaintext, Pony, PostgreSQL SQL dialect, PostScript, POVRay, PowerShell, Prolog, PromQL, Properties, Protocol Buffer, PSL, Puppet, Python 2, Python. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    Kor

    Kor

    LLM

    This is a half-baked prototype that “helps” you extract structured data from text using LLMs. Specify the schema of what should be extracted and provide some examples. Kor will generate a prompt, send it to the specified LLM and parse out the output. You might even get results back.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    JSON Editor

    JSON Editor

    A web-based tool to view, edit, format, and validate JSON

    JSON Editor is a web-based JSON editing and visualization tool designed for viewing, editing, formatting, validating, and transforming JSON documents in multiple interactive modes. The project provides several editing interfaces including tree view, code editor, form-based editing, and plain text modes, allowing users to work with structured data in the format most suitable for their workflow. It can be embedded directly into web applications as a reusable component and supports large JSON documents with schema validation and formatting capabilities. JSONEditor emphasizes usability by combining developer-focused functionality with accessible visual editing tools for non-technical users. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Deep Chat

    Deep Chat

    Customizable AI chat component for websites with API support

    Deep Chat is a highly customizable web component designed to simplify the integration of AI-powered chat interfaces into websites. It allows developers to embed a fully functional chatbot using minimal setup, while still offering extensive control over behavior, appearance, and integrations. Deep Chat supports connections to a wide range of AI services as well as custom backends, enabling flexible deployment for different use cases. It is built as a framework-agnostic solution, meaning it...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 5
    Databend

    Databend

    Cloud-native open source data warehouse for analytics and AI queries

    Databend is an open source cloud-native data warehouse designed for large-scale analytics and modern data workloads. Built in Rust, the system focuses on high performance, scalability, and efficient data processing for analytical queries. It is designed with a separation of compute and storage, allowing compute nodes to scale independently while storing data in object storage systems. This architecture enables cost-efficient storage and elastic scaling for workloads that involve large...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    PDFCraft

    PDFCraft

    PDFCraft is a free, privacy-focused PDF toolkit

    PDFCraft is an extensible toolkit for creating, editing, and transforming PDF documents with both a graphical interface and a scripting API, making it useful for users ranging from casual editors to automated document processors. At its core, the project provides a clean, modern UI where you can rearrange pages, annotate text, insert images, fill forms, and export to multiple formats, all without needing a heavyweight commercial PDF suite. But beyond manual editing, it also offers a programmable layer so developers can write scripts to batch process documents, generate templated reports, or extract structured data from PDFs for integration in workflows. ...
    Downloads: 16 This Week
    Last Update:
    See Project
  • 7
    GLM-OCR

    GLM-OCR

    Accurate × Fast × Comprehensive

    GLM-OCR is an open-source multimodal optical character recognition (OCR) model built on a GLM-V encoder–decoder foundation that brings robust, accurate document understanding to complex real-world layouts and modalities. Designed to handle text recognition, table parsing, formula extraction, and general information retrieval from documents containing mixed content, GLM-OCR excels across major benchmarks while remaining highly efficient with a relatively compact parameter size (~0.9B), enabling deployment in high-concurrency services and edge environments. The model’s multimodal capabilities allow it to reason across image and text content holistically, capturing structured and unstructured information from pages that include dense tables, seals, code snippets, and varied document graphics. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 8
    cloudflare-speed-cli

    cloudflare-speed-cli

    CLI for internet speed test via cloudflare

    ...In addition to TUI mode, it supports headless text or JSON output for pipelines and monitoring systems.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 9
    DataProfiler

    DataProfiler

    Extract schema, statistics and entities from datasets

    DataProfiler is an AI-powered tool for automatic data analysis and profiling, designed to detect patterns, anomalies, and schema inconsistencies in structured and unstructured datasets. The DataProfiler is a Python library designed to make data analysis, monitoring, and sensitive data detection easy. Loading Data with a single command, the library automatically formats & loads files into a DataFrame. Profiling the Data, the library identifies the schema, statistics, entities (PII / NPI), and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Your monitoring isn't a stack. It's a pile. Fix that. Icon
    Your monitoring isn't a stack. It's a pile. Fix that.

    Errors, performance, logs, uptime. One install, one invoice, one UI.

    Replace Datadog, New Relic, and Sentry without adding three more dashboards.
    Free 30 days.
  • 10
    Hollama

    Hollama

    A minimal LLM chat app that runs entirely in your browser

    ...Because the application runs as a static web interface, it does not require complex backend infrastructure and can be easily deployed or self-hosted. Hollama supports both text-based and multimodal interactions, allowing users to work with models that process images as well as text. The interface includes features for editing prompts, retrying responses, copying generated code snippets, and storing conversation history locally within the browser. Mathematical expressions can be rendered using KaTeX, and Markdown formatting allows code blocks and structured outputs to appear clearly within conversations.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 11
    OpenViking

    OpenViking

    Context database designed specifically for AI Agents

    OpenViking is an open-source context database engineered for efficient indexing and retrieval of large amounts of unstructured or semi-structured context data used by AI applications. It’s primarily designed to serve as a high-performance, scalable backend for storing app context, embeddings, conversational histories, and other textual artifacts that need rapid lookup and semantic search, which makes it especially useful for systems like chatbots or memory-augmented agents. The project is...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    GraphRAG

    GraphRAG

    A modular graph-based Retrieval-Augmented Generation (RAG) system

    The GraphRAG project is a data pipeline and transformation suite that is designed to extract meaningful, structured data from unstructured text using the power of LLMs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    PPT Builder Skill

    PPT Builder Skill

    AI-friendly PPT builder skill: 17 hand-polished Chinese PPTX templates

    PPT Builder Skill is an AI-friendly PowerPoint builder skill designed to create editable native PPTX presentations from structured content. It includes polished Chinese presentation templates and uses python-pptx-based tools to preserve layout while applying controlled text edits. The skill supports workflows where an agent selects a template, writes an edits file, and produces a real PowerPoint file with the original design intact. It can also work with user-provided templates by inspecting slide images and shape structures before applying non-destructive changes. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    news-please

    news-please

    Python tool for crawling and extracting structured data from news site

    news-please is an open source news crawler and information extraction tool designed to collect and structure articles from online news websites. It provides an integrated pipeline that crawls news sites, retrieves article pages, and extracts structured information such as headlines, authors, publication dates, and article text. news-please can recursively follow internal links and read RSS feeds to gather both recent and archived articles from a news outlet when given only the root URL of a site. It combines several established technologies and libraries to perform web crawling and content extraction, enabling reliable processing across a wide range of news sources. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    dots.ocr

    dots.ocr

    Multilingual Document Layout Parsing in a Single Vision-Language Model

    ...It achieves state-of-the-art performance on document parsing benchmarks while maintaining a relatively compact model size, demonstrating efficiency without sacrificing accuracy. Beyond standard OCR tasks, it extends its capabilities to parse complex visual elements such as charts, diagrams, and web interfaces, converting them into structured outputs like SVG code.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    Mirascope

    Mirascope

    LLM abstractions that aren't obstructions

    Mirascope is a powerful, flexible, and user-friendly library that simplifies the process of working with LLMs through a unified interface that works across various supported providers, including OpenAI, Anthropic, Mistral, Gemini, Groq, Cohere, LiteLLM, Azure AI, Vertex AI, and Bedrock. Whether you're generating text, extracting structured information, or developing complex AI-driven agent systems, Mirascope provides the tools you need to streamline your development process and create powerful, robust applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Dnote

    Dnote

    A simple command line notebook

    Dnote is a personal knowledge management tool focused on easily capturing, organizing, and retrieving technical notes such as code snippets, terminal commands, and short documentation, all while keeping content accessible and searchable. It is designed around fast, distraction-free workflows that let users jot down notes quickly from the terminal or a web interface, ensuring that insights and solutions are captured at the moment they occur. With structured tagging and hierarchical...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    files-to-prompt

    files-to-prompt

    Concatenate a directory full of files into a single prompt

    ...It includes rich filtering controls, letting you limit by extension, include or skip hidden files, and ignore paths that match glob patterns or .gitignore rules. The output format is flexible: you can emit plain text, Markdown with fenced code blocks, or a Claude-XML style format designed for structured multi-file prompts. It can read file paths from stdin (including NUL-separated paths), which makes it easy to combine with find, rg, or other shell tools.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    AudioNotes

    AudioNotes

    Extract audio and video content and organize it into a Markdown note

    AudioNotes is an application (or proof-of-concept) that likely combines audio recording or playback with note-taking or annotation functionality — enabling users to record voice or audio and attach textual or timestamped notes, making it ideal for lectures, interviews, meetings, or personal memos. Such a tool offers a more expressive and flexible way to capture and revisit information: instead of just typed notes or raw audio, users get both audio context and structured notes. As an...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Whisper-WebUI

    Whisper-WebUI

    A Web UI for easy subtitle using whisper model

    ...Whisper WebUI also includes advanced preprocessing and postprocessing features such as voice activity detection, background music separation, and speaker diarization, enabling more accurate and structured outputs.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 21
    RAG Anything

    RAG Anything

    RAG-Anything: All-in-One RAG Framework

    RAG-Anything is an open-source unified framework that extends the Retrieval-Augmented Generation (RAG) paradigm to fully multimodal document and knowledge retrieval, enabling systems to ingest, parse, represent, and query rich content that includes text, images, tables, formulas, and other structured or visual elements. Traditional RAG systems are typically limited to text and cannot effectively work across heterogeneous document layouts, but RAG-Anything addresses this by modeling multimodal content in ways that preserve cross-modal relationships and semantic context, often treating content elements as interconnected knowledge entities rather than separate data silos. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    QMD

    QMD

    mini cli search engine for your docs, knowledge bases, etc.

    QMD is a powerful and lightweight command-line tool that acts as an on-device search engine for your personal knowledge base, allowing you to index and search files like Markdown notes, meeting transcripts, technical documentation, and other text collections without depending on cloud services. Designed to keep all search activity local, it combines classic full-text search techniques with modern semantic features such as vector similarity and hybrid ranking so that queries return not just...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Evidently

    Evidently

    Evaluate and monitor ML models from validation to production

    Evidently is an open-source Python library for data scientists and ML engineers. It helps evaluate, test, and monitor ML models from validation to production. It works with tabular, text data and embeddings.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    visual-explainer

    visual-explainer

    Agent skill + prompt templates that generate rich HTML pages

    ...The project includes prompt templates and automation logic that enable coding agents to generate visual summaries such as diff reviews, architecture overviews, plan audits, and structured data tables. Its primary goal is to bridge the readability gap between raw machine output and stakeholder-friendly documentation. By producing styled web pages instead of plain text logs, visual-explainer improves communication in engineering and AI workflows where clarity is critical. The tool is particularly useful in environments that rely on autonomous agents or CI pipelines that generate dense technical output.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 25
    DocETL

    DocETL

    A system for agentic LLM-powered data processing and ETL

    DocETL is an open-source system designed to build and execute data processing pipelines powered by large language models, particularly for analyzing complex collections of documents and unstructured datasets. The platform allows developers and researchers to construct structured workflows that extract, transform, and organize information from sources such as reports, transcripts, legal documents, and other text-heavy data. Instead of relying on single prompts or ad-hoc scripts, DocETL provides a declarative pipeline framework that breaks complex document analysis tasks into manageable operations that can be optimized and orchestrated automatically. ...
    Downloads: 1 This Week
    Last Update:
    See Project
Auth0 Logo