Showing 153 open source projects for "structured text"

View related business solutions
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • 1
    DFlash

    DFlash

    Block Diffusion for Ultra-Fast Speculative Decoding

    DFlash is an open-source framework for ultra-fast speculative decoding using a lightweight block diffusion model to draft text in parallel with a target large language model, dramatically improving inference speed without sacrificing generation quality. It acts as a “drafter” that proposes likely continuations which the main model then verifies, enabling significant throughput gains compared to traditional autoregressive decoding methods that generate token by token.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    A2UI

    A2UI

    A Protocol for Agent-Driven Interfaces

    ...A key design principle of A2UI is security, as it avoids executing arbitrary code generated by models and instead restricts output to structured data that maps to a predefined catalog of trusted UI components. The system also supports incremental updates, allowing agents to progressively modify the interface as a conversation evolves.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    yek

    yek

    Serialize repositories into LLM-ready context w/ smart prioritization

    Yek is a Rust-based CLI tool designed to serialize text-based files from a repository or directory into a single structured output for large language model use. It scans projects using .gitignore rules to exclude irrelevant files and automatically filters out binary or oversized content. Yek prioritizes files based on Git history, placing more important content later in the output to align with how language models process context.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    myGPTReader

    myGPTReader

    AI Slack bot for reading, summarizing, and chatting with content

    ...It enables users to quickly understand web pages, documents, and even video content by transforming them into interactive discussions rather than static reading experiences. myGPTReader supports a wide range of file formats, including eBooks, PDFs, and text-based documents, making it flexible for both casual and professional use cases. It also integrates voice interaction capabilities, allowing users to communicate with the system verbally and even use it as a language practice assistant. In addition to content reading, myGPTReader includes built-in prompt templates that enhance conversations and help users get more structured and relevant responses. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • 5
    Youtu-GraphRAG

    Youtu-GraphRAG

    Vertically Unified Agents for Graph Retrieval-Augmented Reasoning

    ...The system combines knowledge graphs, retrieval mechanisms, and agent-based reasoning into a unified architecture designed to handle knowledge-intensive tasks. Instead of relying solely on text retrieval, the framework organizes information into structured graph schemas that represent entities, relationships, and attributes. These structures allow the system to perform multi-hop reasoning by decomposing complex questions into smaller queries that can be executed across different parts of the graph. The framework also incorporates hierarchical community detection algorithms that organize knowledge into clusters, improving both retrieval efficiency and reasoning performance. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    llms-from-scratch-cn

    llms-from-scratch-cn

    Build a large language model from 0 only with Python foundation

    llms-from-scratch-cn is an educational open-source project designed to teach developers how to build large language models step by step using practical code and conceptual explanations. The repository provides a hands-on learning path that begins with the fundamentals of natural language processing and gradually progresses toward implementing full GPT-style architectures from the ground up. Rather than focusing on using pre-trained models through APIs, the project emphasizes understanding...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Banana Slides

    Banana Slides

    A native AI PPT generation application based on nano banana pro

    Banana Slides is an open-source application designed to automatically generate presentation slides using artificial intelligence. Built on top of the Nano Banana Pro framework, the software enables users to transform simple prompts or outlines into complete slide decks without manually formatting content. Instead of relying on traditional slide editing workflows, the system allows users to describe the desired presentation in natural language and have the AI generate structured slides,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Dash Data Agent

    Dash Data Agent

    Self-learning data agent that grounds its answers in layers of content

    Dash is a self-learning data agent built by the Agno AI community that generates grounded answers to English queries over structured data by synthesizing SQL and reasoning based on six layers of context, improving automatically with each run. It sidesteps common limitations of simple text-to-SQL agents by incorporating multiple context layers — including schema structure, human annotations, known query patterns, institutional knowledge from docs, machine-discovered error patterns, and live runtime context — to generate SQL queries that are both technically correct and semantically meaningful. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Weaviate

    Weaviate

    Weaviate is a cloud-native, modular, real-time vector search engine

    ...With Weaviate you can also bring your custom ML models to production scale. Weaviate in detail: Weaviate is a low-latency vector search engine with out-of-the-box support for different media types (text, images, etc.). It offers Semantic Search, Question-Answer-Extraction, Classification, Customizable Models (PyTorch/TensorFlow/Keras), and more. Built from scratch in Go, Weaviate stores both objects and vectors, allowing for combining vector search with structured filtering with the fault-tolerance of a cloud-native database, all accessible through GraphQL, REST, and various language clients.
    Downloads: 2 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    MCP UI

    MCP UI

    SDK for building interactive UI components over MCP for AI tools

    ...It enables developers to create rich, dynamic UI components that can be delivered from an MCP server and rendered seamlessly by a compatible client. Instead of returning only text responses, tools can provide structured UI resources such as HTML or remote-rendered components, allowing more engaging and functional interactions. mcp-ui introduces a standardized approach where tools and their associated interfaces are linked through metadata, enabling clients to automatically discover and display the correct UI. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Acontext

    Acontext

    Context data platform for building observable, self-learning AI agents

    Acontext is a cloud-native context data platform designed to support the development and operation of advanced AI agents. It provides a unified system to store and manage contexts, multimodal messages, artifacts, and task workflows, enabling developers to engineer context effectively for their agent products. The platform observes agent tasks and user feedback in real time, offering robust observability into workflows and helping teams understand how agents perform over time. Acontext also...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Starter Applets

    Starter Applets

    Google AI Studio Starter Apps

    starter-applets is a collection of minimal, sandboxed example “applets” that demonstrate how to compose Gemini-powered microapps (chat widgets, image generation, workflows) that can be embedded in other applications or used standalone. The applets are structured with a focus on simplicity: each presents a prompt input, minimal UI logic, and inline display of the resulting output or widget (e.g. generated text, images). They are built to illustrate best practices (e.g. safety guards, prompt templates, streaming UI updates) rather than production feature sets. The repo supplies a CLI or script to scaffold new applet templates, letting developers spin up small Gemini-powered components quickly. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Trae Agent

    Trae Agent

    LLM-based agent for general purpose software engineering tasks

    Trae Agent is an open-source, LLM-based agent system also developed by ByteDance, focused primarily on automating software engineering workflows. It provides a command-line interface (CLI) that accepts natural-language instructions (e.g. “refactor this module,” “write a unit test,” “generate a REST API skeleton”), and then orchestrates tool-based workflows — such as file editing, shell/batch commands, code generation, code formatting or refactoring — to carry out complex engineering tasks....
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    Qwen2.5

    Qwen2.5

    Open source large language model by Alibaba

    ...The models are available in various sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B parameters, catering to diverse computational requirements. Trained on a comprehensive dataset of up to 18 trillion tokens, Qwen2.5 models exhibit significant improvements in instruction following, long-text generation (exceeding 8,000 tokens), and structured data comprehension, such as tables and JSON formats. They support context lengths up to 128,000 tokens and offer multilingual capabilities in over 29 languages, including Chinese, English, French, Spanish, and more. The models are open-source under the Apache 2.0 license, with resources and documentation available on platforms like Hugging Face and ModelScope. ...
    Downloads: 33 This Week
    Last Update:
    See Project
  • 15
    minbpe

    minbpe

    Minimal, clean code for the Byte Pair Encoding (BPE) algorithm

    minbpe is a minimal, clean implementation of byte-level Byte Pair Encoding (BPE), the tokenization approach widely used in modern language models. It operates on UTF-8 encoded bytes rather than Unicode characters, which makes it robust to arbitrary text inputs and avoids needing a language-specific character vocabulary. The repository is structured as a teaching-oriented implementation that shows how to train a tokenizer by learning merge rules, then apply those merges to encode text into token IDs and decode tokens back into text. It is intentionally small and readable so developers can understand each stage of BPE, including the mechanics of pair counting, merge application, and vocabulary growth. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    LangChain Extract

    LangChain Extract

    Did you say you like data?

    LangChain Extract is an open-source reference application designed to demonstrate how large language models can be used to extract structured data from unstructured text and document files. The project implements a lightweight web service that allows developers to define extraction schemas and apply them to various sources such as plain text, HTML, or PDF documents. Built using FastAPI and the LangChain framework, the application exposes a REST API that can process documents and return structured outputs that match user-defined JSON schemas. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Deep-Learning-Interview-Book

    Deep-Learning-Interview-Book

    Interview guide for machine learning, mathematics, and deep learning

    ...Many entries connect theory to implementation details, including how choices in activation, initialization, or normalization affect convergence and stability. The content is organized for fast review before an interview loop but is also deep enough for systematic study over weeks. Because it’s text-first and modular, it works equally well as a quick refresher or a backbone for a full study plan.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    DocWire SDK

    DocWire SDK

    Award-winning modern data processing SDK in C++20

    DocWire SDK, a standout C++20AI driven data processing tool, has received award from SourceForge and strong backing from Microsoft. It handles nearly 100 file types, empowering efficient text extraction, web data extraction, and document analysis. For businesses, the shift to DocWire SDK signifies a leap forward. It promises comprehensive document format support and the ability to extract valuable insights from email boxes, databases, and websites using cutting-edge AI. DocWire SDK aims to...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    ModelFusion

    ModelFusion

    The TypeScript library for building AI applications

    ...The framework allows developers to integrate large language models and other generative systems into JavaScript and TypeScript applications through a consistent and standardized API. Instead of writing separate integration logic for each provider, developers can use ModelFusion to handle common operations such as text generation, structured object generation, streaming responses, and tool calls. The library supports a wide range of model types, including text generation models, vision models, text-to-speech engines, speech-to-text systems, and embedding models. It also includes built-in production features such as observability hooks, logging, automatic retries, and error handling mechanisms that improve reliability when deploying AI systems in real-world environments.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    PyTextRank

    PyTextRank

    Python implementation of TextRank algorithms

    PyTextRank is a Python implementation of TextRank as a spaCy pipeline extension, for graph-based natural language work -- and related knowledge graph practices.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    LexiFinder

    LexiFinder

    AI-powered semantic indexing: automating the creation of book indexes

    ...The index can be exported as plain text, JSON, CSV, or HTML.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Wikipedia2Vec

    Wikipedia2Vec

    A tool for learning vector representations of words and entities

    Wikipedia2Vec is an embedding learning tool that creates word and entity vector representations from Wikipedia, enabling NLP models to leverage structured and contextual knowledge.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Ailice

    Ailice

    AIlice is a fully autonomous, general-purpose AI agent

    AIlice is an open-source autonomous AI agent framework built to function as a general-purpose assistant that can plan, decompose, and execute complex tasks through a structured multi-agent architecture. The project presents itself as a standalone assistant powered by open-source language models, with an internal design that treats user requests almost like executable programs rather than simple chat prompts. Its core IACT architecture allows the system to break large goals into smaller...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    DB-GPT-Hub

    DB-GPT-Hub

    A repository that contains models, datasets, and fine-tuning

    DB-GPT-Hub is an open-source repository that provides datasets, models, and training tools designed to improve large language models for database interaction tasks, particularly Text-to-SQL. The project serves as a specialized extension of the broader DB-GPT ecosystem, focusing on the preparation and evaluation of models capable of translating natural language questions into structured database queries. It offers a modular framework that supports data preparation, model fine-tuning, benchmarking, and inference for Text-to-SQL systems. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Promptify

    Promptify

    se GPT or other prompt based models to get structured output

    Promptify is an open-source Python library designed to simplify prompt engineering and the development of natural language processing pipelines using large language models. The project provides tools that help developers generate structured prompts for different NLP tasks and apply them across multiple generative AI systems. Instead of manually crafting prompts for each task, Promptify introduces a unified architecture that combines prompt templates, language model interfaces, and processing pipelines into a single framework. This approach allows developers to perform tasks such as text classification, named entity recognition, question answering, and information extraction using consistent prompt templates. ...
    Downloads: 0 This Week
    Last Update:
    See Project
Auth0 Logo