Showing 289 open source projects for "text user interface"

View related business solutions
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • 1
    VideoChat

    VideoChat

    Real-time voice interactive digital human

    VideoChat is a real-time voice-interactive “digital human” system that combines automatic speech recognition, large language models, text-to-speech, and talking-head generation into a single conversational pipeline. It supports both pure end-to-end voice solutions based on multimodal large language models (GLM-4-Voice feeding directly into talking-head generation) and a more traditional cascaded pipeline using ASR → LLM → TTS → talking head. It is built as a Gradio Python demo, exposing a web interface where users can talk to an animated avatar that lip-syncs to synthesized speech while responding intelligently. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    GenMedia Creative Studio

    GenMedia Creative Studio

    AI generative media user experience highlighting use of APIs

    GenMedia Creative Studio is a Google Cloud reference application for experimenting with generative media workflows on Vertex AI. It provides a user experience for working with models and APIs such as Gemini, Veo, Imagen, Gemini Image, Gemini TTS, Chirp 3, and Lyria. The project is built to showcase multimodal creation across text, image, video, speech, and music from one deployable interface. It is useful for creators, marketers, developers, and technical teams that want to prototype media-generation experiences using Google Cloud services. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Lingvo

    Lingvo

    Framework for building neural networks

    Lingvo is a TensorFlow based framework focused on building and training sequence models, especially for language and speech tasks. It was originally developed for internal research and later open sourced to support reproducible experiments and shared model implementations. The framework provides a structured way to define models, input pipelines, and training configurations using a common interface for layers, which encourages reuse across different tasks. It has been used to implement state...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    MAI-UI

    MAI-UI

    Real-World Centric Foundation GUI Agents

    MAI-UI is a cutting-edge open-source project that implements a family of foundation GUI (Graphical User Interface) agent models capable of interpreting natural language and performing real-world GUI navigation and control tasks across mobile and desktop environments. Developed by Tongyi-MAI (Alibaba’s research initiative), the MAI-UI models are multimodal agents trained to understand user instructions and corresponding screenshots, grounding those instructions to on-screen elements and generating sequences of GUI actions such as taps, swipes, text input, and system commands. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • $300 Free Credits to Build on Google Cloud Icon
    $300 Free Credits to Build on Google Cloud

    New to Google Cloud? Get $300 in credits to explore Compute Engine, BigQuery, Cloud Run, Gemini Enterprise Agent Platform, and more.

    Start your next project with $300 in free Google Cloud credit. Spin up VMs, run containers, query petabytes in BigQuery, or build agents with Gemini Enterprise Agent Platform. Once your credits are used, keep building with 20+ always-free tier products including Compute Engine, Cloud Storage, GKE, and Cloud Run functions. No commitment required—just sign up and start building.
    Claim $300 Free
  • 5
    NeMo Curator

    NeMo Curator

    Scalable data pre processing and curation toolkit for LLMs

    NeMo Curator is a Python library specifically designed for fast and scalable dataset preparation and curation for large language model (LLM) use-cases such as foundation model pretraining, domain-adaptive pretraining (DAPT), supervised fine-tuning (SFT) and paramter-efficient fine-tuning (PEFT). It greatly accelerates data curation by leveraging GPUs with Dask and RAPIDS, resulting in significant time savings. The library provides a customizable and modular interface, simplifying pipeline...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6
    vim-ai

    vim-ai

    AI-powered code assistant for Vim. OpenAI and ChatGPT plugin for Vim

    ...The repository also highlights support for custom roles, vision features such as image-to-text, and an emerging provider-plugin model for extending compatibility further. A notable design point is that it only sends content the user explicitly selects or includes in prompts, which helps users control what is shared with the external model.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    AI Runner

    AI Runner

    Offline inference engine for art, real-time voice conversations

    AI Runner is an offline inference engine designed to run a collection of AI workloads on your own machine, including image generation for art, real-time voice conversations, LLM-powered chatbots and automated workflows. It is implemented as a desktop-oriented Python application and emphasizes privacy and self-hosting, allowing users to work with text-to-speech, speech-to-text, text-to-image and multimodal models without sending data to external services. At the core of its LLM stack is a mode-based architecture with specialized “modes” such as Author, Code, Research, QA and General, and a workflow manager that automatically routes user requests to the right agent based on the task. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 8
    Stanza

    Stanza

    Stanford NLP Python library for many human languages

    Stanza is a collection of accurate and efficient tools for the linguistic analysis of many human languages. Starting from raw text to syntactic analysis and entity recognition, Stanza brings state-of-the-art NLP models to languages of your choosing. Stanza is a Python natural language analysis package. It contains tools, which can be used in a pipeline, to convert a string containing human language text into lists of sentences and words, to generate base forms of those words, their parts of...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    MCP Server OpenDAL

    MCP Server OpenDAL

    Model Context Protocol Server for Apache OpenDAL™

    Model Context Protocol Server for Apache OpenDAL™ is an MCP server implementation that provides access to various storage services via Apache OpenDAL. It enables seamless interactions with multiple storage backends through a unified interface. ​
    Downloads: 0 This Week
    Last Update:
    See Project
  • Train ML Models With SQL You Already Know Icon
    Train ML Models With SQL You Already Know

    BigQuery automates data prep, analysis, and predictions with built-in AI assistance.

    Build and deploy ML models using familiar SQL. Automate data prep with built-in Gemini. Query 1 TB and store 10 GB free monthly.
    Try Free
  • 10
    Hunyuan3D-1

    Hunyuan3D-1

    A Unified Framework for Text-to-3D and Image-to-3D Generation

    ...Community and ecosystem support (e.g. usage via Blender addon for geometry/texture). Integration into user-friendly tools/platforms.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    AG-UI

    AG-UI

    The Agent-User Interaction Protocol

    AG-UI is an open, lightweight protocol for connecting AI agents to user-facing applications through a standardized event-based interface. It is designed to make agent behavior visible, interactive, and controllable inside real-time front-end experiences. Instead of treating an AI agent as a black-box chat endpoint, AG-UI defines structured events for messages, tool calls, state changes, lifecycle updates, and user interactions.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    BettaFish

    BettaFish

    Public opinion analysis system

    BettaFish is an open-source, multi-agent public opinion analysis system built to automate the collection, deep analysis, and reporting of social media data at scale through conversational queries. It uses a modular architecture of specialized agents that collaborate to crawl mainstream platforms, extract multimodal content like text and short video, and synthesize insights through both statistical and large language model techniques. With a design that lets users pose questions in natural...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    LibrePhotos

    LibrePhotos

    A self-hosted open source photo management service

    LibrePhotos is an open-source self-hosted photo management platform designed to organize, browse, and analyze personal media libraries while preserving user privacy. The system allows individuals to store and manage their photos and videos locally rather than relying on commercial cloud services. It provides features similar to services like Google Photos but runs on a private server controlled by the user. The application includes AI-powered tools that automatically analyze images to detect faces, objects, and locations, allowing photos to be grouped and searched more efficiently. ...
    Downloads: 14 This Week
    Last Update:
    See Project
  • 14
    AppAgent

    AppAgent

    Multimodal Agents as Smartphone Users, an LLM-based multimodal agent

    ...AppAgent combines vision capabilities with language reasoning to understand interface elements and determine which actions are required to accomplish a task. The system also includes mechanisms for exploration and learning, allowing the agent to analyze user interface layouts and build structured knowledge about how different apps function.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    Vedana

    Vedana

    Open source multi-agent RAG over a knowledge graph

    Vedana is an open-source multi-agent RAG system built around a typed knowledge graph. It is designed for questions that require structure, completeness, and traceability instead of simple text similarity. The system lets agents navigate data step by step through Cypher queries, vector search, document lookup, and source verification. Its architecture combines a knowledge graph, pgvector-based embeddings, incremental ETL, and a backoffice interface for chat, metrics, prompt tuning, and data loading. It also includes JIMS, a framework for persistent conversational agents with typed events and pluggable pipelines. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 16
    Search with Lepton

    Search with Lepton

    Lightweight demo to build a conversational AI search engine quickly

    Search with Lepton is an open source demonstration project that shows how to build a conversational search engine using the Lepton AI framework. It combines traditional web search with large language models to provide natural language answers to user queries. It retrieves information from supported search engines and uses that context to generate responses through a retrieval-augmented generation approach. The implementation is intentionally minimal, containing fewer than 500 lines of code while still providing a complete working example of an AI-powered search system. It includes both a backend service written in Python and a web interface that allows users to interact with the search engine in a conversational format. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 17
    Cactus

    Cactus

    Low-latency AI inference engine optimized for mobile devices

    ...Cactus emphasizes efficient memory usage through techniques such as zero-copy computation graphs and quantized model formats, allowing large models to run within the constraints of mobile hardware. It supports a wide range of AI tasks including text generation, speech-to-text, vision processing, and retrieval-augmented workflows through a unified API interface. A notable feature of Cactus is its hybrid execution model, which can dynamically route tasks between on-device processing and cloud services when additional compute is required.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Gitingest

    Gitingest

    Create prompt-friendly codebase digests from any Git repository URL

    Gitingest is a developer utility that converts an entire Git repository into a structured, prompt-friendly text digest suitable for use with large language models. It analyzes a repository and produces a consolidated textual representation that includes the file structure and code content in an organized format. This makes it easier to provide meaningful code context when working with AI systems that require compact, readable inputs. Developers can generate these digests from either a local...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    PaperBanana

    PaperBanana

    Extension of Google Research’s PaperBanana

    PaperBanana is an open-source agentic framework designed to automatically generate publication-quality academic diagrams and statistical plots directly from text descriptions. The project focuses on helping researchers, educators, and data scientists transform conceptual descriptions of figures into structured visual outputs suitable for research papers, presentations, and technical reports. Instead of manually designing charts or diagrams using traditional visualization tools, users can...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Pixeltable

    Pixeltable

    Data Infrastructure providing an approach to multimodal AI workloads

    ...Developers define data transformations and AI operations using computed columns on tables, allowing pipelines to evolve incrementally as new data or models are added. The framework supports multimodal content including images, video, text, and audio, enabling applications such as retrieval-augmented generation systems, semantic search, and multimedia analytics.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    OmAgent

    OmAgent

    Build multimodal language agents for fast prototype and production

    OmAgent is an open-source Python framework designed to simplify the development of multimodal language agents that can reason, plan, and interact with different types of data sources. The framework provides abstractions and infrastructure for building AI agents that operate on text, images, video, and audio while maintaining a relatively simple interface for developers. Instead of forcing developers to implement complex orchestration logic manually, the system manages task scheduling, worker coordination, and node optimization behind the scenes. Its architecture uses a graph-based workflow engine where tasks are represented as nodes in a directed workflow, enabling modular composition of complex reasoning pipelines. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    LLM TLDR

    LLM TLDR

    95% token savings. 155x faster queries. 16 languages

    LLM TLDR is a tool that leverages large language models (LLMs) to generate concise, coherent summaries (TL;DRs) of long documents, articles, or text files, helping users quickly understand large amounts of content without reading every word. It integrates with LLM APIs to handle input texts of varying lengths and complexity, applying techniques like chunking, context management, and multi-pass summarization to preserve accuracy even when the source is very large. The system supports both...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Argilla

    Argilla

    The open-source data curation platform for LLMs

    Argilla is a production-ready framework for building and improving datasets for NLP projects. Deploy your own Argilla Server on Spaces with a few clicks. Use embeddings to find the most similar records with the UI. This feature uses vector search combined with traditional search (keyword and filter based). Argilla is free, open-source, and 100% compatible with major NLP libraries (Hugging Face transformers, spaCy, Stanford Stanza, Flair, etc.). In fact, you can use and combine your preferred...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    FAY

    FAY

    Framework for building AI-powered interactive digital humans and agent

    Fay is an open source framework designed to build and deploy interactive digital humans powered by large language models. It acts as a middleware layer that connects digital character technologies with conversational AI systems and business applications. Fay supports various types of digital humans, including 2.5D and 3D avatars, and can be integrated with applications running on mobile devices, PCs, web platforms, and embedded systems. Its architecture allows developers to combine different...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 25
    Clay Foundation Model

    Clay Foundation Model

    The Clay Foundation Model - An open source AI model and interface

    The Clay Foundation Model is an open-source AI model and interface designed to provide comprehensive data and insights about Earth. It aims to serve as a foundational tool for environmental monitoring, research, and decision-making by integrating various data sources and offering an accessible platform for analysis.
    Downloads: 1 This Week
    Last Update:
    See Project
Auth0 Logo