Showing 27 open source projects for "gemini"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 1
    Gemini-API

    Gemini-API

    Reverse-engineered Python API for Google Gemini web app

    Gemini-API is a community-created asynchronous Python wrapper for the web interface of Google’s Gemini models (formerly Bard). It is the result of reverse-engineering the Gemini web app and exposing its functionality through a programmatic API. This enables developers to incorporate Gemini into Python applications, scripts, bots, or tools without relying solely on official SDKs.
    Downloads: 13 This Week
    Last Update:
    See Project
  • 2
    Gemini Fullstack LangGraph Quickstart

    Gemini Fullstack LangGraph Quickstart

    Get started w/ building Fullstack Agents using Gemini 2.5 & LangGraph

    gemini-fullstack-langgraph-quickstart is a fullstack reference application from Google DeepMind’s Gemini team that demonstrates how to build a research-augmented conversational AI system using LangGraph and Google Gemini models. The project features a React (Vite) frontend and a LangGraph/FastAPI backend designed to work together seamlessly for real-time research and reasoning tasks.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    Claude Code Bridge

    Claude Code Bridge

    Real-time multi-AI collaboration: Claude, Codex & Gemini

    Claude Code Bridge is an open-source command-line tool designed to enable real-time collaboration between multiple AI coding assistants within a unified development environment. The system allows developers to coordinate interactions between models such as Claude, Codex, and Gemini so that they can work together on programming tasks. By maintaining persistent shared context between these models, the tool reduces redundant prompts and minimizes token usage while allowing each AI system to contribute specialized capabilities. The architecture functions as a unified launcher that manages communication between multiple AI providers and coordinates their responses within the same development session. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4
    Portia SDK Python

    Portia SDK Python

    Portia Labs Python SDK for building agentic workflows

    ...It supports tool-backed agents capable of real-world interactions—like web browsing, API access, and human-in-the-loop clarifications—while maintaining transparency and auditability through structured plans and execution hooks. Designed for production environments, the SDK integrates with local or cloud LLMs (e.g. OpenAI, Anthropic, Mistral, Gemini) and supports extensive tool registries, session handling, and memory management.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • 5
    Self-Operating Computer

    Self-Operating Computer

    A framework to enable multimodal models to operate a computer

    The Self-Operating Computer Framework is an innovative system that enables multimodal models to autonomously operate a computer by interpreting the screen and executing mouse and keyboard actions to achieve specified objectives. This framework is compatible with various multimodal models and currently integrates with GPT-4o, o1, Gemini Pro Vision, Claude 3, and LLaVa. Notably, it was the first known project to implement a multimodal model capable of viewing and controlling a computer screen. The framework supports features like Optical Character Recognition (OCR) and Set-of-Mark (SoM) prompting to enhance visual grounding capabilities. It is designed to be compatible with macOS, Windows, and Linux (with X server installed), and is released under the MIT license.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 6
    Agent Stack

    Agent Stack

    Deploy and share agents with open infrastructure

    ...The platform supports agents built in frameworks like LangChain, CrewAI, etc., enabling them to be hosted, managed and shared through a unified interface. It also offers multi-model, multi-provider support (OpenAI, Anthropic, Gemini, IBM WatsonX, Ollama etc.), letting users compare performance and cost across models. For developers and organizations building AI-agent products or automations, Agent Stack gives a scaffold that handles the “plumbing”, so they can focus on logic and domain.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 7
    Generative AI

    Generative AI

    Sample code and notebooks for Generative AI on Google Cloud

    Generative AI is a comprehensive collection of code samples, notebooks, and demo applications designed to help developers build generative-AI workflows on the Vertex AI platform. It spans multiple modalities—text, image, audio, search (RAG/grounding) and more—showing how to integrate foundation models like the Gemini family into cloud projects. The README emphasises getting started with prompts, datasets, environments and sample apps, making it ideal for both experimentation and production-ready usage. The repository architecture is organised into folders like gemini/, search/, vision/, audio/, and rag-grounding/, which helps developers locate use cases by modality. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    Open Vision Agents by Stream

    Open Vision Agents by Stream

    Build Vision Agents quickly with any model or video provider

    Open Vision Agents by Stream is an open source framework from Stream for building real time, multimodal AI agents that watch, listen, and respond to live video streams. It focuses on combining video understanding models, such as YOLO and Roboflow based detectors, with real time large language models like OpenAI Realtime and Gemini Live to create interactive experiences. The framework uses Stream’s ultra low latency edge network so agents can join sessions quickly and maintain very low audio and video latency while processing frames and generating responses. Developers work with an agent abstraction that connects video edge providers, LLMs, and processors into pipelines, making it easier to orchestrate tasks like object detection, pose estimation, and conversational guidance. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9
    PaperBanana

    PaperBanana

    Extension of Google Research’s PaperBanana

    ...PaperBanana integrates modern multimodal AI models capable of interpreting instructions and producing graphics that follow academic conventions. The framework supports multiple AI providers including OpenAI, Azure OpenAI services, and Google Gemini, allowing users to run the system with different model backends.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 10
    GPT4Free

    GPT4Free

    The official gpt4free repository

    gpt4free is an open-source project offering free, unrestricted access to GPT‑4–style language models without requiring an API key. The repository includes scripts and server implementations designed to replicate OpenAI’s GPT‑4 API behavior by leveraging publicly available or self-hosted models. It’s licensed under GPL‑v3.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 11
    Agent S

    Agent S

    Agent S: an open agentic framework that uses computers like a human

    ...Agent S combines powerful foundation models (such as GPT-5) with grounding models like UI-TARS to translate visual inputs into precise executable actions. It supports flexible deployment via CLI, SDK, or cloud, and integrates with multiple model providers including OpenAI, Anthropic, Gemini, Azure, and Hugging Face endpoints. With optional local code execution, reflection mechanisms, and compositional planning, Agent S provides a scalable and research-driven framework for building advanced computer-use agents.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 12
    DocsGPT

    DocsGPT

    Private AI platform for agents, enterprise search and RAG pipelines

    ...Connect any data source (PDFs, DOCX, CSV, Excel, HTML, audio, GitHub, databases, URLs) and get accurate, hallucination-free answers with source citations. Choose your LLM: OpenAI, Anthropic, Google Gemini, or local models. Works with Qdrant, MongoDB, and Elasticsearch and more. Deploy via Docker or Kubernetes with full data sovereignty. Build embeddable chat and search widgets, automate multi-step workflows with AI agents, and integrate via Slack, Telegram, Discord, or REST API. Enterprise features include RBAC, 99.9% uptime SLA, and dedicated support. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 13
    Loki Mode

    Loki Mode

    Multi-agent autonomous startup system for Claude Code

    ...It orchestrates dozens of agent types across swarms that handle designated roles — such as architecture, coding, QA, deployment, and business workflows — running in parallel to cover both engineering and operational tasks without continuous human intervention. By supporting multiple AI providers (like Claude Code, OpenAI Codex CLI, and Google Gemini CLI), loki-mode dynamically selects and spawns only the needed agents for a given project, optimizing computational resources and task throughput. Its Reason-Act-Reflect-Verify (RARV) cycle with self-verification loops emphasizes quality and resilience, automating end-to-end development lifecycles.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 14
    Klavis AI

    Klavis AI

    MCP integration platforms for AI agents to use tools at any scale

    ...The flagship product Strata solves tool overload through progressive discovery, achieving +13% higher accuracy and 83%+ success on complex workflows. Developers can integrate via Python/TypeScript SDKs or REST API, with support for OpenAI, Claude, Gemini, LangChain, LlamaIndex, and CrewAI. Features include built-in authentication, multi-tenancy, hosted servers, Docker support, and enterprise security guardrails. Licensed under Apache 2.0, Klavis simplifies AI development by eliminating complex authentication management and enabling seamless workflow automation across multiple applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Mirascope

    Mirascope

    LLM abstractions that aren't obstructions

    Mirascope is a powerful, flexible, and user-friendly library that simplifies the process of working with LLMs through a unified interface that works across various supported providers, including OpenAI, Anthropic, Mistral, Gemini, Groq, Cohere, LiteLLM, Azure AI, Vertex AI, and Bedrock. Whether you're generating text, extracting structured information, or developing complex AI-driven agent systems, Mirascope provides the tools you need to streamline your development process and create powerful, robust applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Open Interface

    Open Interface

    Control Any Computer Using LLMs

    Open Interface is a cross-platform application that allows users to control their computers using large language models (LLMs). By sending user requests to an LLM backend, it determines the necessary steps and executes them by simulating keyboard and mouse inputs. The system can adjust its actions based on real-time feedback, providing a self-driving computer experience.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    TalkingHeads

    TalkingHeads

    A library to communicate with ChatGPT, Claude, Copilot, Gemini

    TalkingHeads is a Python library designed to facilitate communication with various AI chat agents, including ChatGPT, Claude, Copilot, Gemini, HuggingChat, and Pi. It provides a unified interface for interacting with these platforms, simplifying the integration of conversational AI capabilities into applications. TalkingHeads supports browser automation and offers tools to manage sessions, handle prompts, and process responses effectively.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    macai

    macai

    All-in-one native macOS AI chat application

    ...It is built specifically for macOS using Swift and SwiftUI, delivering a lightweight and responsive experience that integrates seamlessly with the operating system. The app supports a wide range of providers, including OpenAI, Anthropic, Google Gemini, xAI, Perplexity, and Ollama, allowing users to switch between local and cloud-based models without changing tools. It includes advanced features such as multimodal capabilities, image generation, search integration, and reasoning workflows, making it more than just a simple chat client. The application also emphasizes privacy by avoiding telemetry and offering optional iCloud synchronization for cross-device continuity. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Hugging Face Skills

    Hugging Face Skills

    Definitions for AI/ML tasks like dataset creation

    ...Each skill is a self-contained folder with structured metadata and guidance that tells an agent how to execute tasks such as dataset creation, model training, evaluation, or Hub operations. The project is designed to be interoperable across major agent ecosystems, including Claude Code, OpenAI Codex, Gemini CLI, and Cursor, making it a cross-platform building block for agent automation. By formalizing best practices and workflows, Skills helps transform general-purpose coding agents into domain-aware assistants that can execute complex ML pipelines with less manual prompting. The repository also includes ready-to-use skills for common Hugging Face operations and encourages teams to extend them with custom domain logic.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Gemma

    Gemma

    Gemma open-weight LLM library, from Google DeepMind

    Gemma, developed by Google DeepMind, is a family of open-weights large language models (LLMs) built upon the research and technology behind Gemini. This repository provides the official implementation of the Gemma PyPI package, a JAX-based library that enables users to load, interact with, and fine-tune Gemma models. The framework supports both text and multi-modal input, allowing natural language conversations that incorporate visual content such as images. It includes APIs for conversational sampling, parameter management, and integration with fine-tuning methods like LoRA. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    Qwen3-Omni

    Qwen3-Omni

    Qwen3-omni is a natively end-to-end, omni-modal LLM

    ...It achieves state-of-the-art results: across 36 audio and audio-visual benchmarks, it hits open-source SOTA on 32 and overall SOTA on 22, outperforming or matching strong closed-source models such as Gemini-2.5 Pro and GPT-4o. To reduce latency, especially in audio/video streaming, Talker predicts discrete speech codecs via a multi-codebook scheme and replaces heavier diffusion approaches.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 22
    Browser Use

    Browser Use

    Make websites accessible for AI agents

    ...It enables developers and AI systems to perform complex online tasks such as form filling, data extraction, and navigation through natural language instructions. Built with Python and compatible with modern LLMs, it integrates seamlessly with tools like ChatBrowserUse, Google Gemini, and Anthropic models. The platform supports both open-source deployment and a fully hosted cloud version for enhanced scalability and performance. Its cloud offering includes advanced capabilities like stealth browsing, CAPTCHA solving, and proxy rotation for reliable automation. Overall, Browser Use transforms web interaction into an intelligent, programmable workflow driven by AI agents.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    GNNPCSAFT Chat

    GNNPCSAFT Chat

    Chatbot with GNNPCSAFT

    The GNNPCSAFT Chat is an implementation of our project that focuses on using Graph Neural Networks (GNN) to estimate the pure-component parameters of the Equation of State PC-SAFT. We developed this app so the scientific community can access the model's results easily. In this app, you can chat with LLM models (Gemini or Ollama) with GNNPCSAFT tools, allowing you to ask questions about the PC-SAFT parameters of various compounds, predict thermodynamic properties, and get insights into the GNNPCSAFT's performance.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 24
    CodinIT.dev

    CodinIT.dev

    Free, local, open-source AI app builder

    ...Deep Supabase integration means you can create UI and backend logic in one cohesive environment, while the model-agnostic architecture lets you connect to any AI, whether cloud-based (Gemini 3 Pro, GPT-5, Claude Sonnet 4.5) or local via Ollama, so you’re never locked in. All source code remains on your device and integrates seamlessly with your preferred IDE. A natural-language API enables powerful data queries and updates, automating tasks without leaving the chat interface. By running entirely locally, CodinIT.dev delivers maximum privacy, minimal latency, and smooth developer experiences free from cloud-based inconsistencies.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    Bot on Anything

    Bot on Anything

    Large model-based chatbot builder that can quickly integrate AI models

    Bot on Anything is a versatile open-source AI chatbot builder that lets developers connect large language models such as ChatGPT, Claude, and Gemini to virtually any messaging platform, website, or interface with minimal configuration. At its heart, the project abstracts away the glue logic between AI model APIs and disparate application “channels,” enabling the same bot logic to run in Slack, Telegram, Gmail, enterprise tools, web UIs, or command-line terminals. Configuration is handled simply through a central JSON file where you define which model and which application channel you want to glue together, so developers can create sophisticated AI assistants without rewriting integration code from scratch. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
MongoDB Logo MongoDB