Showing 25 open source projects for "data base"

View related business solutions
  • Ship AI Apps Faster with Vertex AI Icon
    Ship AI Apps Faster with Vertex AI

    Go from idea to deployed AI app without managing infrastructure. Vertex AI offers one platform for the entire AI development lifecycle.

    Ship AI apps and features faster with Vertex AI—your end-to-end AI platform. Access Gemini 3 and 200+ foundation models, fine-tune for your needs, and deploy with enterprise-grade MLOps. Build chatbots, agents, or custom models. New customers get $300 in free credit.
    Try Vertex AI Free
  • Go from Data Warehouse to Data and AI platform with BigQuery Icon
    Go from Data Warehouse to Data and AI platform with BigQuery

    Build, train, and run ML models with simple SQL. Automate data prep, analysis, and predictions with built-in AI assistance from Gemini.

    BigQuery is more than a data warehouse—it's an autonomous data-to-AI platform. Use familiar SQL to train ML models, run time-series forecasts, and generate AI-powered insights with native Gemini integration. Built-in agents handle data engineering and data science workflows automatically. Get $300 in free credit, query 1 TB, and store 10 GB free monthly.
    Try BigQuery Free
  • 1
    DeiT (Data-efficient Image Transformers)
    ...The project provides compact ViT variants (Tiny/Small/Base) that achieve excellent accuracy–throughput trade-offs, making transformers practical beyond massive pretraining regimes. Training involves carefully tuned augmentations, regularization, and optimization schedules to stabilize learning and improve sample efficiency. The repo offers pretrained checkpoints, reference scripts, and ablation studies that clarify which ingredients matter most for data-efficient ViT training.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Airweave

    Airweave

    Airweave lets agents search any app

    Airweave is an open-source platform that enables agents to semantically search across various applications, databases, and APIs. By transforming disparate data sources into a unified, searchable knowledge base, Airweave facilitates intelligent information retrieval through REST APIs or the MCP protocol. It's particularly useful for building AI agents that require access to structured and unstructured data across multiple platforms.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 3
    Airtable MCP

    Airtable MCP

    Airtable integration for AI-powered applications

    Airtable MCP is an integration tool that enables AI-powered applications to access and manipulate Airtable databases directly from the IDE using Anthropic's Model Context Protocol (MCP). It allows querying, creating, updating, and deleting records using natural language, facilitating seamless data management. ​
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    ChatGLM2-6B

    ChatGLM2-6B

    ChatGLM2-6B: An Open Bilingual Chat LLM

    ChatGLM2-6B is the second-gen Chinese-English conversational LLM from ZhipuAI/Tsinghua. It upgrades the base model with GLM’s hybrid pretraining objective, 1.4 TB bilingual data, and preference alignment—delivering big gains on MMLU, CEval, GSM8K, and BBH. The context window extends up to 32K (FlashAttention), and Multi-Query Attention improves speed and memory use. The repo includes Python APIs, CLI & web demos, OpenAI-style/FASTAPI servers, and quantized checkpoints for lightweight local deployment on GPUs or CPU/MPS.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 99.99% Uptime for MySQL and PostgreSQL on Google Cloud Icon
    99.99% Uptime for MySQL and PostgreSQL on Google Cloud

    Enterprise Plus edition delivers sub-second maintenance downtime and 2x read/write performance. Built for critical apps.

    Cloud SQL Enterprise Plus gives you a 99.99% availability SLA with near-zero downtime maintenance—typically under 10 seconds. Get 2x better read/write performance, intelligent data caching, and 35 days of point-in-time recovery. Supports MySQL, PostgreSQL, and SQL Server with built-in vector search for gen AI apps. New customers get $300 in free credit.
    Try Cloud SQL Free
  • 5
    DB-GPT

    DB-GPT

    Revolutionizing Database Interactions with Private LLM Technology

    DB-GPT is an experimental open-source project that uses localized GPT large models to interact with your data and environment. With this solution, you can be assured that there is no risk of data leakage, and your data is 100% private and secure.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6
    GLM-4

    GLM-4

    GLM-4 series: Open Multilingual Multimodal Chat LMs

    GLM-4 is a family of open models from ZhipuAI that spans base, chat, and reasoning variants at both 32B and 9B scales, with long-context support and practical local-deployment options. The GLM-4-32B-0414 models are trained on ~15T high-quality data (including substantial synthetic reasoning data), then post-trained with preference alignment, rejection sampling, and reinforcement learning to improve instruction following, coding, function calling, and agent-style behaviors. ...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 7
    ChatTTS

    ChatTTS

    A generative speech model for daily dialogue

    ChatTTS is an open-source conversational text-to-speech model optimized for dialogue, developed by 2Noise. Trained on 100,000+ hours of English and Chinese conversation data, it excels at generating expressive prosody—pauses, interjections, laughter—for more natural-sounding speech synthesis in assistant and chatbot applications.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 8
    MetaCLIP

    MetaCLIP

    ICLR2024 Spotlight: curation/training code, metadata, distribution

    MetaCLIP is a research codebase that extends the CLIP framework into a meta-learning / continual learning regime, aiming to adapt CLIP-style models to new tasks or domains efficiently. The goal is to preserve CLIP’s strong zero-shot transfer capability while enabling fast adaptation to domain shifts or novel class sets with minimal data and without catastrophic forgetting. The repository provides training logic, adaptation strategies (e.g. prompt tuning, adapter modules), and evaluation across base and target domains to measure how well the model retains its general knowledge while specializing as needed. It includes utilities to fine-tune vision-language embeddings, compute prompt or adapter updates, and benchmark across transfer and retention metrics. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    OpenDAN

    OpenDAN

    OpenDAN is an open source Personal AI OS

    OpenDAN is an open-source Personal AI OS , that consolidates various AI modules in one place for your personal use. The goal of OpenDAN (Open and Do Anything Now with AI) is to create a Personal AI OS , which provides a runtime environment for various Al modules as well as protocols for interoperability between them. With OpenDAN, users can securely collaborate with various AI modules using their private data to create powerful personal AI agents, such as butlers, lawyers, doctors, teachers,...
    Downloads: 4 This Week
    Last Update:
    See Project
  • Run Any Workload on Compute Engine VMs Icon
    Run Any Workload on Compute Engine VMs

    From dev environments to AI training, choose preset or custom VMs with 1–96 vCPUs and industry-leading 99.95% uptime SLA.

    Compute Engine delivers high-performance virtual machines for web apps, databases, containers, and AI workloads. Choose from general-purpose, compute-optimized, or GPU/TPU-accelerated machine types—or build custom VMs to match your exact specs. With live migration and automatic failover, your workloads stay online. New customers get $300 in free credits.
    Try Compute Engine
  • 10
    AutoKeras

    AutoKeras

    AutoML library for deep learning

    ...AutoKeras would search for the best detailed configuration for you. Moreover, you can override the base classes to create your own block.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Mistral Finetune

    Mistral Finetune

    Memory-efficient and performant finetuning of Mistral's models

    ...It builds on techniques like LoRA (Low-Rank Adaptation) to allow customizing models without full parameter updates, which reduces GPU memory footprint and training cost. The repo includes utilities for data preprocessing (e.g. reformat_data.py), validation scripts, and example YAML configs for training variants like 7B base or instruct models. It supports function-calling style datasets (via "messages" keys) as well as plain text formats, with guidelines on formatting, tokenization, and vocabulary extension (e.g. extending vocab to 32768 for some models) before finetuning. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Khoj

    Khoj

    An AI personal assistant for your digital brain

    Get more done with your open-source AI personal assistant. Khoj is a desktop application to search and chat with your notes, documents, and images. It is an offline-first, open-source AI personal assistant that is accessible from Emacs, Obsidian or your Web browser. Khoj is a thinking tool that is transparent, fun, and easy to engage with. You can build faster and better by using Khoj to search and reason across all your data sources. Khoj learns from your notes and documents to function as...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 13
    MetaVoice-1B

    MetaVoice-1B

    Foundational model for human-like, expressive TTS

    MetaVoice — in the form of its source repository “metavoice-src” — is a large-scale text-to-speech (TTS) model. Specifically, the base model (MetaVoice-1B) uses around 1.2 billion parameters and has been trained on a massive dataset — reportedly around 100,000 hours of speech data. The goal is to provide human-like, expressive, and flexible TTS: able to generate natural-sounding speech that can handle diverse inputs and likely generalize over voice styles, intonation, prosody, and perhaps multiple languages or accents. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    LLM Foundry

    LLM Foundry

    LLM training code for MosaicML foundation models

    Introducing MPT-7B, the first entry in our MosaicML Foundation Series. MPT-7B is a transformer trained from scratch on 1T tokens of text and code. It is open source, available for commercial use, and matches the quality of LLaMA-7B. MPT-7B was trained on the MosaicML platform in 9.5 days with zero human intervention at a cost of ~$200k. Large language models (LLMs) are changing the world, but for those outside well-resourced industry labs, it can be extremely difficult to train and deploy...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 15
    Lightweight' GAN

    Lightweight' GAN

    Implementation of 'lightweight' GAN, proposed in ICLR 2021

    Implementation of 'lightweight' GAN proposed in ICLR 2021, in Pytorch. The main contribution of the paper is a skip-layer excitation in the generator, paired with autoencoding self-supervised learning in the discriminator. Quoting the one-line summary "converge on single gpu with few hours' training, on 1024 resolution sub-hundred images". Augmentation is essential for Lightweight GAN to work effectively in a low data setting. You can test and see how your images will be augmented before...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Stanza

    Stanza

    Stanford NLP Python library for many human languages

    ...Stanza is a Python natural language analysis package. It contains tools, which can be used in a pipeline, to convert a string containing human language text into lists of sentences and words, to generate base forms of those words, their parts of speech and morphological features, to give a syntactic structure dependency parse, and to recognize named entities. The toolkit is designed to be parallel among more than 70 languages, using the Universal Dependencies formalism. Stanza is built with highly accurate neural network components that also enable efficient training and evaluation with your own annotated data.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    ChatGPT Retrieval Plugin

    ChatGPT Retrieval Plugin

    The ChatGPT Retrieval Plugin lets you easily find personal documents

    The chatgpt-retrieval-plugin repository implements a semantic retrieval backend that lets ChatGPT (or GPT-powered tools) access private or organizational documents in natural language by combining vector search, embedding models, and plugin infrastructure. It can serve as a custom GPT plugin or function-calling backend so that a chat session can “look up” relevant documents based on user queries, inject those results into context, and respond more knowledgeably about a private knowledge...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    shuyuan

    shuyuan

    Reading book source

    shuyuan is a project oriented around reading and knowledge consumption, especially targeting large-scale text content such as books, articles, or educational material. The name suggests “academy” or “study hall,” and the tool aims to help users ingest, organize, and manage reading content — possibly offering features like text parsing, annotation, metadata generation, translation, or storage for later reference. The repository is set up to support document ingestion, indexing, and maybe some...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Chinese-LLaMA-Alpaca 2

    Chinese-LLaMA-Alpaca 2

    Chinese LLaMA-2 & Alpaca-2 Large Model Phase II Project

    This project is developed based on the commercially available large model Llama-2 released by Meta. It is the second phase of the Chinese LLaMA&Alpaca large model project. The Chinese LLaMA-2 base model and the Alpaca-2 instruction fine-tuning large model are open-sourced. These models expand and optimize the Chinese vocabulary on the basis of the original Llama-2, use large-scale Chinese data for incremental pre-training, and further improve the basic semantics and command understanding of Chinese. Performance improvements. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Langdesk

    Langdesk

    Windows application to search multiple pdfs and chat with them

    Langdesk is desktop application for windows that allows the user to assemble a knowledge base consisting of multiple pdf documents, retrieve information from them and chat with the retrieved content. Currently in BETA mode. Feel free to reach us for any request at info@tecnoesis.gr . We are currently seeking user scenarios, also open to customizations / additions / cooperation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    lora-svc

    lora-svc

    Singing voice change based on whisper, lora for singing voice clone

    singing voice change based on whisper, and lora for singing voice clone. You will feel the beauty of the code from this project. Uni-SVC main branch is for singing voice clone based on whisper with speaker encoder and speaker adapter. Uni-SVC main target is to develop lora for SVC. With lora, maybe clone a singer just need 10 stence after 10 minutes train. Each singer is a plug-in of the base model.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Neuro-comma

    Neuro-comma

    Punctuation restoration production-ready model for Russian language

    This library was developed with the idea to help us to create punctuation restoration models to memorize trained parameters, data, training visualization, etc. The Library doesn't use any high-level frameworks, such as PyTorch-lightning or Keras, to reduce the level entry threshold. Feel free to fork this repo and edit model or dataset classes for your purposes. Our team always uses the latest version and features of Python. We started with Python 3.9, but realized, that there is no FastAPI...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Pytorch Points 3D

    Pytorch Points 3D

    Pytorch framework for doing deep learning on point clouds

    ...Task driven implementation with dynamic model and dataset resolution from arguments. Core implementation of common components for point cloud deep learning - greatly simplifying the creation of new models. 4 Base Convolution base classes to simplify the implementation of new convolutions. Each base class supports a different data format.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24

    ProximityForest

    Efficient Approximate Nearest Neighbors for General Metric Spaces

    A proximity forest is a data structure that allows for efficient computation of approximate nearest neighbors of arbitrary data elements in a metric space. See: O'Hara and Draper, "Are You Using the Right Approximate Nearest Neighbor Algorithm?", WACV 2013 (best student paper award). One application of a ProximityForest is given in the following CVPR publication: Stephen O'Hara and Bruce A. Draper, "Scalable Action Recognition with a Subspace Forest," IEEE Conference on Computer...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    nBoost is a suite of boosting algorithms designed to solve binary classification problems on data that is not linearly separable by a convex combination of base hypotheses, i.e. noisy data. WARNING: Active development. Underlying algorithm is unstable.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB
Gen AI apps are built with MongoDB Atlas
Atlas offers built-in vector search and global availability across 125+ regions. Start building AI apps faster, all in one place.
Try Free →