Showing 151 open source projects for "Databases Open Source & DevTools"

View related business solutions
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 1
    DocWire SDK

    DocWire SDK

    Award-winning modern data processing SDK in C++20

    DocWire SDK, a standout C++20AI driven data processing tool, has received award from SourceForge and strong backing from Microsoft. It handles nearly 100 file types, empowering efficient text extraction, web data extraction, and document analysis. For businesses, the shift to DocWire SDK signifies a leap forward. It promises comprehensive document format support and the ability to extract valuable insights from email boxes, databases, and websites using cutting-edge AI. DocWire SDK aims to...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    PromptTools

    PromptTools

    Open-source tools for prompt testing and experimentation

    Welcome to prompttools created by Hegel AI! This repo offers a set of open-source, self-hostable tools for experimenting with, testing, and evaluating LLMs, vector databases, and prompts. The core idea is to enable developers to evaluate using familiar interfaces like code, notebooks, and a local playground.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    autollm

    autollm

    Ship RAG based LLM web apps in seconds

    autollm is an open-source Python framework designed to make it much faster to build retrieval-augmented generation applications and expose them as usable services with minimal setup. The project focuses on simplifying the usual stack of model selection, document ingestion, vector storage, querying, and API deployment into a more unified developer experience.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    RAGxplorer

    RAGxplorer

    Open-source tool to visualise your RAG

    RAGxplorer is an open-source visualization tool designed to help developers analyze and understand Retrieval-Augmented Generation (RAG) pipelines. Retrieval-augmented generation combines language models with external document retrieval systems in order to produce more accurate and grounded responses. However, RAG systems can be complex because they involve multiple components such as embedding models, vector databases, and retrieval algorithms.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Compliant and Reliable File Transfers Backed by Top Security Certifications Icon
    Compliant and Reliable File Transfers Backed by Top Security Certifications

    Cerberus FTP Server delivers SOC 2 Type II certified security and FIPS 140-2 validated encryption.

    Stop relying on non-certified, legacy file transfer tools that creak under the weight of modern security demands. Get full audit trails, advanced access controls and more supported by an award-winning team of experts. Start your free 25-day trial today.
    Start Free Trial
  • 5
    ADAMS

    ADAMS

    ADAMS is a workflow engine for building complex knowledge workflows.

    ADAMS is a flexible workflow engine aimed at quickly building and maintaining data-driven, reactive workflows, easily integrated into business processes. Instead of placing operators on a canvas and manually connecting them, a tree structure and flow control operators determine how data is processed (sequentially/parallel). This allows rapid development and easy maintenance of large workflows, with hundreds or thousands of operators. Operators include machine learning (WEKA, MOA, MEKA)...
    Leader badge
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    LLM Applications

    LLM Applications

    A comprehensive guide to building RAG-based LLM applications

    LLM Applications is a practical reference repository that demonstrates how to build production-grade applications powered by large language models. The project focuses particularly on Retrieval-Augmented Generation architectures, which combine language models with external knowledge sources to improve accuracy and reliability. It provides step-by-step guidance for constructing systems that ingest documents, split them into chunks, generate embeddings, index them in vector databases, and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    ChatGPT UI

    ChatGPT UI

    A ChatGPT web client that supports multiple users, and databases

    A ChatGPT web client that supports multiple users, multiple database connections for persistent data storage, supports i18n. Provides Docker images and quick deployment scripts. Support gpt-4 model. You can select the model in the "Model Parameters" of the front-end. The GPT-4 model requires whitelist access from OpenAI. Added web search capability to generate more relevant and up-to-date answers from ChatGPT! This feature is off by default, you can turn it on in `Chat->Settings` in the...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Clarity AI

    Clarity AI

    A Perplexity clone

    ...The codebase (TypeScript) leverages LLMs / embeddings to process user queries, retrieve relevant data or context, and respond conversationally; this makes it useful as a personal knowledge assistant, research helper, or Q&A front end over arbitrary datasets or web-available info. Because Clarity AI is open-source, developers can adapt the backend or retrieval logic, integrate their own data sources (databases, documents, APIs), and build custom assistants or knowledge bots tailored to their needs. It can serve as a starting platform for building AI-powered internal tools, knowledge bases, or public-facing “smart search” features.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Common Resource Grep - crgrep

    Common Resource Grep - crgrep

    Common Resource Grep

    CRGREP searches for matching text in databases, various document formats, archives and other difficult to access resources. A command line tool for name and content text matching in database tables, plain files, MS Office documents, PDF, archives, MP3 audio, image meta-data, scanned documents, maven dependencies and web resources. CRGREP will search resources within resources of any arbitrary combination or depth, so text within a document within a zip archive, and so on. Here you...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    More flexibility. More control.

    Generate interest, access liquidity without selling, and execute trades seamlessly. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 10
    SQLFlow

    SQLFlow

    SQL compiler bridging databases and machine learning workflows

    SQLFlow is an open source project designed to bridge the gap between traditional SQL-based data processing and modern machine learning workflows by extending SQL syntax with AI capabilities. It acts as a compiler that translates SQL programs into executable workflows, enabling users to train, evaluate, and deploy machine learning models directly from SQL statements. It integrates with multiple database engines such as MySQL, Hive, and MaxCompute, while also supporting machine learning...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    WikiSQL

    WikiSQL

    A large annotated semantic parsing corpus for developing NL interfaces

    A large crowd-sourced dataset for developing natural language interfaces for relational databases. WikiSQL is the dataset released along with our work Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning. Regarding tokenization and Stanza, when WikiSQL was written 3-years ago, it relied on Stanza, a CoreNLP python wrapper that has since been deprecated. If you'd still like to use the tokenizer, please use the docker image. We do not anticipate switching...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12

    My ZZZ Knowledge Micro Web Base

    This is a base which stores knowledge in the form of nested sets

    Hi, we are introducing you something new - ZZZ Knowledge Base. This is a base which stores knowledge in the form of nested sets, each set can contain a virtually unlimited number of elements and, in turn, each element can contain multiple elements. The access time from multiple ELEMENT NOT DEPEND ON THE NUMBER OF ITEMS IN IT AND PRACTICAL instantaneous! The access to knowledge base is realized through a "server" to which you can connect an unlimited number of "clients" as one computer...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    DocCO

    DocCO

    Non-disjoint groupping of Documents based on word sequence approach

    This is a GUI for learning non disjoint groups of documents based on Weka machine learning framework. It offers the possibility to make non disjoint clustering of documents using both vectorial and sequential representation (word sequence approach based on WSK kernel). All data format supported by WEKA could be used in DocCO. Data could be loaded from files, from databases or from specified URL. All the preprocessing techniques implemented in WEKA could be used before performing the learning.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Bayesian Network tools in Java (BNJ) is an open-source suite of software tools for research and development using graphical models of probability. It is published by the Kansas State University Laboratory for Knowledge Discovery in Databases (KDD).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    A developing environment for abstract board games. The user declares game rules in the IDE using a high-level visual language; Abstromatic plays it locally or via Internet, computes databases, openings, heuristics, variants, strategies and much more!
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16

    PyVocabularyTree

    A vocabulary tree for image classification using OpenCV

    ...The design provides training and optimization parameters that have been characterized using several detectors and descriptors for several input datasets. Evaluation tests performed on public image databases allow to compare obtained results with previously published literature. All the tools and resources used in this project are Open Source licensed.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    WQuery is a domain-specific query language designed to process WordNet-like lexical databases. It may be used as a standalone application or as an API to a lexical database in Java based systems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    SpatialML is a markup language for representing spatial expressions in natural language documents. The goal is to allow for better integration of text collections with resources such as databases that provide spatial information about a domain.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Wikipedia Concept Association Map (WCAM) is new approach for textual knowledge representation and understanding. All concepts and associations are stored in a graph database for better performance and easy distribution.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Moara is a biological text mining tool and consists of a Java library and some auxiliary MySQL databases for gene/protein training and extraction of mentions and its further normalization and disambiguation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    sexyalice is a chatterbot for IRC networks. It does not have any internal databases, but instead connects to alicebot's website to get answers. It keeps sessions for each user, and has many other nice features. Check code for details.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    openEAR is the Munich Open-Source Emotion and Affect Recognition Toolkit developed at the Technische Universität München (TUM). It provides efficient (audio) feature extraction algorithms implemented in C++, classfiers, and pre-trained models on well-known emotion databases. It is now maintained and supported by audEERING. Updates will follow soon.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 23
    IotaBot is an experimental and open robot framework written in Java with cognitive features that harnesses the research of databases and machine learning.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    The goal of this project is to develop a content based search methodology and associated interfaces for medical imaging databases
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    INDUS is a porject for knowledge acquisition and data integration from heterogeneous distributed data, particularly from bio-informatics databases
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB