Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Artificial Intelligence
Semantic Search Tools
Search Results

Search Results for "data.6bin"

x

Sort By:

Relevance

Clear All Filters

OS

Mac 24
Linux 23
Windows 23
More...
BSD 13
ChromeOS 12

Category

Artificial Intelligence 24
Database 2
Software Development 2
Business 1
Formats and Protocols 1
Scientific/Engineering 1
Security 1
System 1

License

OSI-Approved Open Source 21
Creative Commons Attribution License 1

Programming Language

Python 12
Rust 3
C 1
C++ 1
More...
C# 1
Go 1
Java 1
JavaScript 1
PowerShell 1
TypeScript 1
Unix Shell 1

Status

Beta 1

Showing 24 open source projects for "data.6bin"

View related business solutions

Semantic Search Mac Clear Filters & Widen Search

$300 in Free Credit Towards Top Cloud Services
Build VMs, containers, AI, databases, storage—all in one place.

Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.

Get Started
MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
1

Memvid

Video-based AI memory library. Store millions of text chunks in MP4

Memvid encodes text chunks as QR codes within MP4 frames to build a portable “video memory” for AI systems. This innovative approach uses standard video containers and offers millisecond-level semantic search across large corpora with dramatically less storage than vector DBs. It's self-contained—no DB needed—and supports features like PDF indexing, chat integration, and cloud dashboards.

Downloads: 79 This Week

Last Update: 2026-03-13
See Project
2

pgvector

Open-source vector similarity search for Postgres

...It has better query performance than IVFFlat (in terms of speed-recall tradeoff), but has slower build times and uses more memory. Also, an index can be created without any data in the table since there isn’t a training step like IVFFlat.

Downloads: 72 This Week

Last Update: 2026-02-25
See Project
3

Open Semantic Search

Open source semantic search and text analytics for large document sets

...It provides an integrated search server combined with a document processing pipeline that supports crawling, text extraction, and automated analysis of content from many different sources. Open Semantic Search includes an ETL framework that can ingest documents, process them through analysis steps, and enrich the data with extracted information such as named entities and metadata. It also supports optical character recognition to extract text from images and scanned documents, including images embedded inside PDF files. It integrates text mining and analytics capabilities that allow users to examine relationships, topics, and structured data within document collections.

Downloads: 7 This Week

Last Update: 6 days ago
See Project
4

Pixeltable

Data Infrastructure providing an approach to multimodal AI workloads

...Developers define data transformations and AI operations using computed columns on tables, allowing pipelines to evolve incrementally as new data or models are added. The framework supports multimodal content including images, video, text, and audio, enabling applications such as retrieval-augmented generation systems, semantic search, and multimedia analytics.

Downloads: 0 This Week

Last Update: 3 days ago
See Project
Fully Managed MySQL, PostgreSQL, and SQL Server
Automatic backups, patching, replication, and failover. Focus on your app, not your database.

Cloud SQL handles your database ops end to end, so you can focus on your app.

Try Free
5

CocoIndex

ETL framework to index data for AI, such as RAG

...CocoIndex leverages vector embeddings and integrates with various models and frameworks, including OpenAI and Hugging Face, to provide high-quality semantic understanding. It’s built for transparency, ease of use, and local control over your search data, distinguishing itself from closed, black-box systems. The tool is suitable for developers working on personal knowledge bases, AI search interfaces, or private LLM applications.

Downloads: 0 This Week

Last Update: 4 days ago
See Project
6

OpenViking

Context database designed specifically for AI Agents

OpenViking is an open-source context database engineered for efficient indexing and retrieval of large amounts of unstructured or semi-structured context data used by AI applications. It’s primarily designed to serve as a high-performance, scalable backend for storing app context, embeddings, conversational histories, and other textual artifacts that need rapid lookup and semantic search, which makes it especially useful for systems like chatbots or memory-augmented agents. The project is implemented with performance in mind, often leveraging optimized data structures that balance fast reads and writes with minimal resource consumption. ...

Downloads: 2 This Week

Last Update: 2 days ago
See Project
7

txtai

Build AI-powered semantic search applications

txtai executes machine-learning workflows to transform data and build AI-powered semantic search applications. Traditional search systems use keywords to find data. Semantic search applications have an understanding of natural language and identify results that have the same meaning, not necessarily the same keywords. Backed by state-of-the-art machine learning models, data is transformed into vector representations for search (also known as embeddings).

Downloads: 1 This Week

Last Update: 2026-03-17
See Project
8

LEANN

Local RAG engine for private multimodal knowledge search on devices

...By recomputing embeddings during queries and using compact graph-based indexing structures, LEANN can maintain high search accuracy while minimizing disk usage. It aims to act as a unified personal knowledge layer that connects different types of data such as documents, code, images, and other local files into a searchable context for language models.

Downloads: 0 This Week

Last Update: 2026-03-13
See Project
9

Kernel Memory

Research project. A Memory solution for users, teams, and applications

...Kernel Memory can ingest documents in multiple formats, process them into embeddings, and store them in searchable indexes. Applications can then query these indexed data sources to retrieve relevant information and include it as context for AI responses.

Downloads: 1 This Week

Last Update: 2026-03-06
See Project
AI-generated apps that pass security review
Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.

Try Retool free
10

MCP Server Qdrant

An official Qdrant Model Context Protocol (MCP) server implementation

The Qdrant MCP Server is an official Model Context Protocol server that integrates with the Qdrant vector search engine. It acts as a semantic memory layer, allowing for the storage and retrieval of vector-based data, enhancing the capabilities of AI applications requiring semantic search functionalities.

Downloads: 0 This Week

Last Update: 2025-12-10
See Project
11

KnowNote

A local-first AI knowledge base & NotebookLM alternative

...It lets users build an intelligent, searchable knowledge base from uploaded documents such as PDFs, Word files, PowerPoints, and web pages, and then interact with that content using LLM-powered chat, summarization, and reasoning tools. Unlike many NotebookLM alternatives that rely on Docker or cloud deployments, KnowNote runs natively on desktop platforms without complex setup, meaning all data stays local unless the user opts to integrate with self-managed or private LLM APIs. Its retrieval-augmented generation (RAG) system offers semantic search and traceable source references, and it supports multiple LLM providers through a flexible plugin-style provider architecture.

Downloads: 6 This Week

Last Update: 2026-01-30
See Project
12

Haystack

Haystack is an open source NLP framework to interact with your data

Apply the latest NLP technology to your own data with the use of Haystack's pipeline architecture. Implement production-ready semantic search, question answering, summarization and document ranking for a wide range of NLP applications. Evaluate components and fine-tune models. Ask questions in natural language and find granular answers in your documents using the latest QA models with the help of Haystack pipelines.

Downloads: 2 This Week

Last Update: 4 days ago
See Project
13

SemTools

Semantic search and document parsing tools for the command line

SemTools is an open-source command-line toolkit designed for document parsing, semantic indexing, and semantic search workflows. The project focuses on enabling developers and AI agents to process large document collections and extract meaningful semantic representations that can be searched efficiently. Built with Rust for performance and reliability, the toolchain provides fast processing of text and structured documents while maintaining low system overhead. SemTools can parse documents,...

Downloads: 3 This Week

Last Update: 2026-03-13
See Project
14

MyScaleDB

A @ClickHouse fork that supports high-performance vector search

...The system is built on top of the ClickHouse database engine and extends it with specialized indexing and search capabilities optimized for vector embeddings. This design allows developers to store structured data, unstructured text, and high-dimensional vector embeddings within a single database platform. MyScaleDB enables developers to perform vector similarity searches using standard SQL syntax, eliminating the need to learn specialized vector database query languages. The database is optimized for high performance and scalability, allowing it to handle extremely large datasets and high query loads typical of production AI applications.

Downloads: 0 This Week

Last Update: 2026-03-10
See Project
15

ChatGPT Retrieval Plugin

The ChatGPT Retrieval Plugin lets you easily find personal documents

The chatgpt-retrieval-plugin repository implements a semantic retrieval backend that lets ChatGPT (or GPT-powered tools) access private or organizational documents in natural language by combining vector search, embedding models, and plugin infrastructure. It can serve as a custom GPT plugin or function-calling backend so that a chat session can “look up” relevant documents based on user queries, inject those results into context, and respond more knowledgeably about a private knowledge...

Downloads: 3 This Week

Last Update: 2025-10-02
See Project
16

pgai

A suite of tools to develop RAG, semantic search, and other AI apps

pgai is a suite of PostgreSQL extensions developed by Timescale to empower developers in building AI applications directly within their databases. It integrates tools for vector storage, advanced indexing, and AI model interactions, facilitating the development of applications like semantic search and Retrieval-Augmented Generation (RAG) without leaving the SQL environment.

Downloads: 0 This Week

Last Update: 2025-10-14
See Project
17

Weaviate

Weaviate is a cloud-native, modular, real-time vector search engine

Weaviate in a nutshell: Weaviate is a vector search engine and vector database. Weaviate uses machine learning to vectorize and store data, and to find answers to natural language queries. With Weaviate you can also bring your custom ML models to production scale. Weaviate in detail: Weaviate is a low-latency vector search engine with out-of-the-box support for different media types (text, images, etc.). It offers Semantic Search, Question-Answer-Extraction, Classification, Customizable Models (PyTorch/TensorFlow/Keras), and more. ...

Downloads: 2 This Week

Last Update: 2 days ago
See Project
18

rag-search

RAG Search API

...Its architecture is modular, separating handlers, services, and utilities to support customization and extension. Overall, rag-search serves as a practical starter backend for teams building AI search or question-answering applications on their own data.

Downloads: 0 This Week

Last Update: 2026-03-03
See Project
19

kg-gen

Knowledge Graph Generation from Any Text

kg-gen is an open-source framework developed by the STAIR Lab that automatically generates knowledge graphs from unstructured text using large language models. The system is designed to transform plain text sources such as documents, articles, or conversation transcripts into structured graphs composed of entities and relationships. Instead of relying on traditional rule-based extraction techniques, KG-Gen uses language models to identify entities and their relationships, producing...

Downloads: 0 This Week

Last Update: 2026-03-09
See Project
20

Microsoft Learn MCP Server

Official Microsoft Learn MCP Server, powering LLMs and AI agents

Microsoft Learn MCP Server is the official GitHub repository for the Microsoft Learn MCP (Model Context Protocol) Server, a service that implements the Model Context Protocol to provide AI assistants and tools with reliable, real-time access to Microsoft’s official documentation. Rather than relying on training data that may be outdated or incomplete, MCP servers let agents like GitHub Copilot, Claude, or other LLM-based tools search and pull context directly from up-to-date Microsoft Learn content, including Azure, .NET, and other tech docs. By connecting to the MCP endpoint, coding agents can answer questions, retrieve code examples, and offer best practices grounded in authoritative sources without requiring API keys or manual browser searches. ...

Downloads: 0 This Week

Last Update: 4 days ago
See Project
21

QTE Technologies-Industrial-Scientific

1M+ Industrial & Scientific MRO Metadata for AI and Research

...Verification & Authority: Managed via DVC on DagsHub. Archived on Zenodo, Harvard Dataverse, and Figshare. Linked Data via Wikidata (Q138411149). Built for engineers, data scientists, and procurement professionals worldwide.

Downloads: 0 This Week

Last Update: 2026-03-09
See Project
22

hora

Efficient approximate nearest neighbor search algorithm collections

...Hora implements multiple efficient indexing algorithms that allow systems to rapidly search through high-dimensional vectors produced by machine learning models. These vectors are commonly generated by neural networks to represent images, text, audio, or other data types in a mathematical embedding space. The library is written in Rust and emphasizes performance, safety, and efficient memory management, making it suitable for production-grade applications requiring low latency and high throughput.

Downloads: 0 This Week

Last Update: 2026-03-11
See Project
23

Vector AI

A platform for building vector based applications

...Create, store, manipulate, search and analyze vectors alongside json documents to power applications such as neural search, semantic search, personalized recommendations etc. Image2Vec, Audio2Vec, etc (Any data can be turned into vectors through machine learning). Store your vectors alongside documents without having to do a db lookup for metadata about the vectors. Enable searching of vectors and rich multimedia with vector similarity search. The backbone of many popular A.I use cases like reverse image search, recommendations, personalization, etc. ...

Downloads: 0 This Week

Last Update: 2023-04-10
See Project
24

eagle-i

eagle-i is an ontology-driven, RDF-based distributed platform for creating, storing and searching semantically rich data. eagle-i is built around semantic web technologies and adheres to linked open data principles.

Downloads: 0 This Week

Last Update: 2014-01-27
See Project

Previous
You're on page 1
Next

Related Searches

semantic search

chatgpt

search engines

reverse image search text

python

audio transcription

weaviate

Related Categories

Artificial Intelligence

Database

Software Development

Business

Formats and Protocols

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise