Best Open Source Linux Vector Search Engines 2026

Vector Search Engines for Linux

Vector Search Engines Linux Clear Filters

Browse free open source Vector Search Engines and projects for Linux below. Use the toggles on the left to filter open source Vector Search Engines by OS, license, language, programming language, and project status.

MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
Gemini 3 and 200+ AI Models on One Platform
Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build generative AI apps with Vertex AI. Switch between models without switching platforms.

Start Free
1

pgvector

Open-source vector similarity search for Postgres

pgvector is an open-source PostgreSQL extension that equips PostgreSQL databases with vector data storage, indexing, and similarity search capabilities—ideal for embeddings-based applications like semantic search and recommendations. You can add an index to use approximate nearest neighbor search, which trades some recall for speed. Unlike typical indexes, you will see different results for queries after adding an approximate index. An HNSW index creates a multilayer graph. It has better query performance than IVFFlat (in terms of speed-recall tradeoff), but has slower build times and uses more memory. Also, an index can be created without any data in the table since there isn’t a training step like IVFFlat.

Downloads: 89 This Week

Last Update: 2026-02-25
See Project
2

Qdrant

Vector Database for the next generation of AI applications

Qdrant is a vector similarity engine & vector database. It deploys as an API service providing search for the nearest high-dimensional vectors. With Qdrant, embeddings or neural network encoders can be turned into full-fledged applications for matching, searching, recommending, and much more! Provides the OpenAPI v3 specification to generate a client library in almost any programming language. Alternatively, utilize ready-made client for Python or other programming languages with additional functionality. Implement a unique custom modification of the HNSW algorithm for the Approximate Nearest Neighbor Search. Search with a State-of-the-Art speed and apply search filters without compromising on results. Support additional payload associated with vectors. Not only stores payload but also allows filter results based on payload values. Unlike Elasticsearch post-filtering, Qdrant guarantees all relevant vectors are retrieved.

Downloads: 59 This Week

Last Update: 2026-03-27
See Project
3

OceanBase seekdb

The AI-Native Search Database

seekdb is an AI-native search database from OceanBase that unifies vector, full-text, relational, JSON, and GIS data into a single query engine. The system is designed to support hybrid search workloads and in-database AI workflows without requiring multiple specialized databases. It enables developers to perform semantic search, keyword search, and structured SQL queries within the same platform, simplifying modern AI application stacks. seekdb also embeds AI capabilities directly in the database layer, including embedding generation, reranking, and LLM inference for end-to-end RAG pipelines. Built on the OceanBase engine, it maintains ACID compliance and MySQL compatibility while delivering real-time analytical performance. Overall, seekdb positions itself as a unified data foundation for next-generation AI applications that require both transactional and semantic retrieval capabilities.

Downloads: 17 This Week

Last Update: 2026-03-25
See Project
4

Vespa

The open big data serving engine

Make AI-driven decisions using your data, in real-time. At any scale, with unbeatable performance. Vespa is a full-featured text search engine and supports both regular text search and fast approximate vector search (ANN). This makes it easy to create high-performing search applications at any scale, whether you want to use traditional techniques or a modern vector-based approach. You can even combine both approaches efficiently in the same query, something no other engine can do. Recommendation, personalization and targeting involves evaluating recommender models over content items to select the best ones. Vespa lets you build applications which does this online, typically combining fast vector search and filtering with evaluation of machine-learned models over the items. This makes it possible to make recommendations specifically for each user or situation, using completely up to date information.

Downloads: 10 This Week

Last Update: 2026-03-31
See Project
Earn up to 16% annual interest with Nexo.
Access competitive interest rates on your digital assets.

Generate interest, borrow against your crypto, and trade a range of cryptocurrencies — all in one platform. Geographic restrictions, eligibility, and terms apply.

Get started with Nexo.
5

Weaviate

Weaviate is a cloud-native, modular, real-time vector search engine

Weaviate in a nutshell: Weaviate is a vector search engine and vector database. Weaviate uses machine learning to vectorize and store data, and to find answers to natural language queries. With Weaviate you can also bring your custom ML models to production scale. Weaviate in detail: Weaviate is a low-latency vector search engine with out-of-the-box support for different media types (text, images, etc.). It offers Semantic Search, Question-Answer-Extraction, Classification, Customizable Models (PyTorch/TensorFlow/Keras), and more. Built from scratch in Go, Weaviate stores both objects and vectors, allowing for combining vector search with structured filtering with the fault-tolerance of a cloud-native database, all accessible through GraphQL, REST, and various language clients.

Downloads: 6 This Week

Last Update: 2026-04-03
See Project
6

marqo

Tensor search for humans

A tensor-based search and analytics engine that seamlessly integrates with your applications, websites, and workflows. Marqo is a versatile and robust search and analytics engine that can be integrated into any website or application. Due to horizontal scalability, Marqo provides lightning-fast query times, even with millions of documents. Marqo helps you configure deep-learning models like CLIP to pull semantic meaning from images. It can seamlessly handle image-to-image, image-to-text and text-to-image search and analytics. Marqo adapts and stores your data in a fully schemaless manner. It combines tensor search with a query DSL that provides efficient pre-filtering. Tensor search allows you to go beyond keyword matching and search based on the meaning of text, images and other unstructured data. Be a part of the tribe and help us revolutionize the future of search. Whether you are a contributor, a user, or simply have questions about Marqo, we got your back.

Downloads: 5 This Week

Last Update: 2026-04-02
See Project
7

txtai

Build AI-powered semantic search applications

txtai executes machine-learning workflows to transform data and build AI-powered semantic search applications. Traditional search systems use keywords to find data. Semantic search applications have an understanding of natural language and identify results that have the same meaning, not necessarily the same keywords. Backed by state-of-the-art machine learning models, data is transformed into vector representations for search (also known as embeddings). Innovation is happening at a rapid pace, models can understand concepts in documents, audio, images and more. Machine-learning pipelines to run extractive question-answering, zero-shot labeling, transcription, translation, summarization and text extraction. Cloud-native architecture that scales out with container orchestration systems (e.g. Kubernetes). Applications range from similarity search to complex NLP-driven data extractions to generate structured databases. The following applications are powered by txtai.

Downloads: 5 This Week

Last Update: 2026-03-17
See Project
8

Vector AI

A platform for building vector based applications

Vector AI is a framework designed to make the process of building production-grade vector-based applications as quick and easily as possible. Create, store, manipulate, search and analyze vectors alongside json documents to power applications such as neural search, semantic search, personalized recommendations etc. Image2Vec, Audio2Vec, etc (Any data can be turned into vectors through machine learning). Store your vectors alongside documents without having to do a db lookup for metadata about the vectors. Enable searching of vectors and rich multimedia with vector similarity search. The backbone of many popular A.I use cases like reverse image search, recommendations, personalization, etc. There are scenarios where vector search is not as effective as traditional search, e.g. searching for skus. Vector AI lets you combine vector search with all the features of traditional search such as filtering, fuzzy search, and keyword matching to create an even more powerful search.

Downloads: 3 This Week

Last Update: 2023-04-10
See Project
9

Cherche

Neural Search

Cherche allows the creation of efficient neural search pipelines using retrievers and pre-trained language models as rankers. Cherche's main strength is its ability to build diverse and end-to-end pipelines from lexical matching, semantic matching, and collaborative filtering-based models. Cherche provides modules dedicated to summarization and question answering. These modules are compatible with Hugging Face's pre-trained models and fully integrated into neural search pipelines. Search is fully compatible with the collaborative filtering library Implicit. It is advantageous if you have a history associated with users and you want to retrieve / re-rank documents based on user preferences.

Downloads: 2 This Week

Last Update: 2024-06-01
See Project
Try Google Cloud Risk-Free With $300 in Credit
No hidden charges. No surprise bills. Cancel anytime.

Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.

Start Free
10

Milvus

Vector database for scalable similarity search and AI applications

Milvus is an open-source vector database built to power embedding similarity search and AI applications. Milvus makes unstructured data search more accessible, and provides a consistent user experience regardless of the deployment environment. Milvus 2.0 is a cloud-native vector database with storage and computation separated by design. All components in this refactored version of Milvus are stateless to enhance elasticity and flexibility. Average latency measured in milliseconds on trillion vector datasets. Rich APIs designed for data science workflows. Consistent user experience across laptop, local cluster, and cloud. Embed real-time search and analytics into virtually any application. Milvus’ built-in replication and failover/failback features ensure data and applications can maintain business continuity in the event of a disruption. Component-level scalability makes it possible to scale up and down on demand.

Downloads: 2 This Week

Last Update: 4 days ago
See Project
11

Vearch

A distributed system for embedding-based vector retrieval

Vearch is the vector search infrastructure for deep learning and AI applications. Vearch is a distributed vector storage and retrieval system which can be easily extended to billions scale. Vearch implements a high-performance, lockless real-time vector indexing subsystem that utilizes various optimization techniques to support millisecond vector update and retrieval. End-to-end one-click deployment. Through the module of the plugin, a complete default visual search system can be deployed just with one click. Otherwise, you can easily customize your own image, video, or text feature extraction algorithm plugin. This GIF provides a clear demonstration of the project vearch usage and its internal structure. The use of vearch is mainly divided into three steps. Firstly, create DB and Space, then import your data, and finally, you can search on your own dataset.

Downloads: 2 This Week

Last Update: 2026-02-04
See Project
12

Gamma

Real time vector search engine

Gamma is the core vector search engine of Vearch. It is a high-performance, concurrent vector search engine, and supports real-time indexing vectors and scalars without lock. Differently from the general vector search engine, Gamma can store and index a document containing scalars and vectors, providing the ability to quickly index and provides the ability of quickly indexing and filter by numeric scalar fields. The work of design and implementation of real-time indexing has been published in our Middleware paper. As for the part of similarity search of vectors in Gamma, it is mainly implemented based on faiss which is an open-source library developed by Facebook AI Research. Besides faiss, it can easily support other approximate nearest neighbor search(ANN) algorithms or libraries.

Downloads: 1 This Week

Last Update: 2023-04-10
See Project
13

AnnLite

A fast embedded library for approximate nearest neighbor search

AnnLite is a lightweight and embeddable library for fast and filterable approximate nearest neighbor search (ANNS). It allows to search for nearest neighbors in a dataset of millions of points with a Pythonic API. A simple API is designed to be used with Python. It is easy to use and intuitive to set up to production. The library uses a highly optimized approximate nearest neighbor search algorithm (HNSW) to search for nearest neighbors. The library allows you to search for nearest neighbors within a subset of the dataset. Smooth integration with neural search ecosystem including Jina and DocArray, so that users can easily expose search API with gRPC and/or HTTP. The library is easy to install and use. It is designed to be used with Python. To support search with filters, the annlite must be created with colums parameter, which is a series of fields you want to filter by.

Downloads: 0 This Week

Last Update: 2023-04-19
See Project
14

Aquila X

Easy build your personal search engine with Aquila Network

Easy build your personal search engine with Aquila Network. Aquila X is the gateway to Aquila Network and it's applications. AquilaX is a smart bookmarking tool. You can keep your bookmarks and search through it's contents. Choose to keep all your data in a local server or in the cloud. This is an open source software and thus is auditable.

Downloads: 0 This Week

Last Update: 2023-04-10
See Project
15

DocArray

The data structure for multimodal data

DocArray is a library for nested, unstructured, multimodal data in transit, including text, image, audio, video, 3D mesh, etc. It allows deep-learning engineers to efficiently process, embed, search, recommend, store, and transfer multimodal data with a Pythonic API. Door to multimodal world: super-expressive data structure for representing complicated/mixed/nested text, image, video, audio, 3D mesh data. The foundation data structure of Jina, CLIP-as-service, DALL·E Flow, DiscoArt etc. Data science powerhouse: greatly accelerate data scientists’ work on embedding, k-NN matching, querying, visualizing, evaluating via Torch/TensorFlow/ONNX/PaddlePaddle on CPU/GPU. Data in transit: optimized for network communication, ready-to-wire at anytime with fast and compressed serialization in Protobuf, bytes, base64, JSON, CSV, DataFrame. Perfect for streaming and out-of-memory data. One-stop k-NN: Unified and consistent API for mainstream vector databases.

Downloads: 0 This Week

Last Update: 2025-03-21
See Project
16

Jina

Build cross-modal and multimodal applications on the cloud

Jina is a framework that empowers anyone to build cross-modal and multi-modal applications on the cloud. It uplifts a PoC into a production-ready service. Jina handles the infrastructure complexity, making advanced solution engineering and cloud-native technologies accessible to every developer. Build applications that deliver fresh insights from multiple data types such as text, image, audio, video, 3D mesh, PDF with Jina AI’s DocArray. Polyglot gateway that supports gRPC, Websockets, HTTP, GraphQL protocols with TLS. Intuitive design pattern for high-performance microservices. Seamless Docker container integration: sharing, exploring, sandboxing, versioning and dependency control via Jina Hub. Fast deployment to Kubernetes, Docker Compose and Jina Cloud. Improved engineering efficiency thanks to the Jina AI ecosystem, so you can focus on innovating with the data applications you build.

Downloads: 0 This Week

Last Update: 2024-11-12
See Project
17

NOW

No-code tool for creating a neural search solution in minutes

One line to host them all. Bootstrap your multimodal search case in minutes. NOW gives the world access to multimodal neural search with just one command. NOW supports various formats for uploading your dataset to your search application. You may either choose a demo dataset hosted by NOW, or use your own custom dataset, to build an application. NOW can support your custom data in the form of a DocumentArray, as a path to a local folder, or S3 bucket. You can choose a demo dataset to get started quickly. The demo datasets are hosted by NOW which can be easily used to build a search application. There is a large variety of datasets, including images, text, and audio. Perhaps your data is stored in an S3 bucket, which is an option NOW also supports. In this case, NOW asks for the URI to the S3 bucket, as well as the credentials and region thereof. A final step in loading your data is to choose the fields of your data that you would like to use for search and filter respectively.

Downloads: 0 This Week

Last Update: 2023-04-10
See Project
18

UForm

Multi-Modal Neural Networks for Semantic Search, based on Mid-Fusion

UForm is a Multi-Modal Modal Inference package, designed to encode Multi-Lingual Texts, Images, and, soon, Audio, Video, and Documents, into a shared vector space! It comes with a set of homonymous pre-trained networks available on HuggingFace portal and extends the transfromers package to support Mid-fusion Models. Late-fusion models encode each modality independently, but into one shared vector space. Due to independent encoding late-fusion models are good at capturing coarse-grained features but often neglect fine-grained ones. This type of models is well-suited for retrieval in large collections. The most famous example of such models is CLIP by OpenAI. Early-fusion models encode both modalities jointly so they can take into account fine-grained features. Usually, these models are used for re-ranking relatively small retrieval results. Mid-fusion models are the golden midpoint between the previous two types. Mid-fusion models consist of two parts – unimodal and multimodal.

Downloads: 0 This Week

Last Update: 2025-10-30
See Project
19

Vald

Vald. A Highly Scalable Distributed Vector Search Engine

Vald is a highly scalable distributed fast approximate nearest neighbor dense vector search engine. Vald is designed and implemented based on the Cloud-Native architecture. It uses the fastest ANN Algorithm NGT to search for neighbors. Vald has automatic vector indexing and index backup, and horizontal scaling which is made for searching from billions of feature vector data. Vald is easy to use, feature-rich and highly customizable as you needed. Usually, the graph requires locking during indexing, which causes stop-the-world. But Vald uses distributed index graphs so it continues to work during indexing. Vald implements it's own highly customizable Ingress/Egress filter. Which can be configured to fit the gRPC interface. Horizontal scalable on memory and cpu for your demand. Vald supports to auto backup feature using Object Storage or Persistent Volume which enables disaster recovery.

Downloads: 0 This Week

Last Update: 2025-07-04
See Project
20

VectorDB

A Python vector database you just need, no more, no less

vectordb is a Pythonic vector database offers a comprehensive suite of CRUD (Create, Read, Update, Delete) operations and robust scalability options, including sharding and replication. It's readily deployable in a variety of environments, from local to on-premise and cloud. vectordb delivers exactly what you need - no more, no less. It's a testament to effective Pythonic design without over-engineering, making it a lean yet powerful solution for all your needs. vectordb capitalizes on the powerful retrieval prowess of DocArray and the scalability, reliability, and serving capabilities of Jina. Here's the magic: DocArray serves as the engine driving vector search logic, while Jina guarantees efficient and scalable index serving. This synergy culminates in a robust, yet user-friendly vector database experience, that's vectordb for you.

Downloads: 0 This Week

Last Update: 2024-03-04
See Project
21

alvd

alvd = A Lightweight Vald. A lightweight distributed vector search

A lightweight distributed vector search engine based on Vald codebase. Vald is an awesome highly scalable distributed vector search engine works on Kubernetes. It has great features such as file-based backup, and metrics-based ordering of Agents. Also, Vald is highly configurable using YAML files. It works without Kubernetes, single binary (less than 30MB), easy to run (can be configured by command-line options), and consists of Agent and Server. alvd has almost the same features as Vald's gateway-lb + discoverer and agent-ngt. alvd depends on Vald codebase, the files came from Vald (such as internal, pkg/vald. They are downloaded when running make command.) are excluded from my license and ownership.

Downloads: 0 This Week

Last Update: 2023-04-10
See Project