Python Neural Search Software

View 15 business solutions

Browse free open source Python Neural Search Software and projects below. Use the toggles on the left to filter open source Python Neural Search Software by OS, license, language, programming language, and project status.

  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • MongoDB 8.0 on Atlas | Run anywhere Icon
    MongoDB 8.0 on Atlas | Run anywhere

    Now available in even more cloud regions across AWS, Azure, and Google Cloud.

    MongoDB 8.0 brings enhanced performance and flexibility to Atlas—with expanded availability across 125+ regions globally. Build modern apps anywhere your users are, with the power of a modern database behind you.
    Learn More
  • 1
    Haystack

    Haystack

    Haystack is an open source NLP framework to interact with your data

    Apply the latest NLP technology to your own data with the use of Haystack's pipeline architecture. Implement production-ready semantic search, question answering, summarization and document ranking for a wide range of NLP applications. Evaluate components and fine-tune models. Ask questions in natural language and find granular answers in your documents using the latest QA models with the help of Haystack pipelines. Perform semantic search and retrieve ranked documents according to meaning, not just keywords! Make use of and compare the latest pre-trained transformer-based languages models like OpenAI’s GPT-3, BERT, RoBERTa, DPR, and more. Pick any Transformer model from Hugging Face's Model Hub, experiment, find the one that works. Use Haystack NLP components on top of Elasticsearch, OpenSearch, or plain SQL. Boost search performance with Pinecone, Milvus, FAISS, or Weaviate vector databases, and dense passage retrieval.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    Aquila DB

    Aquila DB

    An easy to use Neural Search Engine

    Aquila DB is a Neural search engine. In other words, it is a database to index Latent Vectors generated by ML models along with JSON Metadata to perform k-NN retrieval. It is dead simple to set up, language-agnostic, and drop in addition to your Machine Learning Applications. Aquila DB, as of current features is a ready solution for Machine Learning engineers and Data scientists to build Neural Information Retrieval applications out of the box with minimal dependencies.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    Cherche

    Cherche

    Neural Search

    Cherche allows the creation of efficient neural search pipelines using retrievers and pre-trained language models as rankers. Cherche's main strength is its ability to build diverse and end-to-end pipelines from lexical matching, semantic matching, and collaborative filtering-based models. Cherche provides modules dedicated to summarization and question answering. These modules are compatible with Hugging Face's pre-trained models and fully integrated into neural search pipelines. Search is fully compatible with the collaborative filtering library Implicit. It is advantageous if you have a history associated with users and you want to retrieve / re-rank documents based on user preferences.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    MTEB

    MTEB

    MTEB: Massive Text Embedding Benchmark

    Text embeddings are commonly evaluated on a small set of datasets from a single task not covering their possible applications to other tasks. It is unclear whether state-of-the-art embeddings on semantic textual similarity (STS) can be equally well applied to other tasks like clustering or reranking. This makes progress in the field difficult to track, as various models are constantly being proposed without proper evaluation. To solve this problem, we introduce the Massive Text Embedding Benchmark (MTEB). MTEB spans 8 embedding tasks covering a total of 58 datasets and 112 languages. Through the benchmarking of 33 models on MTEB, we establish the most comprehensive benchmark of text embeddings to date. We find that no particular text embedding method dominates across all tasks. This suggests that the field has yet to converge on a universal text embedding method and scale it up sufficiently to provide state-of-the-art results on all embedding tasks.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Get Avast Free Antivirus | Your top-rated shield against malware and online scams Icon
    Get Avast Free Antivirus | Your top-rated shield against malware and online scams

    Boost your PC's defense against cyberthreats and web-based scams.

    Our antivirus software scans for security and performance issues and helps you to fix them instantly. It also protects you in real time by analyzing unknown files before they reach your desktop PC or laptop — all for free.
    Free Download
  • 5
    VectorDB

    VectorDB

    A Python vector database you just need, no more, no less

    vectordb is a Pythonic vector database offers a comprehensive suite of CRUD (Create, Read, Update, Delete) operations and robust scalability options, including sharding and replication. It's readily deployable in a variety of environments, from local to on-premise and cloud. vectordb delivers exactly what you need - no more, no less. It's a testament to effective Pythonic design without over-engineering, making it a lean yet powerful solution for all your needs. vectordb capitalizes on the powerful retrieval prowess of DocArray and the scalability, reliability, and serving capabilities of Jina. Here's the magic: DocArray serves as the engine driving vector search logic, while Jina guarantees efficient and scalable index serving. This synergy culminates in a robust, yet user-friendly vector database experience, that's vectordb for you.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    AnnLite

    AnnLite

    A fast embedded library for approximate nearest neighbor search

    AnnLite is a lightweight and embeddable library for fast and filterable approximate nearest neighbor search (ANNS). It allows to search for nearest neighbors in a dataset of millions of points with a Pythonic API. A simple API is designed to be used with Python. It is easy to use and intuitive to set up to production. The library uses a highly optimized approximate nearest neighbor search algorithm (HNSW) to search for nearest neighbors. The library allows you to search for nearest neighbors within a subset of the dataset. Smooth integration with neural search ecosystem including Jina and DocArray, so that users can easily expose search API with gRPC and/or HTTP. The library is easy to install and use. It is designed to be used with Python. To support search with filters, the annlite must be created with colums parameter, which is a series of fields you want to filter by.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    CLIP-as-service

    CLIP-as-service

    Embed images and sentences into fixed-length vectors

    CLIP-as-service is a low-latency high-scalability service for embedding images and text. It can be easily integrated as a microservice into neural search solutions. Serve CLIP models with TensorRT, ONNX runtime and PyTorch w/o JIT with 800QPS[*]. Non-blocking duplex streaming on requests and responses, designed for large data and long-running tasks. Horizontally scale up and down multiple CLIP models on single GPU, with automatic load balancing. Easy-to-use. No learning curve, minimalist design on client and server. Intuitive and consistent API for image and sentence embedding. Async client support. Easily switch between gRPC, HTTP, WebSocket protocols with TLS and compression. Smooth integration with neural search ecosystem including Jina and DocArray. Build cross-modal and multi-modal solutions in no time.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    DocArray

    DocArray

    The data structure for multimodal data

    DocArray is a library for nested, unstructured, multimodal data in transit, including text, image, audio, video, 3D mesh, etc. It allows deep-learning engineers to efficiently process, embed, search, recommend, store, and transfer multimodal data with a Pythonic API. Door to multimodal world: super-expressive data structure for representing complicated/mixed/nested text, image, video, audio, 3D mesh data. The foundation data structure of Jina, CLIP-as-service, DALL·E Flow, DiscoArt etc. Data science powerhouse: greatly accelerate data scientists’ work on embedding, k-NN matching, querying, visualizing, evaluating via Torch/TensorFlow/ONNX/PaddlePaddle on CPU/GPU. Data in transit: optimized for network communication, ready-to-wire at anytime with fast and compressed serialization in Protobuf, bytes, base64, JSON, CSV, DataFrame. Perfect for streaming and out-of-memory data. One-stop k-NN: Unified and consistent API for mainstream vector databases.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Jina

    Jina

    Build cross-modal and multimodal applications on the cloud

    Jina is a framework that empowers anyone to build cross-modal and multi-modal applications on the cloud. It uplifts a PoC into a production-ready service. Jina handles the infrastructure complexity, making advanced solution engineering and cloud-native technologies accessible to every developer. Build applications that deliver fresh insights from multiple data types such as text, image, audio, video, 3D mesh, PDF with Jina AI’s DocArray. Polyglot gateway that supports gRPC, Websockets, HTTP, GraphQL protocols with TLS. Intuitive design pattern for high-performance microservices. Seamless Docker container integration: sharing, exploring, sandboxing, versioning and dependency control via Jina Hub. Fast deployment to Kubernetes, Docker Compose and Jina Cloud. Improved engineering efficiency thanks to the Jina AI ecosystem, so you can focus on innovating with the data applications you build.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Build Securely on AWS with Proven Frameworks Icon
    Build Securely on AWS with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 10
    NOW

    NOW

    No-code tool for creating a neural search solution in minutes

    One line to host them all. Bootstrap your multimodal search case in minutes. NOW gives the world access to multimodal neural search with just one command. NOW supports various formats for uploading your dataset to your search application. You may either choose a demo dataset hosted by NOW, or use your own custom dataset, to build an application. NOW can support your custom data in the form of a DocumentArray, as a path to a local folder, or S3 bucket. You can choose a demo dataset to get started quickly. The demo datasets are hosted by NOW which can be easily used to build a search application. There is a large variety of datasets, including images, text, and audio. Perhaps your data is stored in an S3 bucket, which is an option NOW also supports. In this case, NOW asks for the URI to the S3 bucket, as well as the credentials and region thereof. A final step in loading your data is to choose the fields of your data that you would like to use for search and filter respectively.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    PaddleNLP

    PaddleNLP

    Easy-to-use and powerful NLP library with Awesome model zoo

    PaddleNLP It is a natural language processing development library for flying paddles, with Easy-to-use text area API, Examples of applications for multiple scenarios, and High-performance distributed training Three major features, aimed at improving the modeling efficiency of the flying oar developer's text field, aiming to improve the developer's development efficiency in the text field, and provide rich examples of NLP applications. Provide rich industry-level pre-task capabilities Taskflow And process-wide text area API: Support for the loading of rich Chinese data sets Dataset API, can flexibly and efficiently complete data pretreatment Data API, Preset 60 + pre-training word vector Embedding API, Providing 100 + pre-training model Transformer API Wait, the efficiency of NLP task modeling can be greatly improved.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Prime QA

    Prime QA

    State-of-the-art Multilingual Question Answering research

    PrimeQA is a public open source repository that enables researchers and developers to train state-of-the-art models for question answering (QA). By using PrimeQA, a researcher can replicate the experiments outlined in a paper published in the latest NLP conference while also enjoying the capability to download pre-trained models (from an online repository) and run them on their own custom data. PrimeQA is built on top of the Transformers toolkit and uses datasets and models that are directly downloadable.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Vector AI

    Vector AI

    A platform for building vector based applications

    Vector AI is a framework designed to make the process of building production-grade vector-based applications as quick and easily as possible. Create, store, manipulate, search and analyze vectors alongside json documents to power applications such as neural search, semantic search, personalized recommendations etc. Image2Vec, Audio2Vec, etc (Any data can be turned into vectors through machine learning). Store your vectors alongside documents without having to do a db lookup for metadata about the vectors. Enable searching of vectors and rich multimedia with vector similarity search. The backbone of many popular A.I use cases like reverse image search, recommendations, personalization, etc. There are scenarios where vector search is not as effective as traditional search, e.g. searching for skus. Vector AI lets you combine vector search with all the features of traditional search such as filtering, fuzzy search, and keyword matching to create an even more powerful search.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    finetuner

    finetuner

    Task-oriented finetuning for better embeddings on neural search

    Fine-tuning is an effective way to improve performance on neural search tasks. However, setting up and performing fine-tuning can be very time-consuming and resource-intensive. Jina AI’s Finetuner makes fine-tuning easier and faster by streamlining the workflow and handling all the complexity and infrastructure in the cloud. With Finetuner, you can easily enhance the performance of pre-trained models, making them production-ready without extensive labeling or expensive hardware. Create high-quality embeddings for semantic search, visual similarity search, cross-modal text image search, recommendation systems, clustering, duplication detection, anomaly detection, or other uses. Bring considerable improvements to model performance, making the most out of as little as a few hundred training samples, and finish fine-tuning in as little as an hour.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    refinery

    refinery

    Open-source choice to scale, assess and maintain natural language data

    The data scientist's open-source choice to scale, assess and maintain natural language data. Treat training data like a software artifact. You are one of the people we've built refinery for. refinery helps you to build better NLP models in a data-centric approach. Semi-automate your labeling, find low-quality subsets in your training data, and monitor your data in one place. refinery doesn't get rid of manual labeling, but it makes sure that your valuable time is spent well. Also, the makers of refinery currently work on integrations to other labeling tools, such that you can easily switch between different choices. refinery is a multi-repository project, you can find all integrated services in the architecture below. The app builds on top of Hugging Face and spaCy to leverage pre-built language models for your NLP tasks, as well as qdrant for neural search.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    txtai

    txtai

    Build AI-powered semantic search applications

    txtai executes machine-learning workflows to transform data and build AI-powered semantic search applications. Traditional search systems use keywords to find data. Semantic search applications have an understanding of natural language and identify results that have the same meaning, not necessarily the same keywords. Backed by state-of-the-art machine learning models, data is transformed into vector representations for search (also known as embeddings). Innovation is happening at a rapid pace, models can understand concepts in documents, audio, images and more. Machine-learning pipelines to run extractive question-answering, zero-shot labeling, transcription, translation, summarization and text extraction. Cloud-native architecture that scales out with container orchestration systems (e.g. Kubernetes). Applications range from similarity search to complex NLP-driven data extractions to generate structured databases. The following applications are powered by txtai.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.