Showing 204 open source projects for "search engines"

View related business solutions
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 1
    Search with Lepton

    Search with Lepton

    Lightweight demo to build a conversational AI search engine quickly

    Search with Lepton is an open source demonstration project that shows how to build a conversational search engine using the Lepton AI framework. It combines traditional web search with large language models to provide natural language answers to user queries. It retrieves information from supported search engines and uses that context to generate responses through a retrieval-augmented generation approach.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    SearXNG

    SearXNG

    Free internet metasearch engine which aggregates

    SearXNG is a free and open-source metasearch engine designed to aggregate results from multiple search engines while prioritizing user privacy and anonymity. Instead of maintaining its own index, it queries numerous external search providers and merges the results into a single interface, increasing coverage and diversity of information. One of its core principles is privacy, as it does not track users, store personal data, or create search profiles, making it a strong alternative to traditional search engines. . ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    SiteDorks

    SiteDorks

    Automate search engine dorking across hundreds of websites

    SiteDorks is a command line tool designed to automate advanced search queries across multiple search engines and websites. It allows users to perform search engine “dork” queries against a large set of predefined domains, making it easier to discover publicly available information across different platforms. SiteDorks supports several major search engines including Google, Bing, Brave, Ecosia, DuckDuckGo, Yahoo, and Yandex.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 4
    Cherche

    Cherche

    Neural Search

    Cherche allows the creation of efficient neural search pipelines using retrievers and pre-trained language models as rankers. Cherche's main strength is its ability to build diverse and end-to-end pipelines from lexical matching, semantic matching, and collaborative filtering-based models. Cherche provides modules dedicated to summarization and question answering. These modules are compatible with Hugging Face's pre-trained models and fully integrated into neural search pipelines. Search is...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Application Monitoring That Won't Slow Your App Down Icon
    Application Monitoring That Won't Slow Your App Down

    AppSignal's Rust-based agent is lightweight and stable. Already running in thousands of production apps.

    Full APM with errors, performance, logs, and uptime monitoring. 99.999% uptime SLA on the platform itself.
    Start Free
  • 5
    Anna’s Archive

    Anna’s Archive

    Comprehensive search engine for books, papers, comics, magazines

    Anna’s Archive is a large-scale open-source search engine and data aggregation platform designed to index and provide access to a vast collection of books, academic papers, comics, magazines, and other digital texts through a unified interface. The project includes all the infrastructure required to run a full instance locally or in production, combining web servers, databases, and search indexing systems into a scalable architecture. It relies heavily on technologies such as Elasticsearch...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    marqo

    marqo

    Tensor search for humans

    A tensor-based search and analytics engine that seamlessly integrates with your applications, websites, and workflows. Marqo is a versatile and robust search and analytics engine that can be integrated into any website or application. Due to horizontal scalability, Marqo provides lightning-fast query times, even with millions of documents. Marqo helps you configure deep-learning models like CLIP to pull semantic meaning from images. It can seamlessly handle image-to-image, image-to-text and...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    txtai

    txtai

    Build AI-powered semantic search applications

    txtai executes machine-learning workflows to transform data and build AI-powered semantic search applications. Traditional search systems use keywords to find data. Semantic search applications have an understanding of natural language and identify results that have the same meaning, not necessarily the same keywords. Backed by state-of-the-art machine learning models, data is transformed into vector representations for search (also known as embeddings). Innovation is happening at a rapid...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Swirl

    Swirl

    Swirl queries any number of data sources with APIs

    Swirl queries any number of data sources with APIs and uses spaCy and NLTK to re-rank the unified results without extracting and indexing anything! Includes zero-code configs for Apache Solr, ChatGPT, Elastic Search, OpenSearch, PostgreSQL, Google BigQuery, RequestsGet, Google PSE, NLResearch.com, Miro & more! SWIRL adapts and distributes queries to anything with a search API - search engines, databases, noSQL engines, cloud/SaaS services etc - and uses AI (Large Language Models) to re-rank the unified results without extracting and indexing anything. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Tribler

    Tribler

    Privacy enhanced BitTorrent client with P2P content discovery

    ...It introduces built-in anonymity using a Tor-like onion routing network and integrates its own blockchain for economic incentives and trust management. Tribler supports standard torrenting features along with distributed search, self-contained channels, and peer reputation. Its goal is to provide a fully autonomous file-sharing network without relying on external servers, search engines, or trackers.
    Downloads: 44 This Week
    Last Update:
    See Project
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 10
    Robin

    Robin

    AI-powered tool for dark web OSINT search and investigation

    Robin is an AI-powered open source tool designed to assist investigators and researchers in conducting dark web OSINT (Open Source Intelligence) investigations. It combines automated dark web search capabilities with large language models (LLMs) to analyze and summarize information discovered across hidden services and Tor-based search engines. The tool helps refine investigative queries, collect results from multiple dark web sources, and filter relevant intelligence using AI-driven processing. Robin also performs scraping of discovered pages through Tor sessions, allowing users to gather additional context from dark web sites while maintaining the required network routing. ...
    Downloads: 15 This Week
    Last Update:
    See Project
  • 11
    owllook

    owllook

    Vertical novel search engine with unified reading and tracking tools

    ...It aggregates results from multiple search engines and applies parsing rules to extract novel metadata, chapters, and content in a consistent format. Owllook also includes functionality for tracking reading history, displaying rankings based on search activity, and recommending books using a similarity-based approach. Owllook is built using asynchronous technologies to support efficient data retrieval and responsive interactions while reading or searching.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 12
    OWASP Maryam

    OWASP Maryam

    Modular OSINT framework for automated open-source intelligence gatheri

    Maryam is an open source intelligence (OSINT) framework designed to automate the process of gathering and analyzing publicly available information from the internet. It provides a modular environment that enables users to collect data from search engines, open data sources, and various online services for reconnaissance and investigative purposes. Written in Python, Maryam is built to provide a flexible and extensible framework for harvesting information quickly and efficiently from open sources. Maryam helps security researchers and analysts streamline routine data-gathering tasks that typically involve searching multiple sources such as Google, Bing, or other online platforms. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    UForm

    UForm

    Multi-Modal Neural Networks for Semantic Search, based on Mid-Fusion

    UForm is a Multi-Modal Modal Inference package, designed to encode Multi-Lingual Texts, Images, and, soon, Audio, Video, and Documents, into a shared vector space! It comes with a set of homonymous pre-trained networks available on HuggingFace portal and extends the transfromers package to support Mid-fusion Models. Late-fusion models encode each modality independently, but into one shared vector space. Due to independent encoding late-fusion models are good at capturing coarse-grained...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    DocArray

    DocArray

    The data structure for multimodal data

    DocArray is a library for nested, unstructured, multimodal data in transit, including text, image, audio, video, 3D mesh, etc. It allows deep-learning engineers to efficiently process, embed, search, recommend, store, and transfer multimodal data with a Pythonic API. Door to multimodal world: super-expressive data structure for representing complicated/mixed/nested text, image, video, audio, 3D mesh data. The foundation data structure of Jina, CLIP-as-service, DALL·E Flow, DiscoArt etc. Data...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    OmniBox

    OmniBox

    Collect, organize, use, and share, all in OmniBox

    ...Tools like Omnibox typically emphasize extensibility, allowing developers to add plugins or integrations that connect the interface to other systems such as APIs, search engines, or automation tools.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 16
    Snoop Project

    Snoop Project

    This is the most powerful software taking into account CIS location

    ...Snoop project is developed without taking into account the opinions of the NSA and their friends, that is, it is available to the average user. Snoop is a research work (own database / closed bugbounty) in the field of searching and processing public data on the Internet. In terms of specialized search, Snoop is able to compete with traditional search engines.
    Downloads: 14 This Week
    Last Update:
    See Project
  • 17
    Jina

    Jina

    Build cross-modal and multimodal applications on the cloud

    Jina is a framework that empowers anyone to build cross-modal and multi-modal applications on the cloud. It uplifts a PoC into a production-ready service. Jina handles the infrastructure complexity, making advanced solution engineering and cloud-native technologies accessible to every developer. Build applications that deliver fresh insights from multiple data types such as text, image, audio, video, 3D mesh, PDF with Jina AI’s DocArray. Polyglot gateway that supports gRPC, Websockets, HTTP,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    VectorDB

    VectorDB

    A Python vector database you just need, no more, no less

    vectordb is a Pythonic vector database offers a comprehensive suite of CRUD (Create, Read, Update, Delete) operations and robust scalability options, including sharding and replication. It's readily deployable in a variety of environments, from local to on-premise and cloud. vectordb delivers exactly what you need - no more, no less. It's a testament to effective Pythonic design without over-engineering, making it a lean yet powerful solution for all your needs. vectordb capitalizes on the...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    PaSa

    PaSa

    An advanced paper search agent powered by large language models

    PaSa is an open-source “paper search agent” built around large language models (LLMs), designed to automate the process of academic literature retrieval with human-like decision making. Instead of simply translating a query into keywords and returning a flat list of matching papers, PaSa uses a dual-agent architecture (Crawler + Selector) that can iteratively search, read, analyze, and filter academic publications — simulating how a researcher might dig through citation networks, expand...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    autocrawler

    autocrawler

    Multiprocess Selenium crawler for downloading images by keywords

    AutoCrawler is a Python-based image crawling tool designed to automatically download large numbers of images from search engines using automated browser interaction. It uses Selenium and a Chrome browser driver to navigate image search pages and collect image sources based on keywords provided by the user. AutoCrawler supports multiprocess and multithreaded downloading, which allows it to retrieve images faster by running several tasks simultaneously.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    ReCall

    ReCall

    Learning to Reason with Search for LLMs via Reinforcement Learning

    ReCall is an open-source framework designed to train and evaluate language models that can reason through complex problems by interacting with external tools. The project builds on earlier work focused on teaching models how to search for information during reasoning tasks and extends that idea to a broader system where models can call a variety of external tools such as APIs, databases, or computation engines. Instead of relying purely on static knowledge stored inside the model, ReCall allows the language model to dynamically decide when it should retrieve information or invoke external capabilities during the reasoning process. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Sunfish

    Sunfish

    Sunfish: a Python Chess Engine in 111 lines of code

    ...It implements classic chess engine techniques such as alpha-beta pruning and efficient board representation while maintaining readability and simplicity. The project is often used as an educational tool for understanding game AI, search algorithms, and evaluation functions without the complexity of larger engines. It includes a simple UCI-compatible interface, allowing it to be integrated with graphical chess interfaces or used in terminal-based gameplay. The codebase is intentionally minimal, making it ideal for experimentation, modification, and learning purposes.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 23
    Papermerge

    Papermerge

    Open Source Document Management System for Digital Archives

    Papermerge is an open source document management system (DMS) primarily designed for archiving and retrieving your digital documents. Instead of having piles of paper documents all over your desk, office or drawers - you can quickly scan them and configure your scanner to directly upload to Papermerge DMS. Store, organize and index scanned documents in PDF, JPEG and TIFF formats. Instantly find relevant information using full text, tags and metadata-based search. Papermerge is free and...
    Downloads: 16 This Week
    Last Update:
    See Project
  • 24
    nano-graphrag

    nano-graphrag

    A simple, easy-to-hack GraphRAG implementation

    nano-graphrag is a lightweight implementation of the GraphRAG approach designed to simplify experimentation with graph-based retrieval-augmented generation systems. GraphRAG expands traditional RAG pipelines by constructing knowledge graphs from documents and using relationships between entities to improve the quality and reasoning of AI responses. The nano-GraphRAG project focuses on reducing complexity by providing a compact and readable codebase that preserves the core functionality of...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    OnionSearch

    OnionSearch

    Search multiple Tor .onion engines at once and collect hidden links.

    OnionSearch is a Python-based command-line tool designed to collect and aggregate links from multiple search engines on the Tor network. The script works by scraping results from a variety of .onion search services, allowing users to perform a single query while gathering results from many sources at once. This approach helps researchers and investigators locate hidden services more efficiently without manually querying each individual search engine. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB