Showing 30 open source projects for "text search"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 1
    txtai

    txtai

    Build AI-powered semantic search applications

    txtai executes machine-learning workflows to transform data and build AI-powered semantic search applications. Traditional search systems use keywords to find data. Semantic search applications have an understanding of natural language and identify results that have the same meaning, not necessarily the same keywords. Backed by state-of-the-art machine learning models, data is transformed into vector representations for search (also known as embeddings). ...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 2
    CineCLI

    CineCLI

    CineCLI is a cross-platform command-line movie browser

    ...CineCLI also supports paginated results and filters so users can navigate large search outputs without overwhelming their screens. Because it runs entirely from the command line, it’s ideal for developers, movie enthusiasts in headless environments, or anyone who prefers text-based tools over web browsers.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    SentenceTransformers

    SentenceTransformers

    Multilingual sentence & image embeddings with BERT

    SentenceTransformers is a Python framework for state-of-the-art sentence, text and image embeddings. The initial work is described in our paper Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. You can use this framework to compute sentence / text embeddings for more than 100 languages. These embeddings can then be compared e.g. with cosine-similarity to find sentences with a similar meaning. This can be useful for semantic textual similar, semantic search, or paraphrase mining. ...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 4
    Papermerge

    Papermerge

    Open Source Document Management System for Digital Archives

    ...OCR technology is vital part of Papermerge. It extracts text information from scanned documents, PDF, JPEG, TIFF files.
    Downloads: 20 This Week
    Last Update:
    See Project
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 5
    DocArray

    DocArray

    The data structure for multimodal data

    DocArray is a library for nested, unstructured, multimodal data in transit, including text, image, audio, video, 3D mesh, etc. It allows deep-learning engineers to efficiently process, embed, search, recommend, store, and transfer multimodal data with a Pythonic API. Door to multimodal world: super-expressive data structure for representing complicated/mixed/nested text, image, video, audio, 3D mesh data. The foundation data structure of Jina, CLIP-as-service, DALL·E Flow, DiscoArt etc. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Toot

    Toot

    toot - Mastodon CLI & TUI

    Toot is a CLI and TUI tool for interacting with Mastodon instances from the command line.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 7
    PageIndex

    PageIndex

    Document Index for Vectorless, Reasoning-based RAG

    PageIndex is an innovative open-source framework that reimagines retrieval-augmented generation (RAG) by eliminating conventional vector similarity search and instead building hierarchical semantic indexes that mirror a document’s natural structure. Rather than chunking text and embedding it into a vector database, PageIndex constructs a tree-structured index — similar to a detailed, AI-enhanced table of contents — that a large language model can traverse to locate the most relevant sections of long documents. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Jina

    Jina

    Build cross-modal and multimodal applications on the cloud

    Jina is a framework that empowers anyone to build cross-modal and multi-modal applications on the cloud. It uplifts a PoC into a production-ready service. Jina handles the infrastructure complexity, making advanced solution engineering and cloud-native technologies accessible to every developer. Build applications that deliver fresh insights from multiple data types such as text, image, audio, video, 3D mesh, PDF with Jina AI’s DocArray. Polyglot gateway that supports gRPC, Websockets, HTTP,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    sqlite-utils

    sqlite-utils

    Python CLI utility and library for manipulating SQLite databases

    ...As a CLI, it lets you build databases from structured data in one line, run queries against local files or in-memory databases, output results as JSON, CSV, or pretty tables, and configure full-text search. As a library, it exposes high-level APIs for inserting records, creating or transforming tables, normalizing schemas, and running migrations that SQLite’s limited ALTER TABLE cannot handle directly. The project also embraces an ecosystem of plugins, so you can add custom SQL functions, extra commands, or UIs (including a terminal UI) via separate packages. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 10
    repren

    repren

    Rename anything

    ...Because it’s script-friendly, it slots well into project maintenance, codebase migrations, or release engineering tasks. The goal is to give you a reliable, repeatable alternative to ad-hoc shell loops when large-scale text and filename changes are needed.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Wapiti

    Wapiti

    Wapiti is a web-application vulnerability scanner

    Wapiti is a vulnerability scanner for web applications. It currently search vulnerabilities like XSS, SQL and XPath injections, file inclusions, command execution, XXE injections, CRLF injections, Server Side Request Forgery, Open Redirects... It use the Python 3 programming language.
    Leader badge
    Downloads: 129 This Week
    Last Update:
    See Project
  • 12
    dirsearch

    dirsearch

    Web path scanner

    An advanced command-line tool designed to brute force directories and files in webservers, AKA web path scanner. Wordlist is a text file, each line is a path. About extensions, unlike other tools, dirsearch only replaces the %EXT% keyword with extensions from -e flag. For wordlists without %EXT% (like SecLists), -f | --force-extensions switch is required to append extensions to every word in wordlist, as well as the /. To use multiple wordlists, you can separate your wordlists with commas....
    Downloads: 7 This Week
    Last Update:
    See Project
  • 13
    Fairseq

    Fairseq

    Facebook AI Research Sequence-to-Sequence Toolkit written in Python

    Fairseq(-py) is a sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling and other text generation tasks. We provide reference implementations of various sequence modeling papers. Recent work by Microsoft and Google has shown that data parallel training can be made significantly more efficient by sharding the model parameters and optimizer state across data parallel workers. These ideas are encapsulated in the...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    Model Search

    Model Search

    Framework that implements AutoML algorithms

    Model Search is an AutoML research system for discovering neural network architectures with minimal human intervention. Instead of hand-crafting models, you define a search space and objectives, then the system explores candidate architectures using controllers and population-based strategies. It supports multiple tasks (such as vision or text) by letting you express reusable building blocks—layers, cells, and topologies—that the search can recombine. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Rank-BM25

    Rank-BM25

    A Collection of BM25 Algorithms in Python

    A collection of algorithms for querying a set of documents and returning the ones most relevant to the query. The most common use case for these algorithms is, as you might have guessed, to create search engines.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    Paperless-ng

    Paperless-ng

    A supercharged version of paperless, scan, index and archive docs

    Paperless is a simple Django application running in two parts, a Consumer (the thing that does the indexing) and a Web server (the part that lets you search & download already-indexed documents). Paper is a nightmare. Environmental issues aside, there’s no excuse for it in the 21st century. It takes up space, collects dust, doesn’t support any form of a search feature, indexing is tedious, it’s heavy and prone to damage & loss. I wrote this to make “going paperless” easier. I do not have to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    CodeBeagle

    CodeBeagle

    A tool to search source code based on a full text index

    CodeBeagle allows you to quickly find all occurrences of a search term inside source code files. It can handle large projects with thousands of files with a very good performance. To do so it creates a full text index of the desired source files. Because it is tolerant to whitespace its search syntax works great for searching source code. The search results are displayed in a source viewer with customizable syntax highlighting.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Invenio

    Invenio

    Invenio digital library framework

    Invenio is a highly customizable open-source framework for building large-scale digital repositories and research data platforms. Developed by CERN, it is designed to manage, index, and provide access to metadata-rich content such as publications, datasets, and multimedia files. Invenio provides a modular architecture, making it suitable for libraries, archives, and research institutions.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 19
    COAR-DMS

    COAR-DMS

    DMS for linux, C++ library, server, webUI , SOAP

    COAR-DMS is document management system for 32/64 bit. linux. Acts as library, server and tools. Library features: - storage management, free pages recycling - transaction log - indexing: full text, tags, metadata, document attributes - inverted index - versioning, collaboration - document trees, trees versionning - folders - plugins for auth (PAM,LDAP), db, file types plugins - tags - metadata (key value pairs) - object level security, folders documents ACL, - unix like security (rwx), special authorities - from thousands to tens of billions of documents - dashboard (working copies, new documents) - electronic signs - search statement, syntax like SQL - multithreaded, multiprocess library, Servers: - native HTTP server (libmicrohttp) - SOAP server - WebDAV(planed) - Indexer Python API WebUI GWT, JSP, SOAP-API
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20

    sitecheck

    Modular web site spider for web developers.

    More than just a link checker, sitecheck is a website spider (also known as a crawler) which can assist with SEO by testing an entire site plus both inbound links from search engines and outbound links to other sites for the following issues: looping redirects (HTTP 301/302), broken links (HTTP 404), server errors (HTTP 500), spelling mistakes, low readability scores (using the Flesch Reading Ease test), missing/empty/duplicate meta tags, duplicate content, slow page speed, W3C validation...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    A python module that provides algorithms for advanced search - basically all you need to build a search engine.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    openPLM - open source PLM
    open source PLM system - Product Structure management (BOM management) system and Electronic documents management or Entreprise Content Management (ECM) system
    Downloads: 9 This Week
    Last Update:
    See Project
  • 23
    A zope product which provides an interface to keep track of product updates which need to be run on zope sites through an interface that keeps track of those that have been run, when, by who, and their outcome.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Anamnesis is a clipboard manager. It stores all clipboard history and offers an easy interface to do a full-text search on the items of its history.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 25
    Java exception extractor. This utility will parse all files (either plain text or bzipped) and tries to search for various exceptions. It then tries to match exceptions against grouping rules (regexps). It is also able to group unrecognised exceptions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
MongoDB Logo MongoDB