Showing 35 open source projects for "similarity text"

View related business solutions
  • One verification platform to secure the whole user journey Icon
    One verification platform to secure the whole user journey

    Handle every identity verification need in a single dashboard. Verify users, businesses or transactions, all while managing cases and deterring fraud.

    Sumsub is a full-cycle verification platform that secures every step of the user journey. With Sumsub’s customizable KYC, KYB, AML, Transaction Monitoring and Fraud Prevention solutions, you can orchestrate your verification process, welcome more customers worldwide, meet compliance requirements, reduce costs and protect your business.
  • Spreadsheets are hard. Nostra is easy. Icon
    Spreadsheets are hard. Nostra is easy.

    A single tool that advances the performance of your professional services business through data and AI.

    Save administrative costs with simple time tracking and approvals. Understand with precision how your employees are actually spending their time relative to plan. Gain insights on the performance of your company so you can be more strategic on growing your business. Integrate with your existing CRM, or leverage Nostra's to gain insight on your profits and how your sales pipeline is putting demands on your resources. Make only the hires you have to. Gain early insight your sales pipeline and being intune with all inflight projects, Nostra will guide you on exactly when, what and who to hire for. Track milestones and time entry so you know what you can invoice for and when and get paid on time. With approval workflows and integrations with GL systems, you will not leak any revenue.
  • 1
    VALL-E

    VALL-E

    PyTorch implementation of VALL-E (Zero-Shot Text-To-Speech)

    We introduce a language modeling approach for text to speech synthesis (TTS). Specifically, we train a neural codec language model (called VALL-E) using discrete codes derived from an off-the-shelf neural audio codec model, and regard TTS as a conditional language modeling task rather than continuous signal regression as in previous work. During the pre-training stage, we scale up the TTS training data to 60K hours of English speech which is hundreds of times larger than existing systems. VALL...
    Downloads: 24 This Week
    Last Update:
    See Project
  • 2
    Elastiknn

    Elastiknn

    Elasticsearch plugin for nearest neighbor search

    Elasticsearch plugin for nearest neighbor search. Store vectors and run similarity searches using exact and approximate algorithms. Methods like word2vec and convolutional neural nets can convert many data modalities (text, images, users, items, etc.) into numerical vectors, such that pairwise distance computations on the vectors correspond to semantic similarity of the original data. Elasticsearch is a ubiquitous search solution, but its support for vectors is limited. This plugin fills...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    SentenceTransformers

    SentenceTransformers

    Multilingual sentence & image embeddings with BERT

    SentenceTransformers is a Python framework for state-of-the-art sentence, text and image embeddings. The initial work is described in our paper Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. You can use this framework to compute sentence / text embeddings for more than 100 languages. These embeddings can then be compared e.g. with cosine-similarity to find sentences with a similar meaning. This can be useful for semantic textual similar, semantic search, or paraphrase mining...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 4
    PHP Client For NLP Cloud

    PHP Client For NLP Cloud

    NLP Cloud serves high performance pre-trained or custom models for NER

    NLP Cloud serves high performance pre-trained or custom models for NER, sentiment-analysis, classification, summarization, dialogue summarization, paraphrasing, intent classification, product description and ad generation, chatbot, grammar and spelling correction, keywords and keyphrases extraction, text generation, image generation, blog post generation, code generation, question answering, automatic speech recognition, machine translation, language detection, semantic search, semantic...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Print management system for direct buyers, brokers, in-plants and printers. Icon
    Print management system for direct buyers, brokers, in-plants and printers.

    P3Software is a premier provider of intelligent print management solutions.

    P3Software's affordable print management system, is ideally suited for corporate, non-profit and educational print buyers, print managers, in-plants and print manufacturers. Designed by print experts, this easy-to-use print procurement management system helps users manage the print sourcing and buying workflow, from initial job specification to project delivery. Core features include bid and buy or direct buy, customer proposal (estimate), customer direct ordering, enhanced CRM, powerful reporting, easy access to current and historical data, and outstanding training and support.
  • 5
    Weaviate

    Weaviate

    Weaviate is a cloud-native, modular, real-time vector search engine

    Weaviate in a nutshell: Weaviate is a vector search engine and vector database. Weaviate uses machine learning to vectorize and store data, and to find answers to natural language queries. With Weaviate you can also bring your custom ML models to production scale. Weaviate in detail: Weaviate is a low-latency vector search engine with out-of-the-box support for different media types (text, images, etc.). It offers Semantic Search, Question-Answer-Extraction, Classification, Customizable Models...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    LangKit

    LangKit

    An open-source toolkit for monitoring Language Learning Models (LLMs)

    LangKit is an open-source text metrics toolkit for monitoring language models. It offers an array of methods for extracting relevant signals from the input and/or output text, which are compatible with the open-source data logging library whylogs. Productionizing language models, including LLMs, comes with a range of risks due to the infinite amount of input combinations, which can elicit an infinite amount of outputs. The unstructured nature of text poses a challenge in the ML observability...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Reor Project

    Reor Project

    Private & local AI personal knowledge management app

    Reor is an AI-powered desktop note-taking app: it automatically links related notes, answers questions on your notes, provides semantic search and can generate AI flashcards. Everything is stored locally and you can edit your notes with an Obsidian-like markdown editor. The hypothesis of the project is that AI tools for thought should run models locally by default. Reor stands on the shoulders of the giants Ollama, Transformers.js & LanceDB to enable both LLMs and embedding models to run locally.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    txtai

    txtai

    Build AI-powered semantic search applications

    ..., models can understand concepts in documents, audio, images and more. Machine-learning pipelines to run extractive question-answering, zero-shot labeling, transcription, translation, summarization and text extraction. Cloud-native architecture that scales out with container orchestration systems (e.g. Kubernetes). Applications range from similarity search to complex NLP-driven data extractions to generate structured databases. The following applications are powered by txtai.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    natural

    natural

    General natural language facilities for node

    "Natural" is a general natural language facility for nodejs. It offers a broad range of functionalities for natural language processing. Tokenizing, stemming, classification, phonetics, tf-idf, WordNet, string similarity, and some inflections are currently supported. It’s still in the early stages, so we’re very interested in bug reports, contributions and the like. Note that many algorithms from Rob Ellis’s node-nltools are being merged into this project and will be maintained from here onward...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Electronic Lab Notebook and Lab Inventory Management Software Icon
    Electronic Lab Notebook and Lab Inventory Management Software

    SciNote is a cloud-based ELN software with lab inventory, compliance, & team management tools used by the FDA, USDA and scientists in 100+ countries.

    SciNote provides a top-rated set of data management functionalities, such as inventory tracking & management, protocol & SOP management, compliance (CFR 21 part 11 & GxP), team management & collaboration, integrations and API, project management, safety & security of data and more.
  • 10
    MTEB

    MTEB

    MTEB: Massive Text Embedding Benchmark

    Text embeddings are commonly evaluated on a small set of datasets from a single task not covering their possible applications to other tasks. It is unclear whether state-of-the-art embeddings on semantic textual similarity (STS) can be equally well applied to other tasks like clustering or reranking. This makes progress in the field difficult to track, as various models are constantly being proposed without proper evaluation. To solve this problem, we introduce the Massive Text Embedding...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Finetuner

    Finetuner

    Task-oriented finetuning for better embeddings on neural search

    ...-quality embeddings for semantic search, visual similarity search, cross-modal text image search, recommendation systems, clustering, duplication detection, anomaly detection, or other uses. Bring considerable improvements to model performance, making the most out of as little as a few hundred training samples, and finish fine-tuning in as little as an hour.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Node.js Client For NLP Cloud

    Node.js Client For NLP Cloud

    NLP Cloud serves high performance pre-trained or custom models

    This is the Node.js client (with Typescript types) for the NLP Cloud API. NLP Cloud serves high-performance pre-trained or custom models for NER, sentiment analysis, classification, summarization, dialogue summarization, paraphrasing, intent classification, product description and ad generation, chatbot, grammar and spelling correction, keywords and keyphrases extraction, text generation, image generation, blog post generation, text generation, question answering, automatic speech recognition...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Python Client For NLP Cloud

    Python Client For NLP Cloud

    NLP Cloud serves high performance pre-trained or custom models for NER

    NLP Cloud serves high performance pre-trained or custom models for NER, sentiment-analysis, classification, summarization, dialogue summarization, paraphrasing, intent classification, product description and ad generation, chatbot, grammar and spelling correction, keywords and keyphrases extraction, text generation, image generation, blog post generation, source code generation, question answering, automatic speech recognition, machine translation, language detection, semantic search, semantic...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Essential Excel Add-In
    Essential Excel Add-In is a Microsoft Excel Add-In, written in VBA, that contains useful User Defined Functions (UDFs) and macros to performs a number of different tasks that either Excel does not provide (Regular Expressions (RegEx), improved VLookUp).
    Leader badge
    Downloads: 23 This Week
    Last Update:
    See Project
  • 15
    TEXminer

    TEXminer

    Text Mining Classification for Texts in ASCII, Unicode and PDF Format.

    TEXminer uses generic Text Mining Methods to analyze Unicode Files as plain Text or PDF. The Text Database can be saved in XML where the orginal Text, the Sentence and Word Lists and additional Parameters (e.g. Abbreviations) are stored. TEXminer allows Language Detection by Letter Frequency Analysis, finding important Words by Cooccurrence Analysis, Determination of Central Expressions, Thematic Text Classification (also Semantic Groups) and Fingerprint Comparison. Because TEXminer...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 16
    PyTTI-Notebook

    PyTTI-Notebook

    PyTTI-Notebook

    Recent advances in machine learning have created opportunities for “AI” technologies to assist unlocking creativity in powerful ways. PyTTI is a toolkit that facilitates image generation, animation, and manipulation using processes that could be thought of as a human artist collaborating with AI assistants. The underlying technology is complex, but you don’t need to be a deep learning expert or even know coding of any kind to use these tools. Understanding the underlying technology can be...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Desktop Plagiarism Checker

    Desktop Plagiarism Checker

    Free plagiarism software for .NET (Windows 7, 8.1, 10, 11)

    Plagiarisma tools are a valuable resource for anyone who wants to ensure the originality and accuracy of their written work. The plagiarism checker is a useful tool for detecting any instances of copied content in a document. It compares the text with a vast database of sources to determine the level of similarity and highlights any potential areas of concern. The paraphraser tool allows writers to rephrase their sentences to avoid plagiarism, while the summarizer provides a concise overview...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    JoBimText

    JoBimText

    Linking Language to Knowledge with Distributional Semantics

    JobimText is a software solution for automatic text expansion using contextualized distributional similarity. It provides text analysis tools for large corpora and has capabilities to create distributional semantic models (JoBimText models) and multi-word expressions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Synonyms

    Synonyms

    Chinese synonyms, chat robot, intelligent question and answer toolkit

    Chinese Synonyms for natural language processing and understanding. Better Chinese synonyms, chatbot, intelligent question and answer toolkit. synonymsCan be used for many tasks in natural language understanding, text alignment, recommendation algorithms, similarity calculation, semantic shifting, keyword extraction, concept extraction, automatic summarization, search engines, etc. Print synonyms in a friendly way for easy debugging. "Synonyms Cilin" was compiled by Mei Jiaju and others in 1983...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Turi Create

    Turi Create

    Simplifies the development of custom machine learning models

    Turi Create simplifies the development of custom machine learning models. You don't have to be a machine learning expert to add recommendations, object detection, image classification, image similarity or activity classification to your app. If you want your app to recognize specific objects in images, you can build your own model with just a few lines of code. Turi Create supports macOS 10.12+, Linux (with glibc 2.10+), Windows 10 (via WSL). Turi Create requires Python 2.7, 3.5, 3.6, 3.7, 3.8...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 21
    Frontend Regression Validator (FRED)

    Frontend Regression Validator (FRED)

    Visual regression tool used to compare baseline and updated instances

    Visual regression tool used to compare baseline and updated instances of a website in a deployment pipeline. FRED is an opensource visual regression tool used to compare two instances of a website. FRED is responsible for automatic visual regression testing, with the purpose of ensuring that functionality is not broken by comparing a current(baseline) and an updated version of a website. The visual analysis computes the Normalized Mean Squared error and the Structural Similarity Index...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    This project will produce a set of machine measures of text document similarity. A measure of document similarity quantifies the degree to which two text documents are related.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    QtCreator Similarity Analyser Plugin

    QtCreator Similarity Analyser Plugin

    Tool for code duplication detection in QtCreator projects.

    This plugin intergate simian (Similarity Analysis) tool into QtCreator IDE. Fast and customizable source code checking for duplicated code fragments. Double click on similarity record open source code file and highlight text fragment. See wiki page for more info about settings and advanced using.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Prodar

    Prodar

    Prodar searches the PDB for candidate protein structural alignments

    Prodar is a search application that queries the PDB for candidate structural alignments. The input to the search is a protein backbone structure read from a standard (text) PDB file, and the results returned are based solely on structural similarity of the backbone without any regard to sequence information. Searches are extremely fast, searching the PDB (included in app) in less than a minute typically. Prodar identifies partial matches, such that a relatively small section of the query...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Redundancy due to cut-paste operations in text creates bias in machine learning for NLP. This module takes a directory and produces a subset of the files in that directory (in a list) with an upper bound on similarity between two files.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next