Showing 17 open source projects for "tf idf"

View related business solutions
  • Auth0 for AI Agents now in GA Icon
    Auth0 for AI Agents now in GA

    Ready to implement AI with confidence (without sacrificing security)?

    Connect your AI agents to apps and data more securely, give users control over the actions AI agents can perform and the data they can access, and enable human confirmation for critical agent actions.
    Start building today
  • Desktop and Mobile Device Management Software Icon
    Desktop and Mobile Device Management Software

    It's a modern take on desktop management that can be scaled as per organizational needs.

    Desktop Central is a unified endpoint management (UEM) solution that helps in managing servers, laptops, desktops, smartphones, and tablets from a central location.
    Learn More
  • 1
    BERTopic

    BERTopic

    Leveraging BERT and c-TF-IDF to create easily interpretable topics

    ...Instead, we can visualize the topics that were generated in a way very similar to LDAvis. By default, the main steps for topic modeling with BERTopic are sentence-transformers, UMAP, HDBSCAN, and c-TF-IDF run in sequence.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    tidytext

    tidytext

    Text mining using tidy tools

    tidytext brings tidy data principles to text mining by converting text into a tidy data frame format. It provides tools for tokenization, sentiment analysis, n‑gram creation, and term‑document matrices, enabling interoperability with dplyr, ggplot2, and other tidyverse workflows.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    bleve

    bleve

    A modern text indexing library for go

    ...By indexing your data with bleve you gain the ability to compose query types such as Term, Phrase, Match, Match Phrase, Prefix, Conjunction, Disjunction, Boolean, Numeric and Date Ranges, as well as Query String. Industry standard tf-idf scoring with query time boosting. Includes support for highlighting matching text within document fragments.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    natural

    natural

    General natural language facilities for node

    "Natural" is a general natural language facility for nodejs. It offers a broad range of functionalities for natural language processing. Tokenizing, stemming, classification, phonetics, tf-idf, WordNet, string similarity, and some inflections are currently supported. It’s still in the early stages, so we’re very interested in bug reports, contributions and the like. Note that many algorithms from Rob Ellis’s node-nltools are being merged into this project and will be maintained from here onward. While most of the algorithms are English-specific, contributors have implemented support for other languages. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Cloud-based help desk software with ServoDesk Icon
    Cloud-based help desk software with ServoDesk

    Full access to Enterprise features. No credit card required.

    What if You Could Automate 90% of Your Repetitive Tasks in Under 30 Days? At ServoDesk, we help businesses like yours automate operations with AI, allowing you to cut service times in half and increase productivity by 25% - without hiring more staff.
    Try ServoDesk for free
  • 5
    TextGen

    TextGen

    textgen, Text Generation models

    ...EDA, simple data augmentation technique: similar words, synonym replacement, random word insertion, deletion, replacement. This project refers to Google's UDA (non-core word replacement) algorithm and EDA algorithm, based on TF-IDF to replace some unimportant words in sentences with synonyms, random word insertion, deletion, replacement, etc. method, generating new text and implementing text augmentation This project realizes the back translation function based on Baidu translation API, first translate Chinese sentences into English, and then translate English into new Chinese. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Texthero

    Texthero

    Text preprocessing, representation and visualization from zero to hero

    Texthero is a python package to work with text data efficiently. It empowers NLP developers with a tool to quickly understand any text-based dataset and it provides a solid pipeline to clean and represent text data, from zero to hero.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    DrQA

    DrQA

    Reading Wikipedia to Answer Open-Domain Questions

    ...It follows a two-stage pipeline: a fast document retriever first narrows down candidate articles, and a neural machine reader then predicts the exact answer span from those passages. The retriever relies on classic IR features (like TF-IDF and n-gram statistics) to remain lightweight and scalable to millions of documents. The reader is a neural model trained on supervised QA data to estimate start and end positions within a paragraph, and it can be adapted to new domains through fine-tuning or distant supervision. The repository includes scripts to build the Wikipedia index, train the reader, and evaluate end-to-end performance. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Finding topic sentences using TF-IDF. Download and start self container server, then issue a POST request with a single parameter called documents that is the paragraph content to find the three top topic sentences. The results are returned in JSON array. The settings.json file is required for the standalone server to function correctly.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    CakeChat

    CakeChat

    CakeChat: Emotional Generative Dialog System

    CakeChat is a backend for chatbots that are able to express emotions via conversations. The code is flexible and allows to condition model's responses by an arbitrary categorical variable. For example, you can train your own persona-based neural conversational model or create an emotional chatting machine. Hierarchical Recurrent Encoder-Decoder (HRED) architecture for handling deep dialog context. Multilayer RNN with GRU cells. The first layer of the utterance-level encoder is always...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Skillfully - The future of skills based hiring Icon
    Skillfully - The future of skills based hiring

    Realistic Workplace Simulations that Show Applicant Skills in Action

    Skillfully transforms hiring through AI-powered skill simulations that show you how candidates actually perform before you hire them. Our platform helps companies cut through AI-generated resumes and rehearsed interviews by validating real capabilities in action. Through dynamic job specific simulations and skill-based assessments, companies like Bloomberg and McKinsey have cut screening time by 50% while dramatically improving hire quality.
    Learn More
  • 10
    TF-IDF.jar is a Java Archive file to measure TF-IDF of each document in a document collection (corpus). The jar can be used to (a) get all the terms in the corpus (b) get the document frequency (DF) and inverse document frequency (IDF) of all the terms in the corpus (c) get the TF-IDF of each document in the corpus (d) get each term with their frequency (no. of presence), term frequency (TF) and TF-IDF in every document
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11

    TextualModelGenerator

    Generator for textual models by applying different techniques

    This is a project created and supported by: Angel Castellanos Juan Cigarrán Recuero Ana García Serrano This projects allows the modelling of textual contents by applying different techniques: TF-IDF KLD Mutual Information Chi^2 With this application the users can be able to extract the most representative terminology of a textual collection. The application is Java-based, allowing their execution in several platforms and operative systems (Windows, Linux, MacOS).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12

    Texalyzer

    Text analyzer

    Analyzes text document using TF-IDF and optionally stopword list, and extracts important keywords.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13

    Simple Similarity

    A simple tool to calculate the classical tf-idf/cosine similarity.

    This is a simple tool to calculate the similarity between a document and a set of documents by using the classical tf-idf/cosine algorithm.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Murasaki
    Whole-genome scale multiple genome local alignment search program. Supports unlimited length gapped-seed patterns, parallelization through distributed hashing, and unique a TF-IDF based repeat filtering method.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    NLP4J library is a toolset written in Java for Natural Language Processing. This version is oriented to Document Classification and uses Naive Bayes, TF-IDF, etc. There are also pre-processing tools.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Samudra Manthan uses C and MPI for finding interesting n-grams(terms) in a large corpus of data. We use the GigaWord corpus to find top m interesting n-grams using TF*IDF measure.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    IDEAL means Information DEALer. A System wich provides the news and articles which the user wants. Using Tomcat, Struts, Java, MySQL an AgentSystem, Clustering, TF/IDF, Document Parser and it is multi user able.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next