Showing 149 open source projects for "sentence"

View related business solutions
  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • Fully Managed MySQL, PostgreSQL, and SQL Server Icon
    Fully Managed MySQL, PostgreSQL, and SQL Server

    Automatic backups, patching, replication, and failover. Focus on your app, not your database.

    Cloud SQL handles your database ops end to end, so you can focus on your app.
    Try Free
  • 1
    SentenceTransformers

    SentenceTransformers

    Multilingual sentence & image embeddings with BERT

    SentenceTransformers is a Python framework for state-of-the-art sentence, text and image embeddings. The initial work is described in our paper Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. You can use this framework to compute sentence / text embeddings for more than 100 languages. These embeddings can then be compared e.g. with cosine-similarity to find sentences with a similar meaning.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    SetFit

    SetFit

    Efficient few-shot learning with Sentence Transformers

    SetFit is an efficient and prompt-free framework for few-shot fine-tuning of Sentence Transformers. It achieves high accuracy with little labeled data - for instance, with only 8 labeled examples per class on the Customer Reviews sentiment dataset, SetFit is competitive with fine-tuning RoBERTa Large on the full training set of 3k examples.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Apache OpenNLP

    Apache OpenNLP

    Apache OpenNLP

    Apache OpenNLP is a machine learning-based NLP library that provides tools for text-processing tasks such as tokenization, sentence segmentation, and named entity recognition.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Zotero PDF Translate

    Zotero PDF Translate

    Translate PDF, EPub, webpage, metadata, annotations, notes

    ...It also extends translation functionality to annotations, notes, titles, and abstracts, enabling comprehensive multilingual research management. Advanced features such as sentence-by-sentence translation, dictionary lookup, and multi-service comparison further enhance usability for academic work. The tool is highly customizable, allowing users to adjust interface behavior, translation settings, and shortcuts to match their workflow.
    Downloads: 15 This Week
    Last Update:
    See Project
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 5
    BudouX

    BudouX

    Standalone, small, language-neutral

    Standalone. Small. Language-neutral. BudouX is the successor to Budou, the machine learning-powered line break organizer tool. It is standalone. It works with no dependency on third-party word segmenters such as Google cloud natural language API. It is small. It takes only around 15 KB including its machine learning model. It's reasonable to use it even on the client-side. It is language-neutral. You can train a model for any language by feeding a dataset to BudouX’s training...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Papers We Love

    Papers We Love

    Papers from the computer science community to read and discuss

    Papers We Love (PWL) is a global open source community dedicated to reading, discussing, and sharing influential computer science research papers. The repository serves as a curated directory of academic papers that have shaped the field of computing, providing a centralized location for documents that were previously scattered across various online sources. While licensing restrictions prevent hosting all papers directly, PWL offers links to their original sources and clearly marks hosted...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    Hazm

    Hazm

    Persian NLP Toolkit

    Hazm is a natural language processing (NLP) library for Persian text, offering various tools for text preprocessing, tokenization, part-of-speech tagging, and more.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 8
    Stanford CoreNLP

    Stanford CoreNLP

    Stanford CoreNLP, a Java suite of core NLP tools

    CoreNLP is your one stop shop for natural language processing in Java! CoreNLP enables users to derive linguistic annotations for text, including token and sentence boundaries, parts of speech, named entities, numeric and time values, dependency and constituency parses, coreference, sentiment, quote attributions, and relations. CoreNLP currently supports 6 languages, Arabic, Chinese, English, French, German, and Spanish. The centerpiece of CoreNLP is the pipeline. Pipelines take in raw text, run a series of NLP annotators on the text, and produce a final set of annotations. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9
    Universal Sentence Encoder

    Universal Sentence Encoder

    Encoder of greater-than-word length text trained on a variety of data

    The Universal Sentence Encoder (USE) is a pre-trained deep learning model designed to encode sentences into fixed-length embeddings for use in various natural language processing (NLP) tasks. It leverages Transformer and Deep Averaging Network (DAN) architectures to generate embeddings that capture the semantic meaning of sentences. The model is designed for tasks like sentiment analysis, semantic textual similarity, and clustering, and provides high-quality sentence representations in a computationally efficient manner.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 10
    trieve

    trieve

    All-in-one infrastructure for search, recommendations, RAG

    Trieve is an all-in-one infrastructure for building hybrid vector search, recommendations, and RAG.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Text2Code for Jupyter notebook

    Text2Code for Jupyter notebook

    A proof-of-concept jupyter extension which converts english queries

    ...The system uses natural language processing techniques to identify the intent of the query, extract relevant variables, and map the request to predefined code templates. Technologies such as sentence embeddings and named entity recognition are used to interpret user instructions and construct appropriate code outputs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    CopyTranslator

    CopyTranslator

    Foreign language reading and translation assistant

    Copy-and-translate foreign language assisted reading and translation solution. Just copy the text to the clipboard, you can view the translation results in the next second. Solve the problem of garbled codes caused by redundant sentence breaks and line breaks, and the translation results are more in line with reading habits. Infinitely close to the system-level open source implementation of translating, drag and drop to select and copy the translation. With the update of CopyTranslator, the functions are constantly enriched, and the differences between different versions are gradually significant. ...
    Downloads: 47 This Week
    Last Update:
    See Project
  • 13
    gTTS

    gTTS

    Python library and CLI tool to interface with Google Translate

    ...It lets you send text to the Google Translate TTS endpoint and receive spoken audio back as MP3 data, either written to a file, a file-like object, or standard output. The library is designed to handle long texts, using a speech-specific sentence tokenizer that keeps intonation and punctuation natural while splitting requests into acceptable chunks. It supports customizable text pre-processors, which can correct pronunciations, tweak formatting, or handle domain-specific vocabulary before sending it to the API. gTTS is primarily aimed at developers who want a quick way to add cloud-backed speech to scripts, apps, or pipelines without managing any model weights locally. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Wink-NLP

    Wink-NLP

    Developer friendly Natural Language Processing

    Wink-NLP is a lightweight and fast natural language processing library for JavaScript, optimized for browser and Node.js environments.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    spaCy models

    spaCy models

    Models for the spaCy Natural Language Processing (NLP) library

    spaCy is designed to help you do real work, to build real products, or gather real insights. The library respects your time, and tries to avoid wasting it. It's easy to install, and its API is simple and productive. spaCy excels at large-scale information extraction tasks. It's written from the ground up in carefully memory-managed Cython. If your application needs to process entire web dumps, spaCy is the library you want to be using. Since its release in 2015, spaCy has become an industry...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 16
    espanso

    espanso

    Cross-platform Text Expander written in Rust

    Discover the incredible power of a full-blown text expander. No more copying and pasting, create templates once and let Espanso do the rest for you. Customer support replies, sales pitches, medical reports, you name it. Espanso has got you covered. Just press ALT+Space and Espanso’s search bar will open, letting you search for the perfect snippet. Don’t wrap your head around dates. Espanso makes it easy to use them, both past and future ones. Extend Espanso’s capabilities with packages, or...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 17
    VideoCaptioner

    VideoCaptioner

    AI-powered tool for generating, optimizing, and translating subtitles

    VideoCaptioner is an open source AI-powered subtitle processing tool designed to simplify the workflow of creating subtitles for videos. It integrates speech recognition, language processing, and translation technologies to automatically generate and refine subtitles from video or audio sources. VideoCaptioner uses speech-to-text engines such as Whisper variants to transcribe spoken content and convert it into subtitle text with accurate timestamps. After transcription, large language models...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 18
    ContextGem

    ContextGem

    ContextGem: Effortless LLM extraction from documents

    ContextGem is an open-source framework designed to simplify the extraction of structured data and insights from documents using large language models (LLMs). It provides a flexible, intuitive API that minimizes boilerplate code, enabling developers to build complex extraction workflows efficiently. ContextGem supports various document formats and integrates with multiple LLM providers, making it a versatile tool for tasks like contract analysis, anomaly detection, and information retrieval.​
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Typed.js

    Typed.js

    A JavaScript typing animation library

    Typed.js is a library that types. Enter in any string, and watch it type at the speed you've set, backspace what it's typed, and begin a new sentence for however many strings you've set. Rather than using the strings array to insert strings, you can place an HTML div on the page and read from it. This allows bots and search engines, as well as users with JavaScript disabled, to see your text on the page. You can pause in the middle of a string for a given amount of time by including an escape character. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    spaCy

    spaCy

    Industrial-strength Natural Language Processing (NLP)

    spaCy is a library built on the very latest research for advanced Natural Language Processing (NLP) in Python and Cython. Since its inception it was designed to be used for real world applications-- for building real products and gathering real insights. It comes with pretrained statistical models and word vectors, convolutional neural network models, easy deep learning integration and so much more. spaCy is the fastest syntactic parser in the world according to independent benchmarks, with...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 21
    model2Vec

    model2Vec

    Fast State-of-the-Art Static Embeddings

    model2vec is an innovative embedding framework that converts large sentence transformer models into compact, high-speed static embedding models while preserving much of their semantic performance. The project focuses on dramatically reducing the computational cost of generating embeddings, achieving significant improvements in speed and model size without requiring large datasets for retraining. By using a distillation-based approach, it can produce lightweight models that run efficiently on CPUs, making it suitable for edge applications and large-scale processing pipelines. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    latexindent.pl

    latexindent.pl

    Perl script to add indentation to LaTeX files

    Perl script to add indentation (leading horizontal space) to LaTeX files. It can modify line breaks before, during and after code blocks; it can perform text wrapping and paragraph line break removal. It can also perform string-based and regex-based substitutions/replacements. The script is customizable through its YAML interface. latexindent.exe is a standalone executable file that does not require a perl installation. A nice way to test the script is to navigate to the test-cases...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Tokenizers

    Tokenizers

    Fast State-of-the-Art Tokenizers optimized for Research and Production

    ...Easy to use, but also extremely versatile. Designed for both research and production. Full alignment tracking. Even with destructive normalization, it’s always possible to get the part of the original sentence that corresponds to any token. Does all the pre-processing: Truncation, Padding, add the special tokens your model needs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Venom

    Venom

    Venom is the most complete javascript library for Whatsapp

    Venom is a high-performance system developed with JavaScript to create a bot for WhatsApp, support for creating any interaction, such as customer service, media sending, sentence recognition based on artificial intelligence and all types of design architecture for WhatsApp. It's a high-performance alternative API to whatzapp, you can send, text messages, files, images, videos and more. Remember, the API was developed on a platform called RESTful Web services, providing interoperability between computer systems on the Internet. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    AutoTrain Advanced

    AutoTrain Advanced

    Faster and easier training and deployments

    ...The project provides a no-code and low-code interface that allows users to train models using custom datasets without needing extensive expertise in machine learning engineering. It supports a wide range of tasks including text classification, sequence-to-sequence modeling, token classification, sentence embedding training, and large language model fine-tuning. The system integrates closely with the Hugging Face ecosystem and allows developers to train models using datasets hosted on the Hugging Face Hub. AutoTrain Advanced can run locally or in cloud environments, making it adaptable to different computational setups. By automating tasks such as model configuration, hyperparameter selection, and training pipelines, the project significantly reduces the technical barrier to building AI systems.
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB