Showing 216 open source projects for "word"

View related business solutions
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    Let your crypto work for you

    Put idle assets to work with competitive interest rates, borrow without selling, and trade with precision. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 1
    Walrus

    Walrus

    Lightweight Python utilities for working with Redis

    ...Supports secondary indexes to allow filtering on equality, inequality, ranges, less/greater-than, and a basic full-text search index. The full-text search features a boolean search query parser, porter stemmer, stop-word filtering, and optional double-metaphone implementation.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 2
    JumbleGame

    JumbleGame

    A word puzzle game with a set of words scrambled.

    A word puzzle game with a set of words, each of which is “jumbled” or scrambled. A solver answers the scrambled word with the correct word, and can also request for a hint
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    pdf-editor

    pdf-editor

    Edit your PDFs without needing a subscription or creating accounts

    ...Add a parser for the command line to do multiple commands at once e.g. merge (cut pdf1) pdf2. Tested working with Python 3.8.5. Install venv (py -3.8 -m pip install virtualenv). PDF and Word documents are binary files, which makes them much more complex than plaintext files. In addition to text, they store lots of font, color, and layout information. If you want your programs to read or write to PDFs or Word documents, you’ll need to do more than simply pass their filenames to open().
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    OOoPy is a library in Python for inspecting, creating or modifying OpenOffice.org documents. It uses the existing ElementTree XML library by Fredrik Lundh for manipulation of the OOo XML.
    Leader badge
    Downloads: 1 This Week
    Last Update:
    See Project
  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • 5
    Grade School Math

    Grade School Math

    8.5K high quality grade school math problems

    ...The problems are written by human authors (not automatically generated) to ensure linguistic variety and realism. The repository maintains strict formatting (e.g. JSONL) for problem + answer pairs, and is used broadly in research to benchmark model performance under “word problem” settings. Issues are tracked (people report incorrect problems, ambiguous statements), and contributions are possible for cleaning or expanding the set.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6

    HashApp

    python software for cracking and creating SHA and MD5 hashes

    ...If you would like to change the online dictionary URL for hash decryption, use CTRL+F in your text editor and search for this line: LIST_OF_WORDS = str(urlopen('https://raw.githubusercontent.com/dwyl/english-words/master/words.txt').read(), 'utf-8') Change the url in the urlopen function and replace all instances with your version of the above line. If you want to use a txt dictionary stored in your computer, then, well, nothing. Unless the word search fails with the online dictionary, you can't use a txt dictionary. But don't worry; you just have to store your txt on github (like the default dictionary above).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Paperless-ng

    Paperless-ng

    A supercharged version of paperless, scan, index and archive docs

    Paperless is a simple Django application running in two parts, a Consumer (the thing that does the indexing) and a Web server (the part that lets you search & download already-indexed documents). Paper is a nightmare. Environmental issues aside, there’s no excuse for it in the 21st century. It takes up space, collects dust, doesn’t support any form of a search feature, indexing is tedious, it’s heavy and prone to damage & loss. I wrote this to make “going paperless” easier. I do not have to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Texthero

    Texthero

    Text preprocessing, representation and visualization from zero to hero

    Texthero is a python package to work with text data efficiently. It empowers NLP developers with a tool to quickly understand any text-based dataset and it provides a solid pipeline to clean and represent text data, from zero to hero.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    HSKinter

    HSKinter

    Chinese Words Study (HSK 1–5) on Desktop and Phone

    ...Flashcards, practice of hanzi meaning, pinyin and tones, stats of accuracy. Optional pronunciation via gTTS (Google). Compatible with Pydroid 3 (runs on Android). The frequency of a word showing up depends on its retaining level and time since the last answer (age).
    Downloads: 0 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    AliceMind

    AliceMind

    ALIbaba's Collection of Encoder-decoders from MinD

    ...Specifically, we pre-train StructBERT with two auxiliary tasks to make the most of the sequential order of words and sentences, which leverage language structures at the word and sentence levels, respectively. Pre-trained models for natural language generation (NLG). We propose a novel scheme that jointly pre-trains an autoencoding and autoregressive language model on a large unlabeled corpus, specifically designed for generating new text conditioned on context. It achieves new SOTA results in several downstream tasks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    DeepSpeech

    DeepSpeech

    Open source embedded speech-to-text engine

    DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers. DeepSpeech is an open-source Speech-To-Text engine, using a model trained by machine learning techniques based on Baidu's Deep Speech research paper. Project DeepSpeech uses Google's TensorFlow to make the implementation easier. A pre-trained English model is available for use and can be downloaded following the...
    Downloads: 14 This Week
    Last Update:
    See Project
  • 12
    PORORO

    PORORO

    Platform of neural models for natural language processing

    pororo performs Natural Language Processing and Speech-related tasks. It is easy to solve various subtasks in the natural language and speech processing field by simply passing the task name. Recognized speech sentences using the trained model. Currently English, Korean and Chinese support. Get vector or find similar words and entities from pretrained model using Wikipedia.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Keras TCN

    Keras TCN

    Keras Temporal Convolutional Network

    TCNs exhibit longer memory than recurrent architectures with the same capacity. Performs better than LSTM/GRU on a vast range of tasks (Seq. MNIST, Adding Problem, Copy Memory, Word-level PTB...). Parallelism (convolutional layers), flexible receptive field size (possible to specify how far the model can see), stable gradients (backpropagation through time, vanishing gradients). The usual way is to import the TCN layer and use it inside a Keras model. The receptive field is defined as the maximum number of steps back in time from current sample at time T, that a filter from (block, layer, stack, TCN) can hit (effective history) + 1. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Synonyms

    Synonyms

    Chinese synonyms, chat robot, intelligent question and answer toolkit

    ..."Synonyms Cilin" was compiled by Mei Jiaju and others in 1983, and now widely used is "Synonyms Cilin Extended Edition" maintained by the Social Computing and Information Retrieval Research Center of Harbin Institute of Technology. Classes and subclasses, sort out the relationship between words, the extended version of the synonym word forest contains more than 70,000 words, of which more than 30,000 words are shared in the form of open data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    wordle

    wordle

    Create a wordcloud for a Git repository

    Create a wordcloud for a Git repository. Can also create wordclouds from directories of source files or a single source file. wordle uses tox to automate testing and packaging, and pre-commit to maintain code quality. Tests are run with tox and pytest. To run tests for a specific Python version, such as Python 3.6. The documentation is powered by Sphinx. A local copy of the documentation can be built with tox. Type annotations are checked using mypy. Run mypy using tox.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    NetEase-MusicBox

    NetEase-MusicBox

    NetEase cloud music command line version

    ...Vimer-style shortcut keys make the operation silky smooth. Numerical shortcut keys can be used. Can use custom global shortcut keys. Local fuzzy search on the current playlist list. The shortcut keys with the word num + can be modified with numbers. The key sequence is to enter the number first and then type the modified key, that is, the shortcut key after num +.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    fastNLP

    fastNLP

    fastNLP: A Modularized and Extensible NLP Framework

    ...Various convenient NLP tools, such as Embedding loading (including ELMo and BERT), intermediate data cache, etc.. Provide a variety of neural network components and recurrence models (covering tasks such as Chinese word segmentation, named entity recognition, syntactic analysis, text classification, text matching, metaphor resolution, summarization, etc.). Trainer provides a variety of built-in Callback functions to facilitate experiment recording, exception capture, etc. Automatic download of some datasets and pre-trained models.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    DocBook to LaTeX Publishing transforms your SGML/XML DocBook documents to DVI, PostScript or PDF by translating them in pure LaTeX as a first process. MathML 2.0 markups are supported too. It started as a clone of DB2LaTeX.
    Leader badge
    Downloads: 89 This Week
    Last Update:
    See Project
  • 19
    yabasta

    yabasta

    Yet Another BAsic Scraper and Text Analysis

    YA BASTA! is a Python/R application for Lyrics Web Scraper and Text Analysis. Web scraping is developed in Python, text analysis in R as Python subprocesses. YA BASTA! is only tested on windows OS. To run YA BASTA! just type on window command prompt: python.exe yabasta.py
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    SentEval

    SentEval

    A python tool for evaluating the quality of sentence embeddings

    ...It defines a simple interface—provide an encoder function from sentences to vectors—and then runs consistent training/evaluation loops for tasks like sentiment, entailment, paraphrase, and semantic textual similarity. The suite also contains linguistic probing tasks that illuminate what properties embeddings capture, such as tense, word order, or syntactic structure. Datasets are wrapped with unified preprocessing and metrics so results are comparable across papers and implementations. Because the interface is minimal, researchers can plug in encoders from any framework or language model and obtain a broad evaluation with little glue code. SentEval helped establish common baselines and reporting conventions in the sentence-representation community, reducing friction when comparing new methods.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    GluonNLP

    GluonNLP

    NLP made easy

    ...To facilitate both the engineers and researchers, we provide command-line-toolkits for downloading and processing the NLP datasets. Gluon NLP makes it easy to evaluate and train word embeddings. Here are examples to evaluate the pre-trained embeddings included in the Gluon NLP toolkit as well as example scripts for training embeddings on custom datasets. Fasttext models trained with the library of Facebook research are exported both in text and a binary format. Unlike the text format, the binary format preserves information about subword units and consequently supports the computation of word vectors for words unknown during training (and not included in the text format). ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    DNSGen

    DNSGen

    Intelligent DNS permutation tool for subdomain discovery

    DNSGen is an open source DNS name permutation tool designed primarily for security researchers and penetration testers who need to discover potential subdomains during reconnaissance and attack surface mapping. It analyzes existing domain names and generates numerous intelligent variations that may represent valid subdomains within an organization’s infrastructure. These generated permutations help identify hidden or unlisted services that may not appear in standard DNS queries or public...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 23
    jieba

    jieba

    Stuttering Chinese word segmentation

    "Jaba" Chinese word segmentation, do the best Python Chinese word segmentation component. Four word segmentation modes are supported. Precise mode, which tries to cut the sentence most precisely, suitable for text analysis. Full mode, scans all the words that can be formed into words in the sentence, the speed is very fast, but the ambiguity cannot be resolved.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 24
    textgenrnn

    textgenrnn

    Easily train your own text-generating neural network

    ...A modern neural network architecture that utilizes new techniques as attention-weighting and skip-embedding to accelerate training and improve model quality. Train on and generate text at either the character-level or word-level. Configure RNN size, the number of RNN layers, and whether to use bidirectional RNNs. Train on any generic input text file, including large files. Train models on a GPU and then use them to generate text with a CPU. Utilize a powerful CuDNN implementation of RNNs when trained on the GPU, which massively speeds up training time as opposed to typical LSTM implementations. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    PyTorch Natural Language Processing

    PyTorch Natural Language Processing

    Basic Utilities for PyTorch Natural Language Processing (NLP)

    ...Now you've setup your pipeline, you may want to ensure that some functions run deterministically. Wrap any code that's random, with fork_rng and you'll be good to go. Now that you've computed your vocabulary, you may want to make use of pre-trained word vectors to set your embeddings.
    Downloads: 2 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB