Showing 35 open source projects for "word frequency"

View related business solutions
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    WordCount

    WordCount

    Count frequency of single, 2-word and 3-word clusters in a text

    The program can read a text file and count the occurrences of single words and clusters of 2 and 3 words. The resulting list will be sorted in descending order (highest frequency on top).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    TEXminer

    TEXminer

    Text Mining Classification for Texts in ASCII, Unicode and PDF Format.

    TEXminer uses generic Text Mining Methods to analyze Unicode Files as plain Text or PDF. The Text Database can be saved in XML where the orginal Text, the Sentence and Word Lists and additional Parameters (e.g. Abbreviations) are stored. TEXminer allows Language Detection by Letter Frequency Analysis, finding important Words by Cooccurrence Analysis, Determination of Central Expressions, Thematic Text Classification (also Semantic Groups) Fingerprint Comparison and Word Frequency. Because TEXminer is not disigned to have a Reference Corpus, Thematic Model Statistics uses Language Models (lexicons) to have Background Knowledge about certain Languages (English, German, French, Spanish, Italian, Russian), which are derived from Decaleon Project. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Kindle Mate(KMate)

    Kindle Mate(KMate)

    Kindle clippings and Kindle Vocabulary Builder manager

    KMate is the ultimate reading companion for Kindle users — and the all-new, cross-platform successor to Kindle Mate, the classic Kindle notes manager trusted by readers worldwide for over a decade. It is the only Kindle assistant that unifies cross-device import, cloud sync, vocabulary & dictionary management, flexible export, reading analytics, and AI-powered definitions — all in one app. ## KMate 3 for Windows latest (Store...
    Downloads: 32 This Week
    Last Update:
    See Project
  • 4
    Onda Sfasata

    Onda Sfasata

    An authentic Italian learning app.

    ...GitHub repository: https://github.com/Northstrix/onda-sfasata Check it out at: https://onda-sfasata.netlify.app/ This app is fully localized into English, Hebrew, and two dialects of German — Hochdeutsch and a mixture of Zurich and Basel dialects (approximately 64%–36%), labeled as “Schwiizerdütsch” I picked the words for this app not based on predefined categories, usage frequency, or the fluency level to which the word might correspond, but on which words could be cleanly cut from the audio tracks. As a result, the word set turned out to be a bit odd, yet unique. Every single sound used in the app, except for success.wav, error.wav, and completed.wav, was extracted from public domain recordings. The success and error sounds are covered by Attribution 4.0 License (https://creativecommons.org/licenses/by/4.0/), the completed sound is available under Creative Commons 0 License (http://creativecommons.org/publicdomain/zero/1.0/)
    Downloads: 1 This Week
    Last Update:
    See Project
  • Fully Managed MySQL, PostgreSQL, and SQL Server Icon
    Fully Managed MySQL, PostgreSQL, and SQL Server

    Automatic backups, patching, replication, and failover. Focus on your app, not your database.

    Cloud SQL handles your database ops end to end, so you can focus on your app.
    Try Free
  • 5

    pyLogos

    Qualitative content analysis software.

    ...Documents (imported from txt and docx files) are stored in a database, and may have marked text segments associated with codes. It is possible to retrieve these segments in various ways, generate word clouds, tabulate frequency of codes and words, among other outputs. pyLogos é um programa de apoio à análise de conteúdo de textos. Documentos (importados de arquivos txt e docx) são armazenados numa base de dados, podendo ter segmentos de textos marcados a associados a códigos. É possível recuperar esses segmentos de várias formas, gerar nuvens de palavras, tabular frequência de códigos e palavras, entre outras saídas.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 6
    spectrograph

    spectrograph

    The program analyzes sound when you talk into a Headset microphone

    ...The complete source code is included as an ASM file in 2 copies, one copy is ready for assembly with the Qeditor of free MASM32 package, and the other copy is ready for assembly with the free FASMW assembler. Using the ASM file one can try to make improvements to the sound-analyzing. I have removed from it one word which was triggering a false positive in Avast!
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7

    Linguistic Analyzer

    The Linguistic Analyzer is a tool for corpus analysis and comparison

    The Linguistic Analyzer (Almuhalil Alloghawy) is a free tool designed by a team from Al-Imam Muhammad bin Saud islamic university that can be used for corpus analysis and comparison in terms of the several linguistic characteristics, such as frequency lists generation, concordances, collocation extraction, the difference between two words, and keyword identification.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Texthero

    Texthero

    Text preprocessing, representation and visualization from zero to hero

    Texthero is a python package to work with text data efficiently. It empowers NLP developers with a tool to quickly understand any text-based dataset and it provides a solid pipeline to clean and represent text data, from zero to hero.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    HSKinter

    HSKinter

    Chinese Words Study (HSK 1–5) on Desktop and Phone

    ...Flashcards, practice of hanzi meaning, pinyin and tones, stats of accuracy. Optional pronunciation via gTTS (Google). Compatible with Pydroid 3 (runs on Android). The frequency of a word showing up depends on its retaining level and time since the last answer (age).
    Downloads: 0 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 10
    Word frequency and diversity (distribution) across hundreds of corpora. You'll see both the lemma and the various forms.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    jieba

    jieba

    Stuttering Chinese word segmentation

    "Jaba" Chinese word segmentation, do the best Python Chinese word segmentation component. Four word segmentation modes are supported. Precise mode, which tries to cut the sentence most precisely, suitable for text analysis. Full mode, scans all the words that can be formed into words in the sentence, the speed is very fast, but the ambiguity cannot be resolved. The search engine mode, on the basis of the precise mode, divides the long words again to improve the recall rate, which is suitable...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12

    Meaning Explorer

    A tool for analyzing the words of the Quran

    The main purpose of this tool is to help users in extracting syntagmatic relations between words, lemmas and roots available in the Quran; these relations include identifying significant collocates and words’ co-occurrences. In addition, the tool also provides other helpful functionalities that complement the primary purpose, which include a Key Word In Context (KWIC) concordance, in addition to frequency lists of all words, lemmas and roots in the holy Quran. The main intended users of this tool are Arabic Quranic scholars and linguists. The Meaning Explorer applies a new distributional semantic model to extract words’ significant co-occurrences from the Quran. This model is based on the Refined MI association measure applied to all words within a symmetric sliding window of five words surrounding the node word. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13

    Ghawwas_V4

    An open source system for Arabic corpora processing

    Ghawwas (previously known as Khawas) is an open source system for Arabic corpora processing. Ghawwas V4.0 provides the following main functions: a. Frequency list for single word and N-Grams b. Concordance c. Collocation (MI, CHI Squared, LL, T-Score, Z Score, Dice, Log Dice, Weirdness Coefficient) d. Lexical patterns search e. Two corpora frequency profile comparison based on MI, CHI, LL, T-Score, Z Score, Dice, Log Dice, Weirdness Coefficient f. Accept Windows and UTF-8 character encoding g. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    mzitu

    mzitu

    Python crawler that downloads image galleries and analyzes titles

    ...It focuses on automating the collection of large sets of images by programmatically parsing page content and iterating through gallery entries. mzitu also includes a simple analysis script that processes downloaded folder names to generate statistics and visualizations. Using text segmentation and frequency analysis, the project can create a word cloud representing common keywords found in the dataset. This makes the repository both a scraping example and a small data analysis experiment built around the collected content. Overall, mzitu serves as a learning-oriented implementation of Python web scraping, data processing, and visualization techniques.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    tinfoleak

    tinfoleak

    OSINT tool for extracting and analyzing Twitter intelligence data

    tinfoleak is an open source intelligence (OSINT) and social media intelligence (SOCMINT) tool designed to automate the collection and analysis of data from Twitter. It focuses on helping analysts extract large volumes of information from Twitter timelines using identifiers such as usernames, geographic coordinates, or keywords. Once the data is gathered, tinfoleak organizes it into structured information that can support intelligence analysis and investigative research. tinfoleak is capable...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 16
    pydictor

    pydictor

    powerful and useful hacker dictionary builder for a brute-force attack

    ...You can use pydictor to generate a general blast wordlist, a custom wordlist based on Web content, a social engineering wordlist, and so on; You can use the pydictor built-in tool to safe delete, merge, unique, merge and unique, count word frequency to filter the wordlist, besides, you also can specify your wordlist and use '-tool handler' to filter your wordlist. You can generate highly customized and complex wordlists by modifying multiple configuration files, adding your own dictionary, using leet mode, filter by length, char occur times, types of different char, regex, and even add customized encode scripts in /lib/encode/ folder, add your own plugin script in /plugins/ folder, add your own tool script in /tools/ folder.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    Word Doctor

    Word Doctor

    Nextgen word app. Word Docs made easy!

    Word Doctor is a word editor/ writers aid, designed to analyze writing "Content" and "Style". Inspire your creative process and get to work fast using dictation (Speech to Text), or the Ink-Blot test to inspire creativity. Analyze what you already have and Identify imagery, weak writing structures, and more. Content is king, and Word Doctor can certainly help with that!
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18

    Rikaisama (Legacy)

    Modification of Rikaichan with more features and customization options

    *** THIS ADD-ON IS NO LONGER SUPPORTED AND WILL NOT WORK WITH FIREFOX 57+ (however, it still works in Waterfox using a non-e10s window: "File > New Non-e10s Window") *** Rikaisama is a modification of the rikaichan Japanese-English popup dictionary that adds many features and customization options such as audio pronunciation, EPWING dictionary support, sanseido web dictionary support, word frequency, pitch accent, enhanced clipboard & save options, ability to create and add cards directly to an open Anki deck, "Super Sticky" mode, ability to remap shortcut keys, more fine-tuned startup options, and more. See http://rikaisama.sourceforge.net/ for more information. Supports Windows, Ubuntu, and newer versions of OSX.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    JGlossator

    JGlossator

    Creates glosses for Japanese text

    JGlossator can create a gloss for Japanese text complete with de-inflected expressions, readings, audio pronunciation, example sentences, pitch accent, word frequency, kanji information, and grammar analysis. See http://jglossator.sourceforge.net/ for more information and screenshots. Inspired by Translation Aggregator, but aimed primarily at people learning Japanese.
    Leader badge
    Downloads: 5 This Week
    Last Update:
    See Project
  • 20
    Japanese Text Analysis Tool

    Japanese Text Analysis Tool

    Generate frequency and readability reports from Japanese texts.

    cb's Japanese Text Analysis Tool allows users to analyze Japanese text files and generate 4 kinds of reports: 1) Word Frequency Report, 2) Kanji Frequency Report, 3) Formula-based Readability Report, 4) User-based Readability Report. Portable and does not require installation.
    Leader badge
    Downloads: 9 This Week
    Last Update:
    See Project
  • 21
    TuxWordSmith

    TuxWordSmith

    TuxWordSmith uses XDXF dictionaries to play in 88 languages

    Similar to the classic word game 'Scrabble', but with unicode support for multiple languages and character sets. The game is currently distributed with eighty-eight (88) dictionary resources for playing Language[i]-Language[j] 'Scrabble'. For example, if configured to use the French-English dictionary, then the distribution of available tiles will be computed based on frequency of occurance of each character of Language[i] (French), and for each submission the corresponding definition will be given in Language[j] (English).
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22

    word frequency counter

    Word Frequency Counter

    Word Frequency Counter
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    SubString is a set of shell scripts implementing substring reduction and frequency consolidation of word n-grams.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Kanji Word Association Tool

    Kanji Word Association Tool

    Learn kanji and words at the same time in the most optimal way

    This tool was created for students who want to learn kanji and words at the same time in the most optimal fashion possible. Based on a user-provided list of kanji, this tool will generate a list of words that are associated with each kanji and ensure that each word consists only of kanji that you have already studied up to that point and kana. In addition, words are sorted by frequency and no duplicate words are used. For example, assume that 径 is the 882nd kanji in the user-provided kanji list and that we are using the default options. The words that will be generated for 径 will only contain kanji from the 1st kanji in the list to the 882nd kanji in the list.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25

    Arabic New Words

    List of new words not included in current dictionaries

    ...It includes 476,349 new lemmatized words, and they are weighted and ordered so that there is a good likelihood that words which are most relevant (lexicographically) will surface to the top and the least relevant words will be pushed down the list. So, for example if you take the first 10,000 words, there is a good chance that you'll find a large number of word fit to include in a dictionary. Please consider that the word list is not filtered by a spell checker, so many words will only be Misspellings. Proper names are not filtered out because high frequency proper names are usually included in morphological analysers to improve coverage, but in dictionaries people might want to exclude them.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
MongoDB Logo MongoDB