word frequency free download

Showing 24 open source projects for "word frequency"

View related business solutions

Mac Clear Filters & Widen Search

MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
Fully Managed MySQL, PostgreSQL, and SQL Server
Automatic backups, patching, replication, and failover. Focus on your app, not your database.

Cloud SQL handles your database ops end to end, so you can focus on your app.

Try Free
1

TEXminer

Text Mining Classification for Texts in ASCII, Unicode and PDF Format.

TEXminer uses generic Text Mining Methods to analyze Unicode Files as plain Text or PDF. The Text Database can be saved in XML where the orginal Text, the Sentence and Word Lists and additional Parameters (e.g. Abbreviations) are stored. TEXminer allows Language Detection by Letter Frequency Analysis, finding important Words by Cooccurrence Analysis, Determination of Central Expressions, Thematic Text Classification (also Semantic Groups) Fingerprint Comparison and Word Frequency. Because TEXminer is not disigned to have a Reference Corpus, Thematic Model Statistics uses Language Models (lexicons) to have Background Knowledge about certain Languages (English, German, French, Spanish, Italian, Russian), which are derived from Decaleon Project. ...

Downloads: 0 This Week

Last Update: 2025-03-25
See Project
2

Kindle Mate(KMate)

Kindle clippings and Kindle Vocabulary Builder manager

KMate is the ultimate reading companion for Kindle users — and the all-new, cross-platform successor to Kindle Mate, the classic Kindle notes manager trusted by readers worldwide for over a decade. It is the only Kindle assistant that unifies cross-device import, cloud sync, vocabulary & dictionary management, flexible export, reading analytics, and AI-powered definitions — all in one app. ## KMate 3 for Windows latest (Store...

1 Review

Downloads: 32 This Week

Last Update: 5 days ago
See Project
3

Onda Sfasata

An authentic Italian learning app.

...GitHub repository: https://github.com/Northstrix/onda-sfasata Check it out at: https://onda-sfasata.netlify.app/ This app is fully localized into English, Hebrew, and two dialects of German — Hochdeutsch and a mixture of Zurich and Basel dialects (approximately 64%–36%), labeled as “Schwiizerdütsch” I picked the words for this app not based on predefined categories, usage frequency, or the fluency level to which the word might correspond, but on which words could be cleanly cut from the audio tracks. As a result, the word set turned out to be a bit odd, yet unique. Every single sound used in the app, except for success.wav, error.wav, and completed.wav, was extracted from public domain recordings. The success and error sounds are covered by Attribution 4.0 License (https://creativecommons.org/licenses/by/4.0/), the completed sound is available under Creative Commons 0 License (http://creativecommons.org/publicdomain/zero/1.0/)

Downloads: 1 This Week

Last Update: 2026-02-23
See Project
4

pyLogos

Qualitative content analysis software.

...Documents (imported from txt and docx files) are stored in a database, and may have marked text segments associated with codes. It is possible to retrieve these segments in various ways, generate word clouds, tabulate frequency of codes and words, among other outputs. pyLogos é um programa de apoio à análise de conteúdo de textos. Documentos (importados de arquivos txt e docx) são armazenados numa base de dados, podendo ter segmentos de textos marcados a associados a códigos. É possível recuperar esses segmentos de várias formas, gerar nuvens de palavras, tabular frequência de códigos e palavras, entre outras saídas.

Downloads: 3 This Week

Last Update: 2025-03-21
See Project
$300 in Free Credit Towards Top Cloud Services
Build VMs, containers, AI, databases, storage—all in one place.

Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.

Get Started
5

TXM

Unicode XML TEI text analysis platform

TXM is a free and open-source cross-platform Unicode & XML based text analysis environment and graphical client, supporting Windows, Linux and Mac OS X. It can also be used online as a J2EE standard compliant web portal (GWT based) with access control built in. DOWNLOAD LATEST VERSION OF TXM : http://textometrie.ens-lyon.fr/spip.php?rubrique61&lang=en TXM offers a comprehensive range of analysis tools (concordances, collocate search, frequency lists, etc.) based on the powerfull CQP...

Downloads: 13 This Week

Last Update: 2024-12-09
See Project
6

Linguistic Analyzer

The Linguistic Analyzer is a tool for corpus analysis and comparison

The Linguistic Analyzer (Almuhalil Alloghawy) is a free tool designed by a team from Al-Imam Muhammad bin Saud islamic university that can be used for corpus analysis and comparison in terms of the several linguistic characteristics, such as frequency lists generation, concordances, collocation extraction, the difference between two words, and keyword identification.

Downloads: 0 This Week

Last Update: 2022-04-16
See Project
7

Texthero

Text preprocessing, representation and visualization from zero to hero

Texthero is a python package to work with text data efficiently. It empowers NLP developers with a tool to quickly understand any text-based dataset and it provides a solid pipeline to clean and represent text data, from zero to hero.

Downloads: 0 This Week

Last Update: 2024-08-07
See Project
8

Arabic Word diversity

Word frequency and diversity (distribution) across hundreds of corpora. You'll see both the lemma and the various forms.

Downloads: 0 This Week

Last Update: 2020-05-15
See Project
9

Meaning Explorer

A tool for analyzing the words of the Quran

The main purpose of this tool is to help users in extracting syntagmatic relations between words, lemmas and roots available in the Quran; these relations include identifying significant collocates and words’ co-occurrences. In addition, the tool also provides other helpful functionalities that complement the primary purpose, which include a Key Word In Context (KWIC) concordance, in addition to frequency lists of all words, lemmas and roots in the holy Quran. The main intended users of this tool are Arabic Quranic scholars and linguists. The Meaning Explorer applies a new distributional semantic model to extract words’ significant co-occurrences from the Quran. This model is based on the Refined MI association measure applied to all words within a symmetric sliding window of five words surrounding the node word. ...

Downloads: 0 This Week

Last Update: 2019-12-03
See Project
Gemini 3 and 200+ AI Models on One Platform
Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build generative AI apps with Vertex AI. Switch between models without switching platforms.

Start Free
10

Ghawwas_V4

An open source system for Arabic corpora processing

Ghawwas (previously known as Khawas) is an open source system for Arabic corpora processing. Ghawwas V4.0 provides the following main functions: a. Frequency list for single word and N-Grams b. Concordance c. Collocation (MI, CHI Squared, LL, T-Score, Z Score, Dice, Log Dice, Weirdness Coefficient) d. Lexical patterns search e. Two corpora frequency profile comparison based on MI, CHI, LL, T-Score, Z Score, Dice, Log Dice, Weirdness Coefficient f. Accept Windows and UTF-8 character encoding g. ...

1 Review

Downloads: 2 This Week

Last Update: 2018-12-09
See Project
11

mzitu

Python crawler that downloads image galleries and analyzes titles

...It focuses on automating the collection of large sets of images by programmatically parsing page content and iterating through gallery entries. mzitu also includes a simple analysis script that processes downloaded folder names to generate statistics and visualizations. Using text segmentation and frequency analysis, the project can create a word cloud representing common keywords found in the dataset. This makes the repository both a scraping example and a small data analysis experiment built around the collected content. Overall, mzitu serves as a learning-oriented implementation of Python web scraping, data processing, and visualization techniques.

Downloads: 1 This Week

Last Update: 7 days ago
See Project
12

pydictor

powerful and useful hacker dictionary builder for a brute-force attack

...You can use pydictor to generate a general blast wordlist, a custom wordlist based on Web content, a social engineering wordlist, and so on; You can use the pydictor built-in tool to safe delete, merge, unique, merge and unique, count word frequency to filter the wordlist, besides, you also can specify your wordlist and use '-tool handler' to filter your wordlist. You can generate highly customized and complex wordlists by modifying multiple configuration files, adding your own dictionary, using leet mode, filter by length, char occur times, types of different char, regex, and even add customized encode scripts in /lib/encode/ folder, add your own plugin script in /plugins/ folder, add your own tool script in /tools/ folder.

Downloads: 1 This Week

Last Update: 2023-02-22
See Project
13

TuxWordSmith

TuxWordSmith uses XDXF dictionaries to play in 88 languages

Similar to the classic word game 'Scrabble', but with unicode support for multiple languages and character sets. The game is currently distributed with eighty-eight (88) dictionary resources for playing Language[i]-Language[j] 'Scrabble'. For example, if configured to use the French-English dictionary, then the distribution of available tiles will be computed based on frequency of occurance of each character of Language[i] (French), and for each submission the corresponding definition will be given in Language[j] (English).

1 Review

Downloads: 1 This Week

Last Update: 2023-02-17
See Project
14

TextTools

TextTools is a freeware corpus linguistics tool developed in Python to aid in research. This program analyzes user-created corpora and displays information about word (token) frequency, n-grams, clusters, collocations, keyword in context (KWIC), and keyness. TextTools is designed to be user-friendly and intuitive and will run natively on Mac OS X.

Downloads: 0 This Week

Last Update: 2014-09-28
See Project
15

word frequency counter

Word Frequency Counter

Word Frequency Counter

1 Review

Downloads: 0 This Week

Last Update: 2013-12-14
See Project
16

SubString

SubString is a set of shell scripts implementing substring reduction and frequency consolidation of word n-grams.

Downloads: 0 This Week

Last Update: 2014-06-09
See Project
17

Arabic New Words

List of new words not included in current dictionaries

...It includes 476,349 new lemmatized words, and they are weighted and ordered so that there is a good likelihood that words which are most relevant (lexicographically) will surface to the top and the least relevant words will be pushed down the list. So, for example if you take the first 10,000 words, there is a good chance that you'll find a large number of word fit to include in a dictionary. Please consider that the word list is not filtered by a spell checker, so many words will only be Misspellings. Proper names are not filtered out because high frequency proper names are usually included in morphological analysers to improve coverage, but in dictionaries people might want to exclude them.

1 Review

Downloads: 0 This Week

Last Update: 2013-05-11
See Project
18

Word Count of Modern Standard Arabic

A word count of Modern Standard Arabic from a 1 billion word corpus, sorted according to frequency counts

1 Review

Downloads: 0 This Week

Last Update: 2015-11-12
See Project
19

MatnPardaz

MatnPardaz calculates how many times a word appears in a Text.

1 Review

Downloads: 0 This Week

Last Update: 2014-07-08
See Project
20

Virtual Keyboard

Onscreen keyboard for eye tracking systems

Onscreen keyboard for eye tracking systems

1 Review

Downloads: 4 This Week

Last Update: 2017-01-09
See Project
21

PyWordGen

PyWordGen is a random word generator that generates statistics of word parts (frequency and combinations) of a given language and stores that persistantly. That data is then utilized for generating random words with the same characteristics.

Downloads: 0 This Week

Last Update: 2013-05-08
See Project
22

Perlconc - a concordancing program

Perlconc is a Perl-CGI script to search corpora of text files for words/phrases, outputting either a word frequency count or a concordance.

Downloads: 0 This Week

Last Update: 2014-05-22
See Project
23

Toke : Explore, Index and Search the Web

Toke is a webmining toolkit for web exploring, indexing and searching for Java. Toke allows to you crawl public or private web sites, in order to create web estatistics, web Pajek graphs, Lucene indexs and word frequency files for data clustering.

Downloads: 1 This Week

Last Update: 2013-03-20
See Project
24

libtabe

libtabe is a library which provides useful Chinese functions/routines that can deal with fundamental elements such as pronunciation(BoPoMoFo), character frequency, word identification, word frequency. It also comes with a large free word database.

Downloads: 4 This Week

Last Update: 2013-03-21
See Project