word frequency free download

Showing 30 open source projects for "word frequency"

View related business solutions

Linux Clear Filters & Widen Search

Try Google Cloud Risk-Free With $300 in Credit
No hidden charges. No surprise bills. Cancel anytime.

Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.

Start Free
Train ML Models With SQL You Already Know
BigQuery automates data prep, analysis, and predictions with built-in AI assistance.

Build and deploy ML models using familiar SQL. Automate data prep with built-in Gemini. Query 1 TB and store 10 GB free monthly.

Try Free
1

WordCount

Count frequency of single, 2-word and 3-word clusters in a text

The program can read a text file and count the occurrences of single words and clusters of 2 and 3 words. The resulting list will be sorted in descending order (highest frequency on top).

Downloads: 0 This Week

Last Update: 2025-02-01
See Project
2

TEXminer

Text Mining Classification for Texts in ASCII, Unicode and PDF Format.

TEXminer uses generic Text Mining Methods to analyze Unicode Files as plain Text or PDF. The Text Database can be saved in XML where the orginal Text, the Sentence and Word Lists and additional Parameters (e.g. Abbreviations) are stored. TEXminer allows Language Detection by Letter Frequency Analysis, finding important Words by Cooccurrence Analysis, Determination of Central Expressions, Thematic Text Classification (also Semantic Groups) Fingerprint Comparison and Word Frequency. Because TEXminer is not disigned to have a Reference Corpus, Thematic Model Statistics uses Language Models (lexicons) to have Background Knowledge about certain Languages (English, German, French, Spanish, Italian, Russian), which are derived from Decaleon Project. ...

Downloads: 0 This Week

Last Update: 2025-03-25
See Project
3

Onda Sfasata

An authentic Italian learning app.

...GitHub repository: https://github.com/Northstrix/onda-sfasata Check it out at: https://onda-sfasata.netlify.app/ This app is fully localized into English, Hebrew, and two dialects of German — Hochdeutsch and a mixture of Zurich and Basel dialects (approximately 64%–36%), labeled as “Schwiizerdütsch” I picked the words for this app not based on predefined categories, usage frequency, or the fluency level to which the word might correspond, but on which words could be cleanly cut from the audio tracks. As a result, the word set turned out to be a bit odd, yet unique. Every single sound used in the app, except for success.wav, error.wav, and completed.wav, was extracted from public domain recordings. The success and error sounds are covered by Attribution 4.0 License (https://creativecommons.org/licenses/by/4.0/), the completed sound is available under Creative Commons 0 License (http://creativecommons.org/publicdomain/zero/1.0/)

Downloads: 1 This Week

Last Update: 2026-02-23
See Project
4

pyLogos

Qualitative content analysis software.

...Documents (imported from txt and docx files) are stored in a database, and may have marked text segments associated with codes. It is possible to retrieve these segments in various ways, generate word clouds, tabulate frequency of codes and words, among other outputs. pyLogos é um programa de apoio à análise de conteúdo de textos. Documentos (importados de arquivos txt e docx) são armazenados numa base de dados, podendo ter segmentos de textos marcados a associados a códigos. É possível recuperar esses segmentos de várias formas, gerar nuvens de palavras, tabular frequência de códigos e palavras, entre outras saídas.

Downloads: 3 This Week

Last Update: 2025-03-21
See Project
MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
5

TXM

Unicode XML TEI text analysis platform

TXM is a free and open-source cross-platform Unicode & XML based text analysis environment and graphical client, supporting Windows, Linux and Mac OS X. It can also be used online as a J2EE standard compliant web portal (GWT based) with access control built in. DOWNLOAD LATEST VERSION OF TXM : http://textometrie.ens-lyon.fr/spip.php?rubrique61&lang=en TXM offers a comprehensive range of analysis tools (concordances, collocate search, frequency lists, etc.) based on the powerfull CQP...

Downloads: 13 This Week

Last Update: 2024-12-09
See Project
6

pipZuseZ3

An emulator for the Zuse Z3 computer, invented in 1941

An emulator for the Zuse Z3 computer, invented in 1941 by Konrad Zuse. It was the world's first working programmable, fully automatic digital computer. The Z3 was built with 2,600 relays, implementing a 22-bit word length that operated at a clock frequency of about 5–10 Hz. Program code was stored on punched film. Initial values were entered manually.

Downloads: 13 This Week

Last Update: 2023-02-23
See Project
7

Linguistic Analyzer

The Linguistic Analyzer is a tool for corpus analysis and comparison

The Linguistic Analyzer (Almuhalil Alloghawy) is a free tool designed by a team from Al-Imam Muhammad bin Saud islamic university that can be used for corpus analysis and comparison in terms of the several linguistic characteristics, such as frequency lists generation, concordances, collocation extraction, the difference between two words, and keyword identification.

Downloads: 0 This Week

Last Update: 2022-04-16
See Project
8

Texthero

Text preprocessing, representation and visualization from zero to hero

Texthero is a python package to work with text data efficiently. It empowers NLP developers with a tool to quickly understand any text-based dataset and it provides a solid pipeline to clean and represent text data, from zero to hero.

Downloads: 0 This Week

Last Update: 2024-08-07
See Project
9

HSKinter

Chinese Words Study (HSK 1–5) on Desktop and Phone

...Flashcards, practice of hanzi meaning, pinyin and tones, stats of accuracy. Optional pronunciation via gTTS (Google). Compatible with Pydroid 3 (runs on Android). The frequency of a word showing up depends on its retaining level and time since the last answer (age).

Downloads: 0 This Week

Last Update: 2021-08-28
See Project
Gemini 3 and 200+ AI Models on One Platform
Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build generative AI apps with Vertex AI. Switch between models without switching platforms.

Start Free
10

PSWordCloud

Create pretty word clouds with PowerShell!

Create pretty word clouds with PowerShell!

Downloads: 0 This Week

Last Update: 2023-04-10
See Project
11

Arabic Word diversity

Word frequency and diversity (distribution) across hundreds of corpora. You'll see both the lemma and the various forms.

Downloads: 0 This Week

Last Update: 2020-05-15
See Project
12

jieba

Stuttering Chinese word segmentation

"Jaba" Chinese word segmentation, do the best Python Chinese word segmentation component. Four word segmentation modes are supported. Precise mode, which tries to cut the sentence most precisely, suitable for text analysis. Full mode, scans all the words that can be formed into words in the sentence, the speed is very fast, but the ambiguity cannot be resolved. The search engine mode, on the basis of the precise mode, divides the long words again to improve the recall rate, which is suitable...

Downloads: 0 This Week

Last Update: 2022-02-18
See Project
13

Meaning Explorer

A tool for analyzing the words of the Quran

The main purpose of this tool is to help users in extracting syntagmatic relations between words, lemmas and roots available in the Quran; these relations include identifying significant collocates and words’ co-occurrences. In addition, the tool also provides other helpful functionalities that complement the primary purpose, which include a Key Word In Context (KWIC) concordance, in addition to frequency lists of all words, lemmas and roots in the holy Quran. The main intended users of this tool are Arabic Quranic scholars and linguists. The Meaning Explorer applies a new distributional semantic model to extract words’ significant co-occurrences from the Quran. This model is based on the Refined MI association measure applied to all words within a symmetric sliding window of five words surrounding the node word. ...

Downloads: 0 This Week

Last Update: 2019-12-03
See Project
14

Ghawwas_V4

An open source system for Arabic corpora processing

Ghawwas (previously known as Khawas) is an open source system for Arabic corpora processing. Ghawwas V4.0 provides the following main functions: a. Frequency list for single word and N-Grams b. Concordance c. Collocation (MI, CHI Squared, LL, T-Score, Z Score, Dice, Log Dice, Weirdness Coefficient) d. Lexical patterns search e. Two corpora frequency profile comparison based on MI, CHI, LL, T-Score, Z Score, Dice, Log Dice, Weirdness Coefficient f. Accept Windows and UTF-8 character encoding g. ...

1 Review

Downloads: 2 This Week

Last Update: 2018-12-09
See Project
15

mzitu

Python crawler that downloads image galleries and analyzes titles

...It focuses on automating the collection of large sets of images by programmatically parsing page content and iterating through gallery entries. mzitu also includes a simple analysis script that processes downloaded folder names to generate statistics and visualizations. Using text segmentation and frequency analysis, the project can create a word cloud representing common keywords found in the dataset. This makes the repository both a scraping example and a small data analysis experiment built around the collected content. Overall, mzitu serves as a learning-oriented implementation of Python web scraping, data processing, and visualization techniques.

Downloads: 1 This Week

Last Update: 6 days ago
See Project
16

tinfoleak

OSINT tool for extracting and analyzing Twitter intelligence data

tinfoleak is an open source intelligence (OSINT) and social media intelligence (SOCMINT) tool designed to automate the collection and analysis of data from Twitter. It focuses on helping analysts extract large volumes of information from Twitter timelines using identifiers such as usernames, geographic coordinates, or keywords. Once the data is gathered, tinfoleak organizes it into structured information that can support intelligence analysis and investigative research. tinfoleak is capable...

Downloads: 5 This Week

Last Update: 3 days ago
See Project
17

pydictor

powerful and useful hacker dictionary builder for a brute-force attack

...You can use pydictor to generate a general blast wordlist, a custom wordlist based on Web content, a social engineering wordlist, and so on; You can use the pydictor built-in tool to safe delete, merge, unique, merge and unique, count word frequency to filter the wordlist, besides, you also can specify your wordlist and use '-tool handler' to filter your wordlist. You can generate highly customized and complex wordlists by modifying multiple configuration files, adding your own dictionary, using leet mode, filter by length, char occur times, types of different char, regex, and even add customized encode scripts in /lib/encode/ folder, add your own plugin script in /plugins/ folder, add your own tool script in /tools/ folder.

Downloads: 1 This Week

Last Update: 2023-02-22
See Project
18

TuxWordSmith

TuxWordSmith uses XDXF dictionaries to play in 88 languages

Similar to the classic word game 'Scrabble', but with unicode support for multiple languages and character sets. The game is currently distributed with eighty-eight (88) dictionary resources for playing Language[i]-Language[j] 'Scrabble'. For example, if configured to use the French-English dictionary, then the distribution of available tiles will be computed based on frequency of occurance of each character of Language[i] (French), and for each submission the corresponding definition will be given in Language[j] (English).

1 Review

Downloads: 1 This Week

Last Update: 2023-02-17
See Project
19

word frequency counter

Word Frequency Counter

Word Frequency Counter

1 Review

Downloads: 0 This Week

Last Update: 2013-12-14
See Project
20

SubString

SubString is a set of shell scripts implementing substring reduction and frequency consolidation of word n-grams.

Downloads: 0 This Week

Last Update: 2014-06-09
See Project
21

Arabic New Words

List of new words not included in current dictionaries

...It includes 476,349 new lemmatized words, and they are weighted and ordered so that there is a good likelihood that words which are most relevant (lexicographically) will surface to the top and the least relevant words will be pushed down the list. So, for example if you take the first 10,000 words, there is a good chance that you'll find a large number of word fit to include in a dictionary. Please consider that the word list is not filtered by a spell checker, so many words will only be Misspellings. Proper names are not filtered out because high frequency proper names are usually included in morphological analysers to improve coverage, but in dictionaries people might want to exclude them.

1 Review

Downloads: 0 This Week

Last Update: 2013-05-11
See Project
22

Word Count of Modern Standard Arabic

A word count of Modern Standard Arabic from a 1 billion word corpus, sorted according to frequency counts

1 Review

Downloads: 0 This Week

Last Update: 2015-11-12
See Project
23

Alkindus

Alkindus is an automated solver for short monoalphabetic substitution ciphers without word divisions.

Downloads: 0 This Week

Last Update: 2013-04-18
See Project
24

MatnPardaz

MatnPardaz calculates how many times a word appears in a Text.

1 Review

Downloads: 0 This Week

Last Update: 2014-07-08
See Project
25

Virtual Keyboard

Onscreen keyboard for eye tracking systems

Onscreen keyboard for eye tracking systems

1 Review

Downloads: 4 This Week

Last Update: 2017-01-09
See Project