Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Scientific/Engineering
Linguistics Software
Search Results

Search Results for "void linux based"

x

Sort By:

Relevance

Clear All Filters

OS

Linux 24
Windows 20
Mac 17
More...
BSD 13
ChromeOS 9
Desktop Operating Systems 3
Mobile Operating Systems 1

Category

Scientific/Engineering 24
Artificial Intelligence 7
Education 5
Text Editors 2
Communications 1
Multimedia 1
Religion and Philosophy 1
Software Development 1

License

OSI-Approved Open Source 20
Creative Commons Attribution License 2
Other License 1
Public Domain 1

Translations

English 6
Arabic 4
Brazilian Portuguese 3
Dutch 3
More...
Afrikaans 2
French 2
Albanian 1
Indonesian 1
Portuguese 1
Spanish 1

Programming Language

Python 24
C++ 3
JavaScript 3
Unix Shell 3
C 2
More...
Perl 2
C# 1
Java 1
PHP 1

Status

Beta 8
Production/Stable 6
Alpha 4
Planning 2
More...
Pre-Alpha 1

Showing 24 open source projects for "void linux based"

View related business solutions

Linguistics Python Clear Filters & Widen Search

MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
$300 in Free Credit Towards Top Cloud Services
Build VMs, containers, AI, databases, storage—all in one place.

Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.

Get Started
1

PDFMathTranslate

PDF scientific paper translation with preserved formats

PDFMathTranslate is a Python-based tool that uses AI translation to convert academic PDFs into bilingual (e.g. Chinese-English) documents while preserving formatting, including math notation. It supports OCR-enhanced content and offers CLI, GUI, Docker, and Zotero integration under AGPL v3.

Downloads: 47 This Week

Last Update: 2025-07-11
See Project
2

pyVideoTrans

Translate the video from one language to another and embed dubbing

pyVideoTrans is an ambitious open-source multimedia processing project that assembles speech recognition, subtitle generation, AI translation, voice synthesis, and video assembly into a unified pipeline for converting videos from one language to another with embedded dubbing and captions. At its core it runs speech-to-text models to transcribe audio tracks, translates the resulting text into a target language using local or cloud-based translation engines, synthesizes new speech to match the...

Downloads: 30 This Week

Last Update: 21 hours ago
See Project
3

CycleGAN and pix2pix in PyTorch

Image-to-Image Translation in PyTorch

CycleGAN and pix2pix in PyTorch repository is a PyTorch implementation of two influential image-to-image translation frameworks: CycleGAN (for unpaired translation) and pix2pix (for paired translation). This repo gives developers and researchers a convenient, modern (PyTorch-based) platform to train and test these methods — supporting both paired datasets (input to output) and unpaired datasets (domain-to-domain) with minimal changes. The code supports standard training and inference...

Downloads: 0 This Week

Last Update: 2025-12-09
See Project
4

SPPAS

SPPAS - the automatic annotation and analyses of speech

SPPAS is a scientific computer software package written and maintained by Brigitte Bigi of the Laboratoire Parole et Langage, in Aix-en-Provence, France. Available for free, with open source code, there is simply no other package for linguists to simple use in the automatic annotations of speech, the analyses of any kind of annotated data and the conversion of annotated files. SPPAS is able to produce automatically speech annotations from a recorded speech sound and its orthographic...

Downloads: 24 This Week

Last Update: 2026-04-06
See Project
Gemini 3 and 200+ AI Models on One Platform
Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build generative AI apps with Vertex AI. Switch between models without switching platforms.

Start Free
5

Tokenized Text Aligner

Aligns tokens in two versions of a text with differing tokenization.

This tool performs token-by-token alignment of two versions of a text with differing tokenization by interpreting the results of a file diff (https://docs.python.org/3/library/difflib.html). It is intended for use in the preparation of annotated linguistic corpora, where differences in tokenization may arise (i) following corrections or modifications to the source text or (ii) through the creation of different layers of annotation (part-of-speech, treebank) requiring different tokenization....

Downloads: 0 This Week

Last Update: 2026-02-06
See Project
6

MITRE Annotation Toolkit

A toolkit for managing and manipulating text annotations

The MITRE Annotation Toolkit (MAT) is a suite of tools which can be used for automated and human tagging of annotations. Annotation is a process, used mostly by researchers in natural language processing, of enhancing documents with information about the various phrase types the documents contain. MAT supports both UI interaction and command-line interaction, and provides various levels of control over the overall annotation process. It can be customized for specific tasks (e.g.,...

Downloads: 5 This Week

Last Update: 2023-04-19
See Project
7

UnsupervisedMT

Phrase-Based & Neural Unsupervised Machine Translation

Unsupervised Machine Translation is a research repository that implements both phrase-based SMT and neural MT approaches for translation without parallel corpora. The neural component supports multiple architectures—seq2seq, biLSTM with attention, and Transformer—and allows extensive parameter sharing across languages to improve data efficiency. Training relies on denoising auto-encoding and back-translation, with on-the-fly, multithreaded generation of synthetic parallel data to continually...

Downloads: 0 This Week

Last Update: 3 days ago
See Project
8

Arabic Corpus

Text categorization, arabic language processing, language modeling

The Arabic Corpus {compiled by Dr. Mourad Abbas ( http://sites.google.com/site/mouradabbas9/corpora ) The corpus Khaleej-2004 contains 5690 documents. It is divided to 4 topics (categories). The corpus Watan-2004 contains 20291 documents organized in 6 topics (categories). Researchers who use these two corpora would mention the two main references: (1) For Watan-2004 corpus ---------------------- M. Abbas, K. Smaili, D. Berkani, (2011) Evaluation of Topic Identification Methods on...

Downloads: 14 This Week

Last Update: 2019-03-05
See Project
9

dadosSemiotica

Collecter and manager of semiotica annalisis data

This program is a web application to collect and organize data of text analysis. It works with sets of texts and the analysis are done on portions of the length of a sentence. One of the preprocessing modules is based on CoGroo (A LibreOffice & OpenOffice.org Portuguese Grammar Checker).

Downloads: 0 This Week

Last Update: 2018-11-01
See Project
Full-stack observability with actually useful AI | Grafana Cloud
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.

Create free account
10

Presage

the intelligent predictive text entry platform

Presage (formerly Soothsayer) is an intelligent predictive text entry system. Presage generates predictions by modelling natural language as a combination of redundant information sources. Presage computes probabilities for words which are most likely to be entered next by merging predictions generated by the different predictive algorithms. Presage's modular and extensible architecture allows its language model to be extended and customized to utilize statistical, syntactic, and semantic...

3 Reviews

Downloads: 190 This Week

Last Update: 2018-10-11
See Project
11

RDRPOSTagger

A Rule-based Part-of-Speech and Morphological Tagging Toolkit

RDRPOSTagger is a robust, easy-to-use and language-independent rule-based toolkit for Part-of-Speech (POS) and morphological tagging. RDRPOSTagger obtains fast performance in both learning and tagging process. RDRPOSTagger also achieves a very competitive accuracy in comparison to the state-of-the-art results. RDRPOSTagger now supports pre-trained POS and morphological tagging models for Bulgarian, Czech, Dutch, English, French, German, Hindi, Italian, Portuguese, Spanish, Swedish, Thai...

2 Reviews

Downloads: 0 This Week

Last Update: 2017-05-24
See Project
12

Arramooz Alwaseet Arabic Dictionary

Arramooz Alwaseet Open Arabic Dictionary for morphological analyze. To be useful for Arabic language processing. This dictionary is derived from the Ayaspell Arabic spell checker.

Downloads: 7 This Week

Last Update: 2016-12-22
See Project
13

poliqarp2

natural language corpora search engine

This project aims at building an efficient indexer and search engine for natural language corpora with multilevel annotations.

Downloads: 0 This Week

Last Update: 2016-12-19
See Project
14

Resources for Closely Related Languages

This project concerns the development of human language technology resources, based on the approach to share or recycle resources between closely related language. http://gerhard.pro/closely-related-languages/

Downloads: 0 This Week

Last Update: 2015-12-29
See Project
15

mwetoolkit

THIS PROJECT MIGRATED TO https://gitlab.com/mwetoolkit/mwetoolkit3/

THIS PROJECT MIGRATED TO https://gitlab.com/mwetoolkit/mwetoolkit3/ The Multiword Expressions toolkit aids in the automatic identification and extraction of multiword units in running text. These include idioms (kick the bucket), noun compounds (cable car), phrasal verbs (take off, give up), etc. Even though it focuses on multiword expresisons, the framework is quite complete and can also be useful in any corpus-based study in computational linguistics. The mwetoolkit can be...

1 Review

Downloads: 0 This Week

Last Update: 2019-05-01
See Project
16

Alfanous

Quran Search Engine

Alfanous (The Lantern - الفانوس ) is an Arabic search engine API provide the simple and advanced search in the Holy Quran , more features and many interfaces...

2 Reviews

Downloads: 0 This Week

Last Update: 2019-07-20
See Project
17

Aelius Brazilian Portuguese POS-Tagger

Python, NLTK-based package for shallow parsing of Brazilian Portuguese

Aelius is an ongoing open source project aiming at developing a suite of Python, NLTK-based modules and interfaces to external freely available tools for shallow parsing of Brazilian Portuguese. It also includes language resources such as language models, sample texts, and gold standards. Presently, Aelius already offers facilities for POS-tagging and chunking corpora and outputting annotations in different formats, such as in XML in the TEI P5 encoding scheme.

1 Review

Downloads: 0 This Week

Last Update: 2014-11-03
See Project
18

Mishkal: Arabic Text Vocalization

Arabic Text Vocalization system

Automatic system of vocalization of arabic text.

5 Reviews

Downloads: 34 This Week

Last Update: 2017-10-29
See Project
19

Automatic Compound Processing (AuCoPro)

Automatic compound splitting and semantic analysis of compounds

The central problem to be addressed in this project concerns a multidisciplinary (linguistics and computational linguistics) investigation into sharing of knowledge and resources between closely-related languages, specifically relating to the automatic processing of compounds. Specifically, we will explore the possibility to create new knowledge about closely-related languages, and efficiently develop additional, more advanced resources for (a) compound segmentation; and (b) the semantic...

Downloads: 0 This Week

Last Update: 2015-07-28
See Project
20

Language Constructor

Complete tool for constructing/manipulating languages in digital form

With this tool you can easily design a new language, digitize an existing one or incrementally reconstruct an ancient language. It allows for free experimentation of all aspects of the language, so it does not have to be made consistent on paper first. You can edit script, syntax, grammar, morphology, lexicon and phonology, as well as write documents in the language, as it might be too complex to be handled by current font technology. The information is stored in xml format for easy...

Downloads: 0 This Week

Last Update: 2013-12-19
See Project
21

Donatus Parsing Tools for Portuguese

Donatus is an on-going project consisting of Python, NLTK-based tools and grammars for deep parsing and syntactical annotation of Brazilian Portuguese corpora. It includes a user-friendly graphical user interface for building syntactic parsers with the NLTK, providing some additional functionalities.

Downloads: 0 This Week

Last Update: 2016-08-28
See Project
22

pygermanet

Python API to the german wordnet GermaNet. In the current state this can be only seen as a quickstart-help to access GermaNet. To be honest, this API can't be called API. To use it, you will need access to a licensed copy of GermaNet (version > 5).

Downloads: 0 This Week

Last Update: 2013-04-23
See Project
23

Pylero

Pylero is an open-source Python-based text generator.

Downloads: 0 This Week

Last Update: 2014-12-26
See Project
24

Open Source Linguistics

A public repository of open source scripts and small programs related to linguistics and language.

Downloads: 0 This Week

Last Update: 2014-08-22
See Project

Previous
You're on page 1
Next

Related Searches

ocr

translate

sppas

medical diagnosis system

arabic corpus

php grammar checker

predictive text

morphological analysis for amharic language

arabic stardict dictionary

corpus

Related Categories

Scientific/Engineering

Artificial Intelligence

Education

Text Editors

Communications

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise