Showing 348 open source projects for "gnu/linux"

View related business solutions
  • Build Securely on Azure with Proven Frameworks Icon
    Build Securely on Azure with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • Fully Managed MySQL, PostgreSQL, and SQL Server Icon
    Fully Managed MySQL, PostgreSQL, and SQL Server

    Automatic backups, patching, replication, and failover. Focus on your app, not your database.

    Cloud SQL handles your database ops end to end, so you can focus on your app.
    Try Free
  • 1
    Virastyar

    Virastyar

    Virastyar is an spell checker for low-resource languages

    Virastyar is a free and open-source (FOSS) spell checker. It stands upon the shoulders of many free/libre/open-source (FLOSS) libraries developed for processing low-resource languages, especially Persian and RTL languages Publications: Kashefi, O., Nasri, M., & Kanani, K. (2010). Towards Automatic Persian Spell Checking. SCICT. Kashefi, O., Sharifi, M., & Minaie, B. (2013). A novel string distance metric for ranking Persian respelling suggestions. Natural Language Engineering,...
    Downloads: 70 This Week
    Last Update:
    See Project
  • 2

    mwetoolkit

    THIS PROJECT MIGRATED TO https://gitlab.com/mwetoolkit/mwetoolkit3/

    THIS PROJECT MIGRATED TO https://gitlab.com/mwetoolkit/mwetoolkit3/ The Multiword Expressions toolkit aids in the automatic identification and extraction of multiword units in running text. These include idioms (kick the bucket), noun compounds (cable car), phrasal verbs (take off, give up), etc. Even though it focuses on multiword expresisons, the framework is quite complete and can also be useful in any corpus-based study in computational linguistics. The mwetoolkit can be...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    KneeTex is an open–source, stand–alone application for information extraction from narrative reports that describe an MRI scan of the knee. Given an MRI report as input, the system outputs the corresponding clinical findings in the form of JavaScript Object Notation objects. The extracted information is mapped onto TRAK, an ontology that formally models knowledge relevant for the rehabilitation of knee conditions. As a result, formally structured and coded information allows for complex...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 4
    ATTENTION! Morfologik is now at GitHub: https://github.com/morfologik/
    Downloads: 0 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    AFEWC corpus is a multilingual comparable text articles in Arabic, French, and English languages. Each triple article is related to the same topic (aligned at article level). AFEWC corpus is collected from Wikipedia. The corpus is available for free for research purposes only. It is composed of 40K aligned articles, 91.3M English words, 57.8M French words, 22M Arabic words, 2.8M English unique words, 1.9M French unique words, and 1.5M Arabic unique words. Wikipedia text is...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    BANNER is a named entity recognition system intended primarily for biomedical text. It uses conditional random fields as the primary recognition engine and includes a wide survey of the best techniques described in recent literature.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7

    eNTranslator

    To aid translation of satsangs of Paramhamsa Nithyananda

    To aid translation of satsangs of Paramhamsa Nithyananda. Can be used for general purpose by others as well. This translator desktop app uses google translator to translate English text. The auto generated translations are then enriched with human alternation using an easy graphical user interface. Time stamp information may be synched and a subtitle file or a simple textual output may be generated. Additionally it is planned to use google voice tools to also add voice over from these...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8

    Drug Extraction

    Drug name extraction

    Drug name recognition and normalisation/grounding to DrugBank ids and standard names. Package provides 2 taggers: 1. DrugTagger - CRF-based with DrugBank presence feature (see feature set for details). 2. DrugnameGazetteer - gazetteer/dictionary-based. Dictionary created from DrugBank.ca database. Both taggers include grounding/normalisation to DrugBank ids and standard names. Feature set: Word, Word-1, Word+1, Word-1_Word, Word_Word+1, DrugBankPresence, POS DrugBankPresence...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Downloads: 0 This Week
    Last Update:
    See Project
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • 10
    JInsect
    The JINSECT toolkit is a Java-based toolkit and library that supports and demonstrates the use of n-gram graphs within Natural Language Processing applications, ranging from summarization and summary evaluation to text classification and indexing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Alfanous

    Alfanous

    Quran Search Engine

    Alfanous (The Lantern - الفانوس ) is an Arabic search engine API provide the simple and advanced search in the Holy Quran , more features and many interfaces...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    This tool is made to score machine translation performance with the TER metric. This code is based on Snover's algorithm.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13

    LexSub

    A Lexical Substitution Framework

    Lexical substitution framework for supervised all-words lexical substitution using delexicalized features. For a runnable (but GPL-licensed) version of LexSub, see LexSub-GPL (sf.net/p/lexsub/lexsub-gpl)
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14

    ISO GrAF

    Experimental Java library for reading and writing GrAF/XML files.

    The Graph Annotation Framework (GrAF) models linguistic annotations using a data model based on Graph theory and algorithms. The GrAF standard is a work product of ISO TC37SC4 Working Group 1. This Java library is NOT part of the GrAF standard and standoff annotation files produced by the library may not be GrAF compliant.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Text Expander, Inverse summarizer

    Text Expander, Inverse summarizer

    Expand text, inverse summarizer

    IT WILL WORK WITH A JAVA DEVELOPMENT KIT 1.7 ONLY !!! This program is a data-miner and a knowledge-miner. It does exactly the opposite of what the text summarizers do. A text summarizer produces a shortened text given some text as an input. An inverse summarizer takes the shortened input, a similar or a same text and does the process in reverse. This results in an expanded text. It can be used with any text or notes that have the knowledge gaps. It is a great aid to any creative...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    lexhoard_php
    Lexhoard is a dictionary tool for storing words in different languages with the ability to link entries (or 'translate') from one language to others. This php version is a very simple implementation created over a weekend. It can use either a mysql or a sqlite database (other databases may be used, in theory, as all db access is done with PDO).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    DCTFinder

    DCTFinder

    Extract title and creation time from web page.

    Web pages do not offer reliable metadata concerning their creation date and time. However, getting the document creation time is a necessary step for allowing to apply temporal normalization systems to web pages. DCTFinder is a system that parses a web page and extracts from its content the title and the creation date of this web page. DCTFinder combines heuristic title detection, supervised learning with Conditional Random Fields (CRFs) for document date extraction, and rule-based creation...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    A project that aims to create reusable components (C++ libraries, COM components, and Edit controls) for Phonetic Transliteration of Indian languages, such as Telugu, Tamil, Kannada etc.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    ArabicDiacritizer

    ArabicDiacritizer

    An automatic restoration of Arabic diacritic marks

    This is a software of Arabic diacritical marks restoration. It is based mainly on deep architectures using deep neural network. The algorithm generates diacritized text with determined end case. The algorithm is described in detail in: Ilyes Rebai, and Yassine BenAyed 'Text-to-speech synthesis system with Arabic diacritic recognition system', Computer Speech & Language, 2015. We appreciate it very much if you can cite our related work. ************** Installation...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    This project is devoted to the development of natural language processing tools and resources for the Lingala language, which is spoken by tens of millions of people in central Africa.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    mgiza has now moved to github https://github.com/moses-smt/mgiza
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22

    Aelius Brazilian Portuguese POS-Tagger

    Python, NLTK-based package for shallow parsing of Brazilian Portuguese

    Aelius is an ongoing open source project aiming at developing a suite of Python, NLTK-based modules and interfaces to external freely available tools for shallow parsing of Brazilian Portuguese. It also includes language resources such as language models, sample texts, and gold standards. Presently, Aelius already offers facilities for POS-tagging and chunking corpora and outputting annotations in different formats, such as in XML in the TEI P5 encoding scheme.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Mishkal: Arabic Text Vocalization

    Mishkal: Arabic Text Vocalization

    Arabic Text Vocalization system

    Automatic system of vocalization of arabic text.
    Downloads: 31 This Week
    Last Update:
    See Project
  • 24
    This package contains different tools to add NLP capabilities for Lucene 4.x (it has been tested using Lucene version from 4.6.x to 4.8.1). Although it was originally developed for German, it is, mostly, language independent. It allows the user to lemmatize words to be indexed, to weight termy ba their parts of speech (e.g. weighting nouns mor hevaily than pronouns), and to add synonyms taken from GermaNet or a list you provide to the search index and thereby increase recall of lucene.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Mechaglot, Calculate Semantic Similarity

    Mechaglot, Calculate Semantic Similarity

    Calculate semantic similarity for any human and human-like languages

    WARNING: There are too many false-positives! This is Alpha release, expect many things to improve, including the algorithms. PLEASE GO TO BROWSE ALL FILES TO READ A FULL DESCRIPTION. The goal of this project is simple: Input two sentences of the same language, and obtain the number (from 0 to 1) denoting the similarity between the inputted sentences, according to semantic categories. This project models my previous...
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB