Showing 43 open source projects for "data.6bin"

View related business solutions
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 1
    go-i18n

    go-i18n

    Translate your Go program into multiple languages

    go-i18n is a Go package and a command that helps you translate Go programs into multiple languages. Supports pluralized strings for all 200+ languages in the Unicode Common Locale Data Repository (CLDR). Code and tests are automatically generated from CLDR data. Supports strings with named variables using text/template syntax. Supports message files of any format (e.g. JSON, TOML, YAML). Use goi18n extract to extract all i18n.Message struct literals in Go source files to a message file for translation. Create an empty message file for the language that you want to add (e.g. translate.es.toml). ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    Fanyi

    Fanyi

    A 🇨🇳 and 🇺🇸 translate tool in your command line

    Fanyi is a tool for translating words between the Chinese and English languages, right in your command line. It’s a good supportive tool for learning and reading the Chinese language from English, or the other way around. All translation data is fetched from iciba.com and fanyi.youdao.com, and with each translation comprehensive and related samples are given for better understanding and proper usage. There are translations for words as well as sentences, and in Mac/Linux bash, words can even be pronounced by the ‘say’ command.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 3
    SPPAS

    SPPAS

    SPPAS - the automatic annotation and analyses of speech

    ...Available for free, with open source code, there is simply no other package for linguists to simple use in the automatic annotations of speech, the analyses of any kind of annotated data and the conversion of annotated files. SPPAS is able to produce automatically speech annotations from a recorded speech sound and its orthographic transcription. SPPAS is helpful for the analysis of any annotated data: estimate statistical distributions, make requests, manage files, visualize annotations. SPPAS offers a file converter from/to a wide range of formats: xra, TextGrid, eaf, trs... ...
    Downloads: 19 This Week
    Last Update:
    See Project
  • 4

    TEI LingSIG

    Production space for the TEI Linguistics SIG

    This used to be the experimentation and production space for the Special Interest Group (SIG) of the Text Encoding Initiative (TEI) called "TEI for Linguists", LingSIG for short. Currently, this is a storage place for documents produced by the SIG. Use https://github.com/LingSIG to access the current production space.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • 5
    TXM

    TXM

    Unicode XML TEI text analysis platform

    TXM is a free and open-source cross-platform Unicode & XML based text analysis environment and graphical client, supporting Windows, Linux and Mac OS X. It can also be used online as a J2EE standard compliant web portal (GWT based) with access control built in. DOWNLOAD LATEST VERSION OF TXM : http://textometrie.ens-lyon.fr/spip.php?rubrique61&lang=en TXM offers a comprehensive range of analysis tools (concordances, collocate search, frequency lists, etc.) based on the powerfull CQP...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 6

    modnlp-plugins

    External plugins for modnlp/teccli

    This is a general project for modnlp/teccli plugins, with focus on text visualizaton.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7

    MITRE Annotation Toolkit

    A toolkit for managing and manipulating text annotations

    ...It can be customized for specific tasks (e.g., named entity identification, de-identification of medical records). The goal of MAT is not to help you configure your training engine (in the default case, the Carafe CRF system) to achieve the best possible performance on your data. MAT is for "everything else": all the tools you end up wishing you had.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Argos Translate

    Argos Translate

    Open-source offline translation library written in Python

    Argos Translate uses OpenNMT for translations and can be used as either a Python library, command-line, or GUI application. Argos Translate supports installing language model packages which are zip archives with a ".argosmodel" extension containing the data needed for translation. LibreTranslate is an API and web-app built on top of Argos Translate. Argos Translate also manages automatically pivoting through intermediate languages to translate between languages that don't have a direct translation between them installed. For example, if you have a es → en and en → fr translation installed you are able to translate from es → fr as if you had that translation installed. ...
    Downloads: 166 This Week
    Last Update:
    See Project
  • 9
    Apertium: Machine Translation Toolbox

    Apertium: Machine Translation Toolbox

    The free and open-source rule-based machine translation platform

    Apertium is a toolbox to build open-source shallow-transfer machine translation systems, especially suitable for related language pairs: it includes the engine, maintenance tools, and open linguistic data for several language pairs.
    Downloads: 10 This Week
    Last Update:
    See Project
  • Build Securely on AWS with Proven Frameworks Icon
    Build Securely on AWS with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 10
    UnsupervisedMT

    UnsupervisedMT

    Phrase-Based & Neural Unsupervised Machine Translation

    ...The neural component supports multiple architectures—seq2seq, biLSTM with attention, and Transformer—and allows extensive parameter sharing across languages to improve data efficiency. Training relies on denoising auto-encoding and back-translation, with on-the-fly, multithreaded generation of synthetic parallel data to continually refresh supervision signals. The project also provides scripts to fetch and preprocess monolingual data, learn BPE codes, and train cross-lingual embeddings that bootstrap unsupervised alignment between languages. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 11

    dadosSemiotica

    Collecter and manager of semiotica annalisis data

    This program is a web application to collect and organize data of text analysis. It works with sets of texts and the analysis are done on portions of the length of a sentence. One of the preprocessing modules is based on CoGroo (A LibreOffice & OpenOffice.org Portuguese Grammar Checker).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Open data for a Khmer language corpus and lexicographic data that can be used for the development of free language tools for Khmer language, such as automatic translators, dictionaries, linguistic analysis tools, etc.
    Leader badge
    Downloads: 61 This Week
    Last Update:
    See Project
  • 13
    Free Dictionaries
    Free translating dictionaries. Source format: TEI-P5 XML. Delivery formats: DICT, Stardict, etc. The dictionaries may include information on the pronunciation, etymology and such, in a platform-independent format. Access: web/plugins/standalone.
    Downloads: 111 This Week
    Last Update:
    See Project
  • 14
    Colloquium QDA

    Colloquium QDA

    A free and open source qualitative ethnographic interview coding tool.

    Colloquium QDA is a tool for custom coding and analyzing qualitative ethnographic interviews. To run, make sure you first have JRE 8 or later installed (http://www.oracle.com/technetwork/java/javase/downloads/). Colloquium QDA is an open source cross-platform Java Swing app utilizing an embedded Java DB with Lucene integrated search.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15

    BioC

    We describe a simple XML format to share text documents and annotation

    A minimalist approach to share text documents and data annotations. Allows a large number of different annotations to be represented. Project files contain: - simple code to hold/read/write data and perform sample processing. - BioC-formatted corpora - BioC tools that work with BioC corpora BioC goals - simplicity - interoperability - broad use - reuse There should be little investment required to learn to use a format or a software module to process that format. ...
    Leader badge
    Downloads: 7 This Week
    Last Update:
    See Project
  • 16

    Eldamo

    Lexicon and data model for Elvish languages

    A lexicon of Tolkien's Elvish languages, backed by an XML data model of the lexicon's contents. This project has been migrated to Github: https://github.com/pfstrack/eldamo
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17

    ISO GrAF

    Experimental Java library for reading and writing GrAF/XML files.

    The Graph Annotation Framework (GrAF) models linguistic annotations using a data model based on Graph theory and algorithms. The GrAF standard is a work product of ISO TC37SC4 Working Group 1. This Java library is NOT part of the GrAF standard and standoff annotation files produced by the library may not be GrAF compliant.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Text Expander, Inverse summarizer

    Text Expander, Inverse summarizer

    Expand text, inverse summarizer

    IT WILL WORK WITH A JAVA DEVELOPMENT KIT 1.7 ONLY !!! This program is a data-miner and a knowledge-miner. It does exactly the opposite of what the text summarizers do. A text summarizer produces a shortened text given some text as an input. An inverse summarizer takes the shortened input, a similar or a same text and does the process in reverse. This results in an expanded text. It can be used with any text or notes that have the knowledge gaps.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Linux Guist - Multi Lingual OS for Asia

    Linux Guist - Multi Lingual OS for Asia

    A Single Click Language Changer and Publishing System for Web and DTP

    Linux Guist - is a Multi Lingual Live CD OS for most Asian Languages, with the ability to run of a CD & Old Hardware, with just 128 MB Memory, for DTP, Web Publishing & Data Entry purposes. This will help IT employers to take up Govt. Projects that require Data Collection, Entry & Publishing at a very very low cost, while providing Training & Job Opportunities to numerous students of these languages, in the various towns, of the country. Talk to your respective IT/HRD ministry to identify projects as well as Towns that have more Unemployed youths, to make a Difference in their Lives ! ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Unsupervised TXT classifier

    Unsupervised TXT classifier

    Classify any two TXT documents, no training required - JAVA

    This program is made to address two most common issues with the known classifying algorithms. First, over-training and second, shortage of data for a training of categories. Instead, each TXT file is a category on its own, rather than an assigned category. In a way, this is similar to clustering but not really a clustering algorithm since there is some training involved. The summarizer from Classifier4J has been adjusted to accept two inputs (lets call them A and B). Then, the summarizer gets trained with A to summarize a document B, and vice versa. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Perstem
    Perstem is a Persian (Farsi) stemmer, morphological analyzer, transliterator, and partial part-of-speech tagger. Inflexional morphemes are separated or removed from their stems. Perstem can also tokenize and transliterate between various character set encodings and romanizations.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22

    TML - Text Mining Library for LSA & CMM

    TML is a Java Library for LSA and extracting Concept Maps from text

    TML has moved to http://www.villalon.cl/tml.html and the code to https://github.com/villalon/tml
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23

    Hermes Natural Language Processing

    A repository of software, documentation and data for NLP

    Hermes is a repository of software, documentation and data for NLP. I am currently adding corpora extracted from Wikipedia (mostrly in Romance languages).
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24
    ValiTerms

    ValiTerms

    Validation of terms in corpus

    ValiTerms is a tool that helps the validation of terms in corpus. It finds their occurrences and allows terminologists to choose if a term is relevant or not. ValiTerms is developed at LIPN (http://www-lipn.univ-paris13.fr), RCLN team. Please consult the wiki for instructions about installation and usage.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    EyeMap - Eye Movement Data Analyzer
    EyeMap is a visualization and analysis tool for text reading eye movement data. It can process Unicode, proportion/non-proportion and spaced/unspaced reading materials, which supports various languages and experiment methods.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
MongoDB Logo MongoDB