corpus free download - SourceForge

iramuteq

IRAMUTEQ : Interface de R pour les Analyses Multidimensionnelles de Textes et de Questionnaires. Logiciel de traitement de données pour des corpus texte ou de type individus/caractères. Permet notamment de réaliser des analyses de type "ALCESTE"

Downloads: 655 This Week

Last Update: 2024-11-03

See Project

Application Generator for Stemmers

This is an application generator for conflation algorithms in perl language. This system supports generation perl source code for a stemmer from a rule file, running a stemmer which is supported by the system, parsing a corpus file.

Downloads: 0 This Week

Last Update: 2021-06-20

See Project

CRFSharp

CRFSharp is a .NET(C#) implementation of Conditional Random Field

...It encodes model parameters by L-BFGS. Moreover, it has many significant improvement than CRF++, such as totally parallel encoding, optimizing memory usage and so on. Currently, when training corpus, compared with CRF++, CRF# can make full use of multi-core CPUs and only uses very low memory, and memory grow is very smoothly and slowly while amount of training corpus, tags increase. with multi-threads process, CRF# is more suitable for large data and tags training than CRF++ now. For example, in machine with 64GB, CRF# encodes model with more than 4.5 hundred million features quickly.

Downloads: 0 This Week

Last Update: 2015-08-03

See Project

Corsis (formerly Tenka Text)

An open-source corpus analysis class library written in C#. GUI of Tenka Text 0.1.3 comes with Wordlister - an advanced, extremely fast graphical wordlist tool and a simple regex concordance tool. Tenka Text - the open-source answer to WordSmith Tool

Downloads: 1 This Week

Last Update: 2013-05-10

See Project

SALM

A toolkit with using Suffix Array indexing for empirical natural language processing. Providing functions such as searching the occurrences of n-grams in the corpus and suffix array language model which can use arbitrarily long history.

Downloads: 0 This Week

Last Update: 2015-04-16

See Project

LookIng4LO

This proyect presents a system, which, from a corpus of documents, extracts information about a theme area, and a pedagogical components collection. This information is packed into fine granularity learning objects (metadata included).

Downloads: 0 This Week

Last Update: 2013-04-08

See Project

Top Ranked Phrases in a Corpus

This project is supposed to list the Top R ranked terms that are of between M and N length. It is designed to extract these phrases from a given corpus in a input folder.

Downloads: 0 This Week

Last Update: 2013-03-22

See Project

BabyTALK

BabyTALK is to add another brick in the wall of natural languages learning. The baby needs to structure a corpus of texts when his tutor points and talks about a particular part of the corpus. The baby is also to describe any selected part of the corpus.

Downloads: 0 This Week

Last Update: 2016-08-22

See Project

CRFChunker: CRF English Phrase Chunker

CRFChunker: Conditional Random Fields Phrase Chunker (Phrase Chunking Tool) for English. The model was trained on sections 01..24 of WSJ corpus and using section 00 as the development test set (F1-score of 95.77). Chunking speed: 700 sentences/s

Downloads: 1 This Week

Last Update: 2013-03-11

See Project

CRFTagger: CRF English POS Tagger

CRFTagger: Conditional Random Fields Part-of-Speech (POS) Tagger for English. The model was trained on sections 01..24 of WSJ corpus and using section 00 as the development test set (accuracy of 97.00%). Tagging speed: 500 sentences/s.

Downloads: 0 This Week

Last Update: 2013-03-25

See Project

AmiGram

AmiGram is the AMI Graphical Representation and Annotation Module. It is a general-purpose tool for multimodal corpus annotation and allows the time line based annoation of NXT corpora in a layer based environment.

Downloads: 0 This Week

Last Update: 2013-03-08

See Project

corpuscifre

Italian labeled digits corpus, good for speech recognition. Corpus di cifre italiane segmentato, adatto a esperimenti di riconoscimento vocale e riconoscimento fonetico.

Downloads: 0 This Week

Last Update: 2016-08-20

See Project

Hybrid parser for French

TagHybrida is a French hybrid syntactic parser. TagHybrida is a four stage parser combining hand-writen and corpus based information.

Downloads: 0 This Week

Last Update: 2016-06-02

See Project

Search Results for "corpus"

Showing 13 open source projects for "corpus"

iramuteq

Application Generator for Stemmers

CRFSharp

Corsis (formerly Tenka Text)

SALM

LookIng4LO

Top Ranked Phrases in a Corpus

BabyTALK

CRFChunker: CRF English Phrase Chunker

CRFTagger: CRF English POS Tagger

AmiGram

corpuscifre

Hybrid parser for French

Search Results for "corpus"

Showing 13 open source projects for "corpus"

iramuteq

Application Generator for Stemmers

CRFSharp

Corsis (formerly Tenka Text)

SALM

LookIng4LO

Top Ranked Phrases in a Corpus

BabyTALK

CRFChunker: CRF English Phrase Chunker

CRFTagger: CRF English POS Tagger

AmiGram

corpuscifre

Hybrid parser for French

Related Searches

Related Categories