Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Scientific/Engineering
Information Analysis Software
Search Results

Search Results for "corpus"

x

Sort By:

Relevance

Clear All Filters

OS

Linux 19
Windows 13
Mac 12
More...
BSD 10
ChromeOS 8
Desktop Operating Systems 4
Server Operating Systems 1

Category

Scientific/Engineering 19
Artificial Intelligence 6
Education 3
Software Development 3
Formats and Protocols 1
Internet 1
Social sciences 1

License

OSI-Approved Open Source 19

Translations

English 4
Spanish 2
Chinese (Simplified) 1
French 1
More...
German 1
Italian 1
Korean 1
Portuguese 1
Turkish 1

Programming Language

Java 7
C++ 4
C 3
C# 2
More...
Lisp 1
PHP 1
Python 1
S/R 1

Status

Beta 7
Alpha 5
Production/Stable 5
Pre-Alpha 2

Showing 19 open source projects for "corpus"

View related business solutions

Information Analysis Linux Clear Filters & Widen Search

$300 Free Credits for Your Google Cloud Projects
Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.

Start Free Trial
Atera - an All-in-one platform for IT management
Ideal for IT departments and MSPs (managed service providers)

Your IT essentials, integrated & elevated. Take your IT management from automated to autonomous, download Atera's agent to start your free trial!

Try Atera now
1

iramuteq

IRAMUTEQ : Interface de R pour les Analyses Multidimensionnelles de Textes et de Questionnaires. Logiciel de traitement de données pour des corpus texte ou de type individus/caractères. Permet notamment de réaliser des analyses de type "ALCESTE"

Downloads: 637 This Week

Last Update: 2024-11-03
See Project
2

Application Generator for Stemmers

This is an application generator for conflation algorithms in perl language. This system supports generation perl source code for a stemmer from a rule file, running a stemmer which is supported by the system, parsing a corpus file.

Downloads: 0 This Week

Last Update: 2021-06-20
See Project
3

Korean Analyzer Rhino

Parsing Korean words by morpheme and part-of-speech

RHINO parses Korean words by morpheme and part-of-speech. Its dictionaries are based on Korean Modern Tagged Corpus(12 million phrases scale) which was made by Korean government. So it analyses many cases of stems and endings. And the newly developed Dynamic Dictionary Technology can make words to react with their context. That is, a programmed database. For more information see the files in the help folder.

Downloads: 2 This Week

Last Update: 2020-10-11
See Project
4

concordia

Powerful search library, best suited for computer-aided translation

...Concordance searcher - tool for translators who need their translations to "agree" with one standard. Concordia is a C++ library for fast text lookup in large corpora. It uses a RAM stored index, which takes up approximately 600MB of memory for a corpus of 2 million sentences. It is based on the idea of a suffix array, enhanced by the presence of other auxiliary data structures. The effects are stunning - Concordia is able to do simple substring lookup at the pace of 5000 queries per second (on personal PC) - a speed which can not be achieved by any other search library. Moreover, Concordia can perform its own "concordia search". ...

Downloads: 0 This Week

Last Update: 2019-02-28
See Project
MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
5

Bitextor

**CODE MOVED TO GITHUB: https://github.com/bitextor ** Bitextor is an application created to generate translation memories using multilingual websites as a corpus source. It downloads an entire website and applies a set of heuristics (based mainly on HTML tag structure and text block length) to find bitexts.

Downloads: 0 This Week

Last Update: 2018-04-17
See Project
6

Corpus Manager

Yet another corpus manager. Allows for HTTP access to annotated text corpora, client does not need to install any special software to access the server (any browser with JavaScript support will do).

Downloads: 0 This Week

Last Update: 2017-10-05
See Project
7

CRFSharp

CRFSharp is a .NET(C#) implementation of Conditional Random Field

...It encodes model parameters by L-BFGS. Moreover, it has many significant improvement than CRF++, such as totally parallel encoding, optimizing memory usage and so on. Currently, when training corpus, compared with CRF++, CRF# can make full use of multi-core CPUs and only uses very low memory, and memory grow is very smoothly and slowly while amount of training corpus, tags increase. with multi-threads process, CRF# is more suitable for large data and tags training than CRF++ now. For example, in machine with 64GB, CRF# encodes model with more than 4.5 hundred million features quickly.

Downloads: 0 This Week

Last Update: 2015-08-03
See Project
8

Corsis (formerly Tenka Text)

An open-source corpus analysis class library written in C#. GUI of Tenka Text 0.1.3 comes with Wordlister - an advanced, extremely fast graphical wordlist tool and a simple regex concordance tool. Tenka Text - the open-source answer to WordSmith Tool

Downloads: 1 This Week

Last Update: 2013-05-10
See Project
9

SALM

A toolkit with using Suffix Array indexing for empirical natural language processing. Providing functions such as searching the occurrences of n-grams in the corpus and suffix array language model which can use arbitrarily long history.

Downloads: 0 This Week

Last Update: 2015-04-16
See Project
Our Free Plans just got better! | Auth0
With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.

Try free now
10

LookIng4LO

This proyect presents a system, which, from a corpus of documents, extracts information about a theme area, and a pedagogical components collection. This information is packed into fine granularity learning objects (metadata included).

Downloads: 0 This Week

Last Update: 2013-04-08
See Project
11

Get 1T

Get1T is a tool for filtering through the massive quantity of data available in the Web 1T corpus and extracting only the counts you need - including for simple wildcard patterns.

Downloads: 0 This Week

Last Update: 2013-04-19
See Project
12

Top Ranked Phrases in a Corpus

This project is supposed to list the Top R ranked terms that are of between M and N length. It is designed to extract these phrases from a given corpus in a input folder.

Downloads: 0 This Week

Last Update: 2013-03-22
See Project
13

cl-cc-bnc

cl-cc-bnc provides a frontend to learners of English language. You can enter an URI, which will be analyzed word-frequency-wise and compared to word frequencies in the British National Corpus.

Downloads: 0 This Week

Last Update: 2014-05-05
See Project
14

BabyTALK

BabyTALK is to add another brick in the wall of natural languages learning. The baby needs to structure a corpus of texts when his tutor points and talks about a particular part of the corpus. The baby is also to describe any selected part of the corpus.

Downloads: 0 This Week

Last Update: 2016-08-22
See Project
15

CRFChunker: CRF English Phrase Chunker

CRFChunker: Conditional Random Fields Phrase Chunker (Phrase Chunking Tool) for English. The model was trained on sections 01..24 of WSJ corpus and using section 00 as the development test set (F1-score of 95.77). Chunking speed: 700 sentences/s

Downloads: 1 This Week

Last Update: 2013-03-11
See Project
16

CRFTagger: CRF English POS Tagger

CRFTagger: Conditional Random Fields Part-of-Speech (POS) Tagger for English. The model was trained on sections 01..24 of WSJ corpus and using section 00 as the development test set (accuracy of 97.00%). Tagging speed: 500 sentences/s.

Downloads: 0 This Week

Last Update: 2013-03-25
See Project
17

AmiGram

AmiGram is the AMI Graphical Representation and Annotation Module. It is a general-purpose tool for multimodal corpus annotation and allows the time line based annoation of NXT corpora in a layer based environment.

Downloads: 0 This Week

Last Update: 2013-03-08
See Project
18

corpuscifre

Italian labeled digits corpus, good for speech recognition. Corpus di cifre italiane segmentato, adatto a esperimenti di riconoscimento vocale e riconoscimento fonetico.

Downloads: 0 This Week

Last Update: 2016-08-20
See Project
19

Hybrid parser for French

TagHybrida is a French hybrid syntactic parser. TagHybrida is a four stage parser combining hand-writen and corpus based information.

Downloads: 0 This Week

Last Update: 2016-06-02
See Project

Previous
You're on page 1
Next

Related Searches

iramuteq for windows

iramuteq

iramuteq-0.7-alpha2_2020

windows

software para análise qualitativa

1.4.5

bitextor

bitextor 4.1

crf++

concordance

Related Categories

Scientific/Engineering

Artificial Intelligence

Education

Software Development

Formats and Protocols

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise