Join/Login
Open Source Software
Business Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Open Source Software

Business Software

SourceForge Podcast

Articles
Case Studies
Learn
Blog

Menu

Help
Create
Join
Login

Home
Browse Open Source
Search Results

Search Results for "corpus linguistics" - Page 2

x

Sort By:

Relevance

OS

Linux 35
Mac 29
Windows 29
More...
BSD 14
ChromeOS 8
Desktop Operating Systems 4
Mobile Operating Systems 3

Category

Scientific/Engineering 37
Artificial Intelligence 13
Software Development 4
Business 3
Education 3
Communications 1
Formats and Protocols 1
Multimedia 1
Social sciences 1
Text Editors 1

License

OSI-Approved Open Source 30
Creative Commons Attribution License 5
Other License 3
Public Domain 2

Translations

English 11
French 3
Arabic 2
Chinese (Simplified) 1
More...
Dutch 1
Korean 1
Portuguese 1
Russian 1
Spanish 1

Programming Language

Java 12
Python 8
C 4
C++ 3
More...
JavaScript 3
Perl 3
S/R 3
Unix Shell 2
C# 1
Groovy 1
Pascal 1
PHP 1
Ruby 1
XSL (XSLT/XPath/XSL-FO) 1

Status

Alpha 9
Beta 9
Production/Stable 8
Planning 2
More...
Inactive 2
Pre-Alpha 1

Showing 41 open source projects for "corpus linguistics"

View related business solutions

Top-Rated Free CRM Software
216,000+ customers in over 135 countries grow their businesses with HubSpot

HubSpot is an AI-powered customer platform with all the software, integrations, and resources you need to connect your marketing, sales, and customer service. HubSpot's connected platform enables you to grow your business faster by focusing on what matters most: your customers.

Get started free
Multi-Site Network and Cloud Connectivity for Businesses
Internet connectivity without complexity

As your users rely more and more on Cloud and Internet-based technologies, reliable internet connectivity becomes more and more important to your business. With Bigleaf’s proven SD-WAN architecture, groundbreaking AI, and DDoS attack mitigation, you can finally deliver the reliable internet connectivity your business needs without the limitations of traditional networking platforms. Bigleaf’s Cloud Access Network and plug-and-play router allow for limitless control to and from anywhere your traffic needs to go. Bigleaf’s self-driving AI automatically identifies and adapts to any changing circuit conditions and traffic needs—addressing issues before they impact your users. Bigleaf puts you in the driver’s seat of every complaint and support call with full-path traffic and network performance data, delivered as actionable insights, reports, and alerts.

Learn More
1

Pacx

Platform for Annotated Corpora in XML Integrated tool for corpus linguists built on Eclipse, Vex, Subversive, etc. for creating and editing transcriptions and annotations, querying, managing version controlled data, and building a shippable corpus.

Downloads: 0 This Week

Last Update: 2014-03-15
See Project
2

TF-IDF Measure

TF-IDF.jar is a Java Archive file to measure TF-IDF of each document in a document collection (corpus). The jar can be used to (a) get all the terms in the corpus (b) get the document frequency (DF) and inverse document frequency (IDF) of all the terms in the corpus (c) get the TF-IDF of each document in the corpus (d) get each term with their frequency (no. of presence), term frequency (TF) and TF-IDF in every document

Downloads: 0 This Week

Last Update: 2015-12-17
See Project
3

Uplug corpus tools

Various tools for creating annotated parallel corpora including pre-trained tagging and parsing models for various languages, sentence alignment tools and word alignment tools. Uplug also includes a web-based interface for interactive sentence and word alignment and scripts for indexing and querying parallel corpora using the Corpus Work Bench CWB. Download 'uplug-main' first and then add other packages.

Downloads: 1 This Week

Last Update: 2013-04-29
See Project
4

ValiTerms

Validation of terms in corpus

ValiTerms is a tool that helps the validation of terms in corpus. It finds their occurrences and allows terminologists to choose if a term is relevant or not. ValiTerms is developed at LIPN (http://www-lipn.univ-paris13.fr), RCLN team. Please consult the wiki for instructions about installation and usage.

Downloads: 0 This Week

Last Update: 2015-10-06
See Project
Manage your IT department more effectively
Streamline your business from end to end with ConnectWise PSA

ConnectWise PSA (formerly Manage) allows you to stop working in separate systems, and helps you build a more profitable business. No more duplicate data entries, inefficient employees, manual invoices, and the inability to accurately track client service issues. Get a behind the scenes look into the award-winning PSA that automates processes for each area of business: sales, help desk, support, finance, and HR.

Learn More
5

Corpus redundancy manager

Redundancy due to cut-paste operations in text creates bias in machine learning for NLP. This module takes a directory and produces a subset of the files in that directory (in a list) with an upper bound on similarity between two files.

Downloads: 0 This Week

Last Update: 2014-06-30
See Project
6

Australian National Corpus

An ongoing project to collate and provide access to language data

Includes • Scripts for the program/ code developed • High level architecture diagrams • Install guides for developers • Links to end user documentation on the AusNC website Note: The BSD license applies to customised plug-ins, scripts and ingest programs developed by the AusNC project team. Additional open source, 3rd party software products used by the AusNC solution are referenced on our SF wiki space.

Downloads: 0 This Week

Last Update: 2016-11-29
See Project
7

WebSynonymExtractor

a synonym extractor based on web-corpora and a multilingual translator

This project is an approach for synonym extraction and extending WordNet by the so found synonyms. The python application is realised as a kind of pipe that starts with a web-corpus-reader which is followed by several workers (tokenizers, lemmatizers, ...) and finally completed by a result writer. In contrast to the state of the art approaches, this implementation is based on single words found in the web used as a corpus and translated to other languages. If translations of different...

Downloads: 0 This Week

Last Update: 2016-11-18
See Project
8

CRFSharp

CRFSharp is a .NET(C#) implementation of Conditional Random Field

... encoding, optimizing memory usage and so on. Currently, when training corpus, compared with CRF++, CRF# can make full use of multi-core CPUs and only uses very low memory, and memory grow is very smoothly and slowly while amount of training corpus, tags increase. with multi-threads process, CRF# is more suitable for large data and tags training than CRF++ now. For example, in machine with 64GB, CRF# encodes model with more than 4.5 hundred million features quickly.

Downloads: 0 This Week

Last Update: 2015-08-03
See Project
9

CorpSe

CORPSE (CORPus SEarch) is a powerful search engine written in Java. The aim is to provide an efficient implementation of a word level inverted index search with various cool functions that can be used on very large corpora.

1 Review

Downloads: 0 This Week

Last Update: 2013-04-26
See Project
Gain insights and build data-powered applications
Your unified business intelligence platform. Self-service. Governed. Embedded.

Chat with your business data with Looker. More than just a modern business intelligence platform, you can turn to Looker for self-service or governed BI, build your own custom applications with trusted metrics, or even bring Looker modeling to your existing BI environment.

Try it free
10

ug

Ug is a program to generate pseudo-realistic usernames using a categorised BNC (British National Corpus) wordlist. It combines words using pre-defined strategies such as adjective-noun or adjective-conjunction-adjective.

Downloads: 0 This Week

Last Update: 2013-04-24
See Project
11

PyAnnotation

PyAnnotation is a Python Library to access and manipulate linguistically annotated corpus files. Supported file formats are Kura XML, Elan XML and Toolbox files. A Corpus Reader API is provided to support statistical analysis within the NLTK.

Downloads: 0 This Week

Last Update: 2013-04-29
See Project
12

MedTag - Annotated Corpora

A database of linguistic annotation of medical text (from MEDLINE), including corpora used with ABGene, BioCreative I and II, and the MedPost training corpus.

Downloads: 0 This Week

Last Update: 2014-02-05
See Project
13

ClipSyll

Clipsyll is a collection of scripts and programs for dowloading, codifying, analysing (using NLTK) CLIPS, the largest Italian corpus of spoken language. It includes a syllabification module based on the SSP: http://sourceforge.net/projects/silly

Downloads: 0 This Week

Last Update: 2013-04-02
See Project
14

Cunei Machine Translation Platform

Cunei is a data-driven machine translation system that builds dynamic, statistical models based on instances of known translations found in a corpus.

1 Review

Downloads: 0 This Week

Last Update: 2013-06-05
See Project
15

Sanchay

Sanchay is a collection of tools and APIs for language researchers. It has some implementations of NLP algorithms, some flexible APIs, several user friendly annotation interfaces and Sanchay Query Language for language resources.

Downloads: 0 This Week

Last Update: 2013-04-11
See Project
16

NetLing Internet Corpus Linguistics

An Internet Corpus Linguistics analysis tool focusing on lexical variation over time and by geographical location. Initially the project will analyze word frequencies over time in the Linux Kernel Mailing List.

1 Review

Downloads: 0 This Week

Last Update: 2014-03-31
See Project

Previous
1
You're on page 2
Next

Related Searches

neural machine translation

resource viewer

cmdline-jmxclient-0.10.4.jar

british national corpus

Related Categories

Scientific/Engineering

Artificial Intelligence

Software Development

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
225 Broadway Suite 1600
San Diego, CA 92101
+1 (858) 454-5900

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2024 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise

Thanks for helping keep SourceForge clean.

X

You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

Briefly describe the problem (required):

Upload screenshot of ad (required):

Select a file, or drag & drop file here.

✔

✘

Screenshot instructions:

Click URL instructions:
Right-click on the ad, choose "Copy Link", then paste here →
(This may not be possible with some types of ads)

More information about our ad policies

Ad destination/click URL: