Showing 8 open source projects for "corpus"

View related business solutions
  • Secure File Transfer for Windows with Cerberus by Redwood Icon
    Secure File Transfer for Windows with Cerberus by Redwood

    Protect and share files over FTP/S, SFTP, HTTPS and SCP with the #1 rated Windows file transfer server.

    Cerberus supports unlimited users and connections on a single IP, with built-in encryption, 2FA, and a browser-based web client — all deployable in under 15 minutes with a 25-day free trial.
    Try for Free
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 1
    Software, information, data sets and documentation for the Web as Corpus community.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    concordia

    concordia

    Powerful search library, best suited for computer-aided translation

    ...Concordance searcher - tool for translators who need their translations to "agree" with one standard. Concordia is a C++ library for fast text lookup in large corpora. It uses a RAM stored index, which takes up approximately 600MB of memory for a corpus of 2 million sentences. It is based on the idea of a suffix array, enhanced by the presence of other auxiliary data structures. The effects are stunning - Concordia is able to do simple substring lookup at the pace of 5000 queries per second (on personal PC) - a speed which can not be achieved by any other search library. Moreover, Concordia can perform its own "concordia search". ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3

    rcqp

    R interface to the Corpus Query Protocol

    Implements the Corpus Query Protocol as a package for the R statistical environment. It allows to query linguistic corpora and manipulate the data as native R objects. It is based on the CWB software.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4

    mwetoolkit

    THIS PROJECT MIGRATED TO https://gitlab.com/mwetoolkit/mwetoolkit3/

    ...These include idioms (kick the bucket), noun compounds (cable car), phrasal verbs (take off, give up), etc. Even though it focuses on multiword expresisons, the framework is quite complete and can also be useful in any corpus-based study in computational linguistics. The mwetoolkit can be applied to virtually any text collection, language, and MWE type. It is a command-line tool written mostly in Python. Its development started in 2010 as a PhD thesis but the project keeps active (see the SVN logs). Up-to-date documentation and details about the tool can be found on the mwetoolkit website: http://mwetoolkit.sourceforge.net/
    Downloads: 0 This Week
    Last Update:
    See Project
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 5
    This is a library to extract raw unicode text from any written documents (office documents such as PDF, Word, OpenOffice, ...). It should be useful to developpers of search engine, text processing, corpus analysis, ....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Get1T is a tool for filtering through the massive quantity of data available in the Web 1T corpus and extracting only the counts you need - including for simple wildcard patterns.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    This project is supposed to list the Top R ranked terms that are of between M and N length. It is designed to extract these phrases from a given corpus in a input folder.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    reputron is a knowledge extraction engine platform that covers all aspect of text mining, relevance, indexing and querying on a corpus of text documents.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
Auth0 Logo