Showing 17 open source projects for "documents"

View related business solutions
  • Atera all-in-one platform IT management software with AI agents Icon
    Atera all-in-one platform IT management software with AI agents

    Ideal for internal IT departments or managed service providers (MSPs)

    Atera’s AI agents don’t just assist, they act. From detection to resolution, they handle incidents and requests instantly, taking your IT management from automated to autonomous.
    Learn More
  • Total Network Visibility for Network Engineers and IT Managers Icon
    Total Network Visibility for Network Engineers and IT Managers

    Network monitoring and troubleshooting is hard. TotalView makes it easy.

    This means every device on your network, and every interface on every device is automatically analyzed for performance, errors, QoS, and configuration.
    Learn More
  • 1
    PDFMathTranslate

    PDFMathTranslate

    PDF scientific paper translation with preserved formats

    PDFMathTranslate is a Python-based tool that uses AI translation to convert academic PDFs into bilingual (e.g. Chinese-English) documents while preserving formatting, including math notation. It supports OCR-enhanced content and offers CLI, GUI, Docker, and Zotero integration under AGPL v3.
    Downloads: 17 This Week
    Last Update:
    See Project
  • 2

    multinotes

    Text architecture for music theory.

    The text structures of notes and publications in music theory and musical analysis bring challenging requirements: how to include music notation excerpts, graphics, and even combinations thereof, into the typeset flow of paragraphs and into the work-flow, and how to integrate navigable references to these and to single domain entities into running text. Furthermore, dynamic interactive documents can be useful for presenting complicated interdependencies to the reader more clearly, far beyond conventional paper publication. The mulitNotes text architecture and processing pipeline is based on d2d and standard technologies (XSLT, ECMAScript. LilyPond, PostScript, etc.) and addresses these issues. An overview about the software architecture and its operation is given in: Journal of the Text Encoding Initiative, Open Issue 18/2024: "Using d2d for Writing XML --- The multiNotes Text Architecture for Musical Analysis" https://doi.org/10.4000/132ex
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3

    TEI LingSIG

    Production space for the TEI Linguistics SIG

    This used to be the experimentation and production space for the Special Interest Group (SIG) of the Text Encoding Initiative (TEI) called "TEI for Linguists", LingSIG for short. Currently, this is a storage place for documents produced by the SIG. Use https://github.com/LingSIG to access the current production space.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    Edge Translate

    Edge Translate

    A translation extension

    ...We use the API provided by Google Translate to translate words and sentences, which guarantees the accuracy of translation results to a certain extent. We support the translation of wording in PDF files, which breaks the dyslexia of many users when reading PDF documents (due to the Firefox browser's bug, this feature is temporarily unavailable on Firefox browser). We chose the friendly side pop-up to show the translation results. The pop-up display bar will push the user reading content to avoid blocking the content from affecting the reading. Please refresh the page that needs to be translated after installation or update!
    Downloads: 1 This Week
    Last Update:
    See Project
  • D&B Hoovers is Your Sales Accelerator Icon
    D&B Hoovers is Your Sales Accelerator

    For sales teams that want to accelerate B2B sales with better data

    Speed up sales prospecting with the rich audience targeting capabilities of D&B Hoovers so you can spend more sales time closing.
    Learn More
  • 5

    MITRE Annotation Toolkit

    A toolkit for managing and manipulating text annotations

    The MITRE Annotation Toolkit (MAT) is a suite of tools which can be used for automated and human tagging of annotations. Annotation is a process, used mostly by researchers in natural language processing, of enhancing documents with information about the various phrase types the documents contain. MAT supports both UI interaction and command-line interaction, and provides various levels of control over the overall annotation process. It can be customized for specific tasks (e.g., named entity identification, de-identification of medical records). ...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 6
    TIES

    TIES

    A smart search engine for medical documents

    TIES (Text Information Extraction System) is a clinical text search engine that uses Natural Language Processing techniques to extract medical concepts from free text clinical reports. It provides secure de-identified access to this information and has in built collaboration tools and honest broker functionality. It is licensed for academic use under the BSD license. For commercial use please contact Nexi at http://nexihub.com *** NOTICE: this software and forum are no longer...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7

    Arabic Corpus

    Text categorization, arabic language processing, language modeling

    The Arabic Corpus {compiled by Dr. Mourad Abbas ( http://sites.google.com/site/mouradabbas9/corpora ) The corpus Khaleej-2004 contains 5690 documents. It is divided to 4 topics (categories). The corpus Watan-2004 contains 20291 documents organized in 6 topics (categories). Researchers who use these two corpora would mention the two main references: (1) For Watan-2004 corpus ---------------------- M. Abbas, K. Smaili, D. Berkani, (2011) Evaluation of Topic Identification Methods on Arabic Corpora,JOURNAL OF DIGITAL INFORMATION MANAGEMENT,vol. 9, N. 5, pp.185-192. 2) For Khaleej-2004 corpus --------------------------------- M. ...
    Leader badge
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    XML-Print

    XML-Print

    XML-Print: typesetting arbitrary XML documents in high quality

    "XML-Print" is a joint project of the FH Worms (Prof. Marc W. Küster) and the University of Trier (Prof. Claudine Moulin) with support from TU Darmstadt (Prof. Andrea Rapp). Its goal is the creation of a XML formatter designated especially for the needs of the “Digital Humanties”. The project is funded by the DFG. Please visit https://sites.google.com/a/budabe.eu/xmlprint_de/kontakt and let us know, what you think about XML-Print – Does it meet your expectations? – What is missing? –...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9

    BioC

    We describe a simple XML format to share text documents and annotation

    A minimalist approach to share text documents and data annotations. Allows a large number of different annotations to be represented. Project files contain: - simple code to hold/read/write data and perform sample processing. - BioC-formatted corpora - BioC tools that work with BioC corpora BioC goals - simplicity - interoperability - broad use - reuse There should be little investment required to learn to use a format or a software module to process that format. ...
    Leader badge
    Downloads: 0 This Week
    Last Update:
    See Project
  • Smart Business Texting that Generates Pipeline Icon
    Smart Business Texting that Generates Pipeline

    Create and convert pipeline at scale through industry leading SMS campaigns, automation, and conversation management.

    TextUs is the leading text messaging service provider for businesses that want to engage in real-time conversations with customers, leads, employees and candidates. Text messaging is one of the most engaging ways to communicate with customers, candidates, employees and leads. 1:1, two-way messaging encourages response and engagement. Text messages help teams get 10x the response rate over phone and email. Business text messaging has become a more viable form of communication than traditional mediums. The TextUs user experience is intentionally designed to resemble the familiar SMS inbox, allowing users to easily manage contacts, conversations, and campaigns. Work right from your desktop with the TextUs web app or use the Chrome extension alongside your ATS or CRM. Leverage the mobile app for on-the-go sending and responding.
    Learn More
  • 10
    FALCON - Text Search Java Project

    FALCON - Text Search Java Project

    JSON based text search Java Project

    ----------------- - What is it? - ----------------- The "Falcon Search" is a JAVA API and tool to search inside the documents. It was originally started to search the content in pdf files under the project "HAWK Search". Searching with this tool is query-based not word-based as in most of the document search tools OR document readers. It also takes care of jumbling of words within query and spelling mistakes. Commonly used techniques in this project are Natural Language Processing, Information Extraction and Question-Answering Architecture. ---------------------- - Latest Version - ---------------------- Details of latest version can be found on project website - http://geekdadaji.com --------------------------- - CONTACT DETAILS - --------------------------- CREATOR : SWAPNIL A JADHAV (saj1919) EMAIL ID : dadajibudhau@gmail.com WEBSITE : http://geekdadaji.com LICENSE : CC BY-NC 4.0
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Unsupervised TXT classifier

    Unsupervised TXT classifier

    Classify any two TXT documents, no training required - JAVA

    ...The summarizer from Classifier4J has been adjusted to accept two inputs (lets call them A and B). Then, the summarizer gets trained with A to summarize a document B, and vice versa. This extracts a relevant structure for both documents (and thus avoids the over-training) which are then compared using the Vector-Space analysis to give a range of belonging of one document to another (and thus avoids the shortage of information). This method can be used to create the user-defined classes by merging texts of certain categories and then to calculate the relevant distances between the documents, but this is not necessary.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12

    Language Constructor

    Complete tool for constructing/manipulating languages in digital form

    ...It allows for free experimentation of all aspects of the language, so it does not have to be made consistent on paper first. You can edit script, syntax, grammar, morphology, lexicon and phonology, as well as write documents in the language, as it might be too complex to be handled by current font technology. The information is stored in xml format for easy integration with other software.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Redundancy due to cut-paste operations in text creates bias in machine learning for NLP. This module takes a directory and produces a subset of the files in that directory (in a list) with an upper bound on similarity between two files.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    TikZ-dependency

    TikZ-dependency

    A LaTeX library to draw all sorts of dependency trees and graphs

    TikZ-dependency allows you to draw dependency graphs in LaTeX documents with little or no effort. The package has a very easy to learn, high level interface that can be used to draw simple dependency trees, complex non projective graphs, bubble parses, and in general any kind of graph which is based on a sequence of nodes and edges among these. It is based on PGF/TikZ and it can be used either with latex or pdflatex.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    A system to perform analysis of large documents for the purpose of cataloging similar documents. Similarity is based upon contextual analysis of these documents done by identifying common words and proper nouns.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    oopinyinguide
    OO Pinyin Guide is a Java extension for OpenOffice 3 or higher. It enables the user to add pinyin transliteration over Chinese characters inside a text document. This tool can be useful for people learning or teaching Chinese.
    Leader badge
    Downloads: 4 This Week
    Last Update:
    See Project
  • 17
    ConTextKit is a Java-based implementation of Wendy Chapman's ConText algorithm for annotating the context of medical documents, specifically the negation, temporality, and experiencer.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next