documents free download

Showing 34 open source projects for "documents"

View related business solutions

Artificial Intelligence Java Clear Filters & Widen Search

8 Monitoring Tools in One APM. Install in 5 Minutes.
Errors, performance, logs, uptime, hosts, anomalies, dashboards, and check-ins. One interface.

AppSignal works out of the box for Ruby, Elixir, Node.js, Python, and more. 30-day free trial, no credit card required.

Start Free
MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
1

GROBID

A machine learning software for extracting information

GROBID is a machine learning library for extracting, parsing, and re-structuring raw documents such as PDF into structured XML/TEI encoded documents with a particular focus on technical and scientific publications. First developments started in 2008 as a hobby. In 2011 the tool has been made available in open source. Work on GROBID has been steady as a side project since the beginning and is expected to continue as such.

Downloads: 2 This Week

Last Update: 2026-04-07
See Project
2

Discourse Network Analyzer (DNA)

Discourse Network Analyzer (DNA)

The Java software Discourse Network Analyzer (DNA) is a qualitative content analysis tool with network export facilities. You import text files and annotate statements that persons or organizations make, and the program will return network matrices of actors connected by shared concepts.

Downloads: 5 This Week

Last Update: 2024-08-20
See Project
3

OpenKM Document Management - DMS

Document Management System and Content Management System

...Collaborate with colleagues on documents and projects. Capitalize on accumulated knowledge by locating documents and information sources. Control business processes with an embedded workflow engine. Automate tasks. For a complete feature list visit: http://goo.gl/au8cQy

32 Reviews

Downloads: 261 This Week

Last Update: 2026-04-17
See Project
4

CCIL

A SOA framework for web content classification, clustering and automated interlinking of terms between documents. Will provide an expandable set of services such as semantic search, ranking, retrieval and classification of large scale web resources.

Downloads: 0 This Week

Last Update: 2026-03-26
See Project
Gemini 3 and 200+ AI Models on One Platform
Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.

Start Free
5

TXM

Unicode XML TEI text analysis platform

TXM is a free and open-source cross-platform Unicode & XML based text analysis environment and graphical client, supporting Windows, Linux and Mac OS X. It can also be used online as a J2EE standard compliant web portal (GWT based) with access control built in. DOWNLOAD LATEST VERSION OF TXM : http://textometrie.ens-lyon.fr/spip.php?rubrique61&lang=en TXM offers a comprehensive range of analysis tools (concordances, collocate search, frequency lists, etc.) based on the powerfull CQP...

Downloads: 2 This Week

Last Update: 2024-12-09
See Project
6

Common Resource Grep - crgrep

Common Resource Grep

CRGREP searches for matching text in databases, various document formats, archives and other difficult to access resources. A command line tool for name and content text matching in database tables, plain files, MS Office documents, PDF, archives, MP3 audio, image meta-data, scanned documents, maven dependencies and web resources. CRGREP will search resources within resources of any arbitrary combination or depth, so text within a document within a zip archive, and so on. Here you will find binary downloads and discussion (https://sourceforge.net/p/crgrep/discussion/) . ...

3 Reviews

Downloads: 0 This Week

Last Update: 2023-04-23
See Project
7

Next Generation Programming

Compose Software Without Writing Any Programing Code

...Please start by looking at examples from the website first. In this way, you can learn the features of the software and how to use the software in a very short time. More Information (Documents, Videos, Examples ...) : negep.epizy.com

2 Reviews

Downloads: 1 This Week

Last Update: 2022-01-14
See Project
8

TIES

A smart search engine for medical documents

TIES (Text Information Extraction System) is a clinical text search engine that uses Natural Language Processing techniques to extract medical concepts from free text clinical reports. It provides secure de-identified access to this information and has in built collaboration tools and honest broker functionality. It is licensed for academic use under the BSD license. For commercial use please contact Nexi at http://nexihub.com *** NOTICE: this software and forum are no longer...

1 Review

Downloads: 0 This Week

Last Update: 2019-09-09
See Project
9

Service Grid - Language Grid Base System

SOA infrastracture initially developed by NICT Language Grid Project

...Resources with complicated intellectual property issues are wrapped as Web services and shared on the Service Grid. If you release your software by using the software of this project, please include the following description in the documents or on the website. * This software uses the [SOFTWARE] by the Language Grid project (http://langrid.org/). [SOFTWARE] is one of: * Service Grid Server Software (http://langrid.org/oss-project/en/service_grid.html) * Language Service Development Libraries (http://langrid.org/oss-project/en/language_service.html) * Language Grid Toolbox (http://langrid.org/oss-project/en/toolbox.html) If you publish a paper by using the software of this project, please cite the following book...

Downloads: 0 This Week

Last Update: 2017-11-26
See Project
Full-stack observability with actually useful AI | Grafana Cloud
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.

Create free account
10

BioC

We describe a simple XML format to share text documents and annotation

A minimalist approach to share text documents and data annotations. Allows a large number of different annotations to be represented. Project files contain: - simple code to hold/read/write data and perform sample processing. - BioC-formatted corpora - BioC tools that work with BioC corpora BioC goals - simplicity - interoperability - broad use - reuse There should be little investment required to learn to use a format or a software module to process that format. ...

Downloads: 8 This Week

Last Update: 2016-08-08
See Project
11

Carrot2

Project moved to GitHub! https://github.com/carrot2/carrot2 Carrot2 is an Open Source Search Results Clustering Engine. It can automatically organize small collections of documents, e.g. search results, into thematic categories. Carrot2 integrates very well with both Open Source and proprietary search engines.

Downloads: 0 This Week

Last Update: 2015-09-10
See Project
12

DJVU++

The DjVu complete solution,with OCR Technology(Arabic ,English).

...The main features of DjVu++ program are: o Manipulate DjVu files. o Support smaller size than PDF with the same performance. o DjVu++ supports two languages in the OCR technique (Arabic and English). o Read multiple documents at the same time with the new tabs feature. o DjVu++ supports multiple formats:  Convert PDF document into DjVu format with smaller file size and the same performance.  Convert DjVu into PDF format.  Combine images to a single DjVu document. Perform OCR operations on multiple image formats.

4 Reviews

Downloads: 7 This Week

Last Update: 2015-08-24
See Project
13

DjVuPlus

DjVu Read Documents,With OCR Technology(Arabic ,English ),Small Size

The DjVu Reference Library 3.5 was released by Lizardtech under the GNU General Public License version 2. DjVuLibre-3.5 was developed by Leon Bottou and others as a "Derived Work" of the DjVu Reference Library 3.5. As such, it is also subject to the GNU General Public License version 2. Several patents apply to two very specific aspects of DjVu and DjVuLibre. The patents cover a particular aspect of the ZP-coder (the arithmetic coder used in DjVu and implemented in libdjvu/ZPCodec.cpp)...

1 Review

Downloads: 0 This Week

Last Update: 2015-06-25
See Project
14

OpenIMAJ

OpenIMAJ: The Open toolkit for Intelligent Multimedia Analysis in Java. OpenIMAJ contains a large collection of pure-Java classes for analysing multimedia documents, from tools for extracting image features, to tools for analysing web pages.

Downloads: 0 This Week

Last Update: 2015-03-18
See Project
15

FALCON - Text Search Java Project

JSON based text search Java Project

----------------- - What is it? - ----------------- The "Falcon Search" is a JAVA API and tool to search inside the documents. It was originally started to search the content in pdf files under the project "HAWK Search". Searching with this tool is query-based not word-based as in most of the document search tools OR document readers. It also takes care of jumbling of words within query and spelling mistakes. Commonly used techniques in this project are Natural Language Processing, Information Extraction and Question-Answering Architecture. ---------------------- - Latest Version - ---------------------- Details of latest version can be found on project website - http://geekdadaji.com --------------------------- - CONTACT DETAILS - --------------------------- CREATOR : SWAPNIL A JADHAV (saj1919) EMAIL ID : dadajibudhau@gmail.com WEBSITE : http://geekdadaji.com LICENSE : CC BY-NC 4.0

Downloads: 0 This Week

Last Update: 2014-04-18
See Project
16

Consilium Sentence Suggestions Tools

Consilium – User Defined sentence Suggestion Tool.

There are many tools available in market which will provide spell correction or grammer correction while making documents, but very few tools are available which are providing sentence completion according to previously entered text. But this all are providing sentence complition suggestion for sentences which are oftenly or regularly used by all people in same manner. But in reality style of writing changes person to person. While our aim is to provide a sentence suggestion tool which will give suggestion to complete the sentence according previously enterd data by the user. ...

Downloads: 0 This Week

Last Update: 2014-02-24
See Project
17

Unsupervised TXT classifier

Classify any two TXT documents, no training required - JAVA

...The summarizer from Classifier4J has been adjusted to accept two inputs (lets call them A and B). Then, the summarizer gets trained with A to summarize a document B, and vice versa. This extracts a relevant structure for both documents (and thus avoids the over-training) which are then compared using the Vector-Space analysis to give a range of belonging of one document to another (and thus avoids the shortage of information). This method can be used to create the user-defined classes by merging texts of certain categories and then to calculate the relevant distances between the documents, but this is not necessary.

Downloads: 0 This Week

Last Update: 2013-12-19
See Project
18

DocCO

Non-disjoint groupping of Documents based on word sequence approach

This is a GUI for learning non disjoint groups of documents based on Weka machine learning framework. It offers the possibility to make non disjoint clustering of documents using both vectorial and sequential representation (word sequence approach based on WSK kernel). All data format supported by WEKA could be used in DocCO. Data could be loaded from files, from databases or from specified URL.

Downloads: 0 This Week

Last Update: 2013-08-17
See Project
19

RapidMiner Information Extraction Plugin

The Information Extraction Plugin allows the use of information extraction techniques within RapidMiner. It can be seen as an interface between natural language and IE- or datamining-methods, by extracting interesting information out of documents.

Downloads: 0 This Week

Last Update: 2015-08-07
See Project
20

TestEl

TestEl is a Java-based learning analyzer for HTML (and possibly other) structured documents. It can be trained to detect structures in such documents and renders hits in XML.

1 Review

Downloads: 0 This Week

Last Update: 2014-06-09
See Project
21

Topic Modeling Tool

A graphical tool to discover topics from collections of text documents.

Downloads: 0 This Week

Last Update: 2016-05-24
See Project
22

text-analysis

This project aims to implement in java the following text mining techniques: Text Language Detection, Keywords and keyphrases extraction, Text Classification, Text Clustering, Single or multiple documents Summarization, Plagiarism Detection.

Downloads: 0 This Week

Last Update: 2014-05-20
See Project
23

ANts P2P

ANts P2P realizes a third generation P2P net. It protects your privacy while you are connected and makes you not trackable, hiding your identity (ip) and crypting everything you are sending/receiving from others.

20 Reviews

Downloads: 10 This Week

Last Update: 2013-04-15
See Project
24

Karol Zalewski - Master's Thesis

Master's Thesis subject: "Knowledge repositories for effective and secure services executing in agent environment." Goal: Developing optimal method for storing knowledge in distributed agent applications. Java code + LaTeX documents.

Downloads: 0 This Week

Last Update: 2013-05-15
See Project
25

TableSeer

TableSeer is a tool that automatically identifies tables in digital documents and extracts the contents in the cells of the tables as well as table metadata

1 Review

Downloads: 0 This Week

Last Update: 2016-10-07
See Project