Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Scientific/Engineering
Information Analysis Software
Search Results

Search Results for "document analysis"

x

Sort By:

Relevance

Clear All Filters

OS

Linux 18
Windows 18
Mac 17
More...
BSD 13
ChromeOS 12
Server Operating Systems 1

Category

Scientific/Engineering 19
- Information Analysis 19
- Linguistics 5
Artificial Intelligence 8
Business 5
Education 3
Software Development 3
Internet 2
System 2
Text Editors 2
Database 1
Formats and Protocols 1
Multimedia 1

License

OSI-Approved Open Source 15
Creative Commons Attribution License 2
Other License 2

Translations

English 6
French 1
German 1
Indonesian 1
More...
Japanese 1
Russian 1

Programming Language

Java 19
Prolog 2
C++ 1
PHP 1
Python 1
More...
XSL (XSLT/XPath/XSL-FO) 1

Status

Beta 10
Production/Stable 5
Alpha 2
Planning 1
More...
Pre-Alpha 1
Mature 1

Showing 19 open source projects for "document analysis"

View related business solutions

Information Analysis Java Clear Filters & Widen Search

$300 Free Credits to Build on Google Cloud
New to Google Cloud? Get $300 in credits to explore Compute Engine, BigQuery, Cloud Run, Gemini Enterprise Agent Platform, and more.

Start your next project with $300 in free Google Cloud credit. Spin up VMs, run containers, query petabytes in BigQuery, or build agents with Gemini Enterprise Agent Platform. Once your credits are used, keep building with 20+ always-free tier products including Compute Engine, Cloud Storage, GKE, and Cloud Run functions. No commitment required—just sign up and start building.

Claim $300 Free
Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure
Native application identity and user-based security for your Azure cloud

Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.

Get a free trial
1

DynaQ

Innovative text document search. http://dynaq.opendfki.de for details.

The goal of DynaQ is to develop an inquiry system to explore the personal information space, supporting you with the searching paradigm 'orienteering'. DynaQ is a (desktop)search engine with enhanced functionality for file, email and blog search. Look at our GitLab homepage for sourcecode and documentation: http://dynaq.opendfki.de

Downloads: 0 This Week

Last Update: 2021-08-05
See Project
2

jLDADMM

A Java package for the LDA and DMM topic models

The Java package jLDADMM is released to provide alternative choices for topic modeling on normal or short texts. It provides implementations of the Latent Dirichlet Allocation topic model and the one-topic-per-document Dirichlet Multinomial Mixture model (i.e. mixture of unigrams), using collapsed Gibbs sampling. In addition, jLDADMM supplies a document clustering evaluation to compare topic models. See the usage of jLDADMM in its website at http://jldadmm.sourceforge.net/

1 Review

Downloads: 0 This Week

Last Update: 2016-03-13
See Project
3

DCTFinder

Extract title and creation time from web page.

Web pages do not offer reliable metadata concerning their creation date and time. However, getting the document creation time is a necessary step for allowing to apply temporal normalization systems to web pages. DCTFinder is a system that parses a web page and extracts from its content the title and the creation date of this web page. DCTFinder combines heuristic title detection, supervised learning with Conditional Random Fields (CRFs) for document date extraction, and rule-based creation...

Downloads: 0 This Week

Last Update: 2016-10-21
See Project
4

SCAN

SCAN (Smart Content Aggregation and Navigation) is a universal semantic content aggregator. It combines search, text analysis, tagging and metadata functions to provide new user experience of desktop navigation and document management.

3 Reviews

Downloads: 2 This Week

Last Update: 2014-06-19
See Project
Our Free Plans just got better! | Auth0
With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.

Try free now
5

FALCON - Text Search Java Project

JSON based text search Java Project

----------------- - What is it? - ----------------- The "Falcon Search" is a JAVA API and tool to search inside the documents. It was originally started to search the content in pdf files under the project "HAWK Search". Searching with this tool is query-based not word-based as in most of the document search tools OR document readers. It also takes care of jumbling of words within query and spelling mistakes. Commonly used techniques in this project are Natural Language...

Downloads: 0 This Week

Last Update: 2014-04-18
See Project
6

Texalyzer

Text analyzer

Analyzes text document using TF-IDF and optionally stopword list, and extracts important keywords.

Downloads: 0 This Week

Last Update: 2017-04-04
See Project
7

Unsupervised TXT classifier

Classify any two TXT documents, no training required - JAVA

...In a way, this is similar to clustering but not really a clustering algorithm since there is some training involved. The summarizer from Classifier4J has been adjusted to accept two inputs (lets call them A and B). Then, the summarizer gets trained with A to summarize a document B, and vice versa. This extracts a relevant structure for both documents (and thus avoids the over-training) which are then compared using the Vector-Space analysis to give a range of belonging of one document to another (and thus avoids the shortage of information). This method can be used to create the user-defined classes by merging texts of certain categories and then to calculate the relevant distances between the documents, but this is not necessary.

Downloads: 0 This Week

Last Update: 2013-12-19
See Project
8

XmlView

GUI utility in pure Java for viewing and editing XML content; example of application built with Superficial http://superficial.sourceforge.net

Downloads: 0 This Week

Last Update: 2012-05-22
See Project
9

ILEDocs

ILEDocs is a documentation tool which helps the software developers to document their programs in a convenient way similar to javadoc.

Downloads: 0 This Week

Last Update: 2012-09-14
See Project
AI-powered service management for IT and enterprise teams
Enterprise-grade ITSM, for every business

Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.

Try it Free
10

OpenSHORE

OpenSHORE is an XML based Semantic Document Repository (SDR) with a free definable meta model that builds up a semantic network from sections and relations in documents. The acronym SHORE means Semantic Hypertext Object Repository.

Downloads: 0 This Week

Last Update: 2013-04-15
See Project
11

Maui Topic Indexer

Maui is a multi-purpose automatic topic indexing algorithm. Given a document, Maui automatically identifies its topics. Depending on the task topics are tags, keywords, keyphrases, vocabulary terms, descriptors or Wikipedia titles.

Downloads: 0 This Week

Last Update: 2014-04-25
See Project
12

RDF Document Manager

RDF-DocMan is a document manager based on a Sesame (RDF repository) backend. Documents are stored in the filesystem and their metadata in a Sesame repository. It was developed for porQual web content generator (also in sf.net).

Downloads: 0 This Week

Last Update: 2013-04-23
See Project
13

Trainable Relation Extraction framework

T-Rex (Trainable Relation Extraction) is a highly configurable machine learning-based Information Extraction from Text framework, which includes tools for document classification, entity extraction and relation extraction.

Downloads: 0 This Week

Last Update: 2013-05-02
See Project
14

iDocs

iDocs is a intellectual document work flow with text mining options project.

Downloads: 0 This Week

Last Update: 2014-04-08
See Project
15

Flesh

Flesh is a Java application designed to analyze a document (plain text, rich text, Word documents, and PDFs) and display the difficulty associated with comprehending using the Flesch-Kincaid Grade Level and the Flesch Reading Ease Score.

2 Reviews

Downloads: 3 This Week

Last Update: 2013-04-03
See Project
16

Qualiweb

Qualiweb aims at providing semantic web metrics for modeling a website visitors needs according to a given taxonomy or document classification. Web metrics provided by Qualiweb give an indication of how successful each of the website topics have been.

Downloads: 0 This Week

Last Update: 2013-03-19
See Project
17

vyasa

vyasa is a digital library application that incorporates the functions of digital asset and document management systems. It facilitates information retrieval and knowledge discovery by providing comprehensive metadata generation and semantic analysis.

Downloads: 0 This Week

Last Update: 2013-04-24
See Project
18

Phoenix Information Extraction

Phoenix is an information extraction engine written in java. Controlled by rules (declared in xml), it extracts information form any XML document (unstructured XHTML/OpenOffice documents). Supports XPath, additional conditions and top-down decomposit

Downloads: 3 This Week

Last Update: 2013-03-14
See Project
19

Judge

JUDGE (Java Utility for Document Genre Eduction) features automatic classification and clustering of documents, optionally as a webservice. The program is written entirely in Java and makes use of the Weka machine learning toolkit.

Downloads: 0 This Week

Last Update: 2015-12-01
See Project

Previous
You're on page 1
Next

Related Searches

document classification

latent dirichlet allocation

document management

war files

summarizer

xml merge

rdf

flesh

document analysis

search engine

Related Categories

Scientific/Engineering

Artificial Intelligence

Business

Education

Software Development

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise