file analysis free download

Showing 25 open source projects for "file analysis"

View related business solutions

Artificial Intelligence Java Clear Filters & Widen Search

Bright Data - All in One Platform for Proxies and Web Scraping
Say goodbye to blocks, restrictions, and CAPTCHAs

Bright Data offers the highest quality proxies with automated session management, IP rotation, and advanced web unlocking technology. Enjoy reliable, fast performance with easy integration, a user-friendly dashboard, and enterprise-grade scaling. Powered by ethically-sourced residential IPs for seamless web scraping.

Get Started
Deliver secure remote access with OpenVPN.
Trusted by nearly 20,000 customers worldwide, and all major cloud providers.

OpenVPN's products provide scalable, secure remote access — giving complete freedom to your employees to work outside the office while securely accessing SaaS, the internet, and company resources.

Get started — no credit card required.
1

Common Resource Grep - crgrep

Common Resource Grep

CRGREP searches for matching text in databases, various document formats, archives and other difficult to access resources. A command line tool for name and content text matching in database tables, plain files, MS Office documents, PDF, archives, MP3 audio, image meta-data, scanned documents, maven dependencies and web resources. CRGREP will search resources within resources of any arbitrary combination or depth, so text within a document within a zip archive, and so on. Here you...

3 Reviews

Downloads: 3 This Week

Last Update: 2023-04-23
See Project
2

DynaQ

Innovative text document search. http://dynaq.opendfki.de for details.

The goal of DynaQ is to develop an inquiry system to explore the personal information space, supporting you with the searching paradigm 'orienteering'. DynaQ is a (desktop)search engine with enhanced functionality for file, email and blog search. Look at our GitLab homepage for sourcecode and documentation: http://dynaq.opendfki.de

Downloads: 0 This Week

Last Update: 2021-08-05
See Project
3

DSTK - DataScience ToolKit

DSTK - DataScience ToolKit for All of Us

DSTK - DataScience ToolKit is an opensource free software for statistical analysis, data visualization, text analysis, and predictive analytics. Newer version and smaller file size can be found at: https://sourceforge.net/projects/dstk3/ It is designed to be straight forward and easy to use, and familar to SPSS user. While JASP offers more statistical features, DSTK tends to be a broad solution workbench, including text analysis and predictive analytics features. Of course you may specify...

Downloads: 0 This Week

Last Update: 2018-05-08
See Project
4

Twitter Research Data Collector

It gives facility of collecting tweets through Twitter Streaming API w.r.t different search criteria and to save tweets in CSV and ARFF (WEKA) file formats.

Downloads: 0 This Week

Last Update: 2016-10-16
See Project
Free CRM Software With Something for Everyone
216,000+ customers in over 135 countries grow their businesses with HubSpot

Think CRM software is just about contact management? Think again. HubSpot CRM has free tools for everyone on your team, and it’s 100% free. Here’s how our free CRM solution makes your job easier.

Get free CRM
5

Modular Audio Recognition Framework

MARF is a general cross-platform framework with a collection of algorithms for audio (voice, speech, and sound) and natural language text analysis and recognition along with sample applications (identification, NLP, etc.) of its use, implemented in Java.

3 Reviews

Downloads: 0 This Week

Last Update: 2015-10-06
See Project
6

SuperSenseTagger

The software annotates text with 41 broad semantic categories (Wordnet supersenses) for both nouns and verbs; i.e., it performs both sense disambiguation and named-entity recognition. The tagger implements a discriminatively-trained Hidden Markov Model.

Downloads: 0 This Week

Last Update: 2014-10-22
See Project
7

Flamingo Project

Workflow Designer, Hive Editor, Pig Editor, File System Browser

Flamingo is a open-source Big Data Platform that combine a Ajax Rich Web Interface + Workflow Engine + Workflow Designer + MapReduce + Hive Editor + Pig Editor. 1. Easy Tool for big data 2. Use comfortable in Hadoop EcoSystem projects 3. Based GPL V3 License Supporting Pig IDE, Hive IDE, HDFS Browser, Scheduler, Hadoop Job Monitoring, Workflow Engine, Workflow Designer, MapReduce.

3 Reviews

Downloads: 1 This Week

Last Update: 2016-11-29
See Project
8

Unsupervised TXT classifier

Classify any two TXT documents, no training required - JAVA

This program is made to address two most common issues with the known classifying algorithms. First, over-training and second, shortage of data for a training of categories. Instead, each TXT file is a category on its own, rather than an assigned category. In a way, this is similar to clustering but not really a clustering algorithm since there is some training involved. The summarizer from Classifier4J has been adjusted to accept two inputs (lets call them A and B). Then, the summarizer gets...

Downloads: 0 This Week

Last Update: 2013-12-19
See Project
9

DocCO

Non-disjoint groupping of Documents based on word sequence approach

This is a GUI for learning non disjoint groups of documents based on Weka machine learning framework. It offers the possibility to make non disjoint clustering of documents using both vectorial and sequential representation (word sequence approach based on WSK kernel). All data format supported by WEKA could be used in DocCO. Data could be loaded from files, from databases or from specified URL. All the preprocessing techniques implemented in WEKA could be used before performing the learning.

Downloads: 0 This Week

Last Update: 2013-08-17
See Project
Save hundreds of developer hours with components built for SaaS applications.
The #1 Embedded Analytics Solution for SaaS Teams.

Whether you want full self-service analytics or simpler multi-tenant security, Qrvey’s embeddable components and scalable data management remove the guess work.

Try Developer Playground
10

text-analysis

This project aims to implement in java the following text mining techniques: Text Language Detection, Keywords and keyphrases extraction, Text Classification, Text Clustering, Single or multiple documents Summarization, Plagiarism Detection.

Downloads: 0 This Week

Last Update: 2014-05-20
See Project
11

Deep Email Miner

The Deep Email Miner Application is a software solution for the multistaged analysis of an Email Corpus. Social network analysis and text mining techniques are connected to enable an in depth view into the underlying information. The self-executable Version 1.1 jar file will now run on Java 1.5 or higher. A Windows executable file of Version 1.1 is also provided in the Files section. Documentation can be found on the project homepage.

Downloads: 0 This Week

Last Update: 2013-04-25
See Project
12

TextMarker

TextMarker is now developed and hosted at Apache UIMA (http://uima.apache.org/textmarker.html). TextMarker is a UIMA-based tool for information extraction and more. The full featured editor of the rule language and the build process of UIMA descriptors are complemented with components for visualization, explanation, testing and rule learning.

1 Review

Downloads: 0 This Week

Last Update: 2013-04-29
See Project
13

Optex Analyzer

Optex Analyzer is a software to analyze and compare algorithms to solve approximately optimization problems. It has a GUI that allows select a set of input files containing raw algorithm results. The analysis is shown with tables and charts.

Downloads: 0 This Week

Last Update: 2013-04-24
See Project
14

D.U.C.K

D.U.C.K (Determine segmentation of Unknown words by using Context Knowledge)is an NLP tool, which aims to find the correct segmentation for unknown words in written Hebrew. Statistics from different scopes will be used to determine the segmentation.

Downloads: 0 This Week

Last Update: 2015-08-05
See Project
15

Evolving Game for Unnatural Intelligence

Java package to study a clustering model described in the paper \"Novel Clustering Algorithm Based Upon Games on Evolving Network\" by Q. Li, Z. Chen, Y. He and J-P. Jiang (in arxiv: http://arxiv.org/pdf/0812.5064v1), generalizations and similar issues.

Downloads: 0 This Week

Last Update: 2016-02-03
See Project
16

JWebPro: A Java Web Processing Toolkit

JWebPro: A Java tool that can interact with Google search and then process the returned Web documents in a couple of ways. The outputs can serve as inputs for NLP, IR, infor extraction, Web mining, online social network extraction/analysis applications.

Downloads: 2 This Week

Last Update: 2013-03-13
See Project
17

bios sequential tagger

Bios is a suite of syntactico-semantico analyzers that include the most common tools needed for the shallow analysis of English text.

Downloads: 0 This Week

Last Update: 2013-03-25
See Project
18

JVnSegmenter: Vietnamese Word Segmenter

JVnSegmenter is a Java-based and open-source Vietnamese word segmentation tool. The segmentation model was trained on about 8,000 sentences using Conditional Random Fields (FlexCRFs). This tool would be useful for Vietnamese NLP community.

Downloads: 1 This Week

Last Update: 2013-03-22
See Project
19

Qualiweb

Qualiweb aims at providing semantic web metrics for modeling a website visitors needs according to a given taxonomy or document classification. Web metrics provided by Qualiweb give an indication of how successful each of the website topics have been.

Downloads: 3 This Week

Last Update: 2013-03-19
See Project
20

JTextPro: A Java Text Processing Toolkit

JTextPro: A Java-based Text Processing tool that includes sentence boundary detection (using maximum entropy classifier), word tokenization (following Penn conventions), part-of-speech tagging (using CRFTagger), and phrase chunking (using CRFChunker).

Downloads: 1 This Week

Last Update: 2013-03-13
See Project
21

CRFChunker: CRF English Phrase Chunker

CRFChunker: Conditional Random Fields Phrase Chunker (Phrase Chunking Tool) for English. The model was trained on sections 01..24 of WSJ corpus and using section 00 as the development test set (F1-score of 95.77). Chunking speed: 700 sentences/s

Downloads: 0 This Week

Last Update: 2013-03-11
See Project
22

CRFTagger: CRF English POS Tagger

CRFTagger: Conditional Random Fields Part-of-Speech (POS) Tagger for English. The model was trained on sections 01..24 of WSJ corpus and using section 00 as the development test set (accuracy of 97.00%). Tagging speed: 500 sentences/s.

Downloads: 0 This Week

Last Update: 2013-03-25
See Project
23

lastcall

Application of neural network to predict the future of stock exchange, aerospace ( ufo trajectory ), sound, image, noise, scientific or medical data. You need , by example, 100 days of a numeric data and the number of days to predict (by example 7 days).

Downloads: 0 This Week

Last Update: 2015-06-21
See Project
24

AutoSummary Semantic Analysis Engine

AutoSummary uses Natural Language Processing to generate a contextually-relevant synopsis of plain text. It uses statistical and rule-based methods for part-of-speech tagging, word sense disambiguation, sentence deconstruction and semantic analysis.

1 Review

Downloads: 0 This Week

Last Update: 2013-03-25
See Project
25

Numerical Cruncher

Pattern recognition software package. It includes several classification and clustering algorithms. It can read data from a set of images, an ASCII file or a JDBC connection. A small TCP data server with its corresponding JDBC driver is included.

Downloads: 0 This Week

Last Update: 2013-02-25
See Project