Page 3 | Best Open Source Java Machine Learning Software

BorderFlow

BorderFlow implements a general-purpose graph clustering algorithm. It maximizes the inner to outer flow ratio from the border of each cluster to the rest of the graph.

Downloads: 0 This Week

Last Update: 2013-04-18

See Project

Ceka

Crowd Environment and its Knowledge Analysis

A knowledge analysis tool for crowdsourcing based on Weka. We also have a Python version of Crowdsourcing Learning: CrowdwiseKit on GitHub (https://github.com/tssai-lab/CrowdwiseKit).

Downloads: 0 This Week

Last Update: 2023-04-20

See Project

Chordalysis

Log-linear analysis (data modelling) for high-dimensional data

===== Project moved to https://github.com/fpetitjean/Chordalysis ===== Log-linear analysis is the statistical method used to capture multi-way relationships between variables. However, due to its exponential nature, previous approaches did not allow scale-up to more than a dozen variables. We present here Chordalysis, a log-linear analysis method for big data. Chordalysis exploits recent discoveries in graph theory by representing complex models as compositions of triangular structures, also known as chordal graphs. Chordalysis makes it possible to discover the structure of datasets with thousands of variables on a standard desktop computer. Associated papers at ICDM 2013, ICDM 2014 and SDM 2015 can be found at http://www.francois-petitjean.com/Research/ YourKit is supporting Chordalysis open source project with its full-featured Java Profiler. YourKit is the creator of innovative and intelligent tools for profiling Java and .NET applications. http://www.yourkit.com

Downloads: 0 This Week

Last Update: 2015-01-29

See Project

Cinefile

A category-based approach to exploring film data.

Cinefile is a prototype of a category-based method of database exploration. It allows the user to identify abstract categories of films by providing examples of category members, learns to classify films as belonging or not belonging to those categories, and provides a graphical interface for exploring and comparing categories. Cinefile is designed to work with data retrieved from the Internet Movie Database (imdb.com). This data is used for classification and is the subject of the category-based analysis. Cinefile was developed by the University of Mary Washington's Computer Science department (http://cas.umw.edu/computerscience).

Downloads: 0 This Week

Last Update: 2016-11-18

See Project

Consilium Sentence Suggestions Tools

Consilium – User Defined sentence Suggestion Tool.

There are many tools available in market which will provide spell correction or grammer correction while making documents, but very few tools are available which are providing sentence completion according to previously entered text. But this all are providing sentence complition suggestion for sentences which are oftenly or regularly used by all people in same manner. But in reality style of writing changes person to person. While our aim is to provide a sentence suggestion tool which will give suggestion to complete the sentence according previously enterd data by the user. Output or suggestion for same sentence or word will change person to person according to previously entered data by that person. So, it will be very easy to type any document, sms, mail, chatting etc.

Downloads: 0 This Week

Last Update: 2014-02-24

See Project

DE-HEoC

DE-based Weight Optimisation for Heterogeneous Ensemble

We propose the use of Differential Evolution algorithm for the weight adjustment of base classifiers used in weighted voting heterogeneous ensemble of classifier. Average Matthews Correlation Coefficient (MCC) score, calculated over 10-fold cross-validation, has been used as the measure of quality of an ensemble. DE/rand/1/bin algorithm has been utilised to maximize the average MCC score calculated using 10-fold cross-validation on training dataset. The voting weights of base classifiers are optimized for the heterogeneous ensemble of classifiers aiming to attain better generalization performances on testing datasets.

Downloads: 0 This Week

Last Update: 2015-11-24

See Project

DGRLVQ

Dynamic Generalized Relevance Learning Vector Quantization

Some of the usual problems for Learning vector quantization (LVQ) based methods are that one cannot optimally guess about the number of prototypes required for initialization for multimodal data structures i.e.these algorithms are very sensitive to initialization of prototypes and one has to pre define the optimal number of prototypes before running the algorithm. If a prototype, for some reasons, is ‘outside’ the cluster which it should represent and if there are points of a different categories in between, then the other points act as a barrier and the prototype will not find its optimum position during training. Since the model complexity is not known in many cases, we avoid this problem by introducing a "Dynamic" version of LVQ. Dynamic-GRLVQ (DGRLVQ), which adapts the model complexity to the given problem during training by adding or removing prototypes dynamically/realtime one by one for each category until satisfactory classification results are achieved.

Downloads: 0 This Week

Last Update: 2018-04-03

See Project

DSTK - DataScience ToolKit

DSTK - DataScience ToolKit for All of Us

DSTK - DataScience ToolKit is an opensource free software for statistical analysis, data visualization, text analysis, and predictive analytics. Newer version and smaller file size can be found at: https://sourceforge.net/projects/dstk3/ It is designed to be straight forward and easy to use, and familar to SPSS user. While JASP offers more statistical features, DSTK tends to be a broad solution workbench, including text analysis and predictive analytics features. Of course you may specify JASP for advanced data editing and RapidMiner for advanced prediction modeling. DSTK is written in C#, Java and Python to interface with R, NLTK, and Weka. It can be expanded with plugins using R Scripts. We have also created plugins for more statistical functions, and Big Data Analytics with Microsoft Azure HDInsights (Spark Server) with Livy. License: R, RStudio, NLTK, SciPy, SKLearn, MatPlotLib, Weka, ... each has their own licenses.

Downloads: 0 This Week

Last Update: 2018-05-08

See Project

Darwin Genetic Programming Environment

The Darwin Genetic Programming Environment is a graphical Genetic Programming Environment for the facilitation of research in Genetic Programming.

Downloads: 0 This Week

Last Update: 2016-08-02

See Project

Data Mining Platform

Data Mining Platform is a platform for data mining and analysis. It contains many of the new and sophisticated methods such as kernel-based classification, two-way clustering, bayesian networks, pattern recognition for time series analysis and many other

Downloads: 0 This Week

Last Update: 2013-04-18

See Project

DistributedLDA

A project aims to develop a system which trains LDA model in distributed enviorenment. I studied Hadoop based solution and found that Hadoop is not fit for distributed LDA training case. In this project I implement a platform based on socket.

Downloads: 0 This Week

Last Update: 2016-08-08

See Project

DocCO

Non-disjoint groupping of Documents based on word sequence approach

This is a GUI for learning non disjoint groups of documents based on Weka machine learning framework. It offers the possibility to make non disjoint clustering of documents using both vectorial and sequential representation (word sequence approach based on WSK kernel). All data format supported by WEKA could be used in DocCO. Data could be loaded from files, from databases or from specified URL. All the preprocessing techniques implemented in WEKA could be used before performing the learning.

Downloads: 0 This Week

Last Update: 2013-08-17

See Project

Domino Brain

The goal is to create an artificial intelligence for dominoes game.

Downloads: 0 This Week

Last Update: 2013-04-18

See Project

Drug Extraction

Drug name extraction

Drug name recognition and normalisation/grounding to DrugBank ids and standard names. Package provides 2 taggers: 1. DrugTagger - CRF-based with DrugBank presence feature (see feature set for details). 2. DrugnameGazetteer - gazetteer/dictionary-based. Dictionary created from DrugBank.ca database. Both taggers include grounding/normalisation to DrugBank ids and standard names. Feature set: Word, Word-1, Word+1, Word-1_Word, Word_Word+1, DrugBankPresence, POS DrugBankPresence feature indicates the presence of the drug name in the DrugBank. Using CONLL-Evaluation: processed 32065 tokens with 3656 phrases; found: 3251 phrases; correct: 2786. accuracy: 95.25%; precision: 85.70%; recall: 76.20%; FB1: 80.67 Using GATE Corpus Benchmark: Strict: P: 0.65 R: 0.73 F1: 0.69 Lenient: P: 0.74 R: 0.84 F1: 0.78 The details of how to reproduce evaluation, see README. To use standalone version for tagging download DrugExtractionStandalone.tar.gz from Files.

Downloads: 0 This Week

Last Update: 2015-06-12

See Project

E-learning Miner (ELM)

E-learning Miner, formerly DŽEMUj is a tool for data mining from e-learning data. Aimed for teachers.

Downloads: 0 This Week

Last Update: 2013-04-18

See Project

Easy Machine Learning

Easy Machine Learning is a general-purpose dataflow-based system

Machine learning algorithms have become the key components in many big data applications. However, the full potential of machine learning is still far from being realized because using machine learning algorithms is hard, especially on distributed platforms such as Hadoop and Spark. The key barriers come from not only the implementation of the algorithms themselves but also the processing for applying them to real applications which often involve multiple steps and different algorithms. Our platform Easy Machine Learning presents a general-purpose dataflow-based system for easing the process of applying machine learning algorithms to real-world tasks. In the system, a learning task is formulated as a directed acyclic graph (DAG) in which each node represents an operation (e.g. a machine learning algorithm), and each edge represents the flow of the data from one node to its descendants.

Downloads: 0 This Week

Last Update: 2024-08-13

See Project

Edlin

The Edlin toolkit provides a machine learning framework for linear models, designed to be easy to read and understand. The main goal is to provide an architecture and easy to edit working examples of implementations for popular learning algorithms.

1 Review

Downloads: 0 This Week

Last Update: 2014-03-10

See Project

EpochX

EpochX is an open source genetic programming framework, specifically for analysing the properties of evolutionary automatic programming. It supports 3 popular representations - Strongly-Typed GP, Context-Free Grammar GP and Grammatical Evolution.

Downloads: 0 This Week

Last Update: 2013-04-16

See Project

FENNIX

Fast EXperimentation with Neural Networks

FENNIX is a simulator of artificial neural networks written in Java. It allows you to easily describe a complete simulation by using a simple text script language or by adding nodes to a tree of tasks by using the graphical used interface. Moreover, FENNIX is composed of pluggable tools that can be easily modified in order to add new functionalities to the simulator.

Downloads: 0 This Week

Last Update: 2015-10-31

See Project

Face Detector

Detect faces in real time

This Face Detector app can able to detect multiple faces from images in Real-time or from any images that placed in device's storage. In Real-time detection mode user have to give permission to this app to access device camera and voice recorder. In Gallery mode user have to select images from external storage, then it can able to detect the actual faces. This is very short and simple AI based project that can runs on Firebase ML kit API and Google play vision API. This is completely free for you, let's download and Rock.

1 Review

Downloads: 0 This Week

Last Update: 2020-03-23

See Project

Fraeser: errors-in-variables estimation

A graphical MatLab framework for estimating the parameters of, modeling and simulating static and dynamic linear and polynomial systems in the errors-in-variables context with the intent of comparing various estimation strategies.

Downloads: 0 This Week

Last Update: 2013-04-03

See Project

GA-EoC

GeneticAlgorithm-based search for Heterogeneous Ensemble Combinations

In data classification, there are no particular classifiers that perform consistently in every case. This is even worst in case of both the high dimensional and class-imbalanced datasets. To overcome the limitations of class-imbalanced data, we split the dataset using a random sub-sampling to balance them. Then, we apply the (alpha,beta)-k feature set method to select a better subset of features and combine their outputs to get a consolidated feature set for classifier training. To enhance classification performances, we propose an ensemble of classifiers that combine the classification outputs of base classifiers using the simplest and largely used majority voting approach. Instead of creating the ensemble using all base classifiers, we have implemented a genetic algorithm (GA) to search for the best combination from heterogeneous base classifiers. The classification performances achieved by the proposed method method on the chosen datasets are promising.

Downloads: 0 This Week

Last Update: 2016-04-04

See Project

GAME

GAME stays for Generic Architecture based on Multiple Experts. Its main purpose is to make easy prototyping, test and release of prediction systems. Released by IASC group, university of Cagliari

Downloads: 0 This Week

Last Update: 2014-06-26

See Project

GNAT

GNAT recognizes gene names in text and maps them to NCBI Entrez Gene

GNAT is a BioNLP/text mining tool to recognize and identify gene/protein names in natural language text. It will detect mentions of genes in text, such as PubMed/Medline abstracts, and disambiguate them to remove false positives and map them to the correct entry in the NCBI Entrez Gene database by gene ID. March 2017: We started to upload GNAT output on Medline. See files/results/medline/.

Downloads: 0 This Week

Last Update: 2017-12-14

See Project

GUAJE FUZZY

Free software for generating understandable and accurate fuzzy systems

GUAJE stands for Generating Understandable and Accurate fuzzy models in a Java Environment. Thus, it is a free software tool (licensed under GPL-v3) with the aim of supporting the design of interpretable and accurate fuzzy systems by means of combining several preexisting open source tools, taking profit from the main advantages of all of them. It is a user-friendly portable tool designed and developed in order to make easier knowledge extraction and representation for fuzzy systems, paying special attention to interpretability issues. GUAJE lets the user define expert variables and rules, but also provide supervised and fully automatic learning capabilities. Both types of knowledge, expert and induced, are integrated under the expert supervision, ensuring interpretability, simplicity and consistency of the knowledge base along the whole process. Notice that, GUAJE is is an upgraded version of the free software called KBCT (Knowledge Base Configuration Tool).

1 Review

Downloads: 0 This Week

Last Update: 2016-08-22

See Project

Open Source Java Machine Learning Software - Page 3

Java Machine Learning Software

BorderFlow

Ceka

Chordalysis

Cinefile

Consilium Sentence Suggestions Tools

DE-HEoC

DGRLVQ

DSTK - DataScience ToolKit

Darwin Genetic Programming Environment

Data Mining Platform

DistributedLDA

DocCO

Domino Brain

Drug Extraction

E-learning Miner (ELM)

Easy Machine Learning

Edlin

EpochX

FENNIX

Face Detector

Fraeser: errors-in-variables estimation

GA-EoC

GAME

GNAT

GUAJE FUZZY

Related Searches