Showing 58 open source projects for "clustering"

View related business solutions
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • Compliant and Reliable File Transfers Backed by Top Security Certifications Icon
    Compliant and Reliable File Transfers Backed by Top Security Certifications

    Cerberus FTP Server delivers SOC 2 Type II certified security and FIPS 140-2 validated encryption.

    Stop relying on non-certified, legacy file transfer tools that creak under the weight of modern security demands. Get full audit trails, advanced access controls and more supported by an award-winning team of experts. Start your free 25-day trial today.
    Start Free Trial
  • 1
    Elasticsearch

    Elasticsearch

    A Distributed RESTful Search Engine

    Elasticsearch is a distributed, RESTful search and analytics engine that lets you store, search and analyze with ease at scale. It lets you perform and combine many types of searches; it scales seamlessly, and offers answers incredibly fast with search results you can rank based on a variety of factors. Elasticsearch can be used for a wide variety of use cases, from maps and metrics to site search and workplace search, and with all data types.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 2
    NGSEP

    NGSEP

    NGSEP (Next Generation Sequencing Experience Platform)

    ...The current version provides functionalities for both de-novo and reference guided analysis of sequencing data, including genome assembly, read mapping, variants detection and genotyping and de-novo analysis of data generated from reduced representation protocols. NGSEP also provides modules for analysis of genomic variation databases (VCF files), including functional annotation, filtering, format conversion, comparison, clustering, imputation, introgression analysis and different kinds of statistics. Since version 4, we provide functionalities for management of genomes and transcriptomes, including genome alignment and annotation of transposable elements. A complete list of functionalities is available in our wiki (https://sourceforge.net/p/ngsep/wiki/Home/). ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    DynaQ

    DynaQ

    Innovative text document search. http://dynaq.opendfki.de for details.

    The goal of DynaQ is to develop an inquiry system to explore the personal information space, supporting you with the searching paradigm 'orienteering'. DynaQ is a (desktop)search engine with enhanced functionality for file, email and blog search. Look at our GitLab homepage for sourcecode and documentation: http://dynaq.opendfki.de
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    NOVA

    NOVA

    Analysis and visualization of complexome profiling data.

    NOVA is a program designed to analysis complexome profiling data (Heide et al., 2012). A graphical user interface (GUI) provides various visualization tools, such as heat maps and 2D plots. Several hierarchical clustering algorithms (e.g., single linkage, average linkage, Wards linkage), different distance measures (e.g., Euclidean distance, Manhattan distance, Pearson distance), and various normalization techniques are implemented. Many additional functions like zooming, searching for proteins, image export, and automatic file format recognition support intuitive handling for biologists. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5

    MarDRe

    MapReduce-based tool to remove duplicate DNA reads

    MarDRe is a de novo MapReduce-based parallel tool to remove duplicate and near-duplicate DNA reads through the clustering of single-end and paired-end sequences from FASTQ/FASTA datasets. This tool allows bioinformatics to avoid the analysis of not necessary reads, reducing the time of subsequent procedures with the dataset. MarDRe is the Big Data counterpart of ParDRe (link above), which employs HPC technologies (i.e., hybrid MPI/multithreading) to reduce runtime on multicore systems. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6

    popt4jlib

    Parallel Optimization Library for Java

    popt4jlib is an open-source parallel optimization library for the Java programming language supporting both shared memory and distributed message passing models. Implements a number of meta-heuristic algorithms for Non-Linear Programming, including Genetic Algorithms, Differential Evolution, Evolutionary Algorithms, Simulated Annealing, Particle Swarm Optimization, Firefly Algorithm, Monte-Carlo Search, Local Search algorithms, Gradient-Descent-based algorithms, as well as some well-known...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Maltcms
    The Maltcms - Modular Application Toolkit for Chromatography Mass-Spectrometry is a JAVA API for preprocessing, alignment, analysis and visualization of data stored in open file formats used in Proteomics and Metabolomics research.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8

    jLDADMM

    A Java package for the LDA and DMM topic models

    ...It provides implementations of the Latent Dirichlet Allocation topic model and the one-topic-per-document Dirichlet Multinomial Mixture model (i.e. mixture of unigrams), using collapsed Gibbs sampling. In addition, jLDADMM supplies a document clustering evaluation to compare topic models. See the usage of jLDADMM in its website at http://jldadmm.sourceforge.net/
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    The Java Data Mining Package (JDMP) is a library that provides methods for analyzing data with the help of machine learning algorithms (e.g. clustering, classification, graphical models, neural networks, Bayesian networks, text processing, optimization).
    Downloads: 0 This Week
    Last Update:
    See Project
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    More flexibility. More control.

    Generate interest, access liquidity without selling, and execute trades seamlessly. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 10
    JInsect
    The JINSECT toolkit is a Java-based toolkit and library that supports and demonstrates the use of n-gram graphs within Natural Language Processing applications, ranging from summarization and summary evaluation to text classification and indexing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    BiMS

    BiMS

    BiMS (biclustering for mass spectrometry data) is a Java application d

    BiMS (biclustering for mass spectrometry data) is a Java application designed to allow the application of biclustering algorithms to mass spectrometry datasets. Users can load their MS datasets and apply different clustering and biclustering algorithms (current version includes Bimax and BiBit). In adition, users can load raw datasets (in mzML or mzXML formats) and preprocess them using MALDIquant package and MassSpecWavelet.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Luscinia is a program for archiving and analyzing field sound recordings (especially of animals). It incorporates an interface to a database, spectrogram measurement algorithms, sound comparison algorithms, and statistical analysis.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13

    Deem

    Analyze time-course data with significance tests, clustering, modeling

    Use statistical methods to analyze time-course data (gene expression microarray and RNA-seq data in particular, but not limited to). Apply significance tests to filter out only significant genes or time series. Cluster time series into similar groups. Generate network models, including linear or non-linear models. Variable selection and optimization routines included. Written in Scala and R. The application is a cross-platform desktop app with a simple GUI and is fully functional...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Java Machine Learning Library is a library of machine learning algorithms and related datasets. Machine learning techniques include: clustering, classification, feature selection, regression, data pre-processing, ensemble learning, voting, ...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 15
    Weka4OC GUI for Overlapping clustering

    Weka4OC GUI for Overlapping clustering

    Weka4OC: Weka for Overlapping Clustering is a GUI extending WEKA

    This is a GUI application for learning non disjoint groups based on Weka machine learning framework. It offers a variety of learning methods, based on k-means, able to produce overlapping clusters. The application also contains an evaluation framework that calculates several external validation measures. The application offers a visualization tool to discover overlapping groups.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16

    ktree

    clustering, machine learning, algorithms

    This project has moved to github at http://lmwtree.devries.ninja.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Unsupervised TXT classifier

    Unsupervised TXT classifier

    Classify any two TXT documents, no training required - JAVA

    ...First, over-training and second, shortage of data for a training of categories. Instead, each TXT file is a category on its own, rather than an assigned category. In a way, this is similar to clustering but not really a clustering algorithm since there is some training involved. The summarizer from Classifier4J has been adjusted to accept two inputs (lets call them A and B). Then, the summarizer gets trained with A to summarize a document B, and vice versa. This extracts a relevant structure for both documents (and thus avoids the over-training) which are then compared using the Vector-Space analysis to give a range of belonging of one document to another (and thus avoids the shortage of information). ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    DocCO

    DocCO

    Non-disjoint groupping of Documents based on word sequence approach

    This is a GUI for learning non disjoint groups of documents based on Weka machine learning framework. It offers the possibility to make non disjoint clustering of documents using both vectorial and sequential representation (word sequence approach based on WSK kernel). All data format supported by WEKA could be used in DocCO. Data could be loaded from files, from databases or from specified URL. All the preprocessing techniques implemented in WEKA could be used before performing the learning.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19

    TML - Text Mining Library for LSA & CMM

    TML is a Java Library for LSA and extracting Concept Maps from text

    TML has moved to http://www.villalon.cl/tml.html and the code to https://github.com/villalon/tml
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    XDAQ is a software platform designed specifically for the development of distributed data acquisition systems. The development is carried out at CERN, the European Organization for Nuclear Research. Please visit http://xdaq.web.cern.ch
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    The final build of this software now is distributed in R, embedded in "RedeR': an R/Bioconductor package for hierarchical and nested network analysis... more about RedeR: http://bioconductor.org/packages/2.9/bioc/html/RedeR.html
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    TAXOMO
    Data mining tool for sequences (e.g. trajectories on a map, visited web pages, etc.) that creates a succinct description of the sequences, given a taxonomy (e.g. regions and sub-regions in the map, categories and sub-categories of pages, etc.).
    Downloads: 3 This Week
    Last Update:
    See Project
  • 24
    GridSim allows modeling and simulation of entities in parallel and distributed computing systems such as users, applications, resources, and resource brokers/schedulers for design and evaluation of scheduling algorithms. http://www.gridbus.org/gridsim
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    BorderFlow
    BorderFlow implements a general-purpose graph clustering algorithm. It maximizes the inner to outer flow ratio from the border of each cluster to the rest of the graph.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next
Auth0 Logo