Showing 54 open source projects for "data"

View related business solutions
  • Cut Cloud Costs with Google Compute Engine Icon
    Cut Cloud Costs with Google Compute Engine

    Save up to 91% with Spot VMs and get automatic sustained-use discounts. One free VM per month, plus $300 in credits.

    Save on compute costs with Compute Engine. Reduce your batch jobs and workload bill 60-91% with Spot VMs. Compute Engine's committed use offers customers up to 70% savings through sustained use discounts. Plus, you get one free e2-micro VM monthly and $300 credit to start.
    Try Compute Engine
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    Smile

    Smile

    Statistical machine intelligence and learning engine

    Smile is a fast and comprehensive machine learning engine. With advanced data structures and algorithms, Smile delivers the state-of-art performance. Compared to this third-party benchmark, Smile outperforms R, Python, Spark, H2O, xgboost significantly. Smile is a couple of times faster than the closest competitor. The memory usage is also very efficient. If we can train advanced machine learning models on a PC, why buy a cluster?
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    Angel

    Angel

    A Flexible and Powerful Parameter Server for large-scale ML

    Angel is a high-performance distributed machine learning and graph computing platform based on the philosophy of Parameter Server. It is tuned for performance with big data from Tencent and has a wide range of applicability and stability, demonstrating an increasing advantage in handling higher-dimension models. Angel is jointly developed by Tencent and Peking University, taking account of both high availability in industry and innovation in academia. With a model-centered core design concept, Angel partitions the parameters of complex models into multiple parameter-server nodes and implements a variety of machine learning algorithms and graph algorithms using efficient model-updating interfaces and functions, as well as a flexible consistency model for synchronization. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Tribuo

    Tribuo

    Tribuo - A Java machine learning library

    ...Tribuo's Models, Datasets, and Evaluations have provenance, meaning they know exactly what parameters, transformations, and files were used to create them. Provenance data allows each model to be rebuilt verbatim from scratch and for evaluations to track the models and datasets used for each experiment.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Weka

    Weka

    Machine learning software to solve data mining problems

    Weka is a collection of machine learning algorithms for solving real-world data mining problems. It is written in Java and runs on almost any platform. The algorithms can either be applied directly to a dataset or called from your own Java code.
    Leader badge
    Downloads: 9,986 This Week
    Last Update:
    See Project
  • Build on Google Cloud with $300 in Free Credit Icon
    Build on Google Cloud with $300 in Free Credit

    New to Google Cloud? Get $300 in free credit to explore Compute Engine, BigQuery, Cloud Run, Vertex AI, and 150+ other products.

    Start your next project with $300 in free Google Cloud credit. Spin up VMs, run containers, query exabytes in BigQuery, or build AI apps with Vertex AI and Gemini. Once your credits are used, keep building with 20+ products with free monthly usage, including Compute Engine, Cloud Storage, GKE, and Cloud Run functions. Sign up to start building right away.
    Start Free Trial
  • 5
    UnBBayes

    UnBBayes

    Framework & GUI for Bayes Nets and other probabilistic models.

    UnBBayes is a probabilistic network framework written in Java. It has both a GUI and an API with inference, sampling, learning and evaluation. It supports Bayesian networks, influence diagrams, MSBN, OOBN, HBN, MEBN/PR-OWL, PRM, structure, parameter and incremental learning. Please, visit our wiki (https://sourceforge.net/p/unbbayes/wiki/Home/) for more information. Check out the license section (https://sourceforge.net/p/unbbayes/wiki/License/) for our licensing policy.
    Downloads: 21 This Week
    Last Update:
    See Project
  • 6
    ADAMS

    ADAMS

    ADAMS is a workflow engine for building complex knowledge workflows.

    ADAMS is a flexible workflow engine aimed at quickly building and maintaining data-driven, reactive workflows, easily integrated into business processes. Instead of placing operators on a canvas and manually connecting them, a tree structure and flow control operators determine how data is processed (sequentially/parallel). This allows rapid development and easy maintenance of large workflows, with hundreds or thousands of operators.
    Leader badge
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    Alink

    Alink

    Alink is the Machine Learning algorithm platform based on Flink

    Alink is Alibaba’s scalable machine learning algorithm platform built on Apache Flink, designed for batch and stream data processing. It provides a wide variety of ready-to-use ML algorithms for tasks like classification, regression, clustering, recommendation, and more. Written in Java and Scala, Alink is suitable for enterprise-grade big data applications where performance and scalability are crucial. It supports model training, evaluation, and deployment in real-time environments and integrates seamlessly into Alibaba’s cloud ecosystem.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    Synthetic Mixed Data Generator
    A Synthetic Data Generator for producing mixed datasets described by relevant, irrelevant, and redundant features.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    ModelDB

    ModelDB

    Open Source ML Model Versioning, Metadata, and Experiment Management

    An open-source system for Machine Learning model versioning, metadata, and experiment management. ModelDB is an open-source system to version machine learning models including their ingredients code, data, config, and environment and to track ML metadata across the model lifecycle.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Ship AI Apps Faster with Vertex AI Icon
    Ship AI Apps Faster with Vertex AI

    Go from idea to deployed AI app without managing infrastructure. Vertex AI offers one platform for the entire AI development lifecycle.

    Ship AI apps and features faster with Vertex AI—your end-to-end AI platform. Access Gemini 3 and 200+ foundation models, fine-tune for your needs, and deploy with enterprise-grade MLOps. Build chatbots, agents, or custom models. New customers get $300 in free credit.
    Try Vertex AI Free
  • 10
    Oryx

    Oryx

    Lambda architecture on Apache Spark, Apache Kafka for real-time

    Oryx 2 is a realization of the lambda architecture built on Apache Spark and Apache Kafka, but with specialization for real-time large-scale machine learning. It is a framework for building applications but also includes packaged, end-to-end applications for collaborative filtering, classification, regression and clustering. The application is written in Java, using Apache Spark, Hadoop, Tomcat, Kafka, Zookeeper and more. Configuration uses a single Typesafe Config config file, wherein...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Easy Machine Learning

    Easy Machine Learning

    Easy Machine Learning is a general-purpose dataflow-based system

    ...In the system, a learning task is formulated as a directed acyclic graph (DAG) in which each node represents an operation (e.g. a machine learning algorithm), and each edge represents the flow of the data from one node to its descendants.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12

    DGRLVQ

    Dynamic Generalized Relevance Learning Vector Quantization

    Some of the usual problems for Learning vector quantization (LVQ) based methods are that one cannot optimally guess about the number of prototypes required for initialization for multimodal data structures i.e.these algorithms are very sensitive to initialization of prototypes and one has to pre define the optimal number of prototypes before running the algorithm. If a prototype, for some reasons, is ‘outside’ the cluster which it should represent and if there are points of a different categories in between, then the other points act as a barrier and the prototype will not find its optimum position during training. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Seldon Server

    Seldon Server

    Machine learning platform and recommendation engine on Kubernetes

    ...Seldon Core is a progression of the goals of the Seldon-Server project but also a more restricted focus to solving the final step in a machine learning project which is serving models in production. Seldon Server is a machine learning platform that helps your data science team deploy models into production. It provides an open-source data science stack that runs within a Kubernetes Cluster. You can use Seldon to deploy machine learning and deep learning models into production on-premise or in the cloud (e.g. GCP, AWS, Azure).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Genetic Oversampling Weka Plugin

    Genetic Oversampling Weka Plugin

    A Weka Plugin that uses a Genetic Algorithm for Data Oversampling

    Weka genetic algorithm filter plugin to generate synthetic instances. This Weka Plugin implementation uses a Genetic Algorithm to create new synthetic instances to solve the imbalanced dataset problem. See my master thesis available for download, for further details.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15

    OWL Machine Learning

    Machine learning algorithm using OWL

    ...Usually, these are very time-consuming and complex tasks because the features have to be manually crafted. The features are aggregated, combined or split to create features from raw data. This project makes use of ontologies to automatically generate features for the ML algorithms. The features are generated by combining the concepts and relationships that are already in the knowledge base, expressed in form of ontology.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    ... - MECore is a shell-based system that allows the user to create propositional knowledge bases, to perform a variety of belief change operations, and to query a knowledge base with respect to the principle of optimum entropy. - Log4KR is a library providing data structures to represent knowledge bases in various logic formalisms (propositional, relational, conditional, probabilistic, ...) and providing algorithms to perform reasoning operations
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    DSTK - DataScience ToolKit

    DSTK - DataScience ToolKit

    DSTK - DataScience ToolKit for All of Us

    ...Of course you may specify JASP for advanced data editing and RapidMiner for advanced prediction modeling. DSTK is written in C#, Java and Python to interface with R, NLTK, and Weka. It can be expanded with plugins using R Scripts. We have also created plugins for more statistical functions, and Big Data Analytics with Microsoft Azure HDInsights (Spark Server) with Livy.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 18

    JCLTP

    A Java Class Library for Text Processing

    JCLTP is a class library designed for processing text. JCLTP is free, open source and developed with the Java programming language. JCLTP is distributed under the GNU license. It incorporates several technologies that enable process information while applying AI techniques, in order to build predictive models for text classification. Through a flexible structure of interfaces and classes, the opportunity to extend, adapt and add functionality JCLTP is provided. Thus, analysis of new types...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19

    LAML:Linear Algebra and Machine Learning

    A stand-alone Java library for linear algebra and machine learning

    ...The goal is to build efficient and easy-to-use linear algebra and machine learning libraries. The reason why linear algebra and machine learning are built together is that full control of the basic data structures for matrices and vectors is required to have fast implementation for machine learning methods. Additionally, LAML provides a lot of commonly used matrix functions in the same signature to MATLAB, thus can also be used to manually convert MATLAB code to Java code.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20

    GA-EoC

    GeneticAlgorithm-based search for Heterogeneous Ensemble Combinations

    In data classification, there are no particular classifiers that perform consistently in every case. This is even worst in case of both the high dimensional and class-imbalanced datasets. To overcome the limitations of class-imbalanced data, we split the dataset using a random sub-sampling to balance them. Then, we apply the (alpha,beta)-k feature set method to select a better subset of features and combine their outputs to get a consolidated feature set for classifier training. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    This site contains four packages of Mass and mass-based density estimation. 1. The first package is about the basic mass estimation (including one-dimensional mass estimation and Half-Space Tree based multi-dimensional mass estimation). This packages contains the necessary codes to run on MATLAB. 2. The second package includes source and object files of DEMass-DBSCAN to be used with the WEKA system. 3. The third package DEMassBayes includes the source and object files of a Bayesian...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    android-activity-miner

    android-activity-miner

    Activity-Miner for Android

    A mobile application to create accelerometer based activity recognition models directly on the phone. The configuration of the segmentation and feature extraction process chain requires expert knownledge. The prototype was developed in 2012 in a bachelor thesis at the University of Kassel and was optimized and enhanced for an experiment in 2015.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23

    JCLALtext

    Text processing module for JCLAL

    JCLALtext is a class library designed to extend the framework JCLAL text tasks. JCLALtext is free, open source and developed with the Java programming language. JCLALtext is distributed under the GNU license. The researcher can use the class library by adding it to your project.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24

    SuRankCo

    Supervised Ranking of Contigs in de novo Assemblies

    SuRankCo is a machine learning based software to score and rank contigs from de novo assemblies of next generation sequencing data. It trains with alignments of contigs with known reference genomes and predicts scores and ranking for contigs which have no related reference genome yet. For more details about SuRankCo and its functioning, please see "SuRankCo: Supervised Ranking of Contigs in de novo Assemblies" Mathias Kuhring, Piotr Wojtek Dabrowski, Andreas Nitsche and Bernhard Y. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Intelligent Keyword Miner

    Intelligent Keyword Miner

    Intelligent SEO keyword miner and predicing tool

    ...Programs that have similar idea are: Google AdWords, SERPWoo's Keyword Finder, Wordpot, and others. Difference is, this program is intelligent and it accepts your input data and then predicts keywords based on your likes or dislikes. As the main engine, it uses the SMOReg algorithm to analyze and map the keyword frequencies of your data. This can be a great SEO tool to help increase the traffic of any website featuring a product.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next
MongoDB Logo MongoDB