Showing 86 open source projects for "big data"

View related business solutions
  • Ship Agents Faster Icon
    Ship Agents Faster

    Transform your applications and workflows into powerful agentic systems at global scale.

    Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.
    Get Started Free
  • $300 Free Credits to Build on Google Cloud Icon
    $300 Free Credits to Build on Google Cloud

    New to Google Cloud? Get $300 in credits to explore Compute Engine, BigQuery, Cloud Run, Gemini Enterprise Agent Platform, and more.

    Start your next project with $300 in free Google Cloud credit. Spin up VMs, run containers, query petabytes in BigQuery, or build agents with Gemini Enterprise Agent Platform. Once your credits are used, keep building with 20+ always-free tier products including Compute Engine, Cloud Storage, GKE, and Cloud Run functions. No commitment required—just sign up and start building.
    Claim $300 Free
  • 1

    MarDRe

    MapReduce-based tool to remove duplicate DNA reads

    MarDRe is a de novo MapReduce-based parallel tool to remove duplicate and near-duplicate DNA reads through the clustering of single-end and paired-end sequences from FASTQ/FASTA datasets. This tool allows bioinformatics to avoid the analysis of not necessary reads, reducing the time of subsequent procedures with the dataset. MarDRe is the Big Data counterpart of ParDRe (link above), which employs HPC technologies (i.e., hybrid MPI/multithreading) to reduce runtime on multicore systems. Instead, MarDRe takes advantage of the MapReduce programming model to significantly improve ParDRe performance on distributed systems, especially on cloud-based infrastructures. Written in pure Java to maximize cross-platform compatibility, MarDRe is built upon the open-source Apache Hadoop project, the most popular distributed computing framework for Big Data processing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2

    HSRA

    Hadoop spliced read aligner for RNA-seq data

    ...This tool allows bioinformatics researchers to efficiently distribute their mapping tasks over the nodes of a cluster by combining a fast multithreaded spliced aligner (HISAT2) with Apache Hadoop, which is a distributed computing framework for scalable Big Data processing. HSRA currently supports single-end and paired-end read alignments from FASTQ/FASTA datasets. Moreover, our tool uses the Hadoop Sequence Parser (HSP) library (link above) to efficiently read the input datasets stored on the Hadoop Distributed File System (HDFS), being able to process datasets compressed with Gzip and BZip2 codecs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3

    X10

    Performance and Productivity at Scale

    ...Both its modern, type-safe sequential core and simple programming model for concurrency and distribution contribute to making X10 a high-productivity language in the HPC and Big Data spaces. User productivity is further enhanced by providing tools such as an Eclipse-based IDE (X10DT). Implementations of X10 are available for a wide variety of hardware and software platforms ranging from laptops, to commodity clusters, to supercomputers.
    Leader badge
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4
    fooltrader

    fooltrader

    Quant framework for stock

    Build a standard data schema, and then implement various connectors to import systems you are familiar with for analysis. fooltrader is a quantitative analysis trading system designed using big data technology, including data capture, cleaning, structuring, calculation, display, backtesting and trading. Its goal is to provide a unified framework for the whole market (stock, futures, bonds, foreign exchange, digital currency, macroeconomics, etc.) for research, backtesting, forecasting, and trading. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Stop vibe-debugging. Icon
    Stop vibe-debugging.

    Plug Claude into your app's actual errors.

    AppSignal's MCP server hands Claude, Cursor, or Zed your real errors, traces, and the deploy that shipped them. AI writes the fix; you review the diff.
    Free 30 days.
  • 5
    Redis Desktop Manager

    Redis Desktop Manager

    :wrench: Cross-platform GUI management tool for Redis

    Redis Desktop Manager is a fast, open source Redis database management application based on Qt 5. It's available for Windows, Linux and MacOS and offers an easy-to-use GUI to access your Redis DB. With Redis Desktop Manager you can perform some basic operations such as view keys as a tree, CRUD keys and execute commands via shell. It also supports SSL/TLS encryption, SSH tunnels and cloud Redis instances, such as: Amazon ElastiCache, Microsoft Azure Redis Cache and Redis Labs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6

    paralline

    Big Data tool

    Paralline executes a python function (or lambda function) or a script over each line of huge text files, in parallel processes and aggregates the result to a list.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Vaex

    Vaex

    Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python

    Data science solutions, insights, dashboards, machine learning, deployment. We start at 100GB. Vaex is a high-performance Python library for lazy Out-of-Core data frames (similar to Pandas), to visualize and explore big tabular datasets. It calculates statistics such as mean, sum, count, standard deviation etc, on an N-dimensional grid for more than a billion (10^9) samples/rows per second.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8

    PanoramaServer

    Open Source Panorama Server for free virtual tour of 360 degrees views

    Ideal for creating virtual tours of panoramic views for all sorts including property exhibition for brokers at real estate agencies/property agents, tour guide for indoor/outdoor venues, information to public/private facilities for curators, travel journal for tourist as log book, backdrop setting for storytelling, treasure hunt like games, big data mining for pattern through computer vision in artificial intelligence, etc. It is like creating your own Google Map Street View. All is required by the user is to have photos of equirectangular format (panorama) taken from 3D cameras common for on-site premises. These images can be referenced by the PanoramaServer to create virtual travels with 360 degrees view where viewers can navigate to different locations, view information, etc. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Neuro

    Neuro

    The Neuro crypto currency

    The Neuro NRO cryptocurrency is designed to support solutions of machine learning tasks, big data and neural networks. Neuro is a scientific-technical project uniting scientists, engineers and programmers inspired by the idea to build something big, kind and bright. From the first stages of work, we will be engaged in the development of new architectures and algorithms of neural networks. Someday we will undoubtedly enter the annual ImageNet Challenge contest to compete with such giants as GoogLeNet Inception and Microsoft ResNet. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Enterprise-grade ITSM, for every business Icon
    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

    Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
    Try it Free
  • 10

    Random Bits Forest

    RBF: a Strong Classifier/Regressor for Big Data

    We present a classification and regression algorithm called Random Bits Forest (RBF). RBF integrates neural network (for depth), boosting (for wideness) and random forest (for accuracy). It first generates and selects ~10,000 small three-layer threshold random neural networks as basis by gradient boosting scheme. These binary basis are then feed into a modified random forest algorithm to obtain predictions. In conclusion, RBF is a novel framework that performs strongly especially on data...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    json4sapnw

    json4sapnw

    Another JSON extension for SAP ABAP

    This is a SAP addon to handle JSON data within SAP ABAP Programs. It comes in the customer exchange namespace /CEX/ and has to be installed as an SAP transport request. The addon supports object oriented JSON methods to process deep structured JSON data. Building JSON data from SAP data objects and parsing JSON data back to SAP data objects are supported. See the WIKI for some examples.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12

    Random Bits Regression

    Random Bits Regression is a strong general predictor.

    ...The fast-speed nature of our method not only allows big data analysis but also enables real-time recognition and predictions. The RBR framework also hints the mechanism of brain function and leads to a "wide learning" hypothesis. We believe that this method will make a great impact and enable many downstream applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13

    iCubing

    Several OLAP algorithms, data structures and HPC OLAP versions

    OLAP technology is very useful for decision makers and data mining tools with BIG data. In this direction, we implement iCubing project with several multidimensional data cube approaches for cube indexing, querying, updating and mining. There are also several cube types, i.e. alphanumeric cubes, text cubes with unstructured data and geo cube with geo types, dimensions, measures and hierarchies, so the OLAP area continues a hard challenge after more than 20 years of the seminal paper of Jim Gray et al. in 1997. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    ankus

    ankus

    Data Mining and Machine Learning Algorithms based on MapReduce

    [The feature of ankus] * ankus is a 'web-based big data mining project and tool'. - MapReduce-based data mining/machine learning algorithms library - Hadoop-based distributed bigdata system - offering a web-based GUI for easy use [The ankus project & License] * The ankus project consists of three as an open source. * ankus has Dual licensed under the community and commercial licenses
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    GOBIG
    GOBIG is a toolbox that can be used for detecting genetic variations. The project is intended to handle big data. What's more important is that it be used to detect clusters of SNP variants. It is the intention to use the toolbox with common and rare variants. To use it, for example, to find the genetic map of genes causing complex diseases.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16

    BEAR

    CBR Meets Big Data

    Case-based regression learner for big data. The package contains source and binary files for running BEAR's method. BEAR utilizes EAR4 and locality sensitive hashing in its implementation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Universal Java Matrix Package

    Universal Java Matrix Package

    sparse and dense matrix, linear algebra, visualization, big data

    The Universal Java Matrix Package (UJMP) is an open source Java library which provides sparse and dense matrix classes, as well as a large number of calculations for linear algebra such as matrix multiplication or matrix inverse. Operations such as mean, correlation, standard deviation, replacement of missing values or the calculation of mutual information are supported, too. The Universal Java Matrix Package provides various visualization methods, import and export filters for a large...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18

    Relation Tags

    Source code for be able to use Relation Tags.

    ...Please read "readme" file. It is recommended to use a binary matrix class like BinMatrix in order to have enough speed for calculations of implicit relations in a system of bogus tags with big data. Need to be compiled with C++11 and Qt libraries
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Sample Level Musical Timeline

    Sample Level Musical Timeline

    Sample Level Modulation of Musical Timeline

    Sample Level Modulation of Musical Timeline Mingfeng Zhang Dept. of Electrical and Computer Engineering, University of Rochester In this toolbox we provide signal processing tools to allocate music events (samples of musical notes) to specified time locations with sample level accuracy. In this implementation, we use computational tools to add in micro-timing variations in J.S. Bach four-part chorales as a "visualizer" for big data. By extracting data patterns from multiple time scales, we implement a tool that musicians can perform the big data at different resolutions. This toolbox will need the following supporting toolboxes: MIDI TOOLBOX https://www.jyu.fi/hum/laitokset/musiikki/en/research/coe/materials/miditoolbox MIR TOOLBOX https://www.jyu.fi/hum/laitokset/musiikki/en/research/coe/materials/mirtoolbox Please add the path in MATLAB for these two toolbox. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    PROPER is a package for visual evaluation of ranking classifiers for biological big data mining studies in the mathematical language MATLAB. It is an efficient tool for optimization and comparison of the state-of-the-art ranking classifiers by generating over 20 different high quality two- and three-dimensional performance curves.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    PROPER is a package for visual evaluation of ranking classifiers for biological big data mining studies in the mathematical language MATLAB. It is an efficient tool for optimization and comparison of the state-of-the-art ranking classifiers by generating over 20 different high quality two- and three-dimensional performance curves.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    This is a BIg Data project
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23

    Chordalysis

    Log-linear analysis (data modelling) for high-dimensional data

    ...However, due to its exponential nature, previous approaches did not allow scale-up to more than a dozen variables. We present here Chordalysis, a log-linear analysis method for big data. Chordalysis exploits recent discoveries in graph theory by representing complex models as compositions of triangular structures, also known as chordal graphs. Chordalysis makes it possible to discover the structure of datasets with thousands of variables on a standard desktop computer. Associated papers at ICDM 2013, ICDM 2014 and SDM 2015 can be found at http://www.francois-petitjean.com/Research/ YourKit is supporting Chordalysis open source project with its full-featured Java Profiler. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    BIRT Report Designer

    BIRT Report Designer

    Open Source Reporting & Data Visualization Platform

    ...With a flexible Open Data Access framework, developers can write custom data drivers to access data from any source, including Big Data sources like Apache Hadoop, Cassandra, and MongoDB, along with all traditional relational databases, Flat Files, XML data streams, and data stored in proprietary systems. Built for embedding, BIRT includes APIs for data access, chart generation, output formats, content execution, and integration within larger applications.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    giServer

    giServer

    giServer the easy to use and extensible batch and integration server

    ...Instead of using complex XML configuration files an elaborate GUI for batch job management is included. Some possible usage scenarios are: - Automatic processing of incoming data files - Big Data applications - Process automation - Data Mining/Aggregation applications - Automatic Reporting - Processing and analysis of database records
    Downloads: 0 This Week
    Last Update:
    See Project
Auth0 Logo