Showing 144 open source projects for "hadoop"

View related business solutions
  • Achieve perfect load balancing with a flexible Open Source Load Balancer Icon
    Achieve perfect load balancing with a flexible Open Source Load Balancer

    Take advantage of Open Source Load Balancer to elevate your business security and IT infrastructure with a custom ADC Solution.

    Boost application security and continuity with SKUDONET ADC, our Open Source Load Balancer, that maximizes IT infrastructure flexibility. Additionally, save up to $470 K per incident with AI and SKUDONET solutions, further enhancing your organization’s risk management and cost-efficiency strategies.
  • SysAid multi-layered ITSM solution Icon
    SysAid multi-layered ITSM solution

    For organizations spanning all industries and sizes from SMBs to Fortune 500 corporations

    SysAid is an ITSM, Service Desk and Help Desk software solution that integrates all of the essential IT tools into one product. Its rich set of features include a powerful Help Desk, IT Asset Management, and other easy-to-use tools for analyzing and optimizing IT performance.
  • 1

    HSRA

    Hadoop spliced read aligner for RNA-seq data

    HSRA is a MapReduce-based parallel tool for mapping reads from RNA sequencing (RNA-seq) experiments. RNA-seq analyses typically begin by mapping reads to a reference genome in order to determine the location from which the reads were originated, which is a very time-consuming step. This tool allows bioinformatics researchers to efficiently distribute their mapping tasks over the nodes of a cluster by combining a fast multithreaded spliced aligner (HISAT2) with Apache Hadoop, which...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2

    MarDRe

    MapReduce-based tool to remove duplicate DNA reads

    ..., MarDRe takes advantage of the MapReduce programming model to significantly improve ParDRe performance on distributed systems, especially on cloud-based infrastructures. Written in pure Java to maximize cross-platform compatibility, MarDRe is built upon the open-source Apache Hadoop project, the most popular distributed computing framework for Big Data processing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    apache spark data pipeline osDQ

    apache spark data pipeline osDQ

    osDQ dedicated to create apache spark based data pipeline using JSON

    ... file Windows : java -cp .\lib\*;osdq-spark-0.0.1.jar org.arrah.framework.spark.run.TransformRunner -c .\example\samplerun.json Mac UNIX java -cp ./lib/*:./osdq-spark-0.0.1.jar org.arrah.framework.spark.run.TransformRunner -c ./example/samplerun.json For those on windows, you need to have hadoop distribtion unzipped on local drive and HADOOP_HOME set. Also copy winutils.exe from here into HADOOP_HOME\bin
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    X-RIME is a open source project devoted to provide Hadoop based solution for large scale social network analysis.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Cybersecurity Management Software for MSPs Icon
    Cybersecurity Management Software for MSPs

    Secure your clients from cyber threats.

    Define and Deliver Comprehensive Cybersecurity Services. Security threats continue to grow, and your clients are most likely at risk. Small- to medium-sized businesses (SMBs) are targeted by 64% of all cyberattacks, and 62% of them admit lacking in-house expertise to deal with security issues. Now technology solution providers (TSPs) are a prime target. Enter ConnectWise Cybersecurity Management (formerly ConnectWise Fortify) — the advanced cybersecurity solution you need to deliver the managed detection and response protection your clients require. Whether you’re talking to prospects or clients, we provide you with the right insights and data to support your cybersecurity conversation. From client-facing reports to technical guidance, we reduce the noise by guiding you through what’s really needed to demonstrate the value of enhanced strategy.
  • 5
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Easy Machine Learning

    Easy Machine Learning

    Easy Machine Learning is a general-purpose dataflow-based system

    Machine learning algorithms have become the key components in many big data applications. However, the full potential of machine learning is still far from being realized because using machine learning algorithms is hard, especially on distributed platforms such as Hadoop and Spark. The key barriers come from not only the implementation of the algorithms themselves but also the processing for applying them to real applications which often involve multiple steps and different algorithms. Our...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Slimgrid is a Java library for grid computations which is lighter than other ones (JPPF, Hadoop, ...). The main design goals are: minimalism, simplicity, pervasiveness. If you need to grab something which does not require you to comprehend massive and complex API's, do exhaustive configurations and installations, is robust and reliable, uses just one port for all management and communication, then SlimGrid is the right choice. The SlimGrid is built on top of the Apache's ZooKeeper library...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Voldemort

    Voldemort

    A distributed key-value storage system

    Voldemort is a distributed database that’s an open source clone of Amazon’s Dynamo. It automatically replicates data over multiple servers, and automatically partitions them as well so each server only contains a subset of the total data. It offers many other features such as pluggable serialization support, data item versioning and an SSD Optimized Read Write storage engine. Voldemort is not a relational database or an object database. It is essentially a big, distributed, persistent,...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    Chronos

    Chronos

    Fault tolerant job scheduler for Mesos to handle dependencies

    Chronos is a replacement for cron. It is a distributed and fault-tolerant scheduler that runs on top of Apache Mesos that can be used for job orchestration. It supports custom Mesos executors as well as the default command executor. Thus by default, Chronos executes sh (on most systems bash) scripts. Chronos can be used to interact with systems such as Hadoop (incl. EMR), even if the Mesos slaves on which execution happens do not have Hadoop installed. Chronos is also natively able to schedule...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Create and run cloud-based virtual machines. Icon
    Create and run cloud-based virtual machines.

    Secure and customizable compute service that lets you create and run virtual machines on Google’s infrastructure.

    Computing infrastructure in predefined or custom machine sizes to accelerate your cloud transformation. General purpose (E2, N1, N2, N2D) machines provide a good balance of price and performance. Compute optimized (C2) machines offer high-end vCPU performance for compute-intensive workloads. Memory optimized (M2) machines offer the highest memory and are great for in-memory databases. Accelerator optimized (A2) machines are based on the A100 GPU, for very demanding applications.
  • 10
    org.framework.hadoop
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11

    HTSFinder

    High-Throughput DNA Signature Finder

    HTSFinder consists of three computational phases. This pipeline generates all the possibilities of k-mers for every genome individually and then determines their frequency in the entire database. Finally, DNA signatures of every species or strain are obtained in the database or multiple databases that have been involved in the pipeline. HTSFinder implements the parallel and distributed computational tool Hadoop for the second and third phases.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    An Apache Zookeeper-based utility for assigning unique, sequential ID numbers in a distributed system (such as a Hadoop Map/Reduce job).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Vappio is a framework for building virtual appliances that supports distributed data processing in cloud computing environments using Sun Grid Engine or Hadoop. The primary target application of Vappio is bioinformatics.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    HareDB HBase Client

    HareDB HBase Client

    GUI Tools for HBase (including PIG and high speed Hive Query)

    Most people are not familiar with command mode. However, there is only command mode in the world of Hadoop and HBase. For the reason above, we are focusing on developing a set of tools, “HBase Client”, which can be used more easily and having a more friendly interface.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17

    VoltMR

    Pure java NGS mapping soft run on Hadoop 2.0

    VoltMR is pure java NGS (DNA/RNA) mapping and realignment soft that run on Hadoop 2.0 The accuracy is comparable to BWA-MEM and novoalgin with speed faster than those aligner. Using 100 core, VoltMR finish typical exome sample (10GB),mapping, sort, mark duplicate, local realignment in 30 minitue. It use about 10GB to 15GB RAM for each hadoop mapper and reducer. Currently, VoltMR take fastq as a input and output bam/ADAM format. For DNA mapping, GATK compatible realignment/recalbration...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18

    RSS Atom Feed Analytics With MapReduce

    This is a data analytics project for RSS feeds using hadoop MapReduce

    This project accepts the output of jatomrss project as the input. It applies the MR logic on the same to perform the analytics
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Hadoop configuration files

    Hadoop configuration files

    Hadoop configuration files

    Hadoop 1.x and 2.x configuration files and some other files to configure Hadoop cluster
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    10-on-10

    10-on-10

    Recommending top 10 items, a user MUST see now

    The application will be Near-Real time and will be using many technologies as per the need and ability. The target will be to use multiple hadoop technologies, Spark and machine learning over the data to recommend the best. The data source will be many and to analyse those data will be real challenge. We need to try our best to make our recommendations awesome "10-on-10"
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Hadoop-BAM is a Java library for the manipulation of files in common bioinformatics formats using the Hadoop MapReduce framework with the Picard SAM JDK, and command line tools similar to SAMtools. The file formats currently supported are BAM, SAM, FASTQ, FASTA, QSEQ, BCF, and VCF. For a longer high-level description of Hadoop-BAM, refer to the article "Hadoop-BAM: directly manipulating next generation sequencing data in the cloud" in Bioinformatics Volume 28 Issue 6 pp. 876-877, available...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Hadoop

    Hadoop

    Use Hadoop in Scientific Workflows

    This project provides with an application to integrate Hadoop with WS-PGRADE workflows. It uses Openstack cloud to create user specified Hadoop clusters and execute jobs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    MIREX
    MIREX (MapReduce Information Retrieval Experiments) provides solutions to easily and quickly run large-scale information retrieval experiments on a cluster of machines using Hadoop. Version 0.3 has tools for the TREC ClueWeb09 and ClueWeb12 collections.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24

    DAWG

    Dynamic Event List for Fault Management

    A Next Gen Event/Alert/Alarm list in a PostgresQL and Hadoop backend. HTML5 and js front end.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    ankus

    ankus

    Data Mining and Machine Learning Algorithms based on MapReduce

    [The feature of ankus] * ankus is a 'web-based big data mining project and tool'. - MapReduce-based data mining/machine learning algorithms library - Hadoop-based distributed bigdata system - offering a web-based GUI for easy use [The ankus project & License] * The ankus project consists of three as an open source. * ankus has Dual licensed under the community and commercial licenses. * community license is following GPLv3 - Some algorithms in Core Project do not under the OSS...
    Downloads: 0 This Week
    Last Update:
    See Project