Showing 23 open source projects for "hadoop"

View related business solutions
  • SKUDONET Open Source Load Balancer Icon
    SKUDONET Open Source Load Balancer

    Take advantage of Open Source Load Balancer to elevate your business security and IT infrastructure with a custom ADC Solution.

    SKUDONET ADC, operates at the application layer, efficiently distributing network load and application load across multiple servers. This not only enhances the performance of your application but also ensures that your web servers can handle more traffic seamlessly.
  • Total Network Visibility for Network Engineers and IT Managers Icon
    Total Network Visibility for Network Engineers and IT Managers

    Network monitoring and troubleshooting is hard. TotalView makes it easy.

    This means every device on your network, and every interface on every device is automatically analyzed for performance, errors, QoS, and configuration.
  • 1
    Luigi

    Luigi

    Python module that helps you build complex pipelines of batch jobs

    Luigi is a Python (3.6, 3.7, 3.8, 3.9 tested) package that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization, handling failures, command line integration, and much more. The purpose of Luigi is to address all the plumbing typically associated with long-running batch processes. You want to chain many tasks, automate them, and failures will happen. These tasks can be anything, but are typically long running things like Hadoop...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 2
    Apache HBase

    Apache HBase

    Get random, realtime read/write access to your Big Data

    ... HBase provides Bigtable-like capabilities on top of Hadoop and HDFS. Thrift gateway and a REST-ful Web service that supports XML, Protobuf, and binary data encoding options. Support for exporting metrics via the Hadoop metrics subsystem to files or Ganglia; or via JMX. Convenient base classes for backing Hadoop MapReduce jobs with Apache HBase tables.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    Apache Hudi

    Apache Hudi

    Upserts, Deletes And Incremental Processing on Big Data

    Apache Hudi (pronounced Hoodie) stands for Hadoop Upserts Deletes and Incrementals. Hudi manages the storage of large analytical datasets on DFS (Cloud stores, HDFS or any Hadoop FileSystem compatible storage). Apache Hudi is a transactional data lake platform that brings database and data warehouse capabilities to the data lake. Hudi reimagines slow old-school batch data processing with a powerful new incremental processing framework for low latency minute-level analytics. Hudi provides...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    Genie

    Genie

    Distributed Big Data Orchestration Service

    Genie is a completely open source distributed job orchestration engine developed by Netflix. Genie provides REST-ful APIs to run a variety of big data jobs like Hadoop, Pig, Hive, Spark, Presto, Sqoop and more. It also provides APIs for managing the metadata of many distributed processing clusters and the commands and applications which run on them.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Cybersecurity Management Software for MSPs Icon
    Cybersecurity Management Software for MSPs

    Secure your clients from cyber threats.

    Define and Deliver Comprehensive Cybersecurity Services. Security threats continue to grow, and your clients are most likely at risk. Small- to medium-sized businesses (SMBs) are targeted by 64% of all cyberattacks, and 62% of them admit lacking in-house expertise to deal with security issues. Now technology solution providers (TSPs) are a prime target. Enter ConnectWise Cybersecurity Management (formerly ConnectWise Fortify) — the advanced cybersecurity solution you need to deliver the managed detection and response protection your clients require. Whether you’re talking to prospects or clients, we provide you with the right insights and data to support your cybersecurity conversation. From client-facing reports to technical guidance, we reduce the noise by guiding you through what’s really needed to demonstrate the value of enhanced strategy.
  • 5
    XGBoost

    XGBoost

    Scalable and Flexible Gradient Boosting

    ... can be used for Python, Java, Scala, R, C++ and more. It can run on a single machine, Hadoop, Spark, Dask, Flink and most other distributed environments, and is capable of solving problems beyond billions of examples.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    HugeGraph

    HugeGraph

    A graph database that supports more than 100+ billion data

    HugeGraph is a convenient, efficient, and adaptable graph database compatible with the Apache TinkerPop3 framework and the Gremlin query language. HugeGraph supports fast import performance in the case of more than 10 billion Vertices and Edges Graph, millisecond-level OLTP query capability, and can be integrated into big data platforms like Hadoop or Spark for OLAP analysis. The main scenarios of HugeGraph include correlation search, fraud detection, and knowledge graph. Not only supports...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    IoTDB

    IoTDB

    Apache IoTDB

    Apache IoTDB (Database for Internet of Things) is an IoT native database with high performance for data management and analysis, deployable on the edge and the cloud. Due to its light-weight architecture, high performance and rich feature set together with its deep integration with Apache Hadoop, Spark and Flink, Apache IoTDB can meet the requirements of massive data storage, high-speed data ingestion and complex data analysis in the IoT industrial fields. In the scene of factories...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    spatial-framework-for-hadoop

    spatial-framework-for-hadoop

    The Spatial Framework for Hadoop allows developers

    The Spatial Framework for Hadoop allows developers and data scientists to use the Hadoop data processing system for spatial data analysis. For tools, samples, and tutorials that use this framework, head over to GIS Tools for Hadoop. At the root level of this repository, you can build a single jar with everything in the framework using Apache Ant. Alternatively, you can build a jar at the root level of each framework component. Custom MapReduce jobs that use the Esri Geometry API require...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    geometry-api-java

    geometry-api-java

    The Esri Geometry API for Java enables developers to write apps

    The Esri Geometry API for Java can be used to enable spatial data processing in 3rd-party data-processing solutions. Developers of custom MapReduce-based applications for Hadoop can use this API for spatial processing of data in the Hadoop system. The API is also used by the Hive UDF’s and could be used by developers building geometry functions for 3rd-party applications such as Cassandra, HBase, Storm and many other Java-based “big data” applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Let your volunteer coordinators do their best work. Icon
    Let your volunteer coordinators do their best work.

    For non-profit organizations requiring a software solution to keep track of volunteers

    Stop messing with tools that aren’t designed to amplify volunteer programs. With VolunteerMatters, it’s a delight to manage everything in one place.
  • 10
    Open Source Data Quality and Profiling

    Open Source Data Quality and Profiling

    World's first open source data quality & data preparation project

    ..., Meta Data Discovery, Anomaly Discovery, Data Cleansing, Reporting and Analytic. It also had Hadoop ( Big data ) support to move files to/from Hadoop Grid, Create, Load and Profile Hive Tables. This project is also known as "Aggregate Profiler" Resful API for this project is getting built as (Beta Version) https://sourceforge.net/projects/restful-api-for-osdq/ apache spark based data quality is getting built at https://sourceforge.net/projects/apache-spark-osdq/
    Leader badge
    Downloads: 23 This Week
    Last Update:
    See Project
  • 11

    Custom Apache Big data Distribution

    A Custom Apache Distribution including Spark and Hadoop, for Windows.

    This Distribution has been customized to work out of the box. So, just download it, and unzip it. Set the Path variables for bin folders, HADOOP_HOME, SPARK_HOME, and JAVA_HOME. That's it..! use Hadoop and Spark natively on Windows.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12

    HSRA

    Hadoop spliced read aligner for RNA-seq data

    HSRA is a MapReduce-based parallel tool for mapping reads from RNA sequencing (RNA-seq) experiments. RNA-seq analyses typically begin by mapping reads to a reference genome in order to determine the location from which the reads were originated, which is a very time-consuming step. This tool allows bioinformatics researchers to efficiently distribute their mapping tasks over the nodes of a cluster by combining a fast multithreaded spliced aligner (HISAT2) with Apache Hadoop, which...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13

    MarDRe

    MapReduce-based tool to remove duplicate DNA reads

    .... Instead, MarDRe takes advantage of the MapReduce programming model to significantly improve ParDRe performance on distributed systems, especially on cloud-based infrastructures. Written in pure Java to maximize cross-platform compatibility, MarDRe is built upon the open-source Apache Hadoop project, the most popular distributed computing framework for Big Data processing.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    apache spark data pipeline osDQ

    apache spark data pipeline osDQ

    osDQ dedicated to create apache spark based data pipeline using JSON

    ... the zip file Windows : java -cp .\lib\*;osdq-spark-0.0.1.jar org.arrah.framework.spark.run.TransformRunner -c .\example\samplerun.json Mac UNIX java -cp ./lib/*:./osdq-spark-0.0.1.jar org.arrah.framework.spark.run.TransformRunner -c ./example/samplerun.json For those on windows, you need to have hadoop distribtion unzipped on local drive and HADOOP_HOME set. Also copy winutils.exe from here into HADOOP_HOME\bin
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    X-RIME is a open source project devoted to provide Hadoop based solution for large scale social network analysis.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Chronos

    Chronos

    Fault tolerant job scheduler for Mesos to handle dependencies

    Chronos is a replacement for cron. It is a distributed and fault-tolerant scheduler that runs on top of Apache Mesos that can be used for job orchestration. It supports custom Mesos executors as well as the default command executor. Thus by default, Chronos executes sh (on most systems bash) scripts. Chronos can be used to interact with systems such as Hadoop (incl. EMR), even if the Mesos slaves on which execution happens do not have Hadoop installed. Chronos is also natively able to schedule...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17

    RSS Atom Feed Analytics With MapReduce

    This is a data analytics project for RSS feeds using hadoop MapReduce

    This project accepts the output of jatomrss project as the input. It applies the MR logic on the same to perform the analytics
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    ankus

    ankus

    Data Mining and Machine Learning Algorithms based on MapReduce

    [The feature of ankus] * ankus is a 'web-based big data mining project and tool'. - MapReduce-based data mining/machine learning algorithms library - Hadoop-based distributed bigdata system - offering a web-based GUI for easy use [The ankus project & License] * The ankus project consists of three as an open source. * ankus has Dual licensed under the community and commercial licenses. * community license is following GPLv3 - Some algorithms in Core Project do...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Flamingo Project

    Flamingo Project

    Workflow Designer, Hive Editor, Pig Editor, File System Browser

    Flamingo is a open-source Big Data Platform that combine a Ajax Rich Web Interface + Workflow Engine + Workflow Designer + MapReduce + Hive Editor + Pig Editor. 1. Easy Tool for big data 2. Use comfortable in Hadoop EcoSystem projects 3. Based GPL V3 License Supporting Pig IDE, Hive IDE, HDFS Browser, Scheduler, Hadoop Job Monitoring, Workflow Engine, Workflow Designer, MapReduce.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    BIRT Report Designer

    BIRT Report Designer

    Open Source Reporting & Data Visualization Platform

    .... With a flexible Open Data Access framework, developers can write custom data drivers to access data from any source, including Big Data sources like Apache Hadoop, Cassandra, and MongoDB, along with all traditional relational databases, Flat Files, XML data streams, and data stored in proprietary systems. Built for embedding, BIRT includes APIs for data access, chart generation, output formats, content execution, and integration within larger applications.
    Downloads: 22 This Week
    Last Update:
    See Project
  • 21
    R Hadoop for Big Data

    R Hadoop for Big Data

    Download Free Associated R open source script files for big data analy

    Download Free Associated R open source script files for big data analysis with Hadoop and R These are R script source file from Ram Venkat from a past Meetup we did at http://www.meetup.com/R-Matlab-Users/events/85160532/ Also, there is a long video and Powerpoint presentation slide PDF with R files at: http://quantlabs.net/blog/2012/11/how-to-use-hadoop-and-r-for-big-data-parallel-processing-free-download-pdf/ Download source files from http://quantlabs.net/blog/2012/11/download-free...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22

    HadStat

    HadStat is service on cloud,for data analysis using Hadoop MapReduce.

    HadStat is service on the cloud, allow you to analysis the data on the cloud and return the result in nice graph,this service is free, you can redistribute it and/or modify it under the terms of the GNU General Public License. this service using many technologies , like Hadoop mapreduce, HTML, PHP, Web Service applications, linux server, java, eclipse IDE, with many indicators:Simple moving average (SMA),Exponential moving average (EMA),Smoothed simple moving average (SMMA),Linear weighted...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Syoncloud

    Syoncloud

    Hadoop, Hbase, HBase Web Client, Flume based log analytics system

    Syoncloud Logs enables you to process log files from various applications using Hadoop, Flume and HBase. It has an easy installation and configurations interface. It has Syoncloud HBase web client. It displays tree of HBase tables and column families linked to paginated grid of data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next