Showing 103 open source projects for "hadoop"

View related business solutions
  • Top-Rated Free CRM Software Icon
    Top-Rated Free CRM Software

    216,000+ customers in over 135 countries grow their businesses with HubSpot

    HubSpot is an AI-powered customer platform with all the software, integrations, and resources you need to connect your marketing, sales, and customer service. HubSpot's connected platform enables you to grow your business faster by focusing on what matters most: your customers.
  • Powering the next decade of business messaging | Twilio MessagingX Icon
    Powering the next decade of business messaging | Twilio MessagingX

    For organizations interested programmable APIs built on a scalable business messaging platform

    Build unique experiences across SMS, MMS, Facebook Messenger, and WhatsApp – with our unified messaging APIs.
  • 1
    MPJ Express: Parallel Computing for Java
    MPJ Express is an implementation of an MPI-like API—standardized by the Java Grande forum—used to write parallel Java applications, which can execute on a variety of parallel platforms ranging from multicore processors to compute clusters/clouds.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 2
    UI To the Hadoop HBase Project
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3

    HDFSFileTransfer

    File transfer from local FS to HDFS

    The HDFSFileTransfer project was created and developed to ease Hadoop users quickly copying varied files such as: flat, structured, unstructured, big and small from linux to Hadoop File System (HDFS). It allows users to transfer files: - within the same physical machine - from local file system (linux) into HDFS - between two physical machines - copy files from local file system (linux) with HDFS cluster installed to another HDFS cluster. Sample - one can have two single clustered Hadoop...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    BIRT Report Designer

    BIRT Report Designer

    Open Source Reporting & Data Visualization Platform

    .... With a flexible Open Data Access framework, developers can write custom data drivers to access data from any source, including Big Data sources like Apache Hadoop, Cassandra, and MongoDB, along with all traditional relational databases, Flat Files, XML data streams, and data stored in proprietary systems. Built for embedding, BIRT includes APIs for data access, chart generation, output formats, content execution, and integration within larger applications.
    Downloads: 10 This Week
    Last Update:
    See Project
  • Pimberly PIM - the leading enterprise Product Information Management platform. Icon
    Pimberly PIM - the leading enterprise Product Information Management platform.

    Pimberly enables businesses to create amazing online experiences with richer, differentiated product descriptions.

    Drive amazing product experiences with quality product data.
  • 5
    Crowbar

    Crowbar

    A complete operations platform to deploy, maintain and scale clusters.

    The Crowbar Project is an effort to build a complete, easy to use operational platform for everyone. It allows for any number of physical nodes to be moved from bare-metal to production cluster within hours. Specific applications include (but are not limited to) Hadoop and OpenStack.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 6
    Pydoop is a Python MapReduce and HDFS API for Hadoop.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Aspose for Hadoop

    Aspose for Hadoop

    This project holds source code for Aspose for Hadoop project.

    Aspose for Hadoop project enables Apache Hadoop / MapReduce developers to work with various binary file formats. The developers can create and convert binary sequence files into text sequence files.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    Seal

    Seal

    A toolkit for distributed processing of HT sequencing data.

    Seal is a Hadoop-based distributed short read alignment and analysis toolkit. Currently Seal includes tools for: read demultiplexing, read alignment, duplicate read removal, sorting read mappings, and calculating statistics for empirical base quality recalibration. Seal scales, easily handling TB of data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9

    PureHadoop

    pHd - Pure Apache Hadoop Distribution

    A pure build of Apache Hadoop 2.2 from the source. This represents the purest form of Hadoop available. Canned CentOS 6.5 Single node VM available for fast start and quick sandbox. This is NOT a vendor distro from Cloudera, Hortonworks, or MapR. No junk! Just pure Apache. VM is 64 bit native build.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Build with generative AI, deploy apps fast, and analyze data in seconds—all with Google-grade security. Icon
    Google Cloud is a cloud-based service that allows you to create anything from simple websites to complex applications for businesses of all sizes.
  • 10
    This is a toolkit to help developer to install hadoop cluster on hiCloud VMs. Note: hiCloud is the Amazon EC2 like service provided by CHT, Taiwan.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    A next gen sequencing analysis pipeline designed to run on hadoop/hdfs written in java and PIG. For more info, contact Zack Ramjan at USC
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Jxtadoop

    Jxtadoop

    This project aims to provide P2P capabilities with Hadoop DFS.

    Hadoop is designed to work in large datacenters with thousands of servers connected to each others in the Hadoop cloud. This project focuses on the Distributed File System part of Hadoop (HDFS). The goal of this project is to provide an alternative to direct IP connectivity required for Hadoop. Instead, the DFS layer has been modified to use a Peer-2-Peer framework which allows direct connectivity in datacenters as well as indirect connectivity to bypass firewall constraints. The typical use...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13

    DynamicMR

    A Dynamic Slot Allocation and Scheduling System for MapReduce Clusters

    DynamicMR is a dynamic slot allocation and scheduling framework aiming to improve the performance of Hadoop under Hadoop Fair Scheduler (HDFS) by maximizing the slots utilization while guaranteeing the fairness across pools. It consists of three levels of scheduling components, namely, Dynamic Hadoop Fair Scheduler (DHFS), Dynamic Speculative Task Scheduler (DSTS), and Data Locality Maximization Scheduler (DLMS).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15

    Hadoop

    Integration of Virtualization with Hadoop tools.

    Downloads: 0 This Week
    Last Update:
    See Project
  • 16

    DHFS

    A dynamic slot allocation technique to improve performance for HFS.

    Dynamic Hadoop Fair Scheduler (DHFS) is an optimized Hadoop Fair Scheduler that improves the performance of Hadoop by maximizing the slots utilization while guarantees the fairness across pools. It is based on the observation that at different period of time there may be idle map (or reduce) slots, as the job proceeds from map phase to reduce phase. We can use the unused map slots for those overloaded reduce tasks to improve the performance of the MapReduce workload, and vice versa, by breaking...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    JUMMP

    JUMMP

    JUMMP: Job Uninterrupted Maneuverable MapReduce Platform

    JUMMP is an automated scheduling platform that provides a customized Hadoop environment within a batch-scheduled cluster environment. JUMMP enables an interactive pseudo-persistent MapReduce platform within the existing administrative structure of an academic high performance computing center by “jumping” between nodes with minimal administrative effort. Jumping is implemented by the synchronization of stopping and starting daemon processes on different nodes in the cluster. Use...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    R Hadoop for Big Data

    R Hadoop for Big Data

    Download Free Associated R open source script files for big data analy

    Download Free Associated R open source script files for big data analysis with Hadoop and R These are R script source file from Ram Venkat from a past Meetup we did at http://www.meetup.com/R-Matlab-Users/events/85160532/ Also, there is a long video and Powerpoint presentation slide PDF with R files at: http://quantlabs.net/blog/2012/11/how-to-use-hadoop-and-r-for-big-data-parallel-processing-free-download-pdf/ Download source files from http://quantlabs.net/blog/2012/11/download-free...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    hadoop4win
    Hadoop for Windows using Cygwin
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20

    distmap

    A toolkit for distributed short read mapping

    DistMap is a user-friendly pipeline designed to map short reads in a MapReduce framework on a local Hadoop cluster. It is designed to be easily implemented by researchers who do not have expert knowledge of bioinformatics. As it does not have any dependencies, DistMap provides full flexibility and control to the user. The user can use any version of a compatible mapper and any reference genome assembly. There is no need to maintain the mapper, reference or DistMap source code on each...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21

    HadStat

    HadStat is service on cloud,for data analysis using Hadoop MapReduce.

    HadStat is service on the cloud, allow you to analysis the data on the cloud and return the result in nice graph,this service is free, you can redistribute it and/or modify it under the terms of the GNU General Public License. this service using many technologies , like Hadoop mapreduce, HTML, PHP, Web Service applications, linux server, java, eclipse IDE, with many indicators:Simple moving average (SMA),Exponential moving average (EMA),Smoothed simple moving average (SMMA),Linear weighted...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Standalone HDFS
    Hadoop is a great project for deep analytics based on the MapReduce features. It also includes a powerful distributed file system designed to ensure that the analytics workloads can locally access the data to be processed to minimize the network bandwidth impact. I found this filesystem very useful to leverage storage from all my PCs and even from some of my online storage such as S3. However i did not want to deploy the full hadoop stack. Hence my decidion to create a standalone distribution...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23

    HadoopFileManager

    Console File Manager for Hadoop, written on java.

    Console File Manager for Hadoop, written on java. For Linux only. Left panel contains local files, right - files from HDFS. For run execute: hadoop jar HadoopFileManager-0.1.0-DEMO.jar Lanterna library as UI. For avoid additional classpath, included into main jar. Current version is just demo, for check display possibility.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24

    OozieWorkflowViewer

    Eclipse plugin for view Appache Oozie workflow structure

    Eclipse plugin for view Appache Hadoop Oozie workflow structure. Plugin contains one view in Oozie category. Put plugin in eclipse/plugins directory. Run Eclipse, open workflow in editor. Open view from menu Window>Show View>Other>Oozie. Workflow structure will be displayed in view as tree.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25

    oozie-workflow-checker

    Validation of complex Apache Oozie Hadoop workflow

    Library validated complex Oozie workflows (http://oozie.apache.org/). Two usage scenarios: 1) Execute workflow with specified parameters, and as result get list of passed nodes. Sample in WorkflowDirProcessorIntegrationTest Note: from all workflow functions only "wf:conf" is supported now. 2) Check called actions exists or build full call tree in xml format Sample in OozieWorkflowCheckerTest: You can override properties from "config-default.xml" and "job.properties" by file with name...
    Downloads: 0 This Week
    Last Update:
    See Project