Showing 24 open source projects for "hadoop"

View related business solutions
  • Achieve perfect load balancing with a flexible Open Source Load Balancer Icon
    Achieve perfect load balancing with a flexible Open Source Load Balancer

    Take advantage of Open Source Load Balancer to elevate your business security and IT infrastructure with a custom ADC Solution.

    Boost application security and continuity with SKUDONET ADC, our Open Source Load Balancer, that maximizes IT infrastructure flexibility. Additionally, save up to $470 K per incident with AI and SKUDONET solutions, further enhancing your organization’s risk management and cost-efficiency strategies.
  • Simplify Purchasing For Your Business Icon
    Simplify Purchasing For Your Business

    Manage what you buy and how you buy it with Order.co, so you have control over your time and money spent.

    Simplify every aspect of buying for your business in Order.co. From sourcing products to scaling purchasing across locations to automating your AP and approvals workstreams, Order.co is the platform of choice for growing businesses.
  • 1
    ANTLR

    ANTLR

    Parser generator to read, process, or translate structured text

    ... and Pig, the data warehouse and analysis systems for Hadoop, both use ANTLR. Lex Machina uses ANTLR for information extraction from legal texts. Oracle uses ANTLR within SQL Developer IDE and their migration tools. NetBeans IDE parses C++ with ANTLR. The HQL language in the Hibernate object-relational mapping framework is built with ANTLR.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 2
    SageMaker Spark

    SageMaker Spark

    A Spark library for Amazon SageMaker

    ... trained models, and, if you have your own ML algorithms built into SageMaker compatible Docker containers, you can use SageMaker Spark to train and infer on DataFrames with your own algorithms -- all at Spark scale. SageMaker Spark depends on hadoop-aws-2.8.1. To run Spark applications that depend on SageMaker Spark, you need to build Spark with Hadoop 2.8. However, if you are running Spark applications on EMR, you can use Spark built with Hadoop 2.7.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Luigi

    Luigi

    Python module that helps you build complex pipelines of batch jobs

    Luigi is a Python (3.6, 3.7, 3.8, 3.9 tested) package that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization, handling failures, command line integration, and much more. The purpose of Luigi is to address all the plumbing typically associated with long-running batch processes. You want to chain many tasks, automate them, and failures will happen. These tasks can be anything, but are typically long running things like Hadoop...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    Apache Drill

    Apache Drill

    Apache Drill is a distributed MPP query layer for self describing data

    Apache Drill is a distributed MPP query layer that supports SQL and alternative query languages against NoSQL and Hadoop data storage systems. It was inspired in part by Google's Dremel. Get faster insights without the overhead (data loading, schema creation and maintenance, transformations, etc.) Analyze the multi-structured and nested data in non-relational datastores directly without transforming or restricting the data. Leverage your existing SQL skillsets and BI tools including Tableau...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Cybersecurity Management Software for MSPs Icon
    Cybersecurity Management Software for MSPs

    Secure your clients from cyber threats.

    Define and Deliver Comprehensive Cybersecurity Services. Security threats continue to grow, and your clients are most likely at risk. Small- to medium-sized businesses (SMBs) are targeted by 64% of all cyberattacks, and 62% of them admit lacking in-house expertise to deal with security issues. Now technology solution providers (TSPs) are a prime target. Enter ConnectWise Cybersecurity Management (formerly ConnectWise Fortify) — the advanced cybersecurity solution you need to deliver the managed detection and response protection your clients require. Whether you’re talking to prospects or clients, we provide you with the right insights and data to support your cybersecurity conversation. From client-facing reports to technical guidance, we reduce the noise by guiding you through what’s really needed to demonstrate the value of enhanced strategy.
  • 5
    XGBoost

    XGBoost

    Scalable and Flexible Gradient Boosting

    ... can be used for Python, Java, Scala, R, C++ and more. It can run on a single machine, Hadoop, Spark, Dask, Flink and most other distributed environments, and is capable of solving problems beyond billions of examples.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6

    manufacture

    manufacture is a continuous integration platform based on maven

    manufacture is a continuous integration, delivery and distributed computing platform based on open source frameworks such as posix, java, j2ee, maven, jenkins, tomcat and hadoop.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    X-RIME is a open source project devoted to provide Hadoop based solution for large scale social network analysis.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Slimgrid is a Java library for grid computations which is lighter than other ones (JPPF, Hadoop, ...). The main design goals are: minimalism, simplicity, pervasiveness. If you need to grab something which does not require you to comprehend massive and complex API's, do exhaustive configurations and installations, is robust and reliable, uses just one port for all management and communication, then SlimGrid is the right choice. The SlimGrid is built on top of the Apache's ZooKeeper library...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    An Apache Zookeeper-based utility for assigning unique, sequential ID numbers in a distributed system (such as a Hadoop Map/Reduce job).
    Downloads: 0 This Week
    Last Update:
    See Project
  • Total Network Visibility for Network Engineers and IT Managers Icon
    Total Network Visibility for Network Engineers and IT Managers

    Network monitoring and troubleshooting is hard. TotalView makes it easy.

    This means every device on your network, and every interface on every device is automatically analyzed for performance, errors, QoS, and configuration.
  • 10
    Hadoop-BAM is a Java library for the manipulation of files in common bioinformatics formats using the Hadoop MapReduce framework with the Picard SAM JDK, and command line tools similar to SAMtools. The file formats currently supported are BAM, SAM, FASTQ, FASTA, QSEQ, BCF, and VCF. For a longer high-level description of Hadoop-BAM, refer to the article "Hadoop-BAM: directly manipulating next generation sequencing data in the cloud" in Bioinformatics Volume 28 Issue 6 pp. 876-877, available...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11

    CSVTOHIVE

    Generate Hive Scripts Automatically from CSV Files

    Generates Hive Scripts Automatically from a CSV Files. 1. Script copies csv files to Hadoop Files System. 2. Generates CREATE statements to create tables. 3. Generates .hive files in the same folder as that of csv folder and also generates run.sh with all consolidated files. So just switch to the folder where .hive scripts are residing and run run.sh (./run.sh). This tool will also set execute permissions on .hive and run.sh scripts so you can directly execute run.sh.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Flamingo Project

    Flamingo Project

    Workflow Designer, Hive Editor, Pig Editor, File System Browser

    Flamingo is a open-source Big Data Platform that combine a Ajax Rich Web Interface + Workflow Engine + Workflow Designer + MapReduce + Hive Editor + Pig Editor. 1. Easy Tool for big data 2. Use comfortable in Hadoop EcoSystem projects 3. Based GPL V3 License Supporting Pig IDE, Hive IDE, HDFS Browser, Scheduler, Hadoop Job Monitoring, Workflow Engine, Workflow Designer, MapReduce.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    OpenCrowbar

    OpenCrowbar

    Data Center Bare Metal configuration platform

    The principal motivation for creation of OpenCrowbar is the transition a from bare metal installer into a tool that manages ongoing operations. OpenCrowbar enables upgrade and continuous deployment automation. This capability is important for large scale deployments of evolving complex projects like OpenStack, Hadoop, and Ceph. OpenCrowbar provides the foundation for operations automation. OpenCrowbar is an open reference implementation that can be reliably deployed in large-scale, multi-site...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14

    HDFSFileTransfer

    File transfer from local FS to HDFS

    The HDFSFileTransfer project was created and developed to ease Hadoop users quickly copying varied files such as: flat, structured, unstructured, big and small from linux to Hadoop File System (HDFS). It allows users to transfer files: - within the same physical machine - from local file system (linux) into HDFS - between two physical machines - copy files from local file system (linux) with HDFS cluster installed to another HDFS cluster. Sample - one can have two single clustered Hadoop...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    BIRT Report Designer

    BIRT Report Designer

    Open Source Reporting & Data Visualization Platform

    .... With a flexible Open Data Access framework, developers can write custom data drivers to access data from any source, including Big Data sources like Apache Hadoop, Cassandra, and MongoDB, along with all traditional relational databases, Flat Files, XML data streams, and data stored in proprietary systems. Built for embedding, BIRT includes APIs for data access, chart generation, output formats, content execution, and integration within larger applications.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 16
    Aspose for Hadoop

    Aspose for Hadoop

    This project holds source code for Aspose for Hadoop project.

    Aspose for Hadoop project enables Apache Hadoop / MapReduce developers to work with various binary file formats. The developers can create and convert binary sequence files into text sequence files.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    R Hadoop for Big Data

    R Hadoop for Big Data

    Download Free Associated R open source script files for big data analy

    Download Free Associated R open source script files for big data analysis with Hadoop and R These are R script source file from Ram Venkat from a past Meetup we did at http://www.meetup.com/R-Matlab-Users/events/85160532/ Also, there is a long video and Powerpoint presentation slide PDF with R files at: http://quantlabs.net/blog/2012/11/how-to-use-hadoop-and-r-for-big-data-parallel-processing-free-download-pdf/ Download source files from http://quantlabs.net/blog/2012/11/download-free...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18

    HadoopFileManager

    Console File Manager for Hadoop, written on java.

    Console File Manager for Hadoop, written on java. For Linux only. Left panel contains local files, right - files from HDFS. For run execute: hadoop jar HadoopFileManager-0.1.0-DEMO.jar Lanterna library as UI. For avoid additional classpath, included into main jar. Current version is just demo, for check display possibility.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19

    oozie-workflow-checker

    Validation of complex Apache Oozie Hadoop workflow

    Library validated complex Oozie workflows (http://oozie.apache.org/). Two usage scenarios: 1) Execute workflow with specified parameters, and as result get list of passed nodes. Sample in WorkflowDirProcessorIntegrationTest Note: from all workflow functions only "wf:conf" is supported now. 2) Check called actions exists or build full call tree in xml format Sample in OozieWorkflowCheckerTest: You can override properties from "config-default.xml" and "job.properties" by file with name...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20

    ForeIndex

    Distributed Index with Apache Hadoop, Apache Lucene and Apache Tika

    This is a distributed index framework using Apache Hadoop, Apache Lucene and Apache Tika, to index large volume of data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21

    Ocean Sync

    Hadoop Management System

    OceanSync is an Hadoop Management System that allows users to control a variety of aspects of Hadoop. This includes a Graphical User Interface that allows a user to perform HDFS maintenance tasks and submit new jobs to the cluster. The OceanSync product sits on top of any Hadoop Architecture.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    This software allows for a user to generate test data. This is useful for testing Hadoop or other data processing clusters.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Framework for development of simple evolutionary algorithms / island models programs in distributed environment using MapReduce programming model based on hadoop.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    A course with labs to help you explore parallelism in Java : from low level to the cloud, including threads, JSR166, J2EE, hadoop and Google App Engine.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next