SageMaker Spark is an open-source Spark library for Amazon SageMaker. With SageMaker Spark you construct Spark ML Pipelines using Amazon SageMaker stages. These pipelines interleave native Spark ML stages and stages that interact with SageMaker training and model hosting. With SageMaker Spark, you can train on Amazon SageMaker from Spark DataFrames using Amazon-provided ML algorithms like K-Means clustering or XGBoost, and make predictions on DataFrames against SageMaker endpoints hosting your trained models, and, if you have your own ML algorithms built into SageMaker compatible Docker containers, you can use SageMaker Spark to train and infer on DataFrames with your own algorithms -- all at Spark scale. SageMaker Spark depends on hadoop-aws-2.8.1. To run Spark applications that depend on SageMaker Spark, you need to build Spark with Hadoop 2.8. However, if you are running Spark applications on EMR, you can use Spark built with Hadoop 2.7.

Features

  • SageMaker Spark needs to be added to both the driver and executor classpaths
  • You can run SageMaker Spark applications on an EMR cluster
  • EMR allows you to read and write data using the EMR FileSystem
  • Create your Spark Session and load your training and test data into DataFrames
  • SageMaker Spark provides several classes that extend SageMakerEstimator to run particular algorithms
  • Use SageMakerEstimator and SageMakerModel in a Spark Pipeline

Project Samples

Project Activity

See All Activity >

Categories

Libraries

License

Apache License V2.0

Follow SageMaker Spark

SageMaker Spark Web Site

Other Useful Business Software
Enterprise-grade ITSM, for every business Icon
Enterprise-grade ITSM, for every business

Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
Try it Free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of SageMaker Spark!

Additional Project Details

Programming Language

Scala

Related Categories

Scala Libraries

Registered

2022-07-11