Business Software for Amazon SageMaker Data Wrangler

Top Software that integrates with Amazon SageMaker Data Wrangler as of July 2025

Compare business software, products, and services to find the best solution for your business or organization. Use the filters on the left to drill down by category, pricing, features, organization size, organization type, region, user reviews, integrations, and more. View and sort the products and solutions that match your needs in the results below.

  • 1
    Salesforce

    Salesforce

    Salesforce

    Put the power of the #1 CRM to work, at a price that works for you. Launch and grow quickly with the leading AI CRM built for small businesses in any industry - available through the Starter Suite or Pro Suite. Connect marketing, sales, service, and commerce on one easy platform. Save time with quick setup and smart guidance. Harness unified data and AI to fuel your growth. Start simple with Starter Suite — an all-in-one CRM for small businesses. Scale smoothly with AI agents, integrated data, and apps in one platform. No installation needed, just sign up and go from your browser. Advance further with Pro Suite, the complete CRM that scales with your business. Automate tasks and customize your tools to deepen customer relationships and drive growth.
    Leader badge
    Starting Price: $25.00/month/user
  • 2
    Google Analytics
    Get to know your customers. Get a deeper understanding of your customers. Google Analytics gives you the free tools you need to analyze data for your business in one place. Google Analytics 4 (GA4) is the latest iteration of Google’s analytics platform, designed to provide a deeper and more comprehensive understanding of user behavior across websites and apps. Built with a privacy-centric approach, GA4 leverages event-based tracking instead of session-based tracking, enabling more flexible and detailed data collection. It offers advanced features like cross-platform tracking, machine learning-powered insights, and predictive analytics to help businesses better understand customer journeys and make data-driven decisions. With improved integration with Google Ads and customizable reporting, GA4 empowers organizations to optimize their marketing strategies while adhering to evolving privacy regulations.
  • 3
    Amazon Web Services (AWS)
    Whether you're looking for compute power, database storage, content delivery, or other functionality, AWS has the services to help you build sophisticated applications with increased flexibility, scalability and reliability. Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform, offering over 175 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster. AWS has significantly more services, and more features within those services, than any other cloud provider–from infrastructure technologies like compute, storage, and databases–to emerging technologies, such as machine learning and artificial intelligence, data lakes and analytics, and Internet of Things. This makes it faster, easier, and more cost effective to move your existing applications to the cloud.
  • 4
    Facebook Ads
    Reach out to future customers and fans. You don’t have to be an expert to start advertising on Facebook. Create and run campaigns using simple self-serve tools, and track their performance with easy-to-read reports. More than two billion people use Facebook every month—so no matter what kind of audience you want to reach, you'll find them here. To choose the right ad objective, answer the question “what’s the most important outcome I want from this ad?” It could be sales on your website, downloads of your app or increased brand awareness. Using what you know about the people you want to reach—like age, location and other details—choose the demographics, interests and behaviors that best represent your audience. Next, choose where you want to run your ad—whether that’s on Facebook, Instagram, Messenger, Audience Network, or across them all. In this step, you can also choose to run ads on specific mobile devices.
  • 5
    Amazon S3
    Amazon Simple Storage Service (Amazon S3) is an object storage service that offers industry-leading scalability, data availability, security, and performance. This means customers of all sizes and industries can use it to store and protect any amount of data for a range of use cases, such as data lakes, websites, mobile applications, backup and restore, archive, enterprise applications, IoT devices, and big data analytics. Amazon S3 provides easy-to-use management features so you can organize your data and configure finely-tuned access controls to meet your specific business, organizational, and compliance requirements. Amazon S3 is designed for 99.999999999% (11 9's) of durability, and stores data for millions of applications for companies all around the world. Scale your storage resources up and down to meet fluctuating demands, without upfront investments or resource procurement cycles. Amazon S3 is designed for 99.999999999% (11 9’s) of data durability.
  • 6
    Snowflake

    Snowflake

    Snowflake

    Snowflake is a comprehensive AI Data Cloud platform designed to eliminate data silos and simplify data architectures, enabling organizations to get more value from their data. The platform offers interoperable storage that provides near-infinite scale and access to diverse data sources, both inside and outside Snowflake. Its elastic compute engine delivers high performance for any number of users, workloads, and data volumes with seamless scalability. Snowflake’s Cortex AI accelerates enterprise AI by providing secure access to leading large language models (LLMs) and data chat services. The platform’s cloud services automate complex resource management, ensuring reliability and cost efficiency. Trusted by over 11,000 global customers across industries, Snowflake helps businesses collaborate on data, build data applications, and maintain a competitive edge.
    Starting Price: $2 compute/month
  • 7
    Amazon Athena
    Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL. Athena is serverless, so there is no infrastructure to manage, and you pay only for the queries that you run. Athena is easy to use. Simply point to your data in Amazon S3, define the schema, and start querying using standard SQL. Most results are delivered within seconds. With Athena, there’s no need for complex ETL jobs to prepare your data for analysis. This makes it easy for anyone with SQL skills to quickly analyze large-scale datasets. Athena is out-of-the-box integrated with AWS Glue Data Catalog, allowing you to create a unified metadata repository across various services, crawl data sources to discover schemas and populate your Catalog with new and modified table and partition definitions, and maintain schema versioning.
  • 8
    pandas

    pandas

    pandas

    pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language. Tools for reading and writing data between in-memory data structures and different formats: CSV and text files, Microsoft Excel, SQL databases, and the fast HDF5 format. Intelligent data alignment and integrated handling of missing data: gain automatic label-based alignment in computations and easily manipulate messy data into an orderly form.Aggregating or transforming data with a powerful group by engine allowing split-apply-combine operations on data sets. Time series-functionality: date range generation and frequency conversion, moving window statistics, date shifting and lagging. Even create domain-specific time offsets and join time series without losing data.
  • 9
    Amazon Redshift
    More customers pick Amazon Redshift than any other cloud data warehouse. Redshift powers analytical workloads for Fortune 500 companies, startups, and everything in between. Companies like Lyft have grown with Redshift from startups to multi-billion dollar enterprises. No other data warehouse makes it as easy to gain new insights from all your data. With Redshift you can query petabytes of structured and semi-structured data across your data warehouse, operational database, and your data lake using standard SQL. Redshift lets you easily save the results of your queries back to your S3 data lake using open formats like Apache Parquet to further analyze from other analytics services like Amazon EMR, Amazon Athena, and Amazon SageMaker. Redshift is the world’s fastest cloud data warehouse and gets faster every year. For performance intensive workloads you can use the new RA3 instances to get up to 3x the performance of any cloud data warehouse.
    Starting Price: $0.25 per hour
  • 10
    Amazon SageMaker
    Amazon SageMaker is an advanced machine learning service that provides an integrated environment for building, training, and deploying machine learning (ML) models. It combines tools for model development, data processing, and AI capabilities in a unified studio, enabling users to collaborate and work faster. SageMaker supports various data sources, such as Amazon S3 data lakes and Amazon Redshift data warehouses, while ensuring enterprise security and governance through its built-in features. The service also offers tools for generative AI applications, making it easier for users to customize and scale AI use cases. SageMaker’s architecture simplifies the AI lifecycle, from data discovery to model deployment, providing a seamless experience for developers.
  • 11
    JSON

    JSON

    JSON

    JSON (JavaScript Object Notation) is a lightweight data-interchange format. It is easy for humans to read and write. It is easy for machines to parse and generate. It is based on a subset of the JavaScript Programming Language Standard ECMA-262 3rd Edition - December 1999. JSON is a text format that is completely language independent but uses conventions that are familiar to programmers of the C-family of languages, including C, C++, C#, Java, JavaScript, Perl, Python, and many others. These properties make JSON an ideal data-interchange language. JSON is built on two structures: 1. A collection of name/value pairs. In various languages, this is realized as an object, record, struct, dictionary, hash table, keyed list, or associative array. 2. An ordered list of values. In most languages, this is realized as an array, vector, list, or sequence. These are universal data structures. Virtually all modern programming languages support them in one form or another.
    Starting Price: Free
  • 12
    Databricks Data Intelligence Platform
    The Databricks Data Intelligence Platform allows your entire organization to use data and AI. It’s built on a lakehouse to provide an open, unified foundation for all data and governance, and is powered by a Data Intelligence Engine that understands the uniqueness of your data. The winners in every industry will be data and AI companies. From ETL to data warehousing to generative AI, Databricks helps you simplify and accelerate your data and AI goals. Databricks combines generative AI with the unification benefits of a lakehouse to power a Data Intelligence Engine that understands the unique semantics of your data. This allows the Databricks Platform to automatically optimize performance and manage infrastructure in ways unique to your business. The Data Intelligence Engine understands your organization’s language, so search and discovery of new data is as easy as asking a question like you would to a coworker.
  • 13
    Apache Spark

    Apache Spark

    Apache Software Foundation

    Apache Spark™ is a unified analytics engine for large-scale data processing. Apache Spark achieves high performance for both batch and streaming data, using a state-of-the-art DAG scheduler, a query optimizer, and a physical execution engine. Spark offers over 80 high-level operators that make it easy to build parallel apps. And you can use it interactively from the Scala, Python, R, and SQL shells. Spark powers a stack of libraries including SQL and DataFrames, MLlib for machine learning, GraphX, and Spark Streaming. You can combine these libraries seamlessly in the same application. Spark runs on Hadoop, Apache Mesos, Kubernetes, standalone, or in the cloud. It can access diverse data sources. You can run Spark using its standalone cluster mode, on EC2, on Hadoop YARN, on Mesos, or on Kubernetes. Access data in HDFS, Alluxio, Apache Cassandra, Apache HBase, Apache Hive, and hundreds of other data sources.
  • 14
    Amazon EMR
    Amazon EMR is the industry-leading cloud big data platform for processing vast amounts of data using open-source tools such as Apache Spark, Apache Hive, Apache HBase, Apache Flink, Apache Hudi, and Presto. With EMR you can run Petabyte-scale analysis at less than half of the cost of traditional on-premises solutions and over 3x faster than standard Apache Spark. For short-running jobs, you can spin up and spin down clusters and pay per second for the instances used. For long-running workloads, you can create highly available clusters that automatically scale to meet demand. If you have existing on-premises deployments of open-source tools such as Apache Spark and Apache Hive, you can also run EMR clusters on AWS Outposts. Analyze data using open-source ML frameworks such as Apache Spark MLlib, TensorFlow, and Apache MXNet. Connect to Amazon SageMaker Studio for large-scale model training, analysis, and reporting.
  • 15
    PySpark

    PySpark

    PySpark

    PySpark is an interface for Apache Spark in Python. It not only allows you to write Spark applications using Python APIs, but also provides the PySpark shell for interactively analyzing your data in a distributed environment. PySpark supports most of Spark’s features such as Spark SQL, DataFrame, Streaming, MLlib (Machine Learning) and Spark Core. Spark SQL is a Spark module for structured data processing. It provides a programming abstraction called DataFrame and can also act as distributed SQL query engine. Running on top of Spark, the streaming feature in Apache Spark enables powerful interactive and analytical applications across both streaming and historical data, while inheriting Spark’s ease of use and fault tolerance characteristics.
  • 16
    Apache Parquet

    Apache Parquet

    The Apache Software Foundation

    We created Parquet to make the advantages of compressed, efficient columnar data representation available to any project in the Hadoop ecosystem. Parquet is built from the ground up with complex nested data structures in mind, and uses the record shredding and assembly algorithm described in the Dremel paper. We believe this approach is superior to simple flattening of nested namespaces. Parquet is built to support very efficient compression and encoding schemes. Multiple projects have demonstrated the performance impact of applying the right compression and encoding scheme to the data. Parquet allows compression schemes to be specified on a per-column level, and is future-proofed to allow adding more encodings as they are invented and implemented. Parquet is built to be used by anyone. The Hadoop ecosystem is rich with data processing frameworks, and we are not interested in playing favorites.
  • 17
    Amazon SageMaker Studio
    Amazon SageMaker Studio is an integrated development environment (IDE) that provides a single web-based visual interface where you can access purpose-built tools to perform all machine learning (ML) development steps, from preparing data to building, training, and deploying your ML models, improving data science team productivity by up to 10x. You can quickly upload data, create new notebooks, train and tune models, move back and forth between steps to adjust experiments, collaborate seamlessly within your organization, and deploy models to production without leaving SageMaker Studio. Perform all ML development steps, from preparing raw data to deploying and monitoring ML models, with access to the most comprehensive set of tools in a single web-based visual interface. Amazon SageMaker Unified Studio is a comprehensive, AI and data development environment designed to streamline workflows and simplify the process of building and deploying machine learning models.
  • 18
    Amazon SageMaker Feature Store
    Amazon SageMaker Feature Store is a fully managed, purpose-built repository to store, share, and manage features for machine learning (ML) models. Features are inputs to ML models used during training and inference. For example, in an application that recommends a music playlist, features could include song ratings, listening duration, and listener demographics. Features are used repeatedly by multiple teams and feature quality is critical to ensure a highly accurate model. Also, when features used to train models offline in batch are made available for real-time inference, it’s hard to keep the two feature stores synchronized. SageMaker Feature Store provides a secured and unified store for feature use across the ML lifecycle. Store, share, and manage ML model features for training and inference to promote feature reuse across ML applications. Ingest features from any data source including streaming and batch such as application logs, service logs, clickstreams, sensors, etc.
  • 19
    Amazon SageMaker Unified Studio
    Amazon SageMaker Unified Studio is a comprehensive, AI and data development environment designed to streamline workflows and simplify the process of building and deploying machine learning models. Built on Amazon DataZone, it integrates various AWS analytics and AI/ML services, such as Amazon EMR, AWS Glue, and Amazon Bedrock, into a single platform. Users can discover, access, and process data from various sources like Amazon S3 and Redshift, and develop generative AI applications. With tools for model development, governance, MLOps, and AI customization, SageMaker Unified Studio provides an efficient, secure, and collaborative environment for data teams.
  • 20
    SAP Cloud Platform
    Extend your business processes in the cloud. Extend SAP solutions in a fast and agile way without disrupting key business processes - leveraging existing investments and expertise. Rapidly develop robust and scalable cloud-native applications. Leverage your existing ABAP expertise to create new extensions or renovate existing custom apps. Innovate for business agility with cloud-native, low-code, and responsive event-driven applications. Accelerate outcomes with intelligent business process optimization. Discover, configure, extend and optimize business processes, connecting experience data to operational workflows. Gain impactful and actionable insights to anticipate business outcomes and uncover new revenue and growth opportunities. Harness the power of predictive analytics and machine learning capabilities. Embed real-time intelligence into your business applications. Advance and personalize the user experience for your customers, partners and employees.
  • Previous
  • You're on page 1
  • Next