Business Software for Amazon Elastic Container Service (Amazon ECS) - Page 3

Top Software that integrates with Amazon Elastic Container Service (Amazon ECS) as of July 2025 - Page 3

  • 1
    MetricFire

    MetricFire

    MetricFire

    Built by engineers for engineers, our Prometheus monitoring tool is easy to configure, get set up, and begin sending metrics. We take care of scaling your Prometheus, so you don't need to worry about it. We keep your data long-term, with 3x redundancy, so you can focus on applying the data rather than maintaining a database. Get updates and plugins without lifting a finger, as we keep your Prometheus and Grafana stack updated for you. Everything you need to take control of your Prometheus metrics. Vendor lock-in's not our thing. We’re believers in you still owning your data, so you can request a full export at any time. That means you get all the benefits of an open-source tool, but with the security and stability of a SaaS tool. We keep all your data with 3 times the redundancy and keep your data in a safe place for up to 1 year. Scale without fear, we handle all the hassle for you. Prometheus experts are available 24 hours a day.
  • 2
    Opsera

    Opsera

    Opsera

    You choose your tools, we take care of the rest. Put together the perfect CI/CD stack that fits your organization’s goals with zero vendor lock-in. ‍Eliminate manual scripts and stop building toolchain automation. Free your engineers to focus on your core business. Pipeline workflows follow a declarative model so you focus on what is required — not how it’s accomplished — including: software builds, security scans, unit testing, and deployments. With Blueprints, diagnose any failures from within Opsera using a console output of every step of your pipeline execution. Comprehensive software delivery analytics across your CI/CD process in a unified view — including Lead Time, Change Failure Rate, Deployment Frequency, and Time to Restore. ‍Contextualized logs for faster resolution and improved auditing and compliance.
  • 3
    PipeCD

    PipeCD

    PipeCD

    A unified continuous delivery solution for multiple application kinds on multi-cloud that empowers engineers to deploy faster with more confidence. A GitOps tool that enables doing deployment operations by pull request on Git. Deployment pipeline UI shows clarify what is happening. Separate logs viewer for each individual deployment. Real-time visualization of application state. Deployment notifications to slack, and webhook endpoints. Insights show the delivery performance. Automated deployment analysis based on metrics, logs, and emitted requests. Automatically roll back to the previous state as soon as analysis or a pipeline stage fails. Automatically detect configuration drift to notify and render the changes. Automatically trigger a new deployment when a defined event has occurred (e.g. container image pushed, helm chart published, etc). Support single sign-on and role-based access control. Credentials are not exposed outside the cluster and not saved in the control plane.
  • 4
    QueryPie

    QueryPie

    QueryPie

    QueryPie is a centralized platform to manage scattered data sources and security policies all in one place. Put your company on the fast track to success without changing the existing data environment. Data governance is vital to today's data-driven world. Ensure you're on the right side of data governance standards while giving many users access to growing amounts of critical information. Establish data access policies by including key attributes such as IP address and access time. Privilege types can be created based on SQL commands classified as DML, DCL, and DDL to secure data analysis and editing. Manage details of SQL events at a glance and discover user behavior and potential security concerns by browsing logs based on permissions. All histories can be exported as a file and used for reporting purposes.
  • 5
    AWS Neuron

    AWS Neuron

    Amazon Web Services

    It supports high-performance training on AWS Trainium-based Amazon Elastic Compute Cloud (Amazon EC2) Trn1 instances. For model deployment, it supports high-performance and low-latency inference on AWS Inferentia-based Amazon EC2 Inf1 instances and AWS Inferentia2-based Amazon EC2 Inf2 instances. With Neuron, you can use popular frameworks, such as TensorFlow and PyTorch, and optimally train and deploy machine learning (ML) models on Amazon EC2 Trn1, Inf1, and Inf2 instances with minimal code changes and without tie-in to vendor-specific solutions. AWS Neuron SDK, which supports Inferentia and Trainium accelerators, is natively integrated with PyTorch and TensorFlow. This integration ensures that you can continue using your existing workflows in these popular frameworks and get started with only a few lines of code changes. For distributed model training, the Neuron SDK supports libraries, such as Megatron-LM and PyTorch Fully Sharded Data Parallel (FSDP).
  • 6
    DROPS

    DROPS

    DROPS

    DROPS is a release management tool designed to simplify, secure, and centralize the deployment of applications across data centers, hybrid, and multi-cloud infrastructures. It supports a wide range of platforms, integrates seamlessly with various CI/CD pipelines, and offers both agent-based and agentless operations. With features like full-stack release management, automated infrastructure provisioning, and 24/7 availability, DROPS aims to streamline deployment processes and ensure consistent, reliable delivery. The tool is flexible enough to manage both legacy and modern applications, catering to diverse enterprise needs. Select agent-based or agentless operation. No need for agent installation and management. DROPS adapts to your configuration and if agents are needed, they are provisioned automatically. Plan and organize your application deployment from the web console with no scripting needed. Ease collaboration between stakeholders and technical teams.
  • 7
    Amazon EC2 P5 Instances
    Amazon Elastic Compute Cloud (Amazon EC2) P5 instances, powered by NVIDIA H100 Tensor Core GPUs, and P5e and P5en instances powered by NVIDIA H200 Tensor Core GPUs deliver the highest performance in Amazon EC2 for deep learning and high-performance computing applications. They help you accelerate your time to solution by up to 4x compared to previous-generation GPU-based EC2 instances, and reduce the cost to train ML models by up to 40%. These instances help you iterate on your solutions at a faster pace and get to market more quickly. You can use P5, P5e, and P5en instances for training and deploying increasingly complex large language models and diffusion models powering the most demanding generative artificial intelligence applications. These applications include question-answering, code generation, video and image generation, and speech recognition. You can also use these instances to deploy demanding HPC applications at scale for pharmaceutical discovery.
  • 8
    Amazon EC2 Capacity Blocks for ML
    Amazon EC2 Capacity Blocks for ML enable you to reserve accelerated compute instances in Amazon EC2 UltraClusters for your machine learning workloads. This service supports Amazon EC2 P5en, P5e, P5, and P4d instances, powered by NVIDIA H200, H100, and A100 Tensor Core GPUs, respectively, as well as Trn2 and Trn1 instances powered by AWS Trainium. You can reserve these instances for up to six months in cluster sizes ranging from one to 64 instances (512 GPUs or 1,024 Trainium chips), providing flexibility for various ML workloads. Reservations can be made up to eight weeks in advance. By colocating in Amazon EC2 UltraClusters, Capacity Blocks offer low-latency, high-throughput network connectivity, facilitating efficient distributed training. This setup ensures predictable access to high-performance computing resources, allowing you to plan ML development confidently, run experiments, build prototypes, and accommodate future surges in demand for ML applications.
  • 9
    Amazon EC2 UltraClusters
    Amazon EC2 UltraClusters enable you to scale to thousands of GPUs or purpose-built machine learning accelerators, such as AWS Trainium, providing on-demand access to supercomputing-class performance. They democratize supercomputing for ML, generative AI, and high-performance computing developers through a simple pay-as-you-go model without setup or maintenance costs. UltraClusters consist of thousands of accelerated EC2 instances co-located in a given AWS Availability Zone, interconnected using Elastic Fabric Adapter (EFA) networking in a petabit-scale nonblocking network. This architecture offers high-performance networking and access to Amazon FSx for Lustre, a fully managed shared storage built on a high-performance parallel file system, enabling rapid processing of massive datasets with sub-millisecond latencies. EC2 UltraClusters provide scale-out capabilities for distributed ML training and tightly coupled HPC workloads, reducing training times.
  • 10
    Amazon EC2 Trn2 Instances
    Amazon EC2 Trn2 instances, powered by AWS Trainium2 chips, are purpose-built for high-performance deep learning training of generative AI models, including large language models and diffusion models. They offer up to 50% cost-to-train savings over comparable Amazon EC2 instances. Trn2 instances support up to 16 Trainium2 accelerators, providing up to 3 petaflops of FP16/BF16 compute power and 512 GB of high-bandwidth memory. To facilitate efficient data and model parallelism, Trn2 instances feature NeuronLink, a high-speed, nonblocking interconnect, and support up to 1600 Gbps of second-generation Elastic Fabric Adapter (EFAv2) network bandwidth. They are deployed in EC2 UltraClusters, enabling scaling up to 30,000 Trainium2 chips interconnected with a nonblocking petabit-scale network, delivering 6 exaflops of compute performance. The AWS Neuron SDK integrates natively with popular machine learning frameworks like PyTorch and TensorFlow.
  • 11
    Amazon Elastic File System (EFS)
    Amazon Elastic File System (Amazon EFS) automatically grows and shrinks as you add and remove files with no need for management or provisioning. Share code and other files in a secure, organized way to increase DevOps agility and respond faster to customer feedback. Persist and share data from your AWS containers and serverless applications with zero management required. Easier to use and scale, Amazon EFS offers the performance and consistency needed for machine learning and big data analytics workloads. Simplify persistent storage for modern content management system workloads. Get your products and services to market faster, more reliably, and securely at a lower cost. Create and configure shared file systems simply and quickly for AWS compute services, with no provisioning, deploying, patching, or maintenance required. Scale workloads on-demand to petabytes of storage and gigabytes per second of throughput out of the box.
  • 12
    Splunk Infrastructure Monitoring
    The only real-time, analytics-driven multicloud monitoring solution for all environments (formerly SignalFx). Monitor any environment on a massively scalable streaming architecture. Open, flexible data collection and rapid visualizations of services in seconds. Purpose built for ephemeral and dynamic cloud-native environments at any scale (e.g., Kubernetes, container, serverless). Detect, visualize and resolve issues as soon as they arise. Monitor infrastructure performance in real-time at cloud scale through predictive streaming analytics. Over 200 pre-built integrations for cloud services and out-of-the-box dashboards for rapid visualization of your entire stack. Autodiscover, breakdown, group, and explore clouds, services and systems. Quickly and easily understand how your infrastructure behaves across different services, availability zones, Kubernetes clusters and more.
  • 13
    StackState

    StackState

    StackState

    StackState's Topology and Relationship-Based Observability platform lets you manage your dynamic IT environment more effectively by unifying performance data from your existing monitoring tools into a single topology. Enabling you to: 1. 80% Decreased MTTR: by identifying the root cause and alerting the right teams with the correct information. 2. 65% Fewer Outages: through real-time unified observability and more planful planning. 3. 3x Faster Releases: by giving time back to developers to increase implementations. Get started today with our free guided demo: https://www.stackstate.com/schedule-a-demo
  • 14
    AWS Deep Learning Containers
    Deep Learning Containers are Docker images that are preinstalled and tested with the latest versions of popular deep learning frameworks. Deep Learning Containers lets you deploy custom ML environments quickly without building and optimizing your environments from scratch. Deploy deep learning environments in minutes using prepackaged and fully tested Docker images. Build custom ML workflows for training, validation, and deployment through integration with Amazon SageMaker, Amazon EKS, and Amazon ECS.