PipelineOps

AWS ECS vs EKS: Choosing Container Orchestration in 2026

"ECS won't scale with us. If we don't move to EKS, we're going to hit a wall." That's what I heard the week after a senior engineer with deep Kubernetes experience joined our team.

TL;DR: We ran a three-month EKS pilot. EKS is technically more capable, but the right answer depends on your team's Kubernetes experience, operational scale, and multi-cloud requirements. Choosing ECS isn't settling — it's a deliberate decision. This post covers the real problems we hit during the pilot and the decision framework I now use to choose between the two.

What I Was Trying to Do

Our platform engineering team was running 20+ microservices on ECS (Fargate) at a SaaS company. Deployments were stable. Team size was five. Everything ran on AWS — no multi-cloud plans on the horizon.

The new engineer's case had three pillars:

  1. Portability: Moving to the Kubernetes API would make future migrations to other clouds or providers much easier.
  2. Ecosystem: We'd gain access to Helm charts, Operators, and the broader Kubernetes tooling landscape.
  3. Talent market: Kubernetes is the industry standard. Hiring and growing engineers would be easier if we were on it.

None of them were wrong. I couldn't clearly articulate why we shouldn't migrate, so we agreed to pilot it first and decide based on real experience. We launched two greenfield services on EKS and ran them alongside our ECS workloads for three months.

What Went Wrong (and Why)

We underestimated cluster upgrade overhead

Each Kubernetes minor version in EKS has a support window. Once that window closes, you either pay for Extended Support or upgrade — there's no ignoring it. Check the current schedule at the Amazon EKS Kubernetes release calendar.

Our first upgrade arrived four months into the pilot. The process looked straightforward in the docs: upgrade the control plane, then update add-ons (kube-proxy, VPC CNI, CoreDNS) one by one, checking version compatibility at each step. Rolling-replace the worker nodes. Done.

In practice, it took two engineers the better part of two days. That's not a disaster — but it's a recurring tax we'd never paid with ECS. On ECS, AWS manages the control plane. On Fargate, even the execution environment updates automatically. The concept of "cluster upgrade" simply doesn't exist.

System pod resource consumption caught us off guard

EKS worker nodes run Kubernetes system components — kube-proxy, CoreDNS, the VPC CNI daemonset — and those components claim node resources before your application pods get anything.

On a t3.medium, the vCPU and memory available to our application was noticeably less than the instance spec. To predict node capacity correctly, we had to understand --kube-reserved and --system-reserved settings and account for them per node type. On ECS Fargate, the CPU and memory you specify in the task definition is what your container gets — no overhead to model.

Double-layered access management

ECS access control lives entirely in AWS IAM. EKS doesn't work that way.

AWS IAM controls access to the cluster API. But access to resources inside the cluster — Namespaces, Deployments, Pods — is governed separately by Kubernetes RBAC. To give a developer access to EKS, you register their IAM role or user in the aws-auth ConfigMap (or the newer access entries API), then create a RoleBinding. Two systems, both need to be kept in sync.

Onboarding teammates who hadn't worked with Kubernetes before took longer than expected. Access audits also became a two-step process, one for IAM and one for RBAC.

Future flexibility has a present cost

"Multi-cloud portability" is a compelling argument — but the cost of that flexibility starts accruing today. Upgrade work, RBAC management, the Kubernetes learning curve: these are ongoing operational costs you pay every month in exchange for a future option you may or may not exercise.

Our requirement at the time was AWS-only. There was a vague roadmap item about exploring multi-cloud "someday," but nothing concrete. Paying certain, recurring costs for an uncertain, unscheduled benefit didn't make sense.

The Fix — Step by Step

After the pilot, we decided to move back to ECS. Not because EKS is bad — but because ECS was the right fit for where we were.

Decision framework

These four questions now guide any ECS vs. EKS conversation on my teams.

Question 1: Is multi-cloud or cloud portability a current requirement?

  • Yes → EKS (or self-managed Kubernetes)
  • No → Continue to next question

Question 2: Do you have existing dependencies on the Kubernetes ecosystem? (OSS tools that ship Helm charts only, Kubernetes Operators, etc.)

  • Yes → EKS
  • No → Continue to next question

Question 3: Does your team have at least one engineer with deep Kubernetes production experience?

  • Yes → EKS is viable
  • No → ECS strongly recommended — running EKS in production without Kubernetes expertise is a reliability risk

Question 4: What's your service count and team structure?

  • 30+ microservices, multiple teams deploying independently → EKS Namespace isolation earns its keep
  • Fewer than 30 → ECS is manageable at this scale

If you reach the end without landing on EKS, choose ECS.

What to know when choosing ECS

Service discovery: ECS Service Connect provides Cloud Map-based service discovery. For most microservice-to-microservice communication, it's enough.

Deployment strategies: ECS supports rolling updates, Blue/Green via CodeDeploy integration, and external deployment controllers. Choosing ECS doesn't limit your deployment options.

Cost: ECS has no control plane charge. EKS charges per cluster for the control plane — check current pricing on the AWS EKS pricing page. On Fargate, you pay for the CPU and memory your tasks request.

What to know when choosing EKS

Build the upgrade schedule in from day one. Look up the support window for your chosen Kubernetes version and block upgrade time in your sprint calendar well in advance. Waiting until the deadline means upgrading under pressure, with no time to test properly.

Size nodes with system overhead in mind. A small number of larger nodes typically has less proportional overhead than a large number of small nodes, because you're spreading fixed system pod costs across more application capacity per node.

Document the IAM-to-RBAC mapping from the start. Create a runbook that maps AWS IAM roles to Kubernetes RoleBindings and own a process to keep it current. Cleaning this up retroactively is painful.

What I'd Do Differently

Start with a decision framework, not a pilot.

When the EKS proposal came in, I didn't have a clear way to evaluate it. It sounded technically superior, so I found it hard to push back. Now I open with questions: What are our actual multi-cloud requirements? Do we have Kubernetes ecosystem dependencies? Who will own upgrades? The answers usually clarify the decision before any code is written.

Treat "future flexibility" as a requirement, not a reason.

"We might need multi-cloud someday" is a hypothesis, not a requirement. Whether a hypothesis is worth paying for depends on its confidence and timeline. If you can say "there's an 80% chance we move to GCP within three years," that's a basis for a decision. "Maybe eventually" is not.

If you choose EKS, secure the expertise first.

EKS operations assume someone on the team knows Kubernetes deeply. Running EKS in production while simultaneously trying to learn Kubernetes is a reliability risk. The decision to adopt EKS and the decision to have a Kubernetes-experienced engineer on the team should be made together.

Kubernetes and Platform Engineering

The ECS vs. EKS question often surfaces a deeper one: Do you need Kubernetes to do Platform Engineering properly?

No. But Kubernetes is a powerful foundation for teams that are serious about Platform Engineering at scale.

Where Kubernetes fits in Platform Engineering

Platform Engineering's core goal is building an Internal Developer Platform (IDP) — an environment where development teams can ship without thinking about infrastructure. Kubernetes frequently serves as the substrate for that IDP.

Three characteristics make Kubernetes well-suited to this role:

1. Namespace-based team isolation

Multiple development teams can share a single cluster while deploying and managing their workloads independently within their own Namespaces. RBAC controls what each team can touch, so Team A can't accidentally affect Team B's resources. ECS can enforce environment-level isolation, but the fine-grained, per-team separation that Kubernetes Namespaces provide is harder to replicate natively.

2. Declarative APIs and GitOps

Every Kubernetes resource is expressed as a declarative YAML or JSON manifest. This pairs naturally with GitOps tooling like ArgoCD or Flux: merge to Git, cluster reflects the change. That flow — developers owning their manifests, platform team owning cluster policy — is the kind of "developer self-service" that Platform Engineering aims for.

Terraform and CDK can provide declarative management for ECS too, but the ecosystem depth isn't comparable.

3. CRDs as an abstraction layer

Kubernetes Custom Resource Definitions let you create your own APIs on top of the cluster. A platform team can define a kind: BackendService resource, and developers fill in a short spec — language runtime, port, storage needs. The platform layer translates that into an EKS Deployment, an RDS instance, and an ALB listener, all provisioned automatically. Backstage, the open-source IDP framework from Spotify, also has stronger Kubernetes integration than anything available for ECS.

Why I still say "start with ECS"

All of that is real. And I still recommend ECS for teams earlier in their Platform Engineering journey.

Platform Engineering is about improving developer productivity. Kubernetes is one way to get there — not the only way, and not the first thing to reach for. The early stages — standardizing deployments, managing environments, building CI/CD pipelines — are fully achievable on ECS. Solve what's actually blocking developers before you take on Kubernetes complexity.

Kubernetes becomes the right choice when concrete requirements arrive: multiple teams deploying independently at scale, existing dependencies on Helm or Operators, or a need for an abstraction layer between developers and infrastructure.

You can start Platform Engineering on ECS. Kubernetes is one path to scaling it.

Key Takeaways

Choose ECS when:

  • You're AWS-only with no concrete multi-cloud requirements
  • You have no existing Kubernetes ecosystem dependencies
  • Your platform team is five or six engineers or fewer and wants to minimize operational overhead
  • Your deployment pipeline already works end-to-end with AWS-native services or GitHub Actions

Choose EKS when:

  • Multi-cloud or cloud portability is a current, concrete requirement
  • You depend on Helm-only OSS tools or Kubernetes Operators
  • Your team has at least one engineer with deep Kubernetes production experience
  • You're operating at a scale where multiple teams need independent Namespace-level isolation

Either way:

  • Design your deployment strategy (rolling, Blue/Green) before you need it
  • Implement health checks in the application, not just at the load balancer
  • Follow least-privilege for IAM roles — never use AdministratorAccess for a workload identity

FAQ

Q: Is ECS cheaper than EKS?

A: Generally yes, but it depends on your workload. ECS has no control plane charge. EKS charges per cluster for the control plane — see AWS EKS pricing for current rates. Worker node costs (EC2 or Fargate) are identical on both. The practical difference: on EKS, system pods consume a portion of each node's resources, so you may need slightly larger nodes to run the same application workload. For a reliable cost comparison, model it against your actual traffic and instance types.

Q: Can we migrate from ECS to EKS later if we change our minds?

A: Technically yes — but it's not a simple lift-and-shift. ECS Task Definitions need to be converted to Kubernetes Manifests, and ECS-specific features like Service Connect and the ALB integration work differently in Kubernetes. If migration is on your long-term roadmap, keep your Dockerfiles and application code portable now: clean startup/shutdown behavior, proper health check endpoints, no hard dependency on ECS-specific environment variables. That way the container layer is ready for either platform when the time comes.

Q: Can we use Fargate with both ECS and EKS?

A: Yes. Both ECS and EKS support Fargate as a launch type, eliminating worker node management. One important caveat for EKS on Fargate: DaemonSets are not supported. If you're running node-level log collectors, network policy agents, or security tools as DaemonSets, EKS on Fargate won't work for those components. ECS on Fargate doesn't have this constraint — it was designed for that model from the start.

Q: We use GitHub Actions self-hosted runners. Which is a better fit for hosting them?

A: If you're already on EKS, Actions Runner Controller (ARC) is the cleanest option. It's a Kubernetes Operator that scales runner Pods based on job queue depth. On ECS, the typical pattern is Lambda + Fargate Task (or Lambda + EC2) for ephemeral runners — I covered the setup in detail in another post on this blog. Functionally, both approaches deliver ephemeral runners. If you're already on EKS, ARC is less glue to write.

Q: Do we need AWS App Mesh or ECS Service Connect?

A: Not necessarily. If your microservice communication is straightforward — HTTP requests from Service A to Service B — ECS Service Connect's built-in service discovery is sufficient. Reach for App Mesh or a full service mesh when you have concrete requirements for mutual TLS, fine-grained traffic shifting, or per-route observability. Adding a service mesh upfront adds operational complexity and makes debugging harder without clear payoff. Solve the specific problem when it actually appears.


This post draws on SRE experience across multiple organizations. Details that could identify specific companies or individuals have been generalized.