Design High-Performing and Elastic Compute Solutions for SAA-C03

March 28, 2026 4 min read

Learn how SAA-C03 frames EC2, Auto Scaling, Lambda, containers, Batch, EMR, and resource sizing for high-performing compute decisions.

On this page

Compute questions on SAA-C03 are really about fit plus elasticity. The exam wants to know whether the workload should run on EC2, serverless, containers, or a more specialized managed compute path, and then whether the scaling model matches the actual traffic pattern.

What AWS is explicitly testing

AWS ties this task to compute services such as EC2, Fargate, Batch, and EMR, plus distributed computing concepts, Auto Scaling, serverless patterns, container orchestration, and resource sizing.

Compute chooser

Requirement	Strongest first fit	Why
Long-running workload with deep OS control	EC2	Highest control over the host and instance family
Bursty event-driven code without server management	Lambda	Strongest pay-per-use and operational simplicity for the right workloads
Containerized workload without server management	ECS on Fargate	Managed container runtime without EC2 fleet operations
Batch-oriented compute job scheduling	AWS Batch	Fits queued batch processing better than hand-rolled EC2 patterns
Large-scale data processing cluster	EMR	Better fit for distributed data processing than generic EC2 alone

Runtime-fit questions that decide the answer

Question	Why it matters on SAA-C03
Does the workload need host-level control or a custom OS stack?	That often keeps the answer on EC2 rather than managed serverless or container runtime
Is the load bursty, event-driven, and short-lived?	That often pushes the design toward Lambda
Is the team deploying containers but does not want to manage EC2 worker fleets?	That points strongly toward Fargate
Is the workload queued or scheduled batch work rather than request-response serving?	That often points to AWS Batch
Is the problem really distributed data processing?	That can shift the answer toward EMR instead of generic compute fleets

The scaling question AWS is really asking

Do not jump from “slow” to “bigger instance” by reflex. The real question is usually one of these:

should the component scale horizontally or vertically?
is the trigger traffic, queue depth, CPU, concurrency, or schedule?
should the component even be long-running, or should it become event-driven?

Scaling-signal chooser

Signal	Strongest use case	Common mistake
Request count or target response time	Web-facing stateless tiers	Scaling on CPU when the bottleneck is traffic-facing latency
Queue depth	Asynchronous consumers and workers	Treating batch backlog like ordinary web traffic
Concurrency	Lambda or heavily parallel execution paths	Ignoring downstream limits such as database connections
CPU or memory	Compute-bound workers	Using infrastructure metrics when the real problem is downstream I/O

What to notice in elastic compute design

    flowchart LR
	  T["Traffic or queue depth"] --> M["Metric or signal"]
	  M --> S["Scaling policy"]
	  S --> C["Compute capacity change"]

The point is that elastic design starts with the right signal. If the scaling metric does not match the workload bottleneck, the architecture still fails even with Auto Scaling enabled.

Example: target tracking for a web tier

This is the kind of scaling shape SAA-C03 expects you to interpret correctly:

 1Resources:
 2  WebAsg:
 3    Type: AWS::AutoScaling::AutoScalingGroup
 4    Properties:
 5      MinSize: '2'
 6      MaxSize: '8'
 7      DesiredCapacity: '2'
 8      VPCZoneIdentifier:
 9        - subnet-app-a
10        - subnet-app-b
11      TargetGroupARNs:
12        - arn:aws:elasticloadbalancing:us-east-1:111122223333:targetgroup/web-tg/123abc
13
14  CpuTargetTrackingPolicy:
15    Type: AWS::AutoScaling::ScalingPolicy
16    Properties:
17      AutoScalingGroupName: !Ref WebAsg
18      PolicyType: TargetTrackingScaling
19      TargetTrackingConfiguration:
20        PredefinedMetricSpecification:
21          PredefinedMetricType: ASGAverageCPUUtilization
22        TargetValue: 55

What to notice:

the scaling policy is attached to a multi-AZ Auto Scaling group, not one big instance
the signal must match the workload bottleneck or the scaling policy still underperforms
SAA-C03 often prefers elastic multi-instance design over vertical scaling unless the constraint clearly says otherwise

Failure patterns worth recognizing

Symptom	Strongest first check	Why
The app scales out but still times out	Bottleneck layer versus scaling signal	The real limit may be storage, database, or downstream service behavior
The workload uses many short bursts but runs on oversized EC2 full time	Runtime model fit	Lambda or Fargate may be a stronger performance and operations fit
Consumers keep falling behind on queued work	Queue-depth scaling and worker design	This is an asynchronous scaling problem, not a load-balancer problem
Container adoption is increasing operational drag	ECS or Fargate versus EC2 fleet management	The runtime choice may be the real issue, not the application code

Common traps

using vertical scaling when the workload should scale horizontally
choosing Lambda for tasks that need long runtime or heavy host customization
choosing EC2 when a serverless or container service would remove operational drag
forgetting that performance issues may come from storage or database design instead of compute size

Quiz

Loading quiz…

Continue with 3.3 Database Solutions to connect compute elasticity to the data layer that usually limits it.

3.1 Storage Solutions

3.3 Database Solutions