Design High-Performing and Elastic Compute Solutions for SAA-C03

Learn how SAA-C03 frames EC2, Auto Scaling, Lambda, containers, Batch, EMR, and resource sizing for high-performing compute decisions.

Compute questions on SAA-C03 are really about fit plus elasticity. The exam wants to know whether the workload should run on EC2, serverless, containers, or a more specialized managed compute path, and then whether the scaling model matches the actual traffic pattern.

What AWS is explicitly testing

AWS ties this task to compute services such as EC2, Fargate, Batch, and EMR, plus distributed computing concepts, Auto Scaling, serverless patterns, container orchestration, and resource sizing.

Compute chooser

RequirementStrongest first fitWhy
Long-running workload with deep OS controlEC2Highest control over the host and instance family
Bursty event-driven code without server managementLambdaStrongest pay-per-use and operational simplicity for the right workloads
Containerized workload without server managementECS on FargateManaged container runtime without EC2 fleet operations
Batch-oriented compute job schedulingAWS BatchFits queued batch processing better than hand-rolled EC2 patterns
Large-scale data processing clusterEMRBetter fit for distributed data processing than generic EC2 alone

Runtime-fit questions that decide the answer

QuestionWhy it matters on SAA-C03
Does the workload need host-level control or a custom OS stack?That often keeps the answer on EC2 rather than managed serverless or container runtime
Is the load bursty, event-driven, and short-lived?That often pushes the design toward Lambda
Is the team deploying containers but does not want to manage EC2 worker fleets?That points strongly toward Fargate
Is the workload queued or scheduled batch work rather than request-response serving?That often points to AWS Batch
Is the problem really distributed data processing?That can shift the answer toward EMR instead of generic compute fleets

The scaling question AWS is really asking

Do not jump from “slow” to “bigger instance” by reflex. The real question is usually one of these:

  • should the component scale horizontally or vertically?
  • is the trigger traffic, queue depth, CPU, concurrency, or schedule?
  • should the component even be long-running, or should it become event-driven?

Scaling-signal chooser

SignalStrongest use caseCommon mistake
Request count or target response timeWeb-facing stateless tiersScaling on CPU when the bottleneck is traffic-facing latency
Queue depthAsynchronous consumers and workersTreating batch backlog like ordinary web traffic
ConcurrencyLambda or heavily parallel execution pathsIgnoring downstream limits such as database connections
CPU or memoryCompute-bound workersUsing infrastructure metrics when the real problem is downstream I/O

What to notice in elastic compute design

    flowchart LR
	  T["Traffic or queue depth"] --> M["Metric or signal"]
	  M --> S["Scaling policy"]
	  S --> C["Compute capacity change"]

The point is that elastic design starts with the right signal. If the scaling metric does not match the workload bottleneck, the architecture still fails even with Auto Scaling enabled.

Example: target tracking for a web tier

This is the kind of scaling shape SAA-C03 expects you to interpret correctly:

 1Resources:
 2  WebAsg:
 3    Type: AWS::AutoScaling::AutoScalingGroup
 4    Properties:
 5      MinSize: '2'
 6      MaxSize: '8'
 7      DesiredCapacity: '2'
 8      VPCZoneIdentifier:
 9        - subnet-app-a
10        - subnet-app-b
11      TargetGroupARNs:
12        - arn:aws:elasticloadbalancing:us-east-1:111122223333:targetgroup/web-tg/123abc
13
14  CpuTargetTrackingPolicy:
15    Type: AWS::AutoScaling::ScalingPolicy
16    Properties:
17      AutoScalingGroupName: !Ref WebAsg
18      PolicyType: TargetTrackingScaling
19      TargetTrackingConfiguration:
20        PredefinedMetricSpecification:
21          PredefinedMetricType: ASGAverageCPUUtilization
22        TargetValue: 55

What to notice:

  • the scaling policy is attached to a multi-AZ Auto Scaling group, not one big instance
  • the signal must match the workload bottleneck or the scaling policy still underperforms
  • SAA-C03 often prefers elastic multi-instance design over vertical scaling unless the constraint clearly says otherwise

Failure patterns worth recognizing

SymptomStrongest first checkWhy
The app scales out but still times outBottleneck layer versus scaling signalThe real limit may be storage, database, or downstream service behavior
The workload uses many short bursts but runs on oversized EC2 full timeRuntime model fitLambda or Fargate may be a stronger performance and operations fit
Consumers keep falling behind on queued workQueue-depth scaling and worker designThis is an asynchronous scaling problem, not a load-balancer problem
Container adoption is increasing operational dragECS or Fargate versus EC2 fleet managementThe runtime choice may be the real issue, not the application code

Common traps

  • using vertical scaling when the workload should scale horizontally
  • choosing Lambda for tasks that need long runtime or heavy host customization
  • choosing EC2 when a serverless or container service would remove operational drag
  • forgetting that performance issues may come from storage or database design instead of compute size

Quiz

Loading quiz…

Continue with 3.3 Database Solutions to connect compute elasticity to the data layer that usually limits it.